You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Marcelo Vanzin <va...@cloudera.com> on 2014/07/08 21:04:28 UTC

Re: error when spark access hdfs with Kerberos enable

Someone might be able to correct me if I'm wrong, but I don't believe
standalone mode supports kerberos. You'd have to use Yarn for that.

On Tue, Jul 8, 2014 at 1:40 AM, 许晓炜 <xu...@qiyi.com> wrote:
> Hi all,
>
>
>
> I encounter a strange issue when using spark 1.0 to access hdfs with
> Kerberos
>
> I just have one spark test node for spark and HADOOP_CONF_DIR is set to the
> location containing the hdfs configuration files(hdfs-site.xml and
> core-site.xml)
>
> When I use spark-shell with local mode, the access to hdfs is successfully .
>
> However, If I use spark-shell which connects to the stand alone cluster (I
> configured the spark as standalone cluster mode with only one node).
>
> The access to the hdfs fails with the following error: “Can't get Master
> Kerberos principal for use as renewer”
>
>
>
> Anyone have any ideas on this ?
>
> Thanks a lot.
>
>
>
> Regards,
> Xiaowei
>
>



-- 
Marcelo

Re: error when spark access hdfs with Kerberos enable

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Cheney,

I haven't heard of anybody deploying non-secure YARN on top of secure HDFS.
 It's conceivable that you might be able to get work, but my guess is that
you'd run into some issues.  Also, without authentication on in YARN, you
could be leaving your HDFS tokens exposed, which others could steal and get
to your data.

-Sandy


On Tue, Jul 8, 2014 at 7:28 PM, Cheney Sun <su...@gmail.com> wrote:

> Hi Sandy,
>
> We are also going to grep data from a security enabled (with kerberos)
> HDFS in our Spark application. Per you answer, we have to switch Spark on
> YARN to achieve this.
> We plan to deploy a different Hadoop cluster(with YARN) only to run Spark.
> Is it necessary to deploy YARN with security enabled? Or is it possible to
> access data within a security HDFS from no-security enabled Spark on YARN?
>
>
> On Wed, Jul 9, 2014 at 4:19 AM, Sandy Ryza <sa...@cloudera.com>
> wrote:
>
>> That's correct.  Only Spark on YARN supports Kerberos.
>>
>> -Sandy
>>
>>
>> On Tue, Jul 8, 2014 at 12:04 PM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>
>>> Someone might be able to correct me if I'm wrong, but I don't believe
>>> standalone mode supports kerberos. You'd have to use Yarn for that.
>>>
>>> On Tue, Jul 8, 2014 at 1:40 AM, 许晓炜 <xu...@qiyi.com> wrote:
>>> > Hi all,
>>> >
>>> >
>>> >
>>> > I encounter a strange issue when using spark 1.0 to access hdfs with
>>> > Kerberos
>>> >
>>> > I just have one spark test node for spark and HADOOP_CONF_DIR is set
>>> to the
>>> > location containing the hdfs configuration files(hdfs-site.xml and
>>> > core-site.xml)
>>> >
>>> > When I use spark-shell with local mode, the access to hdfs is
>>> successfully .
>>> >
>>> > However, If I use spark-shell which connects to the stand alone
>>> cluster (I
>>> > configured the spark as standalone cluster mode with only one node).
>>> >
>>> > The access to the hdfs fails with the following error: “Can't get
>>> Master
>>> > Kerberos principal for use as renewer”
>>> >
>>> >
>>> >
>>> > Anyone have any ideas on this ?
>>> >
>>> > Thanks a lot.
>>> >
>>> >
>>> >
>>> > Regards,
>>> > Xiaowei
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>>
>

Re: error when spark access hdfs with Kerberos enable

Posted by Cheney Sun <su...@gmail.com>.
Hi Sandy,

We are also going to grep data from a security enabled (with kerberos) HDFS
in our Spark application. Per you answer, we have to switch Spark on YARN
to achieve this.
We plan to deploy a different Hadoop cluster(with YARN) only to run Spark.
Is it necessary to deploy YARN with security enabled? Or is it possible to
access data within a security HDFS from no-security enabled Spark on YARN?


On Wed, Jul 9, 2014 at 4:19 AM, Sandy Ryza <sa...@cloudera.com> wrote:

> That's correct.  Only Spark on YARN supports Kerberos.
>
> -Sandy
>
>
> On Tue, Jul 8, 2014 at 12:04 PM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
>> Someone might be able to correct me if I'm wrong, but I don't believe
>> standalone mode supports kerberos. You'd have to use Yarn for that.
>>
>> On Tue, Jul 8, 2014 at 1:40 AM, 许晓炜 <xu...@qiyi.com> wrote:
>> > Hi all,
>> >
>> >
>> >
>> > I encounter a strange issue when using spark 1.0 to access hdfs with
>> > Kerberos
>> >
>> > I just have one spark test node for spark and HADOOP_CONF_DIR is set to
>> the
>> > location containing the hdfs configuration files(hdfs-site.xml and
>> > core-site.xml)
>> >
>> > When I use spark-shell with local mode, the access to hdfs is
>> successfully .
>> >
>> > However, If I use spark-shell which connects to the stand alone cluster
>> (I
>> > configured the spark as standalone cluster mode with only one node).
>> >
>> > The access to the hdfs fails with the following error: “Can't get Master
>> > Kerberos principal for use as renewer”
>> >
>> >
>> >
>> > Anyone have any ideas on this ?
>> >
>> > Thanks a lot.
>> >
>> >
>> >
>> > Regards,
>> > Xiaowei
>> >
>> >
>>
>>
>>
>> --
>> Marcelo
>>
>
>

RE: error when spark access hdfs with Kerberos enable

Posted by 许晓炜 <xu...@qiyi.com>.
Thanks a lot Marcelo and Sandy. I will try spark on yarn .

Xiaowei

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Wednesday, July 09, 2014 4:20 AM
To: user@spark.apache.org
Subject: Re: error when spark access hdfs with Kerberos enable

That's correct.  Only Spark on YARN supports Kerberos.

-Sandy

On Tue, Jul 8, 2014 at 12:04 PM, Marcelo Vanzin <va...@cloudera.com>> wrote:
Someone might be able to correct me if I'm wrong, but I don't believe
standalone mode supports kerberos. You'd have to use Yarn for that.

On Tue, Jul 8, 2014 at 1:40 AM, 许晓炜 <xu...@qiyi.com>> wrote:
> Hi all,
>
>
>
> I encounter a strange issue when using spark 1.0 to access hdfs with
> Kerberos
>
> I just have one spark test node for spark and HADOOP_CONF_DIR is set to the
> location containing the hdfs configuration files(hdfs-site.xml and
> core-site.xml)
>
> When I use spark-shell with local mode, the access to hdfs is successfully .
>
> However, If I use spark-shell which connects to the stand alone cluster (I
> configured the spark as standalone cluster mode with only one node).
>
> The access to the hdfs fails with the following error: “Can't get Master
> Kerberos principal for use as renewer”
>
>
>
> Anyone have any ideas on this ?
>
> Thanks a lot.
>
>
>
> Regards,
> Xiaowei
>
>


--
Marcelo


Re: error when spark access hdfs with Kerberos enable

Posted by Sandy Ryza <sa...@cloudera.com>.
That's correct.  Only Spark on YARN supports Kerberos.

-Sandy


On Tue, Jul 8, 2014 at 12:04 PM, Marcelo Vanzin <va...@cloudera.com> wrote:

> Someone might be able to correct me if I'm wrong, but I don't believe
> standalone mode supports kerberos. You'd have to use Yarn for that.
>
> On Tue, Jul 8, 2014 at 1:40 AM, 许晓炜 <xu...@qiyi.com> wrote:
> > Hi all,
> >
> >
> >
> > I encounter a strange issue when using spark 1.0 to access hdfs with
> > Kerberos
> >
> > I just have one spark test node for spark and HADOOP_CONF_DIR is set to
> the
> > location containing the hdfs configuration files(hdfs-site.xml and
> > core-site.xml)
> >
> > When I use spark-shell with local mode, the access to hdfs is
> successfully .
> >
> > However, If I use spark-shell which connects to the stand alone cluster
> (I
> > configured the spark as standalone cluster mode with only one node).
> >
> > The access to the hdfs fails with the following error: “Can't get Master
> > Kerberos principal for use as renewer”
> >
> >
> >
> > Anyone have any ideas on this ?
> >
> > Thanks a lot.
> >
> >
> >
> > Regards,
> > Xiaowei
> >
> >
>
>
>
> --
> Marcelo
>