You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ruslan Dautkhanov <da...@gmail.com> on 2016/01/18 07:04:17 UTC

Spark + Sentry + Kerberos don't add up?

Getting following error stack

The Spark session could not be created in the cluster:
>     at org.apache.hadoop.security.*UserGroupInformation.doAs*
> (UserGroupInformation.java:1671)
>     at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:160)
>     at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) )
>     at org.*apache.hadoop.hive.metastore.HiveMetaStoreClient*
> .open(HiveMetaStoreClient.java:466)
>     at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:234)
>     at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
>     ... 35 more


My understanding that hive.server2.enable.impersonation and
hive.server2.enable.doAs should be enabled to make
UserGroupInformation.doAs() work?

When I try to enable these parameters, Cloudera Manager shows error

Hive Impersonation is enabled for Hive Server2 role 'HiveServer2
> (hostname)'.
> Hive Impersonation should be disabled to enable Hive authorization using
> Sentry


So Spark-Hive conflicts with Sentry!?

Environment: Hue 3.9 Spark Notebooks + Livy Server (built from master). CDH
5.5.

This is a kerberized cluster with Sentry.

I was using hue's keytab as hue user is normally (by default in CDH) is
allowed to impersonate to other users.
So very convenient for Spark Notebooks.

Any information to help solve this will be highly appreciated.


-- 
Ruslan Dautkhanov

Re: Spark + Sentry + Kerberos don't add up?

Posted by Ruslan Dautkhanov <da...@gmail.com>.
I took liberty and created a JIRA https://github.com/cloudera/livy/issues/36
Feel free to close it if doesn't belong to Livy project.
I really don't know if this is a Spark or a Livy/Sentry problem.

Any ideas for possible workarounds?

Thank you.



-- 
Ruslan Dautkhanov

On Mon, Jan 18, 2016 at 4:25 PM, Ruslan Dautkhanov <da...@gmail.com>
wrote:

> Hi Romain,
>
> Thank you for your response.
>
> Adding Kerberos support might be as simple as
> https://issues.cloudera.org/browse/LIVY-44 ? I.e. add Livy --principal
> and --keytab parameters to be passed to spark-submit.
>
> As a workaround I just did kinit (using hues' keytab) and then launched
> Livy Server. It probably will work as long as kerberos ticket doesn't
> expire. That's it would be great to have support for --principal and
> --keytab parameters for spark-submit as explined in
> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_sg_yarn_long_jobs.html
>
>
> The only problem I have currently is the above error stack in my previous
> email:
>
> The Spark session could not be created in the cluster:
>>     at org.apache.hadoop.security.*UserGroupInformation.doAs*(
>> UserGroupInformation.java:1671)
>>     at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
>> SparkSubmit.scala:160)
>
>
>
> >> AFAIK Hive impersonation should be turned off when using Sentry
>
> Yep, exactly. That's what I did. It is disabled now. But looks like on
> other hand, Spark or Spark Notebook want to have that enabled?
> It tries to do org.apache.hadoop.security.UserGroupInformation.doAs()
> hence the error.
>
> So Sentry isn't compatible with Spark in kerberized clusters? Is any
> workaround for this problem?
>
>
> --
> Ruslan Dautkhanov
>
> On Mon, Jan 18, 2016 at 3:52 PM, Romain Rigaux <ro...@cloudera.com>
> wrote:
>
>> Livy does not support any Kerberos yet
>> https://issues.cloudera.org/browse/LIVY-3
>>
>> Are you focusing instead about HS2 + Kerberos with Sentry?
>>
>> AFAIK Hive impersonation should be turned off when using Sentry:
>> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/sg_sentry_service_config.html
>>
>> On Sun, Jan 17, 2016 at 10:04 PM, Ruslan Dautkhanov <dautkhanov@gmail.com
>> > wrote:
>>
>>> Getting following error stack
>>>
>>> The Spark session could not be created in the cluster:
>>>>     at org.apache.hadoop.security.*UserGroupInformation.doAs*
>>>> (UserGroupInformation.java:1671)
>>>>     at
>>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:160)
>>>>     at
>>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>>>>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>>>>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) )
>>>>     at org.*apache.hadoop.hive.metastore.HiveMetaStoreClient*
>>>> .open(HiveMetaStoreClient.java:466)
>>>>     at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:234)
>>>>     at
>>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
>>>>     ... 35 more
>>>
>>>
>>> My understanding that hive.server2.enable.impersonation and
>>> hive.server2.enable.doAs should be enabled to make
>>> UserGroupInformation.doAs() work?
>>>
>>> When I try to enable these parameters, Cloudera Manager shows error
>>>
>>> Hive Impersonation is enabled for Hive Server2 role 'HiveServer2
>>>> (hostname)'.
>>>> Hive Impersonation should be disabled to enable Hive authorization
>>>> using Sentry
>>>
>>>
>>> So Spark-Hive conflicts with Sentry!?
>>>
>>> Environment: Hue 3.9 Spark Notebooks + Livy Server (built from master).
>>> CDH 5.5.
>>>
>>> This is a kerberized cluster with Sentry.
>>>
>>> I was using hue's keytab as hue user is normally (by default in CDH) is
>>> allowed to impersonate to other users.
>>> So very convenient for Spark Notebooks.
>>>
>>> Any information to help solve this will be highly appreciated.
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Hue-Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to hue-user+unsubscribe@cloudera.org.
>>>
>>
>>
>

Re: Spark + Sentry + Kerberos don't add up?

Posted by Ruslan Dautkhanov <da...@gmail.com>.
Turns to be it is a Spark issue

https://issues.apache.org/jira/browse/SPARK-13478




-- 
Ruslan Dautkhanov

On Mon, Jan 18, 2016 at 4:25 PM, Ruslan Dautkhanov <da...@gmail.com>
wrote:

> Hi Romain,
>
> Thank you for your response.
>
> Adding Kerberos support might be as simple as
> https://issues.cloudera.org/browse/LIVY-44 ? I.e. add Livy --principal
> and --keytab parameters to be passed to spark-submit.
>
> As a workaround I just did kinit (using hues' keytab) and then launched
> Livy Server. It probably will work as long as kerberos ticket doesn't
> expire. That's it would be great to have support for --principal and
> --keytab parameters for spark-submit as explined in
> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_sg_yarn_long_jobs.html
>
>
> The only problem I have currently is the above error stack in my previous
> email:
>
> The Spark session could not be created in the cluster:
>>     at org.apache.hadoop.security.*UserGroupInformation.doAs*(
>> UserGroupInformation.java:1671)
>>     at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
>> SparkSubmit.scala:160)
>
>
>
> >> AFAIK Hive impersonation should be turned off when using Sentry
>
> Yep, exactly. That's what I did. It is disabled now. But looks like on
> other hand, Spark or Spark Notebook want to have that enabled?
> It tries to do org.apache.hadoop.security.UserGroupInformation.doAs()
> hence the error.
>
> So Sentry isn't compatible with Spark in kerberized clusters? Is any
> workaround for this problem?
>
>
> --
> Ruslan Dautkhanov
>
> On Mon, Jan 18, 2016 at 3:52 PM, Romain Rigaux <ro...@cloudera.com>
> wrote:
>
>> Livy does not support any Kerberos yet
>> https://issues.cloudera.org/browse/LIVY-3
>>
>> Are you focusing instead about HS2 + Kerberos with Sentry?
>>
>> AFAIK Hive impersonation should be turned off when using Sentry:
>> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/sg_sentry_service_config.html
>>
>> On Sun, Jan 17, 2016 at 10:04 PM, Ruslan Dautkhanov <dautkhanov@gmail.com
>> > wrote:
>>
>>> Getting following error stack
>>>
>>> The Spark session could not be created in the cluster:
>>>>     at org.apache.hadoop.security.*UserGroupInformation.doAs*
>>>> (UserGroupInformation.java:1671)
>>>>     at
>>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:160)
>>>>     at
>>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>>>>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>>>>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) )
>>>>     at org.*apache.hadoop.hive.metastore.HiveMetaStoreClient*
>>>> .open(HiveMetaStoreClient.java:466)
>>>>     at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:234)
>>>>     at
>>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
>>>>     ... 35 more
>>>
>>>
>>> My understanding that hive.server2.enable.impersonation and
>>> hive.server2.enable.doAs should be enabled to make
>>> UserGroupInformation.doAs() work?
>>>
>>> When I try to enable these parameters, Cloudera Manager shows error
>>>
>>> Hive Impersonation is enabled for Hive Server2 role 'HiveServer2
>>>> (hostname)'.
>>>> Hive Impersonation should be disabled to enable Hive authorization
>>>> using Sentry
>>>
>>>
>>> So Spark-Hive conflicts with Sentry!?
>>>
>>> Environment: Hue 3.9 Spark Notebooks + Livy Server (built from master).
>>> CDH 5.5.
>>>
>>> This is a kerberized cluster with Sentry.
>>>
>>> I was using hue's keytab as hue user is normally (by default in CDH) is
>>> allowed to impersonate to other users.
>>> So very convenient for Spark Notebooks.
>>>
>>> Any information to help solve this will be highly appreciated.
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Hue-Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to hue-user+unsubscribe@cloudera.org.
>>>
>>
>>
>

Re: Spark + Sentry + Kerberos don't add up?

Posted by Ruslan Dautkhanov <da...@gmail.com>.
Hi Romain,

Thank you for your response.

Adding Kerberos support might be as simple as
https://issues.cloudera.org/browse/LIVY-44 ? I.e. add Livy --principal and
--keytab parameters to be passed to spark-submit.

As a workaround I just did kinit (using hues' keytab) and then launched
Livy Server. It probably will work as long as kerberos ticket doesn't
expire. That's it would be great to have support for --principal and
--keytab parameters for spark-submit as explined in
http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cm_sg_yarn_long_jobs.html


The only problem I have currently is the above error stack in my previous
email:

The Spark session could not be created in the cluster:
>     at org.apache.hadoop.security.*UserGroupInformation.doAs*(
> UserGroupInformation.java:1671)
>     at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:160)



>> AFAIK Hive impersonation should be turned off when using Sentry

Yep, exactly. That's what I did. It is disabled now. But looks like on
other hand, Spark or Spark Notebook want to have that enabled?
It tries to do org.apache.hadoop.security.UserGroupInformation.doAs() hence
the error.

So Sentry isn't compatible with Spark in kerberized clusters? Is any
workaround for this problem?


-- 
Ruslan Dautkhanov

On Mon, Jan 18, 2016 at 3:52 PM, Romain Rigaux <ro...@cloudera.com> wrote:

> Livy does not support any Kerberos yet
> https://issues.cloudera.org/browse/LIVY-3
>
> Are you focusing instead about HS2 + Kerberos with Sentry?
>
> AFAIK Hive impersonation should be turned off when using Sentry:
> http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/sg_sentry_service_config.html
>
> On Sun, Jan 17, 2016 at 10:04 PM, Ruslan Dautkhanov <da...@gmail.com>
> wrote:
>
>> Getting following error stack
>>
>> The Spark session could not be created in the cluster:
>>>     at org.apache.hadoop.security.*UserGroupInformation.doAs*
>>> (UserGroupInformation.java:1671)
>>>     at
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:160)
>>>     at
>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>>>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>>>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) )
>>>     at org.*apache.hadoop.hive.metastore.HiveMetaStoreClient*
>>> .open(HiveMetaStoreClient.java:466)
>>>     at
>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:234)
>>>     at
>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
>>>     ... 35 more
>>
>>
>> My understanding that hive.server2.enable.impersonation and
>> hive.server2.enable.doAs should be enabled to make
>> UserGroupInformation.doAs() work?
>>
>> When I try to enable these parameters, Cloudera Manager shows error
>>
>> Hive Impersonation is enabled for Hive Server2 role 'HiveServer2
>>> (hostname)'.
>>> Hive Impersonation should be disabled to enable Hive authorization using
>>> Sentry
>>
>>
>> So Spark-Hive conflicts with Sentry!?
>>
>> Environment: Hue 3.9 Spark Notebooks + Livy Server (built from master).
>> CDH 5.5.
>>
>> This is a kerberized cluster with Sentry.
>>
>> I was using hue's keytab as hue user is normally (by default in CDH) is
>> allowed to impersonate to other users.
>> So very convenient for Spark Notebooks.
>>
>> Any information to help solve this will be highly appreciated.
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Hue-Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to hue-user+unsubscribe@cloudera.org.
>>
>
>