You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Julien Carme <ju...@gmail.com> on 2016/01/20 18:42:06 UTC

Using Spark, SparkR and Ranger, please help.

Hello,

I have been able to use Spark with Apache Ranger. I had the right
configuration files to Spark conf, I add Ranger jars to the classpath and
it works, Spark complies to Ranger rules when I access Hive tables.

However with SparkR it does not work, which is rather surprising
considering SparkR is supposed to be just a layer over Spark. I don't
understand why sparkR seem to behave differently, maybe I am just missing
something.

So when I run Spark when I do:

sqlContext.sql("show databases").collect()

it works, I get all my Hive databases.

But in sparkR it does not behave the same way.

when I do:

sql(sqlContext,"show databases")

...
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  java.lang.RuntimeException: [1.1] failure: ``with'' expected but
identifier show found
...

>From the documentation it seems that I need to instanciate an hiveContext.

hiveContext <- sparkRHive.init(sc)
sql(hiveContext, "show databases")

...
16/01/20 18:37:20 ERROR RBackendHandler: sql on 2 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  java.lang.AssertionError: Authorization plugins not initialized!
        at
org.apache.hadoop.hive.ql.session.SessionState.getAuthorizationMode(SessionState.java:1511)
        at
org.apache.hadoop.hive.ql.session.SessionState.isAuthorizationModeV2(SessionState.java:1515)
        at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:566)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
        at
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
        at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:484)
        at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:473)
        at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.appl
...

Any help would be appreciated.

Regards,

Julien

Re: Using Spark, SparkR and Ranger, please help.

Posted by Ted Yu <yu...@gmail.com>.
The tail of the stack trace seems to be chopped off.

Can you include the whole trace ?

Which version of Spark / Hive / Ranger are you using ?

Cheers

On Wed, Jan 20, 2016 at 9:42 AM, Julien Carme <ju...@gmail.com>
wrote:

> Hello,
>
> I have been able to use Spark with Apache Ranger. I had the right
> configuration files to Spark conf, I add Ranger jars to the classpath and
> it works, Spark complies to Ranger rules when I access Hive tables.
>
> However with SparkR it does not work, which is rather surprising
> considering SparkR is supposed to be just a layer over Spark. I don't
> understand why sparkR seem to behave differently, maybe I am just missing
> something.
>
> So when I run Spark when I do:
>
> sqlContext.sql("show databases").collect()
>
> it works, I get all my Hive databases.
>
> But in sparkR it does not behave the same way.
>
> when I do:
>
> sql(sqlContext,"show databases")
>
> ...
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
>   java.lang.RuntimeException: [1.1] failure: ``with'' expected but
> identifier show found
> ...
>
> From the documentation it seems that I need to instanciate an hiveContext.
>
> hiveContext <- sparkRHive.init(sc)
> sql(hiveContext, "show databases")
>
> ...
> 16/01/20 18:37:20 ERROR RBackendHandler: sql on 2 failed
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
>   java.lang.AssertionError: Authorization plugins not initialized!
>         at
> org.apache.hadoop.hive.ql.session.SessionState.getAuthorizationMode(SessionState.java:1511)
>         at
> org.apache.hadoop.hive.ql.session.SessionState.isAuthorizationModeV2(SessionState.java:1515)
>         at
> org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:566)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>         at
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>         at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:484)
>         at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:473)
>         at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.appl
> ...
>
> Any help would be appreciated.
>
> Regards,
>
> Julien
>