You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2015/10/22 22:17:27 UTC

[jira] [Commented] (SPARK-11265) YarnClient cant get tokens to talk to Hive in a secure cluster

    [ https://issues.apache.org/jira/browse/SPARK-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969818#comment-14969818 ] 

Steve Loughran commented on SPARK-11265:
----------------------------------------

Initial report from  Chester Chen

{noformat}

  This is tested against the 

   spark 1.5.1 ( branch 1.5  with label 1.5.2-SNAPSHOT with commit on Tue Oct 6, 84f510c4fa06e43bd35e2dc8e1008d0590cbe266)  

   Spark deployment mode : Spark-Cluster

   Notice that if we enable Kerberos mode, the spark yarn client fails with the following: 

Could not initialize class org.apache.hadoop.hive.ql.metadata.Hive
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.ql.metadata.Hive
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$obtainTokenForHiveMetastore(Client.scala:1252)
        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:271)
        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:629)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:119)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:907)


Diving in Yarn Client.scala code and tested against different dependencies and notice the followings:  if  the kerberos mode is enabled, Client.obtainTokenForHiveMetastore() will try to use scala reflection to get Hive and HiveConf and method on these method. 
 
      val hiveClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive")
      val hive = hiveClass.getMethod("get").invoke(null)

      val hiveConf = hiveClass.getMethod("getConf").invoke(hive)
      val hiveConfClass = mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf")

      val hiveConfGet = (param: String) => Option(hiveConfClass
        .getMethod("get", classOf[java.lang.String])
        .invoke(hiveConf, param))

   If the "org.spark-project.hive" % "hive-exec" % "1.2.1.spark" is used, then you will get above exception. But if we use the 
       "org.apache.hive" % "hive-exec" "0.13.1-cdh5.2.0" 
 The above method will not throw exception. 
{noformat}

> YarnClient cant get tokens to talk to Hive in a secure cluster
> --------------------------------------------------------------
>
>                 Key: SPARK-11265
>                 URL: https://issues.apache.org/jira/browse/SPARK-11265
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.5.1
>         Environment: Kerberized Hadoop cluster
>            Reporter: Steve Loughran
>
> As reported on the dev list, trying to run a YARN client which wants to talk to Hive in a Kerberized hadoop cluster fails. This appears to be because the constructor of the {{ org.apache.hadoop.hive.ql.metadata.Hive}} class was made private and replaced with a factory method. The YARN client uses reflection to get the tokens, so the signature changes weren't picked up in SPARK-8064.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org