You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2018/01/25 01:28:00 UTC

[jira] [Created] (SPARK-23209) HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath

Sahil Takiar created SPARK-23209:
------------------------------------

             Summary: HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath
                 Key: SPARK-23209
                 URL: https://issues.apache.org/jira/browse/SPARK-23209
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.3.0
         Environment: OSX, Java(TM) SE Runtime Environment (build 1.8.0_92-b14), Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
            Reporter: Sahil Takiar


While doing some Hive-on-Spark testing against the Spark 2.3.0 release candidates we came across a bug (see HIVE-18436).

Stack-trace:

{code}
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
        at org.apache.spark.deploy.security.HadoopDelegationTokenManager.getDelegationTokenProviders(HadoopDelegationTokenManager.scala:68)
        at org.apache.spark.deploy.security.HadoopDelegationTokenManager.<init>(HadoopDelegationTokenManager.scala:54)
        at org.apache.spark.deploy.yarn.security.YARNHadoopDelegationTokenManager.<init>(YARNHadoopDelegationTokenManager.scala:44)
        at org.apache.spark.deploy.yarn.Client.<init>(Client.scala:123)
        at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1502)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 10 more
{code}

Looks like the bug was introduced by SPARK-20434. SPARK-20434 changed {{HiveDelegationTokenProvider}} so that it constructs {{o.a.h.hive.conf.HiveConf}} inside {{HiveCredentialProvider#hiveConf}} rather than trying to manually load the class via the class loader. Looks like with the new code the JVM tries to load {{HiveConf}} as soon as {{HiveDelegationTokenProvider}} is referenced. Since there is no try-catch around the construction of {{HiveDelegationTokenProvider}} a {{ClassNotFoundException}} is thrown, which causes spark-submit to crash. Spark's {{docs/running-on-yarn.md}} says "a Hive token will be obtained if Hive is on the classpath". This behavior would seem to contradict that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org