You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/01/05 22:34:00 UTC
[jira] [Assigned] (SPARK-17088) IsolatedClientLoader fails to load Hive client when sharesHadoopClasses is false

     [ https://issues.apache.org/jira/browse/SPARK-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-17088:
------------------------------------

    Assignee: Apache Spark

> IsolatedClientLoader fails to load Hive client when sharesHadoopClasses is false
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-17088
>                 URL: https://issues.apache.org/jira/browse/SPARK-17088
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Marcelo Vanzin
>            Assignee: Apache Spark
>            Priority: Minor
>
> There's a bug in a very rare code path in {{IsolatedClientLoader}}:
> {code}
>           case e: RuntimeException if e.getMessage.contains("hadoop") =>
>             // If the error message contains hadoop, it is probably because the hadoop
>             // version cannot be resolved (e.g. it is a vendor specific version like
>             // 2.0.0-cdh4.1.1). If it is the case, we will try just
>             // "org.apache.hadoop:hadoop-client:2.4.0". "org.apache.hadoop:hadoop-client:2.4.0"
>             // is used just because we used to hard code it as the hadoop artifact to download.
>             logWarning(s"Failed to resolve Hadoop artifacts for the version ${hadoopVersion}. " +
>               s"We will change the hadoop version from ${hadoopVersion} to 2.4.0 and try again. " +
>               "Hadoop classes will not be shared between Spark and Hive metastore client. " +
>               "It is recommended to set jars used by Hive metastore client through " +
>               "spark.sql.hive.metastore.jars in the production environment.")
>             sharesHadoopClasses = false
> {code}
> That's the rare part. But when {{sharesHadoopClasses}} is set to false, the instantiation of {{HiveClientImpl}} fails:
> {code}
>       classLoader
>         .loadClass(classOf[HiveClientImpl].getName)
>         .getConstructors.head
>         .newInstance(version, sparkConf, hadoopConf, config, classLoader, this)
>         .asInstanceOf[HiveClient]
> {code}
> {{hadoopConf}} here is an instance of {{Configuration}} loaded by the main Spark class loader, but in this case {{HiveClientImpl}} expects an instance of {{Configuration}} loaded by the isolated class loader (yay class loaders are fun). So you get an error like this:
> {noformat}
> 2016-08-10 13:51:20.742 - stderr> Exception in thread "main" java.lang.IllegalArgumentException: argument type mismatch
> 2016-08-10 13:51:20.743 - stderr> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 2016-08-10 13:51:20.743 - stderr> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 2016-08-10 13:51:20.743 - stderr> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 2016-08-10 13:51:20.743 - stderr> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 2016-08-10 13:51:20.744 - stderr> 	at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
> 2016-08-10 13:51:20.744 - stderr> 	at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
> 2016-08-10 13:51:20.744 - stderr> 	at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
> 2016-08-10 13:51:20.744 - stderr> 	at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
> 2016-08-10 13:51:20.745 - stderr> 	at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org