You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/01/05 22:34:00 UTC
[jira] [Assigned] (SPARK-17088) IsolatedClientLoader fails to load
Hive client when sharesHadoopClasses is false
[ https://issues.apache.org/jira/browse/SPARK-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-17088:
------------------------------------
Assignee: Apache Spark
> IsolatedClientLoader fails to load Hive client when sharesHadoopClasses is false
> --------------------------------------------------------------------------------
>
> Key: SPARK-17088
> URL: https://issues.apache.org/jira/browse/SPARK-17088
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Marcelo Vanzin
> Assignee: Apache Spark
> Priority: Minor
>
> There's a bug in a very rare code path in {{IsolatedClientLoader}}:
> {code}
> case e: RuntimeException if e.getMessage.contains("hadoop") =>
> // If the error message contains hadoop, it is probably because the hadoop
> // version cannot be resolved (e.g. it is a vendor specific version like
> // 2.0.0-cdh4.1.1). If it is the case, we will try just
> // "org.apache.hadoop:hadoop-client:2.4.0". "org.apache.hadoop:hadoop-client:2.4.0"
> // is used just because we used to hard code it as the hadoop artifact to download.
> logWarning(s"Failed to resolve Hadoop artifacts for the version ${hadoopVersion}. " +
> s"We will change the hadoop version from ${hadoopVersion} to 2.4.0 and try again. " +
> "Hadoop classes will not be shared between Spark and Hive metastore client. " +
> "It is recommended to set jars used by Hive metastore client through " +
> "spark.sql.hive.metastore.jars in the production environment.")
> sharesHadoopClasses = false
> {code}
> That's the rare part. But when {{sharesHadoopClasses}} is set to false, the instantiation of {{HiveClientImpl}} fails:
> {code}
> classLoader
> .loadClass(classOf[HiveClientImpl].getName)
> .getConstructors.head
> .newInstance(version, sparkConf, hadoopConf, config, classLoader, this)
> .asInstanceOf[HiveClient]
> {code}
> {{hadoopConf}} here is an instance of {{Configuration}} loaded by the main Spark class loader, but in this case {{HiveClientImpl}} expects an instance of {{Configuration}} loaded by the isolated class loader (yay class loaders are fun). So you get an error like this:
> {noformat}
> 2016-08-10 13:51:20.742 - stderr> Exception in thread "main" java.lang.IllegalArgumentException: argument type mismatch
> 2016-08-10 13:51:20.743 - stderr> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 2016-08-10 13:51:20.743 - stderr> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 2016-08-10 13:51:20.743 - stderr> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 2016-08-10 13:51:20.743 - stderr> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 2016-08-10 13:51:20.744 - stderr> at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
> 2016-08-10 13:51:20.744 - stderr> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
> 2016-08-10 13:51:20.744 - stderr> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
> 2016-08-10 13:51:20.744 - stderr> at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
> 2016-08-10 13:51:20.745 - stderr> at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org