You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Josh Mahonin (JIRA)" <ji...@apache.org> on 2015/12/10 19:07:10 UTC
[jira] [Commented] (PHOENIX-2503) Multiple Java NoClass/Method Errors with Spark and Phoenix

    [ https://issues.apache.org/jira/browse/PHOENIX-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051382#comment-15051382 ] 

Josh Mahonin commented on PHOENIX-2503:
---------------------------------------

Patch creates a new 'client-spark' JAR that is compatible with Spark. The root cause of this issue is that the regular client JAR ships with versions of com.fasterxml.jackson that are incompatible with Spark.

This has likely been an issue for some time, but was hidden in the unit tests, as well my my local environment, due to newer com.fasterxml.jackson JARs taking precendence over those in the client JARs.

What's a confusing to me is why these are included in the client JAR at all. Looking at maven dependency:tree, the only things that pull in the offending fasterxml jars is 'phoenix-server-client' and 'phoenix-server'. Both of these are explicitly excluded in the 'components-minimal.xml' dependency set, so I'm not entirely sure why they're making it into the client JAR into the first place, and if they're required at all. Unfortunately my maven-fu isn't really good enough to figure out exactly what's going on here.

If those components are not required in the client JAR, ideally we could find a way to remove them and have the Spark support continue to use it, rather than requiring a custom client JAR.

> Multiple Java NoClass/Method Errors with Spark and Phoenix
> ----------------------------------------------------------
>
>                 Key: PHOENIX-2503
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2503
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>         Environment: Debian 8 (Jessie) x64
> hadoop-2.6.2
> hbase-1.1.2
> phoenix-4.6.0-HBase-1.1
> spark-1.5.2-bin-without-hadoop
>            Reporter: Jonathan Cox
>            Priority: Blocker
>         Attachments: PHOENIX-2503.patch
>
>
> I have encountered a variety of Java errors while trying to get Apache Phoenix working with Spark. In particular, I encounter these errors when submitting Python jobs to the spark-shell, or running interactively in the scala Spark shell. 
> ------- Issue 1 -------
> The first issue I encountered was that Phoenix would not work with the binary Spark release that includes Hadoop 2.6 (spark-1.5.2-bin-hadoop2.6.tgz). I tried adding the phoenix-4.6.0-HBase-1.1-client.jar to both spark-env.sh and spark-defaults.conf, but encountered the same error when launching spark-shell:
> 15/12/08 18:38:05 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
> 15/12/08 18:38:05 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
> 15/12/08 18:38:05 WARN Hive: Failed to access metastore. This class should not accessed in runtime.
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>                 at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
> ----- Issue 2 -----
> Alright, having given up on getting Phoenix to work with the Spark package that includes Hadoop, I decided to download hadoop-2.6.2.tar.gz and spark-1.5.2-bin-without-hadoop.tgz. I installed these, and again added phoenix-4.6.0-HBase-1.1-client.jar to spark-defaults.conf. In addition, I added the following lines to spark-env.sh:
> SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
> export SPARK_DIST_CLASSPATH="$SPARK_DIST_CLASSPATH:/usr/local/hadoop/share/hadoop/tools/lib/*" 
> export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
> This solved "Issue 1" described above, and now spark-shell launches without generating an error. Nevertheless, other Spark functionality is now broken:
> 15/12/09 13:55:46 INFO repl.SparkILoop: Created spark context..
> Spark context available as sc.
> 15/12/09 13:55:46 INFO repl.SparkILoop: Created sql context..
> SQL context available as sqlContext.
> scala> val textFile = sc.textFile("README.md")
> java.lang.NoSuchMethodError: com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
> 	at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
> 	at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
> Note, this error goes away if I omit phoenix-4.6.0-HBase-1.1-client.jar (but then I have no Phoenix support, obviously). This makes me believe that phoenix-4.6.0-HBase-1.1-client.jar contains some conflicting version of Jackson FastXML classes, which are overriding Spark's Jackson classes with an earlier version that doesn't include this particular method. In other words, Spark needs one version of Jackson JARs, but Phoenix is including another that breaks Spark. Does this make any sense?
> Sincerely,
> Jonathan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)