You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Lian Jiang <ji...@gmail.com> on 2018/03/07 00:17:50 UTC

dependencies conflict in oozie spark action for spark 2

I am using HDP 2.6.4 and have followed
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html
to make oozie use spark2.

After this, I found there are still a bunch of issues:

1. oozie and spark tries to add the same jars multiple time into cache.
This is resolved by removing the duplicate jars
from /user/oozie/share/lib/lib_20180303065325/spark2/ folder.

2. jar conflict which is not resolved. The exception is below:

18/03/06 23:51:18 ERROR ApplicationMaster: User class threw exception:
java.lang.NoSuchFieldError: USE_DEFAULTS
java.lang.NoSuchFieldError: USE_DEFAULTS
at
com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.findSerializationInclusion(JacksonAnnotationIntrospector.java:498)
at
com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findSerializationInclusion(AnnotationIntrospectorPair.java:332)
at
com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findSerializationInclusion(AnnotationIntrospectorPair.java:332)
at
com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findSerializationInclusion(BasicBeanDescription.java:381)
at
com.fasterxml.jackson.databind.ser.PropertyBuilder.<init>(PropertyBuilder.java:41)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.constructPropertyBuilder(BeanSerializerFactory.java:507)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.findBeanProperties(BeanSerializerFactory.java:558)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.constructBeanSerializer(BeanSerializerFactory.java:361)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.findBeanSerializer(BeanSerializerFactory.java:272)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:225)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at
com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at
com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at
com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at
com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at
com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at
com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at
com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:145)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)



My dependencies are:

libraryDependencies += "com.typesafe.scala-logging" %%
"scala-logging-api" % "2.1.2"
libraryDependencies += "com.typesafe.scala-logging" %%
"scala-logging-slf4j" % "2.1.2"
libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.2.3"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.2.0"
libraryDependencies += "com.typesafe" % "config" % "1.3.2"
libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.4"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.4" % "test"
libraryDependencies += "org.scalamock" %% "scalamock" % "4.1.0" % "test"
libraryDependencies += "com.jsuereth" %% "scala-arm" % "2.0"
libraryDependencies += "com.github.scopt" %% "scopt" % "3.7.0"
libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.3.8"
libraryDependencies += "io.dropwizard.metrics" % "metrics-core" % "4.0.2"
libraryDependencies += "com.typesafe.slick" %% "slick" % "3.2.1"
libraryDependencies += "com.typesafe.slick" %% "slick-hikaricp" % "3.2.1"
libraryDependencies += "com.typesafe.slick" %% "slick-extensions" % "3.0.0"
libraryDependencies += "org.scalaz" %% "scalaz-core" % "7.2.19"
libraryDependencies += "org.json4s" %% "json4s-native" % "3.5.3"
libraryDependencies += "com.softwaremill.retry" %% "retry" % "0.3.0"
libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.5.5"
libraryDependencies += "org.apache.httpcomponents" % "httpcore" % "4.4.9"


Sbt dependency tree shows jackson 2.6.5 coming from spark-core is in use.
But per
https://stackoverflow.com/questions/36982173/java-lang-nosuchfielderror-use-defaults-thrown-while-validating-json-schema-thr,
spark is using jackson version before 2.6 causing "NoSuchFieldError:
USE_DEFAULTS".

I have done:

1. succeed to run the same application through spark-submit.

2. make sure the spark dependencies are 2.2.0 to be consistent with that in
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_release-notes/content/comp_versions.html
.

What else I missed? Appreciate any help!

Re: dependencies conflict in oozie spark action for spark 2

Posted by Lian Jiang <ji...@gmail.com>.
I found below inconsistency between oozie and spark2 jars:

jackson-core-2.4.4.jar oozie
jackson-core-2.6.5.jar spark2

jackson-databind-2.4.4.jar oozie
jackson-databind-2.6.5.jar spark2

jackson-annotations-2.4.0.jar oozie
jackson-annotations-2.6.5.jar spark2

I removed the lower version jars from oozie. Then spark job cannot
communicate to Yarn due to this error:

18/03/07 18:24:30 INFO Utils: Using initial executors = 0, max of
spark.dynamicAllocation.initialExecutors,
spark.dynamicAllocation.minExecutors and spark.executor.instances
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto$Builder.setMemory(J)Lorg/apache/hadoop/yarn/proto/YarnProtos$ResourceProto$Builder;
    at
org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.setMemorySize(ResourcePBImpl.java:78)
    at
org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.setMemory(ResourcePBImpl.java:72)
    at
org.apache.hadoop.yarn.api.records.Resource.newInstance(Resource.java:58)
    at
org.apache.spark.deploy.yarn.YarnAllocator.<init>(YarnAllocator.scala:140)
    at
org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:77)
    at
org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:387)
    at
org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:430)
    at
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
    at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
    at
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
    at
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
    at
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
    at
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)


Any idea?


On Tue, Mar 6, 2018 at 4:17 PM, Lian Jiang <ji...@gmail.com> wrote:

> I am using HDP 2.6.4 and have followed https://docs.hortonworks.com/
> HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/
> content/ch_oozie-spark-action.html
> to make oozie use spark2.
>
> After this, I found there are still a bunch of issues:
>
> 1. oozie and spark tries to add the same jars multiple time into cache.
> This is resolved by removing the duplicate jars from /user/oozie/share/lib/lib_20180303065325/spark2/
> folder.
>
> 2. jar conflict which is not resolved. The exception is below:
>
> 18/03/06 23:51:18 ERROR ApplicationMaster: User class threw exception:
> java.lang.NoSuchFieldError: USE_DEFAULTS
> java.lang.NoSuchFieldError: USE_DEFAULTS
> at com.fasterxml.jackson.databind.introspect.
> JacksonAnnotationIntrospector.findSerializationInclusion(
> JacksonAnnotationIntrospector.java:498)
> at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.
> findSerializationInclusion(AnnotationIntrospectorPair.java:332)
> at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.
> findSerializationInclusion(AnnotationIntrospectorPair.java:332)
> at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.
> findSerializationInclusion(BasicBeanDescription.java:381)
> at com.fasterxml.jackson.databind.ser.PropertyBuilder.<
> init>(PropertyBuilder.java:41)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.
> constructPropertyBuilder(BeanSerializerFactory.java:507)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.
> findBeanProperties(BeanSerializerFactory.java:558)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.
> constructBeanSerializer(BeanSerializerFactory.java:361)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.
> findBeanSerializer(BeanSerializerFactory.java:272)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._
> createSerializer2(BeanSerializerFactory.java:225)
> at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.
> createSerializer(BeanSerializerFactory.java:153)
> at com.fasterxml.jackson.databind.SerializerProvider._
> createUntypedSerializer(SerializerProvider.java:1203)
> at com.fasterxml.jackson.databind.SerializerProvider._
> createAndCacheUntypedSerializer(SerializerProvider.java:1157)
> at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(
> SerializerProvider.java:481)
> at com.fasterxml.jackson.databind.SerializerProvider.
> findTypedValueSerializer(SerializerProvider.java:679)
> at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.
> serializeValue(DefaultSerializerProvider.java:107)
> at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(
> ObjectMapper.java:3559)
> at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(
> ObjectMapper.java:2927)
> at org.apache.spark.rdd.RDDOperationScope.toJson(
> RDDOperationScope.scala:52)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:145)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:112)
>
>
>
> My dependencies are:
>
> libraryDependencies += "com.typesafe.scala-logging" %% "scala-logging-api" % "2.1.2"
> libraryDependencies += "com.typesafe.scala-logging" %% "scala-logging-slf4j" % "2.1.2"
> libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.2.3"
> libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"
> libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"
> libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.2.0"
> libraryDependencies += "com.typesafe" % "config" % "1.3.2"
> libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.4"
> libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.4" % "test"
> libraryDependencies += "org.scalamock" %% "scalamock" % "4.1.0" % "test"
> libraryDependencies += "com.jsuereth" %% "scala-arm" % "2.0"
> libraryDependencies += "com.github.scopt" %% "scopt" % "3.7.0"
> libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.3.8"
> libraryDependencies += "io.dropwizard.metrics" % "metrics-core" % "4.0.2"
> libraryDependencies += "com.typesafe.slick" %% "slick" % "3.2.1"
> libraryDependencies += "com.typesafe.slick" %% "slick-hikaricp" % "3.2.1"
> libraryDependencies += "com.typesafe.slick" %% "slick-extensions" % "3.0.0"
> libraryDependencies += "org.scalaz" %% "scalaz-core" % "7.2.19"
> libraryDependencies += "org.json4s" %% "json4s-native" % "3.5.3"
> libraryDependencies += "com.softwaremill.retry" %% "retry" % "0.3.0"
> libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.5.5"
> libraryDependencies += "org.apache.httpcomponents" % "httpcore" % "4.4.9"
>
>
> Sbt dependency tree shows jackson 2.6.5 coming from spark-core is in use.
> But per https://stackoverflow.com/questions/36982173/java-lang-
> nosuchfielderror-use-defaults-thrown-while-validating-json-schema-thr,
> spark is using jackson version before 2.6 causing "NoSuchFieldError:
> USE_DEFAULTS".
>
> I have done:
>
> 1. succeed to run the same application through spark-submit.
>
> 2. make sure the spark dependencies are 2.2.0 to be consistent with that
> in https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.
> 4/bk_release-notes/content/comp_versions.html.
>
> What else I missed? Appreciate any help!
>