You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kai Jiang (Jira)" <ji...@apache.org> on 2020/05/30 00:24:00 UTC

[jira] [Comment Edited] (BEAM-9239) Dependency conflict with Spark using aws io

    [ https://issues.apache.org/jira/browse/BEAM-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120035#comment-17120035 ] 

Kai Jiang edited comment on BEAM-9239 at 5/30/20, 12:23 AM:
------------------------------------------------------------

Spark has two experimental configs `spark.driver.userClassPathFirst` and `spark.executor.userClassPathFirst`. If set both of these as true, it allows user-added jars precedence over Spark's own jars when loading classes in cluster mode.


was (Author: vectorijk):
Spark has two experimental configs `spark.driver.userClassPathFirst` and `spark.executor.userClassPathFirst`. It allows user-added jars precedence over Spark's own jars when loading classes in cluster mode.

> Dependency conflict with Spark using aws io
> -------------------------------------------
>
>                 Key: BEAM-9239
>                 URL: https://issues.apache.org/jira/browse/BEAM-9239
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws, runner-spark
>    Affects Versions: 2.17.0
>            Reporter: David McIntosh
>            Priority: P1
>
> Starting with beam 2.17.0 I get this error in the Spark 2.4.4 driver when aws io is also used:
> {noformat}
> java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.jsontype.TypeSerializer.typeId(Ljava/lang/Object;Lcom/fasterxml/jackson/core/JsonToken;)Lcom/fasterxml/jackson/core/type/WritableTypeId;
> 	at org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:163)
> 	at org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:134)
> 	at com.fasterxml.jackson.databind.ser.impl.TypeWrappedSerializer.serialize(TypeWrappedSerializer.java:32)
> 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
> 	at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
> 	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:721)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:647)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:635)
> 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
> 	at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
> 	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
> 	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.serializeToJson(SerializablePipelineOptions.java:67)
> 	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.<init>(SerializablePipelineOptions.java:43)
> 	at org.apache.beam.runners.spark.translation.EvaluationContext.<init>(EvaluationContext.java:71)
> 	at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:215)
> 	at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:90)
> {noformat}
> The cause seems to be that the Spark driver environment uses an older version of Jackson. I tried to update jackson on the Spark cluster but that led to several other errors. 
> The change that started causing this was:
> https://github.com/apache/beam/commit/b68d70a47b68ad84efcd9405c1799002739bd116
> After reverting that change I was able to successfully run my job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)