You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2020/11/13 18:03:00 UTC

[jira] [Commented] (BEAM-9239) Dependency conflict with Spark using aws io

    [ https://issues.apache.org/jira/browse/BEAM-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231705#comment-17231705 ] 

Kenneth Knowles commented on BEAM-9239:
---------------------------------------

I suggest making a separate bug for Flink and just cross-referencing the bugs together. It is still not clear to me if this is solved by finding the best way to configure the backend or by some change we can make in Beam.

> Dependency conflict with Spark using aws io
> -------------------------------------------
>
>                 Key: BEAM-9239
>                 URL: https://issues.apache.org/jira/browse/BEAM-9239
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws, runner-spark
>    Affects Versions: 2.17.0
>            Reporter: David McIntosh
>            Priority: P1
>
> Starting with beam 2.17.0 I get this error in the Spark 2.4.4 driver when aws io is also used:
> {noformat}
> java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.jsontype.TypeSerializer.typeId(Ljava/lang/Object;Lcom/fasterxml/jackson/core/JsonToken;)Lcom/fasterxml/jackson/core/type/WritableTypeId;
> 	at org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:163)
> 	at org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:134)
> 	at com.fasterxml.jackson.databind.ser.impl.TypeWrappedSerializer.serialize(TypeWrappedSerializer.java:32)
> 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
> 	at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
> 	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:721)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:647)
> 	at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:635)
> 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
> 	at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
> 	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
> 	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.serializeToJson(SerializablePipelineOptions.java:67)
> 	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.<init>(SerializablePipelineOptions.java:43)
> 	at org.apache.beam.runners.spark.translation.EvaluationContext.<init>(EvaluationContext.java:71)
> 	at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:215)
> 	at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:90)
> {noformat}
> The cause seems to be that the Spark driver environment uses an older version of Jackson. I tried to update jackson on the Spark cluster but that led to several other errors. 
> The change that started causing this was:
> https://github.com/apache/beam/commit/b68d70a47b68ad84efcd9405c1799002739bd116
> After reverting that change I was able to successfully run my job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)