You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Jesse Lord <jl...@vectra.ai> on 2020/06/22 16:01:51 UTC

Flink/Portable Runner error on AWS EMR

I am trying to run the wordcount quickstart example on a flink cluster on AWS EMR. Beam version 2.22, Flink 1.10.

I get the following error:

ERROR:root:java.util.ServiceConfigurationError: com.fasterxml.jackson.databind.Module: Provider com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype

This happens with both the portable runner (using python SDK) and the classic flink runner using the quickstart maven project.

I think this error relates to this issue: https://issues.apache.org/jira/browse/BEAM-9239. Based on the comments from this issue I tried adjusting parameters for whether flink prioritizes loading child (user) jars or parent (flink) jars in the classpath but it did not resolve the issue.

Looking for any suggestions that might help as a workaround and wondering if I should open a new jira issue or only add my comment to the existing issue (which I have already done).

Thanks,
Jesse

Re: Flink/Portable Runner error on AWS EMR

Posted by Jesse Lord <jl...@vectra.ai>.
Hi Max,

Thanks for the help. I will certainly look into shading the library, but my setup is very straightforward if someone is interested in reproducing this issue.

I started and AWS EMR cluster version 5.30 with flink 1.10.0 installed. I created the beam wordcount example project using maven (beam version 2.22). Finally, I started a flink yarn session and ran the example project connected to the flink master.

If it is easier to see exact code examples to re-create from scratch I would be happy to provide them.

Thanks,
Jesse

On 6/24/20, 4:01 AM, "Maximilian Michels" <mx...@apache.org> wrote:

    Hi Jesse,

    This is hard to debug without knowing more about the setup. You have
    conflicting versions of the Jackson library. One is present in Beam, one
    may be loaded by the AWS setup. You mentioned in the issue that using
    parent-child first classloading did not resolve the issue.

    I'd suggest to relocate (shade) Jackson in the Jar you submit to AWS.

    -Max

    On 23.06.20 17:22, Jesse Lord wrote:
    > Hi Max,
    > 
    > The error message shows up in the flink web UI, so I think it must be reaching the cluster. 
    > 
    > For the portable runner the error is listed in the docker container output as well but I assume that is just receiving the error message from the flink cluster.
    > 
    > Thanks,
    > Jesse
    > 
    > On 6/23/20, 8:11 AM, "Maximilian Michels" <mx...@apache.org> wrote:
    > 
    >     Hey Jesse,
    > 
    >     Could you share the context of the error? Where does it occur? In the
    >     client code or on the cluster?
    > 
    >     Cheers,
    >     Max
    > 
    >     On 22.06.20 18:01, Jesse Lord wrote:
    >     > I am trying to run the wordcount quickstart example on a flink cluster
    >     > on AWS EMR. Beam version 2.22, Flink 1.10.
    >     > 
    >     >  
    >     > 
    >     > I get the following error:
    >     > 
    >     >  
    >     > 
    >     > ERROR:root:java.util.ServiceConfigurationError:
    >     > com.fasterxml.jackson.databind.Module: Provider
    >     > com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype
    >     > 
    >     >  
    >     > 
    >     > This happens with both the portable runner (using python SDK) and the
    >     > classic flink runner using the quickstart maven project.
    >     > 
    >     >  
    >     > 
    >     > I think this error relates to this issue:
    >     > https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-9239&amp;data=02%7C01%7Cjlord%40vectra.ai%7C516cc4f79f0e46cf966008d8182dee7f%7Ca6cc66bcf41945c2a9c28ff4ab685f2d%7C1%7C0%7C637285932846323904&amp;sdata=HwkC7phfykOsJ8kS0Vz6STc9LbaCFeHmQPHZXbp1zng%3D&amp;reserved=0. Based on the comments
    >     > from this issue I tried adjusting parameters for whether flink
    >     > prioritizes loading child (user) jars or parent (flink) jars in the
    >     > classpath but it did not resolve the issue.
    >     > 
    >     >  
    >     > 
    >     > Looking for any suggestions that might help as a workaround and
    >     > wondering if I should open a new jira issue or only add my comment to
    >     > the existing issue (which I have already done).
    >     > 
    >     >  
    >     > 
    >     > Thanks,
    >     > 
    >     > Jesse
    >     > 
    > 


Re: Flink/Portable Runner error on AWS EMR

Posted by Maximilian Michels <mx...@apache.org>.
Hi Jesse,

This is hard to debug without knowing more about the setup. You have
conflicting versions of the Jackson library. One is present in Beam, one
may be loaded by the AWS setup. You mentioned in the issue that using
parent-child first classloading did not resolve the issue.

I'd suggest to relocate (shade) Jackson in the Jar you submit to AWS.

-Max

On 23.06.20 17:22, Jesse Lord wrote:
> Hi Max,
> 
> The error message shows up in the flink web UI, so I think it must be reaching the cluster. 
> 
> For the portable runner the error is listed in the docker container output as well but I assume that is just receiving the error message from the flink cluster.
> 
> Thanks,
> Jesse
> 
> On 6/23/20, 8:11 AM, "Maximilian Michels" <mx...@apache.org> wrote:
> 
>     Hey Jesse,
> 
>     Could you share the context of the error? Where does it occur? In the
>     client code or on the cluster?
> 
>     Cheers,
>     Max
> 
>     On 22.06.20 18:01, Jesse Lord wrote:
>     > I am trying to run the wordcount quickstart example on a flink cluster
>     > on AWS EMR. Beam version 2.22, Flink 1.10.
>     > 
>     >  
>     > 
>     > I get the following error:
>     > 
>     >  
>     > 
>     > ERROR:root:java.util.ServiceConfigurationError:
>     > com.fasterxml.jackson.databind.Module: Provider
>     > com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype
>     > 
>     >  
>     > 
>     > This happens with both the portable runner (using python SDK) and the
>     > classic flink runner using the quickstart maven project.
>     > 
>     >  
>     > 
>     > I think this error relates to this issue:
>     > https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-9239&amp;data=02%7C01%7Cjlord%40vectra.ai%7C1b00ebd9f3e84f55d76708d81787c205%7Ca6cc66bcf41945c2a9c28ff4ab685f2d%7C1%7C0%7C637285219131055724&amp;sdata=GqQ6xx%2Fuowa6DWy%2FCvdDw4IA4C%2FvnM0Yaj6%2Fpqnnric%3D&amp;reserved=0. Based on the comments
>     > from this issue I tried adjusting parameters for whether flink
>     > prioritizes loading child (user) jars or parent (flink) jars in the
>     > classpath but it did not resolve the issue.
>     > 
>     >  
>     > 
>     > Looking for any suggestions that might help as a workaround and
>     > wondering if I should open a new jira issue or only add my comment to
>     > the existing issue (which I have already done).
>     > 
>     >  
>     > 
>     > Thanks,
>     > 
>     > Jesse
>     > 
> 

Re: Flink/Portable Runner error on AWS EMR

Posted by Jesse Lord <jl...@vectra.ai>.
Hi Max,

The error message shows up in the flink web UI, so I think it must be reaching the cluster. 

For the portable runner the error is listed in the docker container output as well but I assume that is just receiving the error message from the flink cluster.

Thanks,
Jesse

On 6/23/20, 8:11 AM, "Maximilian Michels" <mx...@apache.org> wrote:

    Hey Jesse,

    Could you share the context of the error? Where does it occur? In the
    client code or on the cluster?

    Cheers,
    Max

    On 22.06.20 18:01, Jesse Lord wrote:
    > I am trying to run the wordcount quickstart example on a flink cluster
    > on AWS EMR. Beam version 2.22, Flink 1.10.
    > 
    >  
    > 
    > I get the following error:
    > 
    >  
    > 
    > ERROR:root:java.util.ServiceConfigurationError:
    > com.fasterxml.jackson.databind.Module: Provider
    > com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype
    > 
    >  
    > 
    > This happens with both the portable runner (using python SDK) and the
    > classic flink runner using the quickstart maven project.
    > 
    >  
    > 
    > I think this error relates to this issue:
    > https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-9239&amp;data=02%7C01%7Cjlord%40vectra.ai%7C1b00ebd9f3e84f55d76708d81787c205%7Ca6cc66bcf41945c2a9c28ff4ab685f2d%7C1%7C0%7C637285219131055724&amp;sdata=GqQ6xx%2Fuowa6DWy%2FCvdDw4IA4C%2FvnM0Yaj6%2Fpqnnric%3D&amp;reserved=0. Based on the comments
    > from this issue I tried adjusting parameters for whether flink
    > prioritizes loading child (user) jars or parent (flink) jars in the
    > classpath but it did not resolve the issue.
    > 
    >  
    > 
    > Looking for any suggestions that might help as a workaround and
    > wondering if I should open a new jira issue or only add my comment to
    > the existing issue (which I have already done).
    > 
    >  
    > 
    > Thanks,
    > 
    > Jesse
    > 


Re: Flink/Portable Runner error on AWS EMR

Posted by Maximilian Michels <mx...@apache.org>.
Hey Jesse,

Could you share the context of the error? Where does it occur? In the
client code or on the cluster?

Cheers,
Max

On 22.06.20 18:01, Jesse Lord wrote:
> I am trying to run the wordcount quickstart example on a flink cluster
> on AWS EMR. Beam version 2.22, Flink 1.10.
> 
>  
> 
> I get the following error:
> 
>  
> 
> ERROR:root:java.util.ServiceConfigurationError:
> com.fasterxml.jackson.databind.Module: Provider
> com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype
> 
>  
> 
> This happens with both the portable runner (using python SDK) and the
> classic flink runner using the quickstart maven project.
> 
>  
> 
> I think this error relates to this issue:
> https://issues.apache.org/jira/browse/BEAM-9239. Based on the comments
> from this issue I tried adjusting parameters for whether flink
> prioritizes loading child (user) jars or parent (flink) jars in the
> classpath but it did not resolve the issue.
> 
>  
> 
> Looking for any suggestions that might help as a workaround and
> wondering if I should open a new jira issue or only add my comment to
> the existing issue (which I have already done).
> 
>  
> 
> Thanks,
> 
> Jesse
>