You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/03/08 18:25:00 UTC

[jira] [Work logged] (BEAM-10430) Can't run WordCount on EMR With Flink Runner via YARN

     [ https://issues.apache.org/jira/browse/BEAM-10430?focusedWorklogId=738253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-738253 ]

ASF GitHub Bot logged work on BEAM-10430:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Mar/22 18:24
            Start Date: 08/Mar/22 18:24
    Worklog Time Spent: 10m 
      Work Description: VictorPlusC edited a comment on pull request #14953:
URL: https://github.com/apache/beam/pull/14953#issuecomment-1062074876


   Hi folks,
   
   I am currently working on enabling a feature that relies on a 2.0 Dataproc image ([BEAM-13973](https://issues.apache.org/jira/browse/BEAM-13973)). I am looking to enable Interactive Beam to have the capability of creating a Dataproc cluster and sending a Flink job to it. For such a job to run successfully though, the dependencies listed in this PR are necessary. For this feature, I am using a 2.0 image because the 1.5 Dataproc images all use Flink 1.9.3, and it appears that Flink 1.9 has been deprecated for nearly a year now.
   
   Would there be any other potential workarounds that we can add into Beam to have Flink work on Dataproc? Would it be suitable to add these dependencies for now and label them with a ticket addressing this behavior with Dataproc and EMR?
   
   Thanks in advance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 738253)
    Time Spent: 5h 20m  (was: 5h 10m)

> Can't run WordCount on EMR With Flink Runner via YARN
> -----------------------------------------------------
>
>                 Key: BEAM-10430
>                 URL: https://issues.apache.org/jira/browse/BEAM-10430
>             Project: Beam
>          Issue Type: Bug
>          Components: examples-java, runner-flink
>    Affects Versions: 2.22.0
>         Environment: AWS EMR 5.30.0 running Spark 2.4.5, Flink 1.10.0
>            Reporter: Shashi
>            Assignee: Etienne Chauchot
>            Priority: P3
>              Labels: Clarified
>             Fix For: Missing
>
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> 1) I setup WordCount project as detailed on Beam website..
>  {{mvn archetype:generate \
>       -DarchetypeGroupId=org.apache.beam \
>       -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
>       -DarchetypeVersion=2.22.0 \
>       -DgroupId=org.example \
>       -DartifactId=word-count-beam \
>       -Dversion="0.1" \
>       -Dpackage=org.apache.beam.examples \
>       -DinteractiveMode=false}}
> 2) mvn clean package -Pflink-runner
> 3) Ran the application on AWS EMR 5.30.0 with Flink 1.10.0
> flink run -m yarn-cluster -yid <yarn_application_id> -p 4  -c org.apache.beam.examples.WordCount word-count-beam-bundled-0.1.jar –runner=FlinkRunner --inputFile <path_in_s3_of_input_file> --output <path_in_s3_of_output_dir>
> 4) Launch failed with the following exception stack trace 
> java.util.ServiceConfigurationError: com.fasterxml.jackson.databind.Module: Provider com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype
>  at java.util.ServiceLoader.fail(ServiceLoader.java:239)
>  at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
>  at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376)
>  at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
>  at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
>  at com.fasterxml.jackson.databind.ObjectMapper.findModules(ObjectMapper.java:1054)
>  at org.apache.beam.sdk.options.PipelineOptionsFactory.<clinit>(PipelineOptionsFactory.java:471)
>  at org.apache.beam.examples.WordCount.main(WordCount.java:190)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)
>  at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>  at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>  at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>  at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>  at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>  at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)