You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2017/10/23 08:59:00 UTC

[jira] [Commented] (FLINK-7882) Writing to S3 from EMR fails with exception

    [ https://issues.apache.org/jira/browse/FLINK-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214836#comment-16214836 ] 

Aljoscha Krettek commented on FLINK-7882:
-----------------------------------------

I can verify this, and I suspect it's caused by the exclusion of {{jaxb-core}} which is a transitive dependency of {{akka-camel}}. For some reason, the EMR connector only works when this is present but it's not present in the default Hadoop classpath.

> Writing to S3 from EMR fails with exception
> -------------------------------------------
>
>                 Key: FLINK-7882
>                 URL: https://issues.apache.org/jira/browse/FLINK-7882
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.4.0
>            Reporter: Gary Yao
>            Priority: Blocker
>             Fix For: 1.4.0
>
>
> Writing to S3 from EMR fails with exception:
> {code}
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:485)
> 	at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:215)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:449)
> 	at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
> 	at org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:89)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
> 	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389)
> 	at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:819)
> 	at org.apache.flink.client.CliFrontend.run(CliFrontend.java:282)
> 	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1071)
> 	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1118)
> 	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1115)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
> 	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> 	at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1115)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:923)
> 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:866)
> 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:866)
> 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> 	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.NoClassDefFoundError: com/sun/xml/bind/v2/model/impl/ModelBuilderI
> 	at java.lang.ClassLoader.defineClass1(Native Method)
> 	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> 	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> 	at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> 	at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 	at java.lang.ClassLoader.defineClass1(Native Method)
> 	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> 	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> 	at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> 	at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 	at com.sun.xml.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:444)
> 	at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:292)
> 	at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:139)
> 	at com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1138)
> 	at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:162)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:247)
> 	at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:234)
> 	at javax.xml.bind.ContextFinder.find(ContextFinder.java:441)
> 	at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:641)
> 	at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)
> 	at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.util.Base64.<clinit>(Base64.java:44)
> 	at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.util.BinaryUtils.fromBase64(BinaryUtils.java:71)
> 	at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1715)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.call.PutObjectCall.performCall(PutObjectCall.java:34)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.call.PutObjectCall.performCall(PutObjectCall.java:9)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.call.AbstractUploadingS3Call.perform(AbstractUploadingS3Call.java:62)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:80)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:176)
> 	at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.putObject(AmazonS3LiteClient.java:104)
> 	at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:165)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 	at com.sun.proxy.$Proxy26.storeFile(Unknown Source)
> 	at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem$NativeS3FsOutputStream.close(S3NativeFileSystem.java:364)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
> 	at org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream.close(HadoopDataOutputStream.java:52)
> 	at org.apache.flink.core.fs.ClosingFSDataOutputStream.close(ClosingFSDataOutputStream.java:64)
> 	at org.apache.flink.api.common.io.FileOutputFormat.close(FileOutputFormat.java:267)
> 	at org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.close(OutputFormatSinkFunction.java:93)
> 	at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
> 	at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.close(AbstractUdfStreamOperator.java:109)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:394)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:281)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: com.sun.xml.bind.v2.model.impl.ModelBuilderI
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 	... 68 more
> {code}
> Git bisect between commits {{84a07a34ac22af14f2dd0319447ca5f45de6d0bb}} (good) and {{c81a6db44817ce818c949c2fd55ebfc2af0cc913}} (bad)
> identified {{5a5006ceb8d19bc0f3cc490451a18b8fc21197cb}} to be the first bad commit.
> Command to start the job:
> {code}
> HADOOP_CONF_DIR=/etc/hadoop/conf bin/flink run -m yarn-cluster -yn 1 examples/streaming/WordCount.jar --output s3://mybucket/out --input s3://mybucket/input
> {code}
> EMR release label: emr-5.9.0
> Hadoop distribution: Amazon 2.7.3 
> Commands used to compile Flink:
> {code}
> mvn clean install -Pdocs-and-source -DskipTests -Dhadoop.version=2.7.3
> cd flink-dist
> mvn clean install -Pdocs-and-source -DskipTests -Dhadoop.version=2.7.3
> {code}
> Java and Maven version used to compile Flink:
> {code}
> java -version
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (Zulu 8.23.0.3-macosx) (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (Zulu 8.23.0.3-macosx) (build 25.144-b01, mixed mode)
> {code}
> {code}
> mvn -version
> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T17:41:47+01:00)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)