You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "PengFei Li (Jira)" <ji...@apache.org> on 2019/12/24 07:34:00 UTC

[jira] [Comment Edited] (FLINK-15355) Nightly streaming file sink fails with unshaded hadoop

    [ https://issues.apache.org/jira/browse/FLINK-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002703#comment-17002703 ] 

PengFei Li edited comment on FLINK-15355 at 12/24/19 7:33 AM:
--------------------------------------------------------------

This test begins to fail after FLINK-11956 with commit 8ec545d56f007645ca8f2a2374386882132ffc7a. The name of this test is "e2e - misc - hadoop 2.8" whose profile contains "-Dinclude-hadoop -Dhadoop.version=2.8.3", and flink-shaded-hadoop-2-uber-2.8.3-9.0.jar will be put into lib directory.  After enable jvm option "-verbose:class" in jobmanager.sh, we can find that "org.apache.hadoop.conf.Configuration" is loaded from flink-shaded-hadoop-2-uber-2.8.3-9.0.jar rather than hadoop-s3. "Configuration#getTimeDuration(String name, String defaultValue, TimeUnit unit)" only exists in hadoop-3.1.0, so NoSuchMethodError is throwed. The purpose of filesystem plugin is to solve class conflicts, but it seems something doesn't work as expected. I'm not familiar with plugin implementation before, so need to take some time to find root cause. Hope for any feedback. [~arvid heise]

  


was (Author: banmoy):
This test begins to fail after [FLINK-11956|https://issues.apache.org/jira/browse/FLINK-11956] with commit 8ec545d56f007645ca8f2a2374386882132ffc7a. The name of this test is "e2e - misc - hadoop 2.8" whose profile contains "-Dinclude-hadoop -Dhadoop.version=2.8.3", and flink-shaded-hadoop-2-uber-2.8.3-9.0.jar will be put into lib directory.  After enable jvm option "-verbose:class" in jobmanager.sh, we can find that "org.apache.hadoop.conf.Configuration" is loaded from flink-shaded-hadoop-2-uber-2.8.3-9.0.jar rather than hadoop-s3. 

"Configuration#getTimeDuration(String name, String defaultValue, TimeUnit unit)" only exists in hadoop-3.1.0, so NoSuchMethodError is throwed. The purpose of filesystem plugin is to solve class conflicts, but it seems something doesn't work as expected. I'm not familiar with plugin implementation before, so need to take some time to find root cause. Hope for any feedback. [~arvid heise]

  

> Nightly streaming file sink fails with unshaded hadoop
> ------------------------------------------------------
>
>                 Key: FLINK-15355
>                 URL: https://issues.apache.org/jira/browse/FLINK-15355
>             Project: Flink
>          Issue Type: Bug
>          Components: FileSystems
>    Affects Versions: 1.10.0, 1.11.0
>            Reporter: Arvid Heise
>            Assignee: PengFei Li
>            Priority: Blocker
>             Fix For: 1.10.0
>
>
> {code:java}
> org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
>  at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>  at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>  at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>  at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>  at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>  at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>  at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
> Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
>  at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
>  at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1751)
>  at org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:94)
>  at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:63)
>  at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1628)
>  at StreamingFileSinkProgram.main(StreamingFileSinkProgram.java:77)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)
>  ... 11 more
> Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
>  at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>  at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
>  at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1746)
>  ... 20 more
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
>  at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:326)
>  at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>  at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
>  at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:274)
>  at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
>  at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
>  at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Internal server error., <Exception on server side:
> org.apache.flink.runtime.client.JobSubmissionException: Failed to submit job.
>  at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$internalSubmitJob$3(Dispatcher.java:336)
>  at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
>  at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>  at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>  at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>  at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
>  at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>  at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>  at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>  at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManager
>  at org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:36)
>  at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>  ... 6 more
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManager
>  at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:152)
>  at org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:84)
>  at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:379)
>  at org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:34)
>  ... 7 more
> Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getTimeDuration(Ljava/lang/String;Ljava/lang/String;Ljava/util/concurrent/TimeUnit;)J
>  at org.apache.hadoop.fs.s3a.S3ARetryPolicy.<init>(S3ARetryPolicy.java:113)
>  at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:257)
>  at org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory.create(AbstractS3FileSystemFactory.java:126)
>  at org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:61)
>  at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:441)
>  at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
>  at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
>  at org.apache.flink.runtime.state.memory.MemoryBackendCheckpointStorage.<init>(MemoryBackendCheckpointStorage.java:85)
>  at org.apache.flink.runtime.state.memory.MemoryStateBackend.createCheckpointStorage(MemoryStateBackend.java:295)
>  at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:279)
>  at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:205)
>  at org.apache.flink.runtime.executiongraph.ExecutionGraph.enableCheckpointing(ExecutionGraph.java:486)
>  at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:338)
>  at org.apache.flink.runtime.scheduler.SchedulerBase.createExecutionGraph(SchedulerBase.java:245)
>  at org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:217)
>  at org.apache.flink.runtime.scheduler.SchedulerBase.<init>(SchedulerBase.java:205)
>  at org.apache.flink.runtime.scheduler.DefaultScheduler.<init>(DefaultScheduler.java:119)
>  at org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:105)
>  at org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:278)
>  at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:266)
>  at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:98)
>  at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:40)
>  at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:146)
>  ... 10 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)