You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (Jira)" <ji...@apache.org> on 2020/01/31 10:31:00 UTC
[jira] [Closed] (FLINK-8068) Failed single-job Flink cluster on
YARN shows as SUCCEEDED
[ https://issues.apache.org/jira/browse/FLINK-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann closed FLINK-8068.
--------------------------------
Resolution: Invalid
Should no longer be a problem.
> Failed single-job Flink cluster on YARN shows as SUCCEEDED
> ----------------------------------------------------------
>
> Key: FLINK-8068
> URL: https://issues.apache.org/jira/browse/FLINK-8068
> Project: Flink
> Issue Type: Bug
> Components: Deployment / YARN
> Affects Versions: 1.4.0
> Reporter: Nico Kruber
> Priority: Major
>
> A single-job Flink cluster with a failing program, e.g. a non-existing input file as in this word count example {{flink run -m yarn-cluster -yn 2 -ys 1 -yjm 768 -ytm 1024 ./examples/batch/WordCount.jar --input foobar}}, will be shown by YARN as {{SUCCEEDED}}. This was fixed in the past by FLINK-2226 but apparently turned up again.
> FYI: Flink's CLI will have a non-zero exit status and will also correctly report the error in the output, such as:
> {code}
> ------------------------------------------------------------
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Failed to submit job eede3007fc880de24e4ad24f8de3d4a6 (Flink Java Job at Tue Nov 14 10:09:27 UTC 2017)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:492)
> at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:215)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:444)
> at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
> at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:815)
> at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
> at org.apache.flink.api.java.DataSet.print(DataSet.java:1652)
> at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:89)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
> at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:396)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:802)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:282)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1054)
> at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1101)
> at org.apache.flink.client.CliFrontend$1.call(CliFrontend.java:1098)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1098)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Failed to submit job eede3007fc880de24e4ad24f8de3d4a6 (Flink Java Job at Tue Nov 14 10:09:27 UTC 2017)
> at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1325)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:447)
> at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at org.apache.flink.runtime.clusterframework.ContaineredJobManager$$anonfun$handleContainerMessage$1.applyOrElse(ContaineredJobManager.scala:107)
> at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
> at org.apache.flink.yarn.YarnJobManager$$anonfun$handleYarnShutdown$1.applyOrElse(YarnJobManager.scala:110)
> at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
> at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:38)
> at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
> at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> at akka.actor.ActorCell.invoke(ActorCell.scala:495)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> at akka.dispatch.Mailbox.run(Mailbox.scala:224)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: File foobar does not exist or the user running Flink ('yarn') has insufficient permissions to access it.
> at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:262)
> at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:801)
> at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:180)
> at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1277)
> ... 23 more
> Caused by: java.io.FileNotFoundException: File foobar does not exist or the user running Flink ('yarn') has insufficient permissions to access it.
> at org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:114)
> at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:473)
> at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:62)
> at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:248)
> ... 26 more
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)