You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/10/01 09:36:00 UTC

[jira] [Commented] (FLINK-10312) Wrong / missing exception when submitting job

    [ https://issues.apache.org/jira/browse/FLINK-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633771#comment-16633771 ] 

ASF GitHub Bot commented on FLINK-10312:
----------------------------------------

zentol commented on a change in pull request #6731: [FLINK-10312] Propagate exception from server to client in REST API
URL: https://github.com/apache/flink/pull/6731#discussion_r221546026
 
 

 ##########
 File path: flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java
 ##########
 @@ -207,12 +210,10 @@
 								resultFuture.whenComplete(
 									(innerT, innerThrowable) -> scheduledFuture.cancel(false));
 							} else {
-								final String errorMsg = retries == 0 ?
-									"Number of retries has been exhausted." :
-									"Exception is not retryable.";
-								resultFuture.completeExceptionally(new RetryException(
-									"Could not complete the operation. " + errorMsg,
-									throwable));
+								RetryException retryException = new RetryException(
+									"Could not complete the operation: number of retries has been exhausted.",
 
 Review comment:
   I would stick with the original formatting: `Could not complete the operation. Number of retries has been exhausted.`
   
   When seeing a `:` I would expect a reason as to _why_ it couldn't be completed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Wrong / missing exception when submitting job
> ---------------------------------------------
>
>                 Key: FLINK-10312
>                 URL: https://issues.apache.org/jira/browse/FLINK-10312
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.5.2, 1.6.0
>            Reporter: Stephan Ewen
>            Assignee: Andrey Zagrebin
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.7.0, 1.6.2, 1.5.5
>
>         Attachments: lmerge-TR.pdf
>
>
> h3. Problem
> When submitting a job that cannot be created / initialized on the JobManager, there is no proper error message. The exception says *"Could not retrieve the execution result. (JobID: 5a7165e1260c6316fa11d2760bd3d49f)"*
> h3. Steps to Reproduce
> Create a streaming job, set a state backend with a non existing file system scheme.
> h3. Full Stack Trace
> {code}
> Submitting a job where instantiation on the JM fails yields this, which seems like a major regression from seeing the actual exception:
> org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: 5a7165e1260c6316fa11d2760bd3d49f)
> 	at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:260)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
> 	at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
> 	at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1511)
> 	at com.dataartisans.streamledger.examples.simpletrade.SimpleTradeExample.main(SimpleTradeExample.java:98)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:497)
> 	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
> 	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
> 	at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
> 	at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
> 	at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
> 	at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
> 	at org.apache.flink.client.cli.CliFrontend.lambda$main$16(CliFrontend.java:1120)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> 	at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
> 	at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$25(RestClusterClient.java:379)
> 	at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> 	at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
> 	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> 	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
> 	at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$32(FutureUtils.java:213)
> 	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> 	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> 	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> 	at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
> 	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
> 	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
> 	at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
> 	at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
> 	at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
> 	at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
> 	... 12 more
> Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.
> 	... 10 more
> Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
> 	at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
> 	at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
> 	at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
> 	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
> 	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
> 	... 4 more
> Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
> 	at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:310)
> 	at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$364(RestClient.java:294)
> 	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
> 	... 5 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)