You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/07/02 12:23:00 UTC

[jira] [Commented] (FLINK-8785) JobSubmitHandler does not handle JobSubmissionExceptions

    [ https://issues.apache.org/jira/browse/FLINK-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529799#comment-16529799 ] 

ASF GitHub Bot commented on FLINK-8785:
---------------------------------------

Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/6222#discussion_r199477966
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobSubmitHandler.java ---
    @@ -66,6 +67,9 @@ public JobSubmitHandler(
     		}
     
     		return gateway.submitJob(jobGraph, timeout)
    -			.thenApply(ack -> new JobSubmitResponseBody("/jobs/" + jobGraph.getJobID()));
    +			.thenApply(ack -> new JobSubmitResponseBody("/jobs/" + jobGraph.getJobID()))
    +			.exceptionally(exception -> {
    +				throw new CompletionException(new RestHandlerException("Job submission failed.", HttpResponseStatus.INTERNAL_SERVER_ERROR, exception));
    --- End diff --
    
    well, there's no doubt that it _could_ be helpful; my point is that it can be _harmful_ if not done properly.
    
    The `submitJob` should either provide the `JobSubmitHandler` with means to detect these exceptions and create adequate responses, or explicitly throw exceptions with messages that we can safely pass on to users.
    
    That said, I do not know how to do either of these things in a good way. 😞 
    
    For completeness sake, here are ideas that came to mind:
    
    ## 1
    Introduce a special `FlinkUserFacingException` that we "trust" to contain a good error message.
    
    Con: This provides little additional safety and will never provide proper HTTP response code.
    
    ## 2
    Introduce dedicated exceptions for the scenarios that you listed and explicitly look for them in the `exceptionally` block, i.e
    ```
    .exceptionally(exception -> {
    	if (exception instanceof JobAlreadyExistsException) {
    		throw new CompletionException(new RestHandlerException("Job already exists.", HttpResponseStatus.BAD_REQUEST, exception));
    	} else {
    		throw new CompletionException(new RestHandlerException("Job submission failed.", HttpResponseStatus.INTERNAL_SERVER_ERROR, exception));
    	}
    }
    ```
    
    Con: Obviously, this approach is inherently flawed as there is no guarantee that a given exception can be thrown; we would have to manually keep it in sync with the actual implementation because `CompletableFuture` throw a wrench into sane exception handling. 😡 
    
    ## 3
    Encode possible user-facing exceptions in the return value of `submitJob`, i.e. return a `AckOrException`
    ```
    public class AckOrException {
    	// holds exception, could also be a series of nullable fields
    	private final SuperEither<ExceptionA, ExceptionB, ExceptionC> exception; 
    	...
    	public void throwIfError() throws ExceptionA, ExceptionB, ExceptionC;
    }
    ```
    Con: Relies on users to call `throwIfError` and introduces an entirely separate channel for passing errors, but it would allow exception matching.


> JobSubmitHandler does not handle JobSubmissionExceptions
> --------------------------------------------------------
>
>                 Key: FLINK-8785
>                 URL: https://issues.apache.org/jira/browse/FLINK-8785
>             Project: Flink
>          Issue Type: Bug
>          Components: Job-Submission, JobManager, REST
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Blocker
>              Labels: flip-6, pull-request-available
>
> If the job submission, i.e. {{DispatcherGateway#submitJob}} fails with a {{JobSubmissionException}} the {{JobSubmissionHandler}} returns "Internal server error" instead of signaling the failed job submission.
> This can for example occur if the transmitted execution graph is faulty, as tested by the \{{JobSubmissionFailsITCase}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)