You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org> on 2017/02/28 11:39:45 UTC

[jira] [Commented] (FLINK-4816) Executions failed from "DEPLOYING" should retain restored checkpoint information

    [ https://issues.apache.org/jira/browse/FLINK-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887850#comment-15887850 ] 

ramkrishna.s.vasudevan commented on FLINK-4816:
-----------------------------------------------

Going through the code, will it better that in the Checkpointcoordinator when we assign the restore checkpoint state to the execution job vertices, can we just set the latest checkpoint id into the vertices?
So when we call fail() on the Exceution, and we find that the job vertex has a non negative checkpoint ID, we could wrap the throwable with RestoreTaskException along with the checkpoint id and if the job vertex has a non negative ID then wrap it with just DeployTaskException. 
Ping [~StephanEwen]?

> Executions failed from "DEPLOYING" should retain restored checkpoint information
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-4816
>                 URL: https://issues.apache.org/jira/browse/FLINK-4816
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Distributed Coordination
>            Reporter: Stephan Ewen
>
> When an execution fails from state {{DEPLOYING}}, it should wrap the failure to better report the failure cause:
>   - If no checkpoint was restored, it should wrap the exception in a {{DeployTaskException}}
>   - If a checkpoint was restored, it should wrap the exception in a {{RestoreTaskException}} and record the id of the checkpoint that was attempted to be restored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)