You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (JIRA)" <ji...@apache.org> on 2018/11/18 12:44:10 UTC

[jira] [Updated] (FLINK-10573) Support task revocation

     [ https://issues.apache.org/jira/browse/FLINK-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till Rohrmann updated FLINK-10573:
----------------------------------
    Fix Version/s: 1.8.0

> Support task revocation
> -----------------------
>
>                 Key: FLINK-10573
>                 URL: https://issues.apache.org/jira/browse/FLINK-10573
>             Project: Flink
>          Issue Type: Sub-task
>          Components: JobManager
>            Reporter: JIN SUN
>            Assignee: JIN SUN
>            Priority: Major
>             Fix For: 1.7.0, 1.8.0
>
>
> In Batch Mode, When a downstream task has a partition missing failure, which indicate the output of upstream task has been lost. To make the job success we need to rerun the upstream task to reproduce the data, which we call task revocation (revoke the success of upstream task)
> For revocation, we need to identify the partition missing issue, and it is better to detect the missing partition accurately:
>  * Ideally, it makes things much easier if we get a specific exception indicating that the data source is missing
>  * When a task got an IOException, it doesn’t mean the source data has issues. It might also be related to target task, such as that the target task has network issues.
>  * If multiple tasks cannot read the same source, it is highly likely the source data is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)