You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Zhiyuan Yang (JIRA)" <ji...@apache.org> on 2016/08/15 22:05:20 UTC

[jira] [Comment Edited] (TEZ-3397) Better fault tolerance heuristics for custom edge

    [ https://issues.apache.org/jira/browse/TEZ-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421761#comment-15421761 ] 

Zhiyuan Yang edited comment on TEZ-3397 at 8/15/16 10:05 PM:
-------------------------------------------------------------

Close this for now. Currently if a destination task keep reporting error for more than configured time limit, the re-execution of source task will be performed. This is good enough for now.


was (Author: aplusplus):
Close this for now. Currently if a destination task keep reporting error for more than a time limit, the re-execution of source task will be performed. This is good enough for now.

> Better fault tolerance heuristics for custom edge
> -------------------------------------------------
>
>                 Key: TEZ-3397
>                 URL: https://issues.apache.org/jira/browse/TEZ-3397
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>
> Today, a source task calculates failure fraction by dividing number of unique destination tasks that report failure by number of destination tasks that depend on this source task. A better way is to divide number of destination tasks that report failure by number of *unfinished* destination tasks that depend on the source task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)