You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/10/01 09:58:00 UTC

[jira] [Updated] (TEZ-4139) Tez should consider node information for computing failure fraction - downstream problems

     [ https://issues.apache.org/jira/browse/TEZ-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated TEZ-4139:
------------------------------
    Summary: Tez should consider node information for computing failure fraction - downstream problems  (was: Tez should consider node information for computing failure fraction)

> Tez should consider node information for computing failure fraction - downstream problems
> -----------------------------------------------------------------------------------------
>
>                 Key: TEZ-4139
>                 URL: https://issues.apache.org/jira/browse/TEZ-4139
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: TEZ-4139.01.WIP.patch, TEZ-4139.02.WIP.patch
>
>
> When lots of downstream attempts fail to pull the information from source task, source task is marked as failed and it is retried. Currently failure fraction is handled by looking at unique task attempts from downstream. However, it should consider taking into account node information for computing "failureFraction".
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java#L1845-L1849



--
This message was sent by Atlassian Jira
(v8.3.4#803005)