You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2016/04/02 23:27:25 UTC

[jira] [Updated] (TEZ-3161) Allow task to report different kinds of errors - fatal / kill

     [ https://issues.apache.org/jira/browse/TEZ-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Siddharth Seth updated TEZ-3161:
--------------------------------
    Attachment: TEZ-3161.4.txt

Updated patch with the following changes.
- FailureType renamed to TaskFailureType
- Have retained the APIs introduced in the patch. The existing API is going to get confusing otherwise. Added specific javadocs on fatalError explaining the behaviour, along with deprecation. This seems like the least confusing to me.
- Marked killSlef as private
- Renamed unsuccessfulEnd to taskFailureType
- Added writing to history. Is there some place that ATS data is being read back as well ? I couldn't find that.
- Changed the TaskImpl log line to be easier to understand

bq. Wouldnt there be only one specific termination cause to indicate that the user-code told the framework to abort itself or kill itself?
The TaskAttemptEndReason is set based on which component reported the error - Input / Processor / Output - at least from the task. There's a bunch of other EndReasons which are independent of this. FailureType would now indicate the FailureType on top of whatever EndReason is set.

> Allow task to report different kinds of errors - fatal / kill
> -------------------------------------------------------------
>
>                 Key: TEZ-3161
>                 URL: https://issues.apache.org/jira/browse/TEZ-3161
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-3161.1.txt, TEZ-3161.2.txt, TEZ-3161.3.txt, TEZ-3161.4.txt
>
>
> In some cases, task failures will be the same across all attempts - e.g. exceeding memory utilization on an operation. In this case, there's no point in running another attempt of the same task.
> There's other cases where a task may want to mark itself as KILLED - i.e. a temporary error. An example of this is pipelined shuffle.
> Tez should allow both operations.
> cc [~vikram.dixit], [~rajesh.balamohan]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)