You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2014/07/10 03:53:04 UTC

[jira] [Updated] (TEZ-657) Tez should process the Container exit status - specifically when the RM preempts a container

     [ https://issues.apache.org/jira/browse/TEZ-657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bikas Saha updated TEZ-657:
---------------------------

    Attachment: TEZ-657.1.patch

TaskSchedulerEventHandler sets the appropriate info in AMContainerEventCompleted.
When AMContainer gets the event then it checks for preempted and diskfailed and sends appropriate events to TaskAttempt.
Renamed TaskAttemptEventType.PREEMPTED to TERMINATED_BY_SYSTEM for generic system terminations instead of duplicating PREEMPTION and DISK_FAILED. If needed the actual status can be passed via the event later on.
Added tests.

There is no need to fail nodes on disk failure because the NM will remain usable with the remaining disks. If too many disks fail, NM will mark itself unhealthy and we handle that already.
[~sseth] please review.

> Tez should process the Container exit status - specifically when the RM preempts a container
> --------------------------------------------------------------------------------------------
>
>                 Key: TEZ-657
>                 URL: https://issues.apache.org/jira/browse/TEZ-657
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: TEZ-657.1.patch
>
>
> Containers preempted by the RM will currently register as task failures - these tasks should be considered to be KILLED instead.
> Handling the entire preemption hint logic would be a separate jira.



--
This message was sent by Atlassian JIRA
(v6.2#6252)