You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/04/29 00:13:15 UTC

[jira] [Updated] (SAMZA-117) Jobs that end due to failed container count should be marked as failed rather than finished

     [ https://issues.apache.org/jira/browse/SAMZA-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini updated SAMZA-117:
----------------------------------

    Attachment: Screen Shot 2014-04-28 at 3.10.46 PM.png

Dug into this a bit more. I don't think this is possible. The YARN RM has two fields: "state" and "final status". It looks like the "state" field only  marks failed when the AM can't even start up. Once the AM starts up, it's always marked as finished. The "final status" is controlled by the AM, and allows it to specify the final status before shutting down.

We can't simply throw an exception in the AM as a hack to make YARN think the "state" should be marked "failed" because this would result in the AM being retried by the NM, which we don't want--we want to stop the job flat out immediately.

Attaching a screenshot that shows the difference.

OK if I mark this one as won't fix?

> Jobs that end due to failed container count should be marked as failed rather than finished
> -------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-117
>                 URL: https://issues.apache.org/jira/browse/SAMZA-117
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jakob Homan
>         Attachments: Screen Shot 2014-04-28 at 3.10.46 PM.png
>
>
> Currently if a job ends because of too many failed containers within a specific time, the YARN job correctly ends but is marked as "finished."  It would be more accurate to consider these jobs failed and set their status as such.



--
This message was sent by Atlassian JIRA
(v6.2#6252)