You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "Sherry Guo (JIRA)" <ji...@apache.org> on 2015/10/08 23:19:26 UTC

[jira] [Updated] (SLIDER-930) Incorporate Yarn feature of resetting AM failure count into Slider AM

     [ https://issues.apache.org/jira/browse/SLIDER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Guo updated SLIDER-930:
------------------------------
    Attachment: SLIDER-930-001.patch

> Incorporate Yarn feature of resetting AM failure count into Slider AM
> ---------------------------------------------------------------------
>
>                 Key: SLIDER-930
>                 URL: https://issues.apache.org/jira/browse/SLIDER-930
>             Project: Slider
>          Issue Type: Bug
>          Components: appmaster
>    Affects Versions: Slider 0.80
>            Reporter: Gour Saha
>            Assignee: thomas liu
>             Fix For: Slider 0.90
>
>         Attachments: SLIDER-930-001.patch
>
>
> YARN-611 provides this feature. Currently Slider apps are bound by the number set for yarn.resourcemanager.am.max-retries in the cluster. By default this value is set to 2, which is very low for long running services.
> Slider AM should use the feature provided in YARN-611 and set an interval after which the failure count will be reset to 0.
> I believe the API to call on ApplicationSubmissionContext is attemptFailuresValidityInterval. To start with Slider can set it to 5 mins which should be a reasonable default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)