You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2014/04/30 16:29:17 UTC
[jira] [Commented] (YARN-2005) Blacklisting support for scheduling
AMs
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985575#comment-13985575 ]
Jason Lowe commented on YARN-2005:
----------------------------------
This is particularly helpful on a busy cluster where one node happens to be in a state where it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state. In that scenario the only place with spare capacity is a node that fails every container attempt, and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't working.
In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios where a particular app's AMs end up failing on certain nodes but other containers run just fine.
> Blacklisting support for scheduling AMs
> ---------------------------------------
>
> Key: YARN-2005
> URL: https://issues.apache.org/jira/browse/YARN-2005
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 0.23.10, 2.4.0
> Reporter: Jason Lowe
>
> It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side.
--
This message was sent by Atlassian JIRA
(v6.2#6252)