You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jian He (JIRA)" <ji...@apache.org> on 2014/08/27 02:13:57 UTC
[jira] [Commented] (YARN-2456) Possible deadlock in
CapacityScheduler when RM is recovering apps
[ https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111624#comment-14111624 ]
Jian He commented on YARN-2456:
-------------------------------
One thing we can do is to add the application to scheduler based on the application submission order. i.e. sort the apps first based on applicationId before recovering the apps
> Possible deadlock in CapacityScheduler when RM is recovering apps
> -----------------------------------------------------------------
>
> Key: YARN-2456
> URL: https://issues.apache.org/jira/browse/YARN-2456
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Jian He
>
> Consider this scenario:
> 1. RM is configured with a single queue and only one application can be active at a time.
> 2. Submit App1 which uses up the queue's whole capacity
> 3. Submit App2 which remains pending.
> 4. Restart RM.
> 5. App2 is recovered before App1, so App2 is added to the activeApplications list. Now App1 remains pending (because of max-active-app limit)
> 6. All containers of App1 are now recovered when NM registers, and use up the whole queue capacity again.
> 7. Since the queue is full, App2 cannot proceed to allocate AM container.
> 8. In the meanwhile, App1 cannot proceed to become active because of the max-active-app limit
--
This message was sent by Atlassian JIRA
(v6.2#6252)