You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2015/06/03 20:43:38 UTC

[jira] [Commented] (YARN-3764) CapacityScheduler should properly handle moving LeafQueue from one parent to another

    [ https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571473#comment-14571473 ] 

Wangda Tan commented on YARN-3764:
----------------------------------

CS's reinitialize logic creates new queues, but only copies configuration properties to old queue, and new queue will be discarded after reinitialization.

A comprehensive fix for this is, copy old queue's run time information to new queue, including runningApplications, etc. And discard old queue after reinitialization.

A short term fix is don't allow remove queue under parentQueue. IAW, CS will throw exception if a LeafQueue is moved from one parent to another. I prefer to do comprehensive fix for 2.8.0, and short term fix for 2.7.1/2.6.1 (if required).

Thoughts?

> CapacityScheduler should properly handle moving LeafQueue from one parent to another
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-3764
>                 URL: https://issues.apache.org/jira/browse/YARN-3764
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>
> Currently CapacityScheduler doesn't handle the case well, for example:
> A queue structure:
> {code}
>     root
>       |
>       a (100)
>     /   \
>    x     y
>   (50)   (50)
> {code}
> And reinitialize using following structure:
> {code}
>      root
>      /   \ 
> (50)a     x (50)
>     |
>     y
>    (100)
> {code}
> The actual queue structure after reinitialize is:
> {code}
>      root
>     /    \
>    a (50) x (50)
>   /  \
>  x    y
> (50)  (100)
> {code}
> We should handle this case better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)