You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Gergely Pollak (Jira)" <ji...@apache.org> on 2021/03/11 03:16:00 UTC

[jira] [Comment Edited] (YARN-10571) Refactor dynamic queue handling logic

    [ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299260#comment-17299260 ] 

Gergely Pollak edited comment on YARN-10571 at 3/11/21, 3:15 AM:
-----------------------------------------------------------------

It's in the original code as well, but we can have a NPE here:
CapacitySchedulerQueueManager#removeLegacyDynamicQueue
{code:java}
CSQueue q = this.getQueue(queueName);
if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass()))) {
{code}
 q can be null if queueName is ambiguous (or queue doesn't exist);

 

CapacitySchedulerQueueManager#determineMissingParents:

1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency).

Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit.

2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation.

3) And here is some super weird edge case:

In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen:
 - CapacitySchedulerQueueManager minds his business
 - A wild call to createQueue with queue (parent.leaf) appears
{code:java}
Line 531 CSQueue parentQueue = getQueue(parentQueueName);  determines it does not exists (because parent is ambiguous){code}

 - CreateQueue uses 
{code:java}
Line 537 return createAutoQueue(queue); //It's not very effective{code}

 - CreateAutoQueue tries to determine the missing parents
{code:java}
565     ApplicationPlacementContext queueCandidateContext = parentContext;
566     CSQueue firstExistingQueue = getQueue(
567         queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous
559   public List<ApplicationPlacementContext> determineMissingParents(
560       ApplicationPlacementContext queue) throws SchedulerDynamicEditException {  //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf'
561     ApplicationPlacementContext parentContext =
562         CSQueueUtils.extractQueuePath(queue.getParentQueue());  //parentContext parent: null  leaf: 'parent'  fullPath: 'parent'
563     List<ApplicationPlacementContext> parentsToCreate = new ArrayList<>();
569     while (firstExistingQueue == null) {
570       parentsToCreate.add(queueCandidateContext);   //parent added to list
571       queueCandidateContext = CSQueueUtils.extractQueuePath(
572           queueCandidateContext.getParentQueue());  //NPE because queueCandidateContext.getParentQueue() == null{code}
 

 


was (Author: shuzirra):
It's in the original code as well, but we can have a NPE here:

CapacitySchedulerQueueManager#removeLegacyDynamicQueue

 
{code:java}
CSQueue q = this.getQueue(queueName);
if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass()))) {
{code}
 q can be null if queueName is ambiguous (or queue doesn't exist);

 

CapacitySchedulerQueueManager#determineMissingParents:

1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency).

Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit.

2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation.

3) And here is some super weird edge case:

In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen:
 - CapacitySchedulerQueueManager minds his business
 - A wild call to createQueue with queue (parent.leaf) appears
{code:java}
Line 531 CSQueue parentQueue = getQueue(parentQueueName);  determines it does not exists (because parent is ambiguous){code}

 - CreateQueue uses 
{code:java}
Line 537 return createAutoQueue(queue); //It's not very effective{code}

 - CreateAutoQueue tries to determine the missing parents
{code:java}
565     ApplicationPlacementContext queueCandidateContext = parentContext;
566     CSQueue firstExistingQueue = getQueue(
567         queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous
559   public List<ApplicationPlacementContext> determineMissingParents(
560       ApplicationPlacementContext queue) throws SchedulerDynamicEditException {  //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf'
561     ApplicationPlacementContext parentContext =
562         CSQueueUtils.extractQueuePath(queue.getParentQueue());  //parentContext parent: null  leaf: 'parent'  fullPath: 'parent'
563     List<ApplicationPlacementContext> parentsToCreate = new ArrayList<>();
569     while (firstExistingQueue == null) {
570       parentsToCreate.add(queueCandidateContext);   //parent added to list
571       queueCandidateContext = CSQueueUtils.extractQueuePath(
572           queueCandidateContext.getParentQueue());  //NPE because queueCandidateContext.getParentQueue() == null{code}
 

 

> Refactor dynamic queue handling logic
> -------------------------------------
>
>                 Key: YARN-10571
>                 URL: https://issues.apache.org/jira/browse/YARN-10571
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Andras Gyori
>            Assignee: Andras Gyori
>            Priority: Minor
>         Attachments: YARN-10571.001.patch
>
>
> As per YARN-10506 we have introduced an other mode for auto queue creation and a new class, which handles it. We should move the old, managed queue related logic to CSAutoQueueHandler as well, and do additional cleanup regarding queue management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org