You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Szilard Nemeth (JIRA)" <ji...@apache.org> on 2018/07/10 09:42:00 UTC

[jira] [Comment Edited] (YARN-8468) Limit container sizes per queue in FairScheduler

    [ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16538323#comment-16538323 ] 

Szilard Nemeth edited comment on YARN-8468 at 7/10/18 9:41 AM:
---------------------------------------------------------------

Hi [~bsteinbach]!
Thanks for the patch. This is high quality code here.

I noticed a couple of things: 
- {{AllocationFileQueueParser: MAX_CONTAINER_RESOURCES}} could be package-private (without any modifier)
- {{QueueMaxContainerAllocationValidator.createExceptionText}}: please use {{String.format()}} instead of concatenating the parts of the string.
- {{QueueMaxContainerAllocationValidator}}: you used the method {{throwException}} 2 times, and you also used {{throw new YarnRuntimeException}} as is. I think you should either use the method for all 3 invocations or just use {{throw new YarnRuntimeException()}} everywhere. I prefer the latter.
- {{QueueMaxContainerAllocationValidator.validate}}: I would use this kind of message instead: "Invalid queue resource allocation, it does not allowed to override " + MAX_CONTAINER_RESOURCES + " for the root queue!"
- {{QueueMaxContainerAllocationValidator.validate}}: Logging maxMem and maxCores on INFO level is unnecessary. I would not log these at all, even not on DEBUG level as it does not hold any meaningful information for the users like this.
- {{QueueMaxContainerAllocationValidator.checkContainerResources}}: Same as above, remove the logged queueMem and queueCores log statements.
- {{AllocationConfiguration.queueMaxContainerResourcesMap}}: Please add comments about what is this field for, as we have comments for other fields as well.
{{FSLeafQueue.getMaximumResourceCapability // FsParentQueue.getMaximumResourceCapability}}: I accidentally noticed there's a space missing between the "if" and the parentheses.
- {{TestQueueMaxContainerAllocationValidator}}: I think the convention is to use method names like {{testXXX}} so {{tooHighMemoryMaxContainerAllocationTest}} should change to {{testTooHighMemoryMaxContainerAllocation}}. In addition, I would change the name to {{testMaxContainerAllocationWithTooHighMemory}} and the rest of the methods similarly.
- {{TestQueueMaxContainerAllocationValidator}}: Please don't use {{QueueMaxContainerAllocationValidator.createExceptionText}} in the tests, as if the production code generates the text in a wrong format, then this test won't fail. I would simply use Strings here to assert the message.
- {{TestFairScheduler}}: Once again, the convention for method names is testXXX.
- In the {{FairScheduler.md}} documentation, I would replace "This property is invalid for root queue." with "This property cannot be defined for the root queue"

Please fix the lines longer than 80 chars, at least I saw one occurence in {{FairSchedulerTestBase}} and {{TestFairScheduler}}.

Thanks!


was (Author: snemeth):
Hi [~bsteinbach]!
Thanks for the patch. This is high quality code here.

I noticed a couple of things: 
- {{AllocationFileQueueParser: MAX_CONTAINER_RESOURCES}} could be package-private (without any modifier)
- {{QueueMaxContainerAllocationValidator.createExceptionText}}: please use {{String.format()}} instead of concatenating the parts of the string.
- {{QueueMaxContainerAllocationValidator}}: you used the method {{throwException}} 2 times, and you also used {{throw new YarnRuntimeException}} as is. I think you should either use the method for all 3 invocations or just use {{throw new YarnRuntimeException()}} everywhere. I prefer the latter.
- {{QueueMaxContainerAllocationValidator.validate}}: I would use this kind of message instead: "Invalid queue resource allocation, it does not allowed to override " + MAX_CONTAINER_RESOURCES + " for the root queue!"
- {{QueueMaxContainerAllocationValidator.validate}}: Logging maxMem and maxCores on INFO level is unnecessary. I would not log these at all, even not on DEBUG level as it does not hold any meaningful information for the users like this.
- {{QueueMaxContainerAllocationValidator.checkContainerResources}}: Same as above, remove the logged queueMem and queueCores log statements.
- {{AllocationConfiguration.queueMaxContainerResourcesMap}}: Please add comments about what is this field for, as we have comments for other fields as well.
{{FSLeafQueue.getMaximumResourceCapability // FsParentQueue.getMaximumResourceCapability}}: I accidentally noticed there's a space missing between the "if" and the parentheses.
- {{TestQueueMaxContainerAllocationValidator}}: I think the convention is to use method names like {{testXXX}} so {{tooHighMemoryMaxContainerAllocationTest}} should change to {{testTooHighMemoryMaxContainerAllocation}}. In addition, I would change the name to {{testMaxContainerAllocationWithTooHighMemory}} and the rest of the methods similarly.
- {{TestQueueMaxContainerAllocationValidator}}: Please don't use {{QueueMaxContainerAllocationValidator.createExceptionText}} in the tests, as if the production code generates the text in a wrong format, then this test won't fail. I would simply use Strings here to assert the message.
- {{TestFairScheduler}}: Once again, the convention for method names is testXXX.
- In the {{FairScheduler.md}} documentation, I would replace "This property is invalid for root queue." with "This property cannot be defined for the root queue"

Please fix the lines longer than 80 chars, at least I saw one occurence in {{FairSchedulerTestBase}} and {{TestFairScheduler}}.

> Limit container sizes per queue in FairScheduler
> ------------------------------------------------
>
>                 Key: YARN-8468
>                 URL: https://issues.apache.org/jira/browse/YARN-8468
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 3.1.0
>            Reporter: Antal Bálint Steinbach
>            Assignee: Antal Bálint Steinbach
>            Priority: Critical
>              Labels: patch
>         Attachments: YARN-8468.000.patch, YARN-8468.001.patch, YARN-8468.002.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org