You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wilfred Spiegelenburg (JIRA)" <ji...@apache.org> on 2019/03/18 14:16:00 UTC

[jira] [Comment Edited] (YARN-8967) Change FairScheduler to use PlacementRule interface

    [ https://issues.apache.org/jira/browse/YARN-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795049#comment-16795049 ] 

Wilfred Spiegelenburg edited comment on YARN-8967 at 3/18/19 2:15 PM:
----------------------------------------------------------------------

Thank you for the review [~yufeigu]

AllocationFileLoaderService file:
1) yes it did clean up nicely
2) The class is marked as {{@Unstable}} that should cover the change. Leaving the old constructors in could allow you to create a new {{AllocationFileLoaderService}} without a scheduler reference. That would cause a NPE on scheduler init and every single time the reload thread would run, leaving the RM in a failed state. I don't think it would be wise to leave them in.  _Based on all this I do think I need to file a follow up jira to fix the Hive SHIM that uses the policy at the moment and move that to the new code in a backward compatible way._
3) fixed that
4) fixed that
5) The difference between recovery and normal is just two if statements: in the first we ignore and empty context on recovery and the second one is to not generate an event on recovery. Moving the code out would not help. The checks are on opposite sides of the method and simple.
6) We could still have an empty queue that was why I left it. I just noticed that that case would be caught by the {{getLeafQueue}} so we should be OK with removing.
7) fixed that, it should have been removed

QueuePlacementPolicy file:
1) I have chosen to use the utility class solution and clean up a bit more. Keeping the QueuePlacementPolicy around in the allocation does not really help as the rules are really only relevant in the QueuePlacementManager in the new setup. There is no logic beside the rule list which is not 1:1 with the config that we could keep around.
2) fixed the reference (I used javadoc as there was nothing for other comments, now it is just a plain comment)
3) removed the comment and code
4) fixed
5) the tests look really similar but they are not. They test slight variations: the first two checks make sure the specified rule and create user rule trigger correctly. The last two make sure that the specified rule triggers but not the user rule and that the default rule does the catch it correctly.
6) fixed that, I left it at first with a view on possible extension later with other bits. I now moved the parent create code out and left the loop for elements which clears things up.
7) added a RuleMap class based on the suggestion
8) I think it is better to file a follow up jira as the same has happened in all new rule classes. We must have overlooked them in the previous jira when we did the cleanup. I checked and the exception is logged in the client service so it can be done.




was (Author: wilfreds):
Thank you for the review [~yufeigu]

1) yes it did clean up nicely
2) The class is marked as {{@ Unstable}} that should cover the change. Leaving the old constructors in could allow you to create a new {{AllocationFileLoaderService}} without a scheduler reference. That would cause a NPE on scheduler init and every single time the reload thread would run, leaving the RM in a failed state. I don't think it would be wise to leave them in. 
Based on all this I do think I need to file a follow up jira to fix the Hive SHIM that uses the policy at the moment and move that to the new code in a backward compatible way.
3) fixed that
4) fixed that
5) The difference between recovery and normal is just two if statements: in the first we ignore and empty context on recovery and the second one is to not generate an event on recovery. Moving the code out would not help. The checks are on opposite sides of the method and simple.
6) We could still have an empty queue that was why I left it. I just noticed that that case would be caught by the {{getLeafQueue}} so we should be OK with removing.
7) fixed that, it should have been removed

1) I have chosen to use the utility class solution and clean up a bit more. Keeping the QueuePlacementPolicy around in the allocation does not really help as the rules are really only relevant in the QueuePlacementManager in the new setup. There is no logic beside the rule list which is not 1:1 with the config that we could keep around.
2) fixed the reference (I used javadoc as there was nothing for other comments, now it is just a plain comment)
3) removed the comment and code
4) fixed
5) the tests look really similar but they are not. They test slight variations: the first two checks make sure the specified rule and create user rule trigger correctly. The last two make sure that the specified rule triggers but not the user rule and that the default rule does the catch it correctly.
6) fixed that, I left it at first with a view on possible extension later with other bits. I now moved the parent create code out and left the loop for elements which clears things up.
7) added a RuleMap class based on the suggestion
8) I think it is better to file a follow up jira as the same has happened in all new rule classes. We must have overlooked them in the previous jira when we did the cleanup. I checked and the exception is logged in the client service so it can be done.



> Change FairScheduler to use PlacementRule interface
> ---------------------------------------------------
>
>                 Key: YARN-8967
>                 URL: https://issues.apache.org/jira/browse/YARN-8967
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler, fairscheduler
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>         Attachments: YARN-8967.001.patch, YARN-8967.002.patch, YARN-8967.003.patch, YARN-8967.004.patch, YARN-8967.005.patch, YARN-8967.006.patch, YARN-8967.007.patch, YARN-8967.008.patch
>
>
> The PlacementRule interface was introduced to be used by all schedulers as per YARN-3635. The CapacityScheduler is using it but the FairScheduler is not and is using its own rule definition.
> YARN-8948 cleans up the implementation and removes the CS references which should allow this change to go through.
> This would be the first step in using one placement rule engine for both schedulers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org