You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Ahmed Hussein (Jira)" <ji...@apache.org> on 2021/01/22 20:58:01 UTC

[jira] [Comment Edited] (YARN-10425) Replace the legacy placement engine in CS with the new one

    [ https://issues.apache.org/jira/browse/YARN-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270425#comment-17270425 ] 

Ahmed Hussein edited comment on YARN-10425 at 1/22/21, 8:57 PM:
----------------------------------------------------------------

[~shuzirra], [~pbacsko], [~snemeth], [~BilwaST]  thanks for the contribution.
I have a question about the changes introduced by this Ticket. The following code block 
 is from [CSMappingPlacementRule#L128|https://github.com/apache/hadoop/commit/567600fd80896c1c9b0db1f228368d4eb2a694a2#diff-92b5797cf7739d330364d967172e65e61a859c776d9ebe526aba03ea33039033R127] 

{code:java}
    if (groups == null) {
      //We cannot use Groups#getUserToGroupsMappingService here, because when
      //tests change the HADOOP_SECURITY_GROUP_MAPPING, Groups won't refresh its
      //cached instance of groups, so we might get a Group instance which
      //ignores the HADOOP_SECURITY_GROUP_MAPPING settings.
      groups = new Groups(conf);
    }
{code}

IIUC, the design of groups caching "{{Groups.cache}}" relies on the fact that the Groups being a singleton. Otherwise, there will be inconsistent behavior especially in classes like {{JniBasedUnixGroupsNetgroupMapping}} and {{ShellBasedUnixGroupsNetgroupMapping}}. Both mapping implementations have a second caching layer for the netgroups "{{NetgroupCache}}".
I have the following two concerns regarding an independent Groups instance in {{CSMappingPlacementRule.java}}
* It breaks the design leading to inconsistent behaviors that do not match the expected. As I mentioned, {{NetgroupCache}} contents won't be defined.
* Performance considerations. Allocating "N" instances of {{Groups}} means fetching the user's groups  "N" times. Therefore, Guava cacheLoader's refresh will be done "N" times, and so on.

Why did you decide to make that change instead of fixing the design of the unit tests?
IIUC, there is a need to fix that bug in a follow up Jira.


was (Author: ahussein):
[~shuzirra], [~pbacsko] thanks for the contribution.
I have a question about the changes introduced by this Ticket. The following code block 
 is from [CSMappingPlacementRule#L128|https://github.com/apache/hadoop/commit/567600fd80896c1c9b0db1f228368d4eb2a694a2#diff-92b5797cf7739d330364d967172e65e61a859c776d9ebe526aba03ea33039033R127] 

{code:java}
    if (groups == null) {
      //We cannot use Groups#getUserToGroupsMappingService here, because when
      //tests change the HADOOP_SECURITY_GROUP_MAPPING, Groups won't refresh its
      //cached instance of groups, so we might get a Group instance which
      //ignores the HADOOP_SECURITY_GROUP_MAPPING settings.
      groups = new Groups(conf);
    }
{code}

IIUC, the design of groups caching "{{Groups.cache}}" relies on the fact that the Groups being a singleton. Otherwise, there will be inconsistent behavior especially in classes like {{JniBasedUnixGroupsNetgroupMapping}} and {{ShellBasedUnixGroupsNetgroupMapping}}. Both mapping implementations have a second caching layer for the netgroups "{{NetgroupCache}}".
I have the following two concerns regarding an independent Groups instance in {{CSMappingPlacementRule.java}}
* It breaks the design leading to inconsistent behaviors that do not match the expected. As I mentioned, {{NetgroupCache}} contents won't be defined.
* Performance considerations. Allocating "N" instances of {{Groups}} means fetching the user's groups  "N" times. Therefore, Guava cacheLoader's refresh will be done "N" times, and so on.

Why did you decide to make that change instead of fixing the design of the unit tests?
IIUC, there is a need to fix that bug in a follow up Jira.

> Replace the legacy placement engine in CS with the new one
> ----------------------------------------------------------
>
>                 Key: YARN-10425
>                 URL: https://issues.apache.org/jira/browse/YARN-10425
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Gergely Pollak
>            Assignee: Gergely Pollak
>            Priority: Major
>             Fix For: 3.4.0
>
>         Attachments: YARN-10425.001.patch, YARN-10425.002.patch, YARN-10425.003.patch, YARN-10425.004.patch, YARN-10425.005.patch, YARN-10425.006.patch, YARN-10425.007.patch
>
>
> Remove the UserGroupMapping and ApplicationName mapping classes, and use the new CSMappingPlacementRule instead. Also cleanup the orphan classes which are used by these classes only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org