You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org> on 2016/04/12 01:33:25 UTC

[jira] [Commented] (YARN-3452) Bogus token usernames cause many invalid group lookups

    [ https://issues.apache.org/jira/browse/YARN-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236192#comment-15236192 ] 

Vinod Kumar Vavilapalli commented on YARN-3452:
-----------------------------------------------

Old JIRA.

bq. However YARN really should not be using bogus users on tokens anyway in case the RPC layer (or other non-YARN systems) try to do something with those users like HADOOP-10650 did.
bq. if someone else tries to do something with the ugi assuming it actually was a valid user.
They aren't really bogus, for many of these calls, we need them to identify the incoming security context / identity when per-app, per-container tokens are used. Server logging / audit logs also can depend on this for operations where the identifier should be app / container etc.

Even though, our core layer is named as UserGroupInformation, in many part of YARN and MapReduce (other than application submission), it is used as a way of propogating "IdentityInformation". Arguably, the server side code could simply look at the incoming tokens, find the incoming ID and ignore the user-name altogether. On the flip side, obviously Service-level authorization layer (hadoop-policy.xml) etc are wired into it as system-level users (HADOOP-10650 being a symptom) so I agree with you in that it is kind of disconnected.

Most of this code goes all the way back when I originally implemented security for YARN. And I borrowed this way of doing things strictly from how JobTokens were done in Hadoop 1.x MapReduce. I doubt if we can change this now - we'll have to change each and every API depending on this and their usage to understand both user-name and the specific identifier (like Application ID).

Given that HADOOP-12413 too care of the invalid group lookups, we are good for now. Changing the usage of UGI to only use real kerberos-names is likely going to be a huge one in YARN / MR.


> Bogus token usernames cause many invalid group lookups
> ------------------------------------------------------
>
>                 Key: YARN-3452
>                 URL: https://issues.apache.org/jira/browse/YARN-3452
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: security
>            Reporter: Jason Lowe
>         Attachments: tactical_defense.patch
>
>
> YARN uses a number of bogus usernames for tokens, like application attempt IDs for NM tokens or even the hardcoded "testing" for the container localizer token.  These tokens cause the RPC layer to do group lookups on these bogus usernames which will never succeed but can take a long time to perform.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)