You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Yicong Cai (Jira)" <ji...@apache.org> on 2019/08/20 09:07:00 UTC

[jira] [Commented] (HADOOP-16521) Subject has a contradiction between proxy user and real user

    [ https://issues.apache.org/jira/browse/HADOOP-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911128#comment-16911128 ] 

Yicong Cai commented on HADOOP-16521:
-------------------------------------

There is a usage scenario that reflects this problem:

The Hadoop Archive Logs tool uses the Distribute Shell tool to submit tasks to the Yarn cluster.
The Distribute Shell Client collects HDFS Tokens for task container, but only contains HDFS/Yarn Token information and does not contain Kerberos authentication information.
The Hadoop Archive Logs Runner was executed in the container. In the Runner, the ProxyUser was built as follows:

 

 
{code:java}
UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
UserGroupInformation proxyUser = UserGroupInformation.createProxyUser(proxyUserName, loginUser);proxyUser.doAs(new PrivilegedExceptionAction<Integer>() {
  @Override
  public Integer run() throws Exception {
    //......
    FileSystem fs = p.getFileSystem(getConf());
    //......
    return 1;
  }
});
{code}
 

first part:

UserGroupInformation.getLoginUser() using HADOOP_TOKEN_FILE_LOCATION to login and only include HDFS HA ​​Token.
loginUser is RealUser, and Credentials contains Token authentication information.

UserGroupInformation.createProxyUser(proxyUserName, loginUser)
Create a new UserGroupInformation instance with the new Subject instance, including User and RealUser, and Credentials is empty.


the second part:

DFSClient creates a new UGI instance using UserGroupInformation.getCurrentUser(), and the subject is proxyUser (Credentials is empty).
Use the UGI instance of DFSClient in the ConfiguredFailoverProxyProvider to obtain the logical Token and convert it into the corresponding NameNode IP Token;
Call HAUtilClient.cloneDelegationTokenForLogicalUri(ugi, uri, addressesOfNns);
But in fact, the UGI is a proxyUser and does not contain Token information, so the conversion fails here.

 

the third part:

When setupIOstreams makes a NameNode connection, it uses RealUser;
However, in the second part, RealUser does not translate the Token corresponding to the IP, which causes the getServerToken to fail in createSaslClient. Other authentication methods need to be sought.

> Subject has a contradiction between proxy user and real user
> ------------------------------------------------------------
>
>                 Key: HADOOP-16521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16521
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Yicong Cai
>            Priority: Major
>
> In the method UserGroupInformation#loginUserFromSubject, if you specify ProxyUser with HADOOP_PROXY_USER, and create a Proxy UGI instance, the valid Credentials are included in the User's PrivateCredentials. The UGI information is as follows:
>  
> {code:java}
>  proxyUGI
>  |
>  |--subject 1
>  | |
>  | |--principals
>  | | |
>  | | |--user
>  | | |
>  | |  --real user
>  | |
>  |  --privCredentials(all cred)
>  |
>   --proxy user
> {code}
>  
> If you first login Real User and then use UserGroupInformation#createProxyUser to create a Proxy UGI, the valid Credentials information is included in RealUser's subject PrivateCredentials. The UGI information is as follows:
>  
> {code:java}
> proxyUGI
>  |
>  |--subject 1
>  | |
>  | |--principals
>  | | |
>  | | |--user
>  | | |
>  | |  --real user
>  | |    |
>  | |     --subject 2
>  | |       |
>  | |        --privCredentials(all cred)
>  | |
>  |  --privCredentials(empty)
>  |
>   --proxy user{code}
>  
> Use the proxy user in the HDFS FileSystem to perform token-related operations.
> However, in the RPC Client Connection, use the token in RealUser for SaslRpcClient#saslConnect.
> So the main contradiction is, should ProxyUser's real Credentials information be placed in ProxyUGI's subject, or should it be placed in RealUser's subject?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org