You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Venkata krishnan Sowrirajan (Jira)" <ji...@apache.org> on 2023/02/21 06:21:00 UTC

[jira] [Created] (FLINK-31162) Avoid setting private tokens to AM container context when kerberos delegation token fetch is disabled

Venkata krishnan Sowrirajan created FLINK-31162:
---------------------------------------------------

             Summary: Avoid setting private tokens to AM container context when kerberos delegation token fetch is disabled
                 Key: FLINK-31162
                 URL: https://issues.apache.org/jira/browse/FLINK-31162
             Project: Flink
          Issue Type: Bug
          Components: Deployment / YARN
    Affects Versions: 1.16.1
            Reporter: Venkata krishnan Sowrirajan


In our internal env, we have enabled [Consistent Reads from HDFS Observer NameNode|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html]. With this, some of the `ObserverReadProxyProvider` implementation clone the delegation token for HA service and mark those tokens private so that they won't be accessible through _ugi.getCredentials()._


But Flink internally uses [`currUsr.getTokens()`|https://github.com/apache/flink/blob/release-1.16.1/flink-yarn/src/main/java/org/apache/flink/yarn/Utils.java#L222] to get the current user credentials tokens to be set in AM context for submitting the YARN app to RM.

This fails with the following error:
{code:java}
Unable to add the application to the delegation token renewer.
java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: test01-ha4.abc:9000, Ident: (HDFS_DELEGATION_TOKEN token 151335106 for john)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:495)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$900(DelegationTokenRenewer.java:79)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:939)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:916)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2044)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1451)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:5348)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:733)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:1056)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:525)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:495)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1038)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1003)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:931)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1905)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2856)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1499)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1342)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy87.renewDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewDelegationToken(ClientNamenodeProtocolTranslatorPB.java:986)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy88.renewDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:761)
at org.apache.hadoop.security.token.Token.renew(Token.java:466)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:629)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:626)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1905)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:625)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:481)
... 6 more
{code}
Based on the [code comment here in HAUtilClient.java|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HAUtilClient.java#L128], it seems like the user credentials should be obtained using _ugi.getCredentials()_ instead of {_}ugi.getTokens(){_}. Also Spark seems to use _ugi.getCredentials()_ [here|https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L348] to set the credentials obtained to AM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)