You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2015/12/08 09:59:11 UTC

[jira] [Commented] (HADOOP-11187) NameNode - KMS communication fails after a long period of inactivity

    [ https://issues.apache.org/jira/browse/HADOOP-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046601#comment-15046601 ] 

Harsh J commented on HADOOP-11187:
----------------------------------

This doesn't cover the retries properly if the connection setup itself fails (the retries are limited to the call(…) method being executed, but the connection is setup before that:

{code}
org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:743)
	at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2530)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2664)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:289)
	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
	at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
	at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2381)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2351)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
	at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
	at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
	at org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:266)
	at org.apache.hadoop.crypto.key.kms.ValueQueue.getNext(ValueQueue.java:226)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:738)
	... 16 more
Caused by: java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:488)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.access$100(KMSClientProvider.java:83)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$EncryptedQueueRefiller.fillQueueForKey(KMSClientProvider.java:132)
	at org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:181)
	at org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:175)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
	... 24 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:306)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196)
	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127)
	at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216)
	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:482)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:477)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:477)
	... 30 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
	at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
	at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
	at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:285)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:261)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:261)
	... 40 more
{code}

> NameNode - KMS communication fails after a long period of inactivity
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11187
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11187
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>             Fix For: 2.7.0
>
>         Attachments: HADOOP-11187.1.patch, HADOOP-11187.2.patch
>
>
> As reported by [~atm] :
> The issue is due to the authentication token that the NN has to talk to the KMS is expiring, AND the signature secret provider in the KMS authentication filter is discarding the old secret after 2x the authentication token validity period.
> If the token being supplied is under 1x the validity lifetime then the token will authenticate just fine. If the token being supplied is between 1x-2x the validity lifetime, then the token can be validated but it will be expired, so a 401 will be returned to the client and it will get a new token. But if the token being supplied is greater than 2x the validity lifetime, then the KMS authentication filter will not even be able to validate the token, and will return a 403, which will cause the client to not retry authentication to the KMS.
> The KMSClientProvider needs to be modified to retry authentication even in the above case



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)