You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2014/09/09 19:13:29 UTC

[jira] [Commented] (HADOOP-10523) Hadoop services (such as RM, NN and JHS) throw confusing exception during token auto-cancelation

    [ https://issues.apache.org/jira/browse/HADOOP-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127240#comment-14127240 ] 

Daryn Sharp commented on HADOOP-10523:
--------------------------------------

Re-reading the description, I'm a bit confused with:
bq. the system (such as RM, NN and JHS) also periodically tries to cancel the same token.  During the second cancel (originated by RM/NN/JHS)".
and
bq. So this will be added as a WARN message in the caller of cancelToken(). It includes RM, JHS and NN. right?

Perhaps I'm misunderstanding your use of "originated by".  Token issuers like the NN and JHS do not try to cancel tokens, at least not each other's.  Their secret manager periodically purges their own tokens and won't ever emit the exception trace in the description.  A client, which is what the RM is while renewing on behalf of the job, will receive an exception for an already cancelled token if something else cancelled it.

I should have asked before where the log snippet is from?  The RM?  A task?  Etc?  Can you please check if the latest 2.x exhibits the same logging behavior?

> Hadoop services (such as RM, NN and JHS) throw confusing exception during token auto-cancelation 
> -------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10523
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 2.3.0
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>             Fix For: 2.6.0
>
>         Attachments: HADOOP-10523.1.patch
>
>
> When a user explicitly cancels the token, the system (such as RM, NN and JHS) also periodically tries to cancel the same token. During the second cancel (originated by RM/NN/JHS), Hadoop processes throw the following error/exception in the  log file. Although the exception is harmless, it creates a lot of confusions and causes the dev to spend a lot of time to investigate.
> This JIRA is to make sure if the token is available/not cancelled before attempting to cancel the token and  finally replace this exception with proper warning message.
> {noformat}
> 2014-04-15 01:41:14,686 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Token cancelation requested for identifier:: owner=<FULL_PRINCIPAL>.linkedin.com@REALM, renewer=yarn, realUser=, issueDate=1397525405921, maxDate=1398130205921, sequenceNumber=1, masterKeyId=2
> 2014-04-15 01:41:14,688 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/HOST@<REALM> (auth:KERBEROS) cause:org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not found
> 2014-04-15 01:41:14,689 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 10020, call org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB.cancelDelegationToken from 172.20.128.42:2783 Call#37759 Retry#0: error: org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not found
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not found
>         at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.cancelToken(AbstractDelegationTokenSecretManager.java:436)
>         at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.cancelDelegationToken(HistoryClientService.java:400)
>         at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.cancelDelegationToken(MRClientProtocolPBServiceImpl.java:286)
>         at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:301)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)