You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@twill.apache.org by "Alvin Wang (JIRA)" <ji...@apache.org> on 2014/10/28 22:43:34 UTC

[jira] [Commented] (TWILL-106) HDFS delegation token is not being refreshed properly

    [ https://issues.apache.org/jira/browse/TWILL-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187520#comment-14187520 ] 

Alvin Wang commented on TWILL-106:
----------------------------------

I see that the dfs.namenode.delegation.key.update-interval in the test cluster defaulted to 86400000 (1 day), which is the time that the CDAP transaction service takes to go down due to this error:

{code}
Aborting transaction manager due to: Snapshot (timestamp 1414037562088) failed due to: token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
{code}

Also, there weren't any other properties set to 1 day, so it seems likely that the error is related to the dfs.namenode.delegation.key.update-interval. I also remember that we made a fix to Twill for delegation key expiration, and that I was able to have the CDAP transaction service stay running for longer than a day. This error could be caused by some different configuration.

> HDFS delegation token is not being refreshed properly
> -----------------------------------------------------
>
>                 Key: TWILL-106
>                 URL: https://issues.apache.org/jira/browse/TWILL-106
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.4.0-incubating
>            Reporter: Poorna Chandra
>
> We have a Twill app that runs in a secure Hadoop cluster. The app starts up fine, and runs for a day. I can see in logs that say secure store was updated regularly. However, after a day I see exceptions that say "token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache". 
> Exception:
> -------------
> 2014-10-23T04:12:42,101Z ERROR c.c.t.TransactionManager [cdap-secure120-1000.dev.continuuity.net] [tx-snapshot] TransactionManager:abortService(TransactionManager.java:594) - Aborting transaction manager due to: Snapshot (timestamp 1414037562088) failed due to: token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
> org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 4287 for yarn) can't be found in cache
>         at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)