You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by hss <10...@qq.com> on 2019/12/10 03:03:17 UTC

HDFS_DELEGATION_TOKEN自动过期问题

各位好!


hadoop集群开启了Kerberos安全认证,以 Flink on Yarn 的Per-job模式提交任务。&nbsp;只要是超过七天之后HDFS_DELEGATION_TOKEN自动过期, checkpoint执行不成功, 有遇到这种问题的?
&nbsp;

2019-12-02 00:00:00.283 ERROR org.apache.flink.yarn.YarnResourceManager &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Could not start TaskManager in container container_e39_1563434037485_0606_01_552751.
 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for BDATA_UME_ADM: HDFS_DELEGATION_TOKEN owner=BDATA_UME_ADM@TRAVELSKY.BDP.COM, renewer=yarn, realUser=, issueDate=1574414126899, maxDate=1575018926899, sequenceNumber=800, masterKeyId=225) can't be found in cache

Re: HDFS_DELEGATION_TOKEN自动过期问题

Posted by Paul Lam <pa...@gmail.com>.
Hi,

你需要将 keytab 一并提交到集群,参考 security.kerberos.login.principal 和 security.kerberos.login.keytab 两个配置的说明 [1]。

[1]https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#kerberos-based-security <https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#kerberos-based-security>

Best,
Paul Lam

> 在 2019年12月10日,11:03,hss <10...@qq.com> 写道:
> 
> 各位好!
> 
> 
> hadoop集群开启了Kerberos安全认证,以 Flink on Yarn 的Per-job模式提交任务。&nbsp;只要是超过七天之后HDFS_DELEGATION_TOKEN自动过期, checkpoint执行不成功, 有遇到这种问题的?
> &nbsp;
> 
> 2019-12-02 00:00:00.283 ERROR org.apache.flink.yarn.YarnResourceManager &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Could not start TaskManager in container container_e39_1563434037485_0606_01_552751.
> 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for BDATA_UME_ADM: HDFS_DELEGATION_TOKEN owner=BDATA_UME_ADM@TRAVELSKY.BDP.COM, renewer=yarn, realUser=, issueDate=1574414126899, maxDate=1575018926899, sequenceNumber=800, masterKeyId=225) can't be found in cache