You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Xiaoqiao He (Jira)" <ji...@apache.org> on 2023/09/03 16:22:00 UTC
[jira] [Created] (HADOOP-18881) ZKDTSM could be stuck when meet znode version overflow
Xiaoqiao He created HADOOP-18881:
------------------------------------
Summary: ZKDTSM could be stuck when meet znode version overflow
Key: HADOOP-18881
URL: https://issues.apache.org/jira/browse/HADOOP-18881
Project: Hadoop Common
Issue Type: Bug
Reporter: Xiaoqiao He
Assignee: Xiaoqiao He
ZKDTSM could be stuck when meet znode (/zkdtsm/ZKDTSMRoot/ZKDTSMSeqNumRoot) version int overflow (2147483647). It can not recovery even restart Application which may include YARN Router, DFS Router, KMS and other modules who use zookeeper to manage Token. One solution (not very smooth) is delete this znode first and then restart Service.
The root cause is following code snippet and curator could not compatible with version overflow. I try to give a draft improvement at CURATOR-688. Welcome to any discussion if we could resolve it at Hadoop side smooth.
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager#incrSharedCount
{code:java}
private int incrSharedCount(SharedCount sharedCount, int batchSize)
throws Exception {
while (true) {
// Loop until we successfully increment the counter
VersionedValue<Integer> versionedValue = sharedCount.getVersionedValue();
if (sharedCount.trySetCount(
versionedValue, versionedValue.getValue() + batchSize)) {
return versionedValue.getValue();
}
}
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org