You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mohammad Islam (Jira)" <ji...@apache.org> on 2020/05/25 08:15:00 UTC
[jira] [Created] (SPARK-31812) Spark to support the auto
cancelation of delegation token when an Application completes
Mohammad Islam created SPARK-31812:
--------------------------------------
Summary: Spark to support the auto cancelation of delegation token when an Application completes
Key: SPARK-31812
URL: https://issues.apache.org/jira/browse/SPARK-31812
Project: Spark
Issue Type: Improvement
Components: Spark Submit
Affects Versions: 2.4.5
Reporter: Mohammad Islam
Fix For: 2.4.7
*Context* :
YARN application provides a client API [setCancelTokensWhenComplete|http://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-api/apidocs/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.html#setCancelTokensWhenComplete(boolean)] to manage the delegation token(DT) lifecycle. By default, YARN [cancels the DT|https://github.com/apache/hadoop/blob/8f78aeb2500011e568929b585ed5b0987355f88d/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto#L513] when App finishes. However, the user can override this NOT to cancel the DT after the App completes. In some instances, this is required to lessen the HDFS/KMS memory footprints by reducing the outstanding DTs.
MR and TEZ already allow that through client config such as _mapreduce.job.complete.cancel.delegation.tokens_ and _tez.cancel.delegation.tokens.on.completion_ respectively_._
*Proposal* :
Currently, Spark doesn't support it. However, we may need to manage the lifecycle of DT outside YARN/Spark framework.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org