You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2023/06/30 13:20:19 UTC
[spark] branch master updated: [SPARK-41599] Memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher
This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 7971e1c6a7c [SPARK-41599] Memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher
7971e1c6a7c is described below
commit 7971e1c6a7c074c65829c2bdfad857a33e0a7a5d
Author: Xieming LI <ri...@gmail.com>
AuthorDate: Fri Jun 30 08:20:04 2023 -0500
[SPARK-41599] Memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher
### What changes were proposed in this pull request?
Using `FileSystem.closeAllForUGI` to close the cache to prevent memory leak.
### Why are the changes needed?
There seems to be a memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher.
For more detail, see [SPARK-41599](https://issues.apache.org/jira/browse/SPARK-41599)
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
I have tested the patch with my code which uses inProcessLauncher.
Confirmed that the memory leak issue is mitigated.
<img width="1059" alt="Screenshot 2023-06-23 at 11 46 52" src="https://github.com/apache/spark/assets/4378066/cfdef4d3-cb43-464c-bb46-de60f3b91622">
I will be very helpful if I can have some feedback and I will add some test cases if required.
Closes #41692 from risyomei/fix-SPARK-41599.
Authored-by: Xieming LI <ri...@gmail.com>
Signed-off-by: Sean Owen <sr...@gmail.com>
---
core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 2 ++
.../apache/spark/deploy/security/HadoopDelegationTokenManager.scala | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 8f9477385e7..60253ed5fda 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -186,6 +186,8 @@ private[spark] class SparkSubmit extends Logging {
} else {
throw e
}
+ } finally {
+ FileSystem.closeAllForUGI(proxyUser)
}
}
} else {
diff --git a/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala b/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala
index 6ce195b6c7a..54a24927ded 100644
--- a/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala
@@ -26,6 +26,7 @@ import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
import scala.collection.mutable
import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.security.{Credentials, UserGroupInformation}
import org.apache.spark.SparkConf
@@ -149,6 +150,9 @@ private[spark] class HadoopDelegationTokenManager(
creds.addAll(newTokens)
}
})
+ if(!currentUser.equals(freshUGI)) {
+ FileSystem.closeAllForUGI(freshUGI)
+ }
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org