You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2022/06/23 15:52:46 UTC
[spark] branch master updated: [SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6
This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new a3fdf3b107b [SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6
a3fdf3b107b is described below
commit a3fdf3b107b62089890a4246c168ca516dff5d44
Author: Dongjoon Hyun <do...@apache.org>
AuthorDate: Thu Jun 23 08:52:22 2022 -0700
[SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6
### What changes were proposed in this pull request?
This PR aims to improve YARN cluster mode to support IPv6.
After this PR, all `YARN` unit tests pass.
### Why are the changes needed?
**BEFORE**
```
$ SBT_OPTS='-Djava.net.preferIPv6Addresses=true' SPARK_LOCAL_HOSTNAME=::1 build/sbt "yarn/testOnly *.YarnClusterSuite" -Pyarn
[info] YarnClusterSuite:
[info] - run Spark in yarn-client mode (10 seconds, 204 milliseconds)
[info] - run Spark in yarn-cluster mode *** FAILED *** (2 seconds, 88 milliseconds)
[info] FAILED did not equal FINISHED (stdout/stderr was not captured) (BaseYarnClusterSuite.scala:240)
...
```
**AFTER**
```
$ SBT_OPTS='-Djava.net.preferIPv6Addresses=true' SPARK_LOCAL_HOSTNAME=::1 build/sbt "yarn/testOnly *.YarnClusterSuite" -Pyarn
[info] YarnClusterSuite:
[info] - run Spark in yarn-client mode (10 seconds, 204 milliseconds)
[info] - run Spark in yarn-cluster mode (10 seconds, 118 milliseconds)
[info] - run Spark in yarn-client mode with unmanaged am (7 seconds, 99 milliseconds)
[info] - run Spark in yarn-client mode with different configurations, ensuring redaction (8 seconds, 92 milliseconds)
[info] - run Spark in yarn-cluster mode with different configurations, ensuring redaction (10 seconds, 117 milliseconds)
[info] - yarn-cluster should respect conf overrides in SparkHadoopUtil (SPARK-16414, SPARK-23630) (9 seconds, 113 milliseconds)
[info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' (8 seconds, 113 milliseconds)
[info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' (9 seconds, 119 milliseconds)
[info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' and gateway-replacement path (8 seconds, 97 milliseconds)
[info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' and gateway-replacement path (10 seconds, 108 milliseconds)
[info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' and gateway-replacement path containing an environment variable (9 seconds, 111 milliseconds)
[info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'file' (8 seconds, 103 milliseconds)
[info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'file' (9 seconds, 108 milliseconds)
[info] - run Spark in yarn-cluster mode unsuccessfully (7 seconds, 126 milliseconds)
[info] - run Spark in yarn-cluster mode failure after sc initialized (15 seconds, 121 milliseconds)
[info] - run Python application in yarn-client mode (10 seconds, 113 milliseconds)
[info] - run Python application in yarn-cluster mode (11 seconds, 116 milliseconds)
[info] - run Python application in yarn-cluster mode using spark.yarn.appMasterEnv to override local envvar (11 seconds, 122 milliseconds)
[info] - user class path first in client mode (8 seconds, 103 milliseconds)
[info] - user class path first in cluster mode (9 seconds, 110 milliseconds)
[info] - monitor app using launcher library (4 seconds, 15 milliseconds)
[info] - running Spark in yarn-cluster mode displays driver log links (12 seconds, 125 milliseconds)
[info] - timeout to get SparkContext in cluster mode triggers failure (11 seconds, 112 milliseconds)
[info] - executor env overwrite AM env in client mode (7 seconds, 101 milliseconds)
[info] - executor env overwrite AM env in cluster mode (10 seconds, 108 milliseconds)
[info] - SPARK-34472: ivySettings file with no scheme or file:// scheme should be localized on driver in cluster mode (17 seconds, 202 milliseconds)
[info] - SPARK-34472: ivySettings file with no scheme or file:// scheme should retain user provided path in client mode (14 seconds, 190 milliseconds)
[info] - SPARK-34472: ivySettings file with non-file:// schemes should throw an error (1 second, 917 milliseconds)
[info] Run completed in 4 minutes, 51 seconds.
[info] Total number of tests run: 28
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 28, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 352 s (05:52), completed Jun 23, 2022, 2:40:50 AM
```
### Does this PR introduce _any_ user-facing change?
No, there is no change to IPv4 users.
### How was this patch tested?
Pass the CIs and do the manual testing with the above command.
Closes #36964 from dongjoon-hyun/SPARK-39566.
Authored-by: Dongjoon Hyun <do...@apache.org>
Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 2 ++
.../yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala | 1 +
.../src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala | 1 +
3 files changed, 4 insertions(+)
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index dab1474725d..783cf47df16 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -1022,6 +1022,8 @@ object SparkSubmit extends CommandLineUtils with Logging {
"org.apache.spark.deploy.k8s.submit.KubernetesClientApplication"
override def main(args: Array[String]): Unit = {
+ Option(System.getenv("SPARK_PREFER_IPV6"))
+ .foreach(System.setProperty("java.net.preferIPv6Addresses", _))
val submit = new SparkSubmit() {
self =>
diff --git a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
index 66873961100..99741d9f759 100644
--- a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
+++ b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
@@ -912,6 +912,7 @@ private[spark] class Client(
populateClasspath(args, hadoopConf, sparkConf, env, sparkConf.get(DRIVER_CLASS_PATH))
env("SPARK_YARN_STAGING_DIR") = stagingDirPath.toString
env("SPARK_USER") = UserGroupInformation.getCurrentUser().getShortUserName()
+ env("SPARK_PREFER_IPV6") = Utils.preferIPv6.toString
// Pick up any environment variables for the AM provided through spark.yarn.appMasterEnv.*
val amEnvPrefix = "spark.yarn.appMasterEnv."
diff --git a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
index 1c20723ff7a..ca8ec642282 100644
--- a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
+++ b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
@@ -262,6 +262,7 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
test("monitor app using launcher library") {
val env = new JHashMap[String, String]()
env.put("YARN_CONF_DIR", hadoopConfDir.getAbsolutePath())
+ env.put("SPARK_PREFER_IPV6", Utils.preferIPv6.toString)
val propsFile = createConfFile()
val handle = new SparkLauncher(env)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org