You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2022/06/23 15:52:46 UTC

[spark] branch master updated: [SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a3fdf3b107b [SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6
a3fdf3b107b is described below

commit a3fdf3b107b62089890a4246c168ca516dff5d44
Author: Dongjoon Hyun <do...@apache.org>
AuthorDate: Thu Jun 23 08:52:22 2022 -0700

    [SPARK-39566][CORE][YARN] Improve YARN cluster mode to support IPv6
    
    ### What changes were proposed in this pull request?
    
    This PR aims to improve YARN cluster mode to support IPv6.
    After this PR, all `YARN` unit tests pass.
    
    ### Why are the changes needed?
    
    **BEFORE**
    ```
    $ SBT_OPTS='-Djava.net.preferIPv6Addresses=true' SPARK_LOCAL_HOSTNAME=::1 build/sbt "yarn/testOnly *.YarnClusterSuite" -Pyarn
    [info] YarnClusterSuite:
    [info] - run Spark in yarn-client mode (10 seconds, 204 milliseconds)
    [info] - run Spark in yarn-cluster mode *** FAILED *** (2 seconds, 88 milliseconds)
    [info]   FAILED did not equal FINISHED (stdout/stderr was not captured) (BaseYarnClusterSuite.scala:240)
    ...
    ```
    
    **AFTER**
    ```
    $ SBT_OPTS='-Djava.net.preferIPv6Addresses=true' SPARK_LOCAL_HOSTNAME=::1 build/sbt "yarn/testOnly *.YarnClusterSuite" -Pyarn
    [info] YarnClusterSuite:
    [info] - run Spark in yarn-client mode (10 seconds, 204 milliseconds)
    [info] - run Spark in yarn-cluster mode (10 seconds, 118 milliseconds)
    [info] - run Spark in yarn-client mode with unmanaged am (7 seconds, 99 milliseconds)
    [info] - run Spark in yarn-client mode with different configurations, ensuring redaction (8 seconds, 92 milliseconds)
    [info] - run Spark in yarn-cluster mode with different configurations, ensuring redaction (10 seconds, 117 milliseconds)
    [info] - yarn-cluster should respect conf overrides in SparkHadoopUtil (SPARK-16414, SPARK-23630) (9 seconds, 113 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' (8 seconds, 113 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' (9 seconds, 119 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'local' and gateway-replacement path (8 seconds, 97 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' and gateway-replacement path (10 seconds, 108 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'local' and gateway-replacement path containing an environment variable (9 seconds, 111 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-client mode with additional jar using URI scheme 'file' (8 seconds, 103 milliseconds)
    [info] - SPARK-35672: run Spark in yarn-cluster mode with additional jar using URI scheme 'file' (9 seconds, 108 milliseconds)
    [info] - run Spark in yarn-cluster mode unsuccessfully (7 seconds, 126 milliseconds)
    [info] - run Spark in yarn-cluster mode failure after sc initialized (15 seconds, 121 milliseconds)
    [info] - run Python application in yarn-client mode (10 seconds, 113 milliseconds)
    [info] - run Python application in yarn-cluster mode (11 seconds, 116 milliseconds)
    [info] - run Python application in yarn-cluster mode using spark.yarn.appMasterEnv to override local envvar (11 seconds, 122 milliseconds)
    [info] - user class path first in client mode (8 seconds, 103 milliseconds)
    [info] - user class path first in cluster mode (9 seconds, 110 milliseconds)
    [info] - monitor app using launcher library (4 seconds, 15 milliseconds)
    [info] - running Spark in yarn-cluster mode displays driver log links (12 seconds, 125 milliseconds)
    [info] - timeout to get SparkContext in cluster mode triggers failure (11 seconds, 112 milliseconds)
    [info] - executor env overwrite AM env in client mode (7 seconds, 101 milliseconds)
    [info] - executor env overwrite AM env in cluster mode (10 seconds, 108 milliseconds)
    [info] - SPARK-34472: ivySettings file with no scheme or file:// scheme should be localized on driver in cluster mode (17 seconds, 202 milliseconds)
    [info] - SPARK-34472: ivySettings file with no scheme or file:// scheme should retain user provided path in client mode (14 seconds, 190 milliseconds)
    [info] - SPARK-34472: ivySettings file with non-file:// schemes should throw an error (1 second, 917 milliseconds)
    [info] Run completed in 4 minutes, 51 seconds.
    [info] Total number of tests run: 28
    [info] Suites: completed 1, aborted 0
    [info] Tests: succeeded 28, failed 0, canceled 0, ignored 0, pending 0
    [info] All tests passed.
    [success] Total time: 352 s (05:52), completed Jun 23, 2022, 2:40:50 AM
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, there is no change to IPv4 users.
    
    ### How was this patch tested?
    
    Pass the CIs and do the manual testing with the above command.
    
    Closes #36964 from dongjoon-hyun/SPARK-39566.
    
    Authored-by: Dongjoon Hyun <do...@apache.org>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala           | 2 ++
 .../yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala       | 1 +
 .../src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala  | 1 +
 3 files changed, 4 insertions(+)

diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index dab1474725d..783cf47df16 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -1022,6 +1022,8 @@ object SparkSubmit extends CommandLineUtils with Logging {
     "org.apache.spark.deploy.k8s.submit.KubernetesClientApplication"
 
   override def main(args: Array[String]): Unit = {
+    Option(System.getenv("SPARK_PREFER_IPV6"))
+      .foreach(System.setProperty("java.net.preferIPv6Addresses", _))
     val submit = new SparkSubmit() {
       self =>
 
diff --git a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
index 66873961100..99741d9f759 100644
--- a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
+++ b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
@@ -912,6 +912,7 @@ private[spark] class Client(
     populateClasspath(args, hadoopConf, sparkConf, env, sparkConf.get(DRIVER_CLASS_PATH))
     env("SPARK_YARN_STAGING_DIR") = stagingDirPath.toString
     env("SPARK_USER") = UserGroupInformation.getCurrentUser().getShortUserName()
+    env("SPARK_PREFER_IPV6") = Utils.preferIPv6.toString
 
     // Pick up any environment variables for the AM provided through spark.yarn.appMasterEnv.*
     val amEnvPrefix = "spark.yarn.appMasterEnv."
diff --git a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
index 1c20723ff7a..ca8ec642282 100644
--- a/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
+++ b/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
@@ -262,6 +262,7 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
   test("monitor app using launcher library") {
     val env = new JHashMap[String, String]()
     env.put("YARN_CONF_DIR", hadoopConfDir.getAbsolutePath())
+    env.put("SPARK_PREFER_IPV6", Utils.preferIPv6.toString)
 
     val propsFile = createConfFile()
     val handle = new SparkLauncher(env)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org