You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2021/11/22 01:13:12 UTC
[spark] branch master updated: [SPARK-37209][YARN][TESTS] Fix `YarnShuffleIntegrationSuite` releated UTs when using `hadoop-3.2` profile without `assembly/target/scala-%s/jars`

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a7b3fc7  [SPARK-37209][YARN][TESTS] Fix `YarnShuffleIntegrationSuite` releated UTs when using `hadoop-3.2` profile without `assembly/target/scala-%s/jars`
a7b3fc7 is described below

commit a7b3fc7cef4c5df0254b945fe9f6815b072b31dd
Author: yangjie01 <ya...@baidu.com>
AuthorDate: Sun Nov 21 19:11:40 2021 -0600

    [SPARK-37209][YARN][TESTS] Fix `YarnShuffleIntegrationSuite` releated UTs when using `hadoop-3.2` profile without `assembly/target/scala-%s/jars`
    
    ### What changes were proposed in this pull request?
    `YarnShuffleIntegrationSuite`, `YarnShuffleAuthSuite` and `YarnShuffleAlternateNameConfigSuite` will failed when using `hadoop-3.2` profile without `assembly/target/scala-%s/jars`,  the fail reason is `java.lang.NoClassDefFoundError: breeze/linalg/Matrix`.
    
    The above UTS can succeed when using `hadoop-2.7` profile without `assembly/target/scala-%s/jars` because `KryoSerializer.loadableSparkClasses` can workaroud when `Utils.isTesting` is true, but `Utils.isTesting` is false when using `hadoop-3.2` profile.
    
    After investigated, I found that when `hadoop-2.7` profile is used, `SPARK_TESTING`  will be propagated to AM and Executor, but when `hadoop-3.2` profile is used, `SPARK_TESTING`  will not be propagated to AM and Executor.
    
    In order to ensure the consistent behavior of using `hadoop-2.7` and ``hadoop-3.2``, this pr change to manually propagate `SPARK_TESTING` environment variable if it exists to ensure `Utils.isTesting` is true in above test scenario.
    
    ### Why are the changes needed?
    Ensure `YarnShuffleIntegrationSuite` releated UTs can succeed when using `hadoop-3.2` profile without `assembly/target/scala-%s/jars`
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    
    - Pass the Jenkins or GitHub Action
    
    - Manual test `YarnShuffleIntegrationSuite`. `YarnShuffleAuthSuite` and `YarnShuffleAlternateNameConfigSuite`  can be verified in the same way.
    
    Please ensure that the `assembly/target/scala-%s/jars` directory does not exist before executing the test command, we can clean up the whole project by executing follow command or clone a new local code repo.
    
    1. run with `hadoop-3.2` profile
    
    ```
    mvn clean install -Phadoop-3.2 -Pyarn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnShuffleIntegrationSuite
    ```
    
    **Before**
    
    ```
    YarnShuffleIntegrationSuite:
    - external shuffle service *** FAILED ***
      FAILED did not equal FINISHED (stdout/stderr was not captured) (BaseYarnClusterSuite.scala:227)
    Run completed in 48 seconds, 137 milliseconds.
    Total number of tests run: 1
    Suites: completed 2, aborted 0
    Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
    *** 1 TEST FAILED ***
    ```
    
    Error stack as follows:
    
    ```
    21/11/20 23:00:09.682 main ERROR Client: Application diagnostics message: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times,
     most recent failure: Lost task 0.3 in stage 0.0 (TID 6) (localhost executor 1): java.lang.NoClassDefFoundError: breeze/linalg/Matrix
            at java.lang.Class.forName0(Native Method)
            at java.lang.Class.forName(Class.java:348)
            at org.apache.spark.util.Utils$.classForName(Utils.scala:216)
            at org.apache.spark.serializer.KryoSerializer$.$anonfun$loadableSparkClasses$1(KryoSerializer.scala:537)
            at scala.collection.immutable.List.flatMap(List.scala:366)
            at org.apache.spark.serializer.KryoSerializer$.loadableSparkClasses$lzycompute(KryoSerializer.scala:535)
            at org.apache.spark.serializer.KryoSerializer$.org$apache$spark$serializer$KryoSerializer$$loadableSparkClasses(KryoSerializer.scala:502)
            at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:226)
            at org.apache.spark.serializer.KryoSerializer$$anon$1.create(KryoSerializer.scala:102)
            at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
            at org.apache.spark.serializer.KryoSerializer$PoolWrapper.borrow(KryoSerializer.scala:109)
            at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:346)
            at org.apache.spark.serializer.KryoSerializationStream.<init>(KryoSerializer.scala:266)
            at org.apache.spark.serializer.KryoSerializerInstance.serializeStream(KryoSerializer.scala:432)
            at org.apache.spark.shuffle.ShufflePartitionPairsWriter.open(ShufflePartitionPairsWriter.scala:76)
            at org.apache.spark.shuffle.ShufflePartitionPairsWriter.write(ShufflePartitionPairsWriter.scala:59)
            at org.apache.spark.util.collection.WritablePartitionedIterator.writeNext(WritablePartitionedPairCollection.scala:83)
            at org.apache.spark.util.collection.ExternalSorter.$anonfun$writePartitionedMapOutput$1(ExternalSorter.scala:772)
            at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
            at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
            at org.apache.spark.util.collection.ExternalSorter.writePartitionedMapOutput(ExternalSorter.scala:775)
            at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:70)
            at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
            at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
            at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
            at org.apache.spark.scheduler.Task.run(Task.scala:136)
            at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
            at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.ClassNotFoundException: breeze.linalg.Matrix
            at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
            at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
            ... 32 more
    ```
    
    **After**
    ```
    YarnShuffleIntegrationSuite:
    - external shuffle service
    Run completed in 35 seconds, 188 milliseconds.
    Total number of tests run: 1
    Suites: completed 2, aborted 0
    Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
    All tests passed.
    ```
    
    2. run with `hadoop-2.7` profile
    
    ```
    mvn clean install -Phadoop-2.7 -Pyarn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnShuffleIntegrationSuite
    ```
    
    **Before**
    
    ```
    YarnShuffleIntegrationSuite:
    - external shuffle service
    Run completed in 30 seconds, 828 milliseconds.
    Total number of tests run: 1
    Suites: completed 2, aborted 0
    Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
    All tests passed.
    ```
    
    **After**
    ```
    YarnShuffleIntegrationSuite:
    - external shuffle service
    Run completed in 30 seconds, 967 milliseconds.
    Total number of tests run: 1
    Suites: completed 2, aborted 0
    Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
    All tests passed.
    ```
    
    Closes #34620 from LuciferYang/SPARK-37209.
    
    Authored-by: yangjie01 <ya...@baidu.com>
    Signed-off-by: Sean Owen <sr...@gmail.com>
---
 .../yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala   | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
index 4763115..7787e2f 100644
--- a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
+++ b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
@@ -901,8 +901,8 @@ private[spark] class Client(
       sys.env.get("PYTHONHASHSEED").foreach(env.put("PYTHONHASHSEED", _))
     }
 
-    sys.env.get(ENV_DIST_CLASSPATH).foreach { dcp =>
-      env(ENV_DIST_CLASSPATH) = dcp
+    Seq(ENV_DIST_CLASSPATH, SPARK_TESTING).foreach { envVar =>
+      sys.env.get(envVar).foreach(value => env(envVar) = value)
     }
 
     env
@@ -1353,6 +1353,8 @@ private[spark] object Client extends Logging {
   // Subdirectory where Spark libraries will be placed.
   val LOCALIZED_LIB_DIR = "__spark_libs__"
 
+  val SPARK_TESTING = "SPARK_TESTING"
+
   /**
    * Return the path to the given application's staging directory.
    */

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org