You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yang Jie (Jira)" <ji...@apache.org> on 2021/11/16 12:11:00 UTC
[jira] [Comment Edited] (SPARK-37209) YarnShuffleIntegrationSuite and other two similar cases in `resource-managers` test failed
[ https://issues.apache.org/jira/browse/SPARK-37209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444502#comment-17444502 ]
Yang Jie edited comment on SPARK-37209 at 11/16/21, 12:10 PM:
--------------------------------------------------------------
After some investigation, I found that this issue maybe related to `hadoop-3.x`, when use `hadoop-2.7` profile, the above test can be successful:
{code:java}
mvn clean install -DskipTests -pl resource-managers/yarn -Pyarn -Phadoop-2.7 -am
mvn test -pl resource-managers/yarn -Pyarn -Phadoop-2.7 -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnShuffleIntegrationSuite
Discovery starting.
Discovery completed in 259 milliseconds.
Run starting. Expected test count is: 1
YarnShuffleIntegrationSuite:
- external shuffle service
Run completed in 30 seconds, 765 milliseconds.
Total number of tests run: 1
Suites: completed 2, aborted 0
Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
All tests passed.{code}
It seems that when testing with hadoop-2.7, the result of executing `Utils.isTesting` on the executor side is true, which helps test case to ignore the `NoClassDefFoundError` in the test, but when testing with hadoop-3.2, the result of executing `Utils.isTesting` on the executor side is false.
But I haven't investigated the root cause with hadoop-3.2
cc [~hyukjin.kwon] [~dongjoon] [~srowen]
was (Author: luciferyang):
After some investigation, I found that this issue maybe related to `hadoop-3.x`, when use `hadoop-2.7` profile, the above test can be successful:
{code:java}
mvn clean install -DskipTests -pl resource-managers/yarn -Pyarn -Phadoop-2.7 -am
mvn test -pl resource-managers/yarn -Pyarn -Phadoop-2.7 -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnShuffleIntegrationSuite
Discovery starting.
Discovery completed in 259 milliseconds.
Run starting. Expected test count is: 1
YarnShuffleIntegrationSuite:
- external shuffle service
Run completed in 30 seconds, 765 milliseconds.
Total number of tests run: 1
Suites: completed 2, aborted 0
Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
All tests passed.{code}
It seems that when testing with hadoop-2.7, the result of executing `Utils.isTesting` on the executor side is true, which helps test case to ignore the `NoClassDefFoundError` in the test, but when testing with hadoop-3.2, the result of executing `Utils.isTesting` on the executor side is false.
But I haven't investigated the root cause with hadoop-3.2
> YarnShuffleIntegrationSuite and other two similar cases in `resource-managers` test failed
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-37209
> URL: https://issues.apache.org/jira/browse/SPARK-37209
> Project: Spark
> Issue Type: Bug
> Components: Tests, YARN
> Affects Versions: 3.3.0
> Reporter: Yang Jie
> Priority: Minor
> Attachments: failed-unit-tests.log, success-unit-tests.log
>
>
> Execute :
> # build/mvn clean package -DskipTests -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive
> # build/mvn test -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive -Pscala-2.13 -pl resource-managers/yarn
> The test will successful.
>
> Execute :
> # build/mvn clean -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive
> # build/mvn clean test -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive -Pscala-2.13 -pl resource-managers/yarn
> The test will failed.
>
> Execute :
> # build/mvn clean package -DskipTests -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive
> # Delete assembly/target/scala-2.12/jars manually
> # build/mvn test -Phadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive -Pscala-2.13 -pl resource-managers/yarn
> The test will failed.
>
> The error stack is :
> {code:java}
> 21/11/04 19:48:52.159 main ERROR Client: Application diagnostics message: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times,
> most recent failure: Lost task 0.3 in stage 0.0 (TID 6) (localhost executor 1): java.lang.NoClassDefFoundError: breeze/linalg/Matrix
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:216)
> at org.apache.spark.serializer.KryoSerializer$.$anonfun$loadableSparkClasses$1(KryoSerializer.scala:537)
> at scala.collection.immutable.List.flatMap(List.scala:293)
> at scala.collection.immutable.List.flatMap(List.scala:79)
> at org.apache.spark.serializer.KryoSerializer$.loadableSparkClasses$lzycompute(KryoSerializer.scala:535)
> at org.apache.spark.serializer.KryoSerializer$.org$apache$spark$serializer$KryoSerializer$$loadableSparkClasses(KryoSerializer.scala:502)
> at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:226)
> at org.apache.spark.serializer.KryoSerializer$$anon$1.create(KryoSerializer.scala:102)
> at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
> at org.apache.spark.serializer.KryoSerializer$PoolWrapper.borrow(KryoSerializer.scala:109)
> at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:346)
> at org.apache.spark.serializer.KryoSerializationStream.<init>(KryoSerializer.scala:266)
> at org.apache.spark.serializer.KryoSerializerInstance.serializeStream(KryoSerializer.scala:432)
> at org.apache.spark.shuffle.ShufflePartitionPairsWriter.open(ShufflePartitionPairsWriter.scala:76)
> at org.apache.spark.shuffle.ShufflePartitionPairsWriter.write(ShufflePartitionPairsWriter.scala:59)
> at org.apache.spark.util.collection.WritablePartitionedIterator.writeNext(WritablePartitionedPairCollection.scala:83)
> at org.apache.spark.util.collection.ExternalSorter.$anonfun$writePartitionedMapOutput$1(ExternalSorter.scala:772)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
> at org.apache.spark.util.collection.ExternalSorter.writePartitionedMapOutput(ExternalSorter.scala:775)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:70)
> at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
> at org.apache.spark.scheduler.Task.run(Task.scala:136)
> at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: breeze.linalg.Matrix
> at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
> {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org