You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "huangtianhua (JIRA)" <ji...@apache.org> on 2019/07/22 04:10:00 UTC

[jira] [Created] (SPARK-28467) Tests failed if there are not enough executors up before running

huangtianhua created SPARK-28467:
------------------------------------

             Summary: Tests failed if there are not enough executors up before running
                 Key: SPARK-28467
                 URL: https://issues.apache.org/jira/browse/SPARK-28467
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: huangtianhua


We ran unit tests on arm64 instance, and there are tests failed due to the executor can't up under the timeout 10000 ms:
- test driver discovery under local-cluster mode *** FAILED ***
  java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 milliseconds elapsed
  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
  at org.apache.spark.SparkContextSuite.$anonfun$new$78(SparkContextSuite.scala:753)
  at org.apache.spark.SparkContextSuite.$anonfun$new$78$adapted(SparkContextSuite.scala:741)
  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
  at org.apache.spark.SparkContextSuite.$anonfun$new$77(SparkContextSuite.scala:741)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)
  
- test gpu driver resource files and discovery under local-cluster mode *** FAILED ***
  java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 milliseconds elapsed
  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
  at org.apache.spark.SparkContextSuite.$anonfun$new$80(SparkContextSuite.scala:781)
  at org.apache.spark.SparkContextSuite.$anonfun$new$80$adapted(SparkContextSuite.scala:761)
  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
  at org.apache.spark.SparkContextSuite.$anonfun$new$79(SparkContextSuite.scala:761)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)

And then we increase the timeout to 20000(or 30000) the tests passed, I found there are other issues about the timeout increasing before, see: https://issues.apache.org/jira/browse/SPARK-7989 and https://issues.apache.org/jira/browse/SPARK-10651 
I think the timeout doesn't work well, and seems there is no principle of the timeout setting, how can I fix this? Could I increase the timeout for these two tests?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org