You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/25 11:39:50 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request, #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

HyukjinKwon opened a new pull request, #37989:
URL: https://github.com/apache/spark/pull/37989

   ### What changes were proposed in this pull request?
   
   This PR is a followup of https://github.com/apache/spark/pull/37533 that works around the test failure by explicitly checking the element and length in the test.
   
   ### Why are the changes needed?
   
   For an unknown reason the test added in is flaky even though both `ArrayBuffer` and `List` are `Sep` and the test should pass up to my best knowledge. See https://github.com/apache/spark/actions/runs/3109851954/jobs/5040465291
   
   ```
   [info] - SPARK-40096: Send finalize events even if shuffle merger blocks indefinitely with registerMergeResults is false *** FAILED *** (90 milliseconds)
   [info]   ArrayBuffer("hostB") did not equal List("hostB") (DAGSchedulerSuite.scala:4498)
   [info]   org.scalatest.exceptions.TestFailedException:
   [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
   [info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
   [info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
   [info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
   [info]   at org.apache.spark.scheduler.DAGSchedulerSuite.$anonfun$new$286(DAGSchedulerSuite.scala:4498)
   [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
   [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
   [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
   [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
   [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:207)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
   [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
   [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:66)
   [info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
   [info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
   [info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:66)
   [info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
   [info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
   [info]   at scala.collection.immutable.List.foreach(List.scala:431)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, test-only.
   
   ### How was this patch tested?
   
   CI in this PR should verify that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37989:
URL: https://github.com/apache/spark/pull/37989#discussion_r980652463


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -4495,7 +4495,8 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti
       sendRequestsLatch.await()
       verify(blockStoreClient, times(2))
         .finalizeShuffleMerge(any(), any(), any(), any(), any())
-      assert(sentHosts === Seq("hostB"))
+      assert(sentHosts.nonEmpty)
+      assert(sentHosts.head === "hostB" && sentHosts.length == 1)

Review Comment:
   Ah, okay. Seems like this is flaky because of the timeout:
   
   ```
   [info] - SPARK-40096: Send finalize events even if shuffle merger blocks indefinitely with registerMergeResults is false *** FAILED *** (106 milliseconds)
   [info]   ArrayBuffer("hostB") was empty (DAGSchedulerSuite.scala:4498)
   [info]   org.scalatest.exceptions.TestFailedException:
   [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
   [info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
   [info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
   [info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
   [info]   at org.apache.spark.scheduler.DAGSchedulerSuite.$anonfun$new$286(DAGSchedulerSuite.scala:4498)
   [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
   [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
   [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
   [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
   ```
   https://github.com/apache/spark/actions/runs/3129263557/jobs/5078150518
   
   I think it was empty when the condition is checked. Later, I think the array is filled at the time when the exception is actually printed out.
   
   Should we increase the timeout, @mridulm and @wankunde?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
mridulm commented on code in PR #37989:
URL: https://github.com/apache/spark/pull/37989#discussion_r980719077


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -4495,7 +4495,8 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti
       sendRequestsLatch.await()
       verify(blockStoreClient, times(2))
         .finalizeShuffleMerge(any(), any(), any(), any(), any())
-      assert(sentHosts === Seq("hostB"))
+      assert(sentHosts.nonEmpty)
+      assert(sentHosts.head === "hostB" && sentHosts.length == 1)

Review Comment:
   I think this might a consequence of using `MyDAGScheduler` - `scheduleShuffleMergeFinalize` is overridden and runs within the calling thread, instead of async - which this test is expecting.
   
   +CC @otterc, @venkata91 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #37989:
URL: https://github.com/apache/spark/pull/37989#issuecomment-1257175536

   cc @mridulm FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37989:
URL: https://github.com/apache/spark/pull/37989#issuecomment-1257299340

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length
URL: https://github.com/apache/spark/pull/37989


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37989:
URL: https://github.com/apache/spark/pull/37989#discussion_r979395539


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -4469,7 +4469,7 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti
       blockStoreClientField.setAccessible(true)
       blockStoreClientField.set(sc.env.blockManager, blockStoreClient)
 
-      val sentHosts = ArrayBuffer[String]()
+      val sentHosts = ArrayBuffer.empty[String]

Review Comment:
   This is unrelated but just a cleanup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37989:
URL: https://github.com/apache/spark/pull/37989#discussion_r979395539


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -4469,7 +4469,7 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti
       blockStoreClientField.setAccessible(true)
       blockStoreClientField.set(sc.env.blockManager, blockStoreClient)
 
-      val sentHosts = ArrayBuffer[String]()
+      val sentHosts = ArrayBuffer.empty[String]

Review Comment:
   This is unrelated but just a cleanup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
mridulm commented on PR #37989:
URL: https://github.com/apache/spark/pull/37989#issuecomment-1257374954

   This is very interesting behavior !
   Thanks for fixing this @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37989:
URL: https://github.com/apache/spark/pull/37989#discussion_r983397463


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -4495,7 +4495,8 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti
       sendRequestsLatch.await()
       verify(blockStoreClient, times(2))
         .finalizeShuffleMerge(any(), any(), any(), any(), any())
-      assert(sentHosts === Seq("hostB"))
+      assert(sentHosts.nonEmpty)
+      assert(sentHosts.head === "hostB" && sentHosts.length == 1)

Review Comment:
   Hmmm .. this is actually more flaky than I thought:
   - https://github.com/apache/spark/actions/runs/3145115911/jobs/5112006948
   - https://github.com/apache/spark/actions/runs/3146198025/jobs/5114387367
   
   Would be great if @wankunde has a chance to take a look ..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org