You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2019/07/27 18:07:08 UTC

[spark] branch master updated: [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 7f84104  [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
7f84104 is described below

commit 7f84104b3981dc69238730e0bed7c8c5bd113d76
Author: Marcelo Vanzin <va...@cloudera.com>
AuthorDate: Sat Jul 27 11:06:35 2019 -0700

    [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
    
    This test tries to detect correct behavior in racy code, where the event
    thread is racing with the executor thread that's trying to kill the running
    task.
    
    If the event that signals the stage end arrives first, any delay in the
    delivery of the message to kill the task causes the code to rapidly process
    elements, and may cause the test to assert. Adding a 10ms delay in
    LocalSchedulerBackend before the task kill makes the test run through
    ~1000 elements. A longer delay can easily cause the 10000 elements to
    be processed.
    
    Instead, by adding a small delay (10ms) in the test code that processes
    elements, there's a much lower probability that the kill event will not
    arrive before the end; that leaves a window of 100s for the event
    to be delivered to the executor. And because each element only sleeps for
    10ms, the test is not really slowed down at all.
    
    Closes #25270 from vanzin/SPARK-28535.
    
    Authored-by: Marcelo Vanzin <va...@cloudera.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 core/src/test/scala/org/apache/spark/JobCancellationSuite.scala | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
index db90c31..b533304 100644
--- a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
+++ b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
@@ -365,7 +365,10 @@ class JobCancellationSuite extends SparkFunSuite with Matchers with BeforeAndAft
       }.foreachAsync { x =>
         // Block this code from being executed, until the job get cancelled. In this case, if the
         // source iterator is interruptible, the max number of increment should be under
-        // `numElements`.
+        // `numElements`. We sleep a little to make sure that we leave enough time for the
+        // "kill" message to be delivered to the executor (10000 * 10ms = 100s allowance for
+        // delivery, which should be more than enough).
+        Thread.sleep(10)
         taskCancelledSemaphore.acquire()
         executionOfInterruptibleCounter.getAndIncrement()
     }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org