You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2019/07/27 18:07:35 UTC

[spark] branch branch-2.3 updated: [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.3 by this push:
     new abff292  [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
abff292 is described below

commit abff29237255627cf8d56dad64f212b8ca2a9640
Author: Marcelo Vanzin <va...@cloudera.com>
AuthorDate: Sat Jul 27 11:06:35 2019 -0700

    [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
    
    This test tries to detect correct behavior in racy code, where the event
    thread is racing with the executor thread that's trying to kill the running
    task.
    
    If the event that signals the stage end arrives first, any delay in the
    delivery of the message to kill the task causes the code to rapidly process
    elements, and may cause the test to assert. Adding a 10ms delay in
    LocalSchedulerBackend before the task kill makes the test run through
    ~1000 elements. A longer delay can easily cause the 10000 elements to
    be processed.
    
    Instead, by adding a small delay (10ms) in the test code that processes
    elements, there's a much lower probability that the kill event will not
    arrive before the end; that leaves a window of 100s for the event
    to be delivered to the executor. And because each element only sleeps for
    10ms, the test is not really slowed down at all.
    
    Closes #25270 from vanzin/SPARK-28535.
    
    Authored-by: Marcelo Vanzin <va...@cloudera.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
    (cherry picked from commit 7f84104b3981dc69238730e0bed7c8c5bd113d76)
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 core/src/test/scala/org/apache/spark/JobCancellationSuite.scala | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
index 61da413..7b251c1 100644
--- a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
+++ b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
@@ -363,7 +363,10 @@ class JobCancellationSuite extends SparkFunSuite with Matchers with BeforeAndAft
       }.foreachAsync { x =>
         // Block this code from being executed, until the job get cancelled. In this case, if the
         // source iterator is interruptible, the max number of increment should be under
-        // `numElements`.
+        // `numElements`. We sleep a little to make sure that we leave enough time for the
+        // "kill" message to be delivered to the executor (10000 * 10ms = 100s allowance for
+        // delivery, which should be more than enough).
+        Thread.sleep(10)
         taskCancelledSemaphore.acquire()
         executionOfInterruptibleCounter.getAndIncrement()
     }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org