You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2019/07/27 18:07:08 UTC
[spark] branch master updated: [SPARK-28535][CORE][TEST] Slow down
tasks to de-flake JobCancellationSuite
This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 7f84104 [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
7f84104 is described below
commit 7f84104b3981dc69238730e0bed7c8c5bd113d76
Author: Marcelo Vanzin <va...@cloudera.com>
AuthorDate: Sat Jul 27 11:06:35 2019 -0700
[SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
This test tries to detect correct behavior in racy code, where the event
thread is racing with the executor thread that's trying to kill the running
task.
If the event that signals the stage end arrives first, any delay in the
delivery of the message to kill the task causes the code to rapidly process
elements, and may cause the test to assert. Adding a 10ms delay in
LocalSchedulerBackend before the task kill makes the test run through
~1000 elements. A longer delay can easily cause the 10000 elements to
be processed.
Instead, by adding a small delay (10ms) in the test code that processes
elements, there's a much lower probability that the kill event will not
arrive before the end; that leaves a window of 100s for the event
to be delivered to the executor. And because each element only sleeps for
10ms, the test is not really slowed down at all.
Closes #25270 from vanzin/SPARK-28535.
Authored-by: Marcelo Vanzin <va...@cloudera.com>
Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
core/src/test/scala/org/apache/spark/JobCancellationSuite.scala | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
index db90c31..b533304 100644
--- a/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
+++ b/core/src/test/scala/org/apache/spark/JobCancellationSuite.scala
@@ -365,7 +365,10 @@ class JobCancellationSuite extends SparkFunSuite with Matchers with BeforeAndAft
}.foreachAsync { x =>
// Block this code from being executed, until the job get cancelled. In this case, if the
// source iterator is interruptible, the max number of increment should be under
- // `numElements`.
+ // `numElements`. We sleep a little to make sure that we leave enough time for the
+ // "kill" message to be delivered to the executor (10000 * 10ms = 100s allowance for
+ // delivery, which should be more than enough).
+ Thread.sleep(10)
taskCancelledSemaphore.acquire()
executionOfInterruptibleCounter.getAndIncrement()
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org