You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/04/08 08:57:03 UTC
[spark] branch branch-3.0 updated: [SPARK-31379][CORE][TEST] Fix
flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources
from executor
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 2ced982 [SPARK-31379][CORE][TEST] Fix flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor
2ced982 is described below
commit 2ced9825913d4cb406263329c909619c1c386f75
Author: yi.wu <yi...@databricks.com>
AuthorDate: Wed Apr 8 17:54:28 2020 +0900
[SPARK-31379][CORE][TEST] Fix flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor
### What changes were proposed in this pull request?
This PR (SPARK-31379) adds one line `when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)` to avoid allocating resources.
### Why are the changes needed?
The test is flaky and here's part of error stack:
```
sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedDueToTimeoutException:
The code passed to eventually never returned normally. Attempted 325 times over 5.01070979 seconds.
Last failure message: ArrayBuffer("1", "3") did not equal Array("0", "1", "3").
...
org.apache.spark.scheduler.CoarseGrainedSchedulerBackendSuite.eventually(CoarseGrainedSchedulerBackendSuite.scala:45)
```
You can check [here](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120786/testReport/org.apache.spark.scheduler/CoarseGrainedSchedulerBackendSuite/extra_resources_from_executor/) for details.
And it is flaky because: after sending `StatusUpdate` to `CoarseGrainedSchedulerBackend`, `CoarseGrainedSchedulerBackend` will call `makeOffer` immediately once releasing the resources. So, it's possible that `availableAddrs` has allocated again before we assert `execResources(GPU).availableAddrs.sorted === Array("0", "1", "3")`.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
The issue can be stably reproduced by inserting `Thread.sleep(3000)` after the line of sending `StatusUpdate`. After applying this fix, the issue is gone.
Closes #28145 from Ngone51/fix_flaky.
Authored-by: yi.wu <yi...@databricks.com>
Signed-off-by: HyukjinKwon <gu...@apache.org>
(cherry picked from commit a2789c2a5179f6ed5613c24130bc80bf1bbfd753)
Signed-off-by: HyukjinKwon <gu...@apache.org>
---
.../apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala | 3 +++
1 file changed, 3 insertions(+)
diff --git a/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala b/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
index c063301..fad971f 100644
--- a/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
+++ b/core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala
@@ -251,6 +251,9 @@ class CoarseGrainedSchedulerBackendSuite extends SparkFunSuite with LocalSparkCo
assert(execResources(GPU).assignedAddrs === Array("0"))
}
+ // To avoid allocating any resources immediately after releasing the resource from the task to
+ // make sure that `availableAddrs` below won't change
+ when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)
backend.driverEndpoint.send(
StatusUpdate("1", 1, TaskState.FINISHED, buffer, taskResources))
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org