You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by viirya <gi...@git.apache.org> on 2018/12/10 08:16:06 UTC
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/23272
[SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager
## What changes were proposed in this pull request?
In `BytesToBytesMap.MapIterator.advanceToNextPage`, We will first lock this `MapIterator` and then `TaskMemoryManager` when going to free a memory page by calling `freePage`. At the same time, it is possibly that another memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it acquires memory and causes spilling on this `MapIterator`.
So it ends with the `MapIterator` object holds lock to the `MapIterator` object and waits for lock on `TaskMemoryManager`, and the other consumer holds lock to `TaskMemoryManager` and waits for lock on the `MapIterator` object.
To avoid deadlock here, this patch proposes to keep reference to the page to free and free it after releasing the lock of `MapIterator`.
## How was this patch tested?
Added test and manually test by running the test 100 times to make sure there is no deadlock.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/viirya/spark-1 SPARK-26265
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23272.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23272
----
commit 25e8e068047b714f27706399e1e6c03c338ac178
Author: Liang-Chi Hsieh <vi...@...>
Date: 2018-12-10T07:59:09Z
Fix deadlock in BytesToBytesMap.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5925/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5926/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/23272
cc @cloud-fan
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99900/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23272
**[Test build #99900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99900/testReport)** for PR 23272 at commit [`25e8e06`](https://github.com/apache/spark/commit/25e8e068047b714f27706399e1e6c03c338ac178).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240178993
--- Diff: core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java ---
@@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger) throws IOException {
return used;
}
- void use(long size) {
+ @VisibleForTesting
--- End diff --
This is a test class, we can just make it public.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240189245
--- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
@@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
}
}
+ @Test
+ public void avoidDeadlock() throws InterruptedException {
+ memoryManager.limit(PAGE_SIZE_BYTES);
+ MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
+ TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
+ BytesToBytesMap map =
+ new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
+
+ Runnable memoryConsumer = new Runnable() {
+ @Override
+ public void run() {
+ int i = 0;
+ long used = 0;
+ while (i < 10) {
+ c1.use(10000000);
+ used += 10000000;
+ i++;
+ }
+ c1.free(used);
+ }
+ };
+
+ Thread thread = new Thread(memoryConsumer);
+
+ try {
+ int i;
+ for (i = 0; i < 1024; i++) {
+ final long[] arr = new long[]{i};
+ final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
+ loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
+ }
+
+ // Starts to require memory at another memory consumer.
+ thread.start();
+
+ BytesToBytesMap.MapIterator iter = map.destructiveIterator();
+ for (i = 0; i < 1024; i++) {
+ iter.next();
+ }
+ assertFalse(iter.hasNext());
+ } finally {
+ map.free();
+ thread.join();
--- End diff --
Is this line where the test hangs without the fix?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240204508
--- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
@@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
}
}
+ @Test
+ public void avoidDeadlock() throws InterruptedException {
+ memoryManager.limit(PAGE_SIZE_BYTES);
+ MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
+ TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
+ BytesToBytesMap map =
+ new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
+
+ Runnable memoryConsumer = new Runnable() {
+ @Override
+ public void run() {
+ int i = 0;
+ long used = 0;
+ while (i < 10) {
+ c1.use(10000000);
+ used += 10000000;
+ i++;
+ }
+ c1.free(used);
+ }
+ };
+
+ Thread thread = new Thread(memoryConsumer);
+
+ try {
+ int i;
+ for (i = 0; i < 1024; i++) {
+ final long[] arr = new long[]{i};
+ final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
+ loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
+ }
+
+ // Starts to require memory at another memory consumer.
+ thread.start();
+
+ BytesToBytesMap.MapIterator iter = map.destructiveIterator();
+ for (i = 0; i < 1024; i++) {
+ iter.next();
+ }
+ assertFalse(iter.hasNext());
+ } finally {
+ map.free();
+ thread.join();
--- End diff --
This line just makes sure `memoryConsumer` to end and free memory.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240204245
--- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
@@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
}
}
+ @Test
+ public void avoidDeadlock() throws InterruptedException {
+ memoryManager.limit(PAGE_SIZE_BYTES);
+ MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
+ TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
+ BytesToBytesMap map =
+ new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
+
+ Runnable memoryConsumer = new Runnable() {
+ @Override
+ public void run() {
+ int i = 0;
+ long used = 0;
+ while (i < 10) {
+ c1.use(10000000);
+ used += 10000000;
+ i++;
+ }
+ c1.free(used);
+ }
+ };
+
+ Thread thread = new Thread(memoryConsumer);
+
+ try {
+ int i;
+ for (i = 0; i < 1024; i++) {
+ final long[] arr = new long[]{i};
+ final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
+ loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
+ }
+
+ // Starts to require memory at another memory consumer.
+ thread.start();
+
+ BytesToBytesMap.MapIterator iter = map.destructiveIterator();
+ for (i = 0; i < 1024; i++) {
+ iter.next();
+ }
+ assertFalse(iter.hasNext());
+ } finally {
+ map.free();
+ thread.join();
--- End diff --
When without this line, the test still hangs. The test thread hangs on the deadlock with the other thread of running `memoryConsumer`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/23272
> have you seen any bug report caused by this dead lock?
The original reporter of the JIRA ticket SPARK-26265 has hit with this bug in their workload.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23272
**[Test build #99900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99900/testReport)** for PR 23272 at commit [`25e8e06`](https://github.com/apache/spark/commit/25e8e068047b714f27706399e1e6c03c338ac178).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240170574
--- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -255,11 +255,18 @@ private MapIterator(int numRecords, Location loc, boolean destructive) {
}
private void advanceToNextPage() {
+ // SPARK-26265: We will first lock this `MapIterator` and then `TaskMemoryManager` when going
+ // to free a memory page by calling `freePage`. At the same time, it is possibly that another
+ // memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it
+ // acquires memory and causes spilling on this `MapIterator`. To avoid deadlock here, we keep
+ // reference to the page to free and free it after releasing the lock of `MapIterator`.
--- End diff --
Yea, OffHead also suffers from this issue. One change is needed for this test case to cover it. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23272
have you seen any bug report caused by this dead lock?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23272
**[Test build #99915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99915/testReport)** for PR 23272 at commit [`9d52320`](https://github.com/apache/spark/commit/9d52320e24077a8c94639aad6b21a4af5d3e83d9).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240146031
--- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -255,11 +255,18 @@ private MapIterator(int numRecords, Location loc, boolean destructive) {
}
private void advanceToNextPage() {
+ // SPARK-26265: We will first lock this `MapIterator` and then `TaskMemoryManager` when going
+ // to free a memory page by calling `freePage`. At the same time, it is possibly that another
+ // memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it
+ // acquires memory and causes spilling on this `MapIterator`. To avoid deadlock here, we keep
+ // reference to the page to free and free it after releasing the lock of `MapIterator`.
--- End diff --
Do we have a deadlock situation for OffHeap situation, too? Does the new test case cover that?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5914/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240144674
--- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
@@ -667,4 +668,53 @@ public void testPeakMemoryUsed() {
}
}
+ @Test
+ public void avoidDeadlock() throws InterruptedException {
--- End diff --
Hi, @viirya . Since this test case reproduces `Deadlock` situation, we need a timeout logic. Otherwise, it will hang (instead of failures) when we hit this issue later.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240187678
--- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
@@ -667,4 +668,53 @@ public void testPeakMemoryUsed() {
}
}
+ @Test
+ public void avoidDeadlock() throws InterruptedException {
--- End diff --
I've tried few ways to set a timeout logic, but don't work. The deadlock hangs the test thread.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240187707
--- Diff: core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java ---
@@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger) throws IOException {
return used;
}
- void use(long size) {
+ @VisibleForTesting
--- End diff --
Ok.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23272
**[Test build #99914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99914/testReport)** for PR 23272 at commit [`4c621d2`](https://github.com/apache/spark/commit/4c621d2bd36c50a10591d93ccd77bd7c0432a873).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23272
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org