You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2019/04/04 01:58:31 UTC
[spark] branch master updated: [SPARK-27338][CORE] Fix deadlock in
UnsafeExternalSorter.SpillableIterator when locking both
UnsafeExternalSorter.SpillableIterator and TaskMemoryManager
This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 6c4552c6 [SPARK-27338][CORE] Fix deadlock in UnsafeExternalSorter.SpillableIterator when locking both UnsafeExternalSorter.SpillableIterator and TaskMemoryManager
6c4552c6 is described below
commit 6c4552c65045cfe82ed95212ee7cff684e44288b
Author: Venkata krishnan Sowrirajan <vs...@qubole.com>
AuthorDate: Thu Apr 4 09:58:05 2019 +0800
[SPARK-27338][CORE] Fix deadlock in UnsafeExternalSorter.SpillableIterator when locking both UnsafeExternalSorter.SpillableIterator and TaskMemoryManager
## What changes were proposed in this pull request?
In `UnsafeExternalSorter.SpillableIterator#loadNext()` takes lock on the `UnsafeExternalSorter` and calls `freePage` once the `lastPage` is consumed which needs to take a lock on `TaskMemoryManager`. At the same time, there can be another MemoryConsumer using `UnsafeExternalSorter` as part of sorting can try to `allocatePage` needs to get lock on `TaskMemoryManager` which can cause spill to happen which requires lock on `UnsafeExternalSorter` again causing deadlock. This is a classic [...]
To fix this, we can move the `freePage` call in `loadNext` outside of `Synchronized` block similar to the fix in SPARK-26265
## How was this patch tested?
Manual tests were being done and will also try to add a test.
Closes #24265 from venkata91/deadlock-sorter.
Authored-by: Venkata krishnan Sowrirajan <vs...@qubole.com>
Signed-off-by: Wenchen Fan <we...@databricks.com>
---
.../unsafe/sort/UnsafeExternalSorter.java | 34 +++++++++++++++-------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
index af5a934..2314514 100644
--- a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
+++ b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
@@ -573,19 +573,31 @@ public final class UnsafeExternalSorter extends MemoryConsumer {
@Override
public void loadNext() throws IOException {
- synchronized (this) {
- loaded = true;
- if (nextUpstream != null) {
- // Just consumed the last record from in memory iterator
- if (lastPage != null) {
- freePage(lastPage);
- lastPage = null;
+ MemoryBlock pageToFree = null;
+ try {
+ synchronized (this) {
+ loaded = true;
+ if (nextUpstream != null) {
+ // Just consumed the last record from in memory iterator
+ if(lastPage != null) {
+ // Do not free the page here, while we are locking `SpillableIterator`. The `freePage`
+ // method locks the `TaskMemoryManager`, and it's a bad idea to lock 2 objects in
+ // sequence. We may hit dead lock if another thread locks `TaskMemoryManager` and
+ // `SpillableIterator` in sequence, which may happen in
+ // `TaskMemoryManager.acquireExecutionMemory`.
+ pageToFree = lastPage;
+ lastPage = null;
+ }
+ upstream = nextUpstream;
+ nextUpstream = null;
}
- upstream = nextUpstream;
- nextUpstream = null;
+ numRecords--;
+ upstream.loadNext();
+ }
+ } finally {
+ if (pageToFree != null) {
+ freePage(pageToFree);
}
- numRecords--;
- upstream.loadNext();
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org