You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by viirya <gi...@git.apache.org> on 2018/12/10 08:16:06 UTC

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/23272

    [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

    ## What changes were proposed in this pull request?
    
    In `BytesToBytesMap.MapIterator.advanceToNextPage`, We will first lock this `MapIterator` and then `TaskMemoryManager` when going to free a memory page by calling `freePage`. At the same time, it is possibly that another memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it acquires memory and causes spilling on this `MapIterator`.
    
    So it ends with the `MapIterator` object holds lock to the `MapIterator` object and waits for lock on `TaskMemoryManager`, and the other consumer holds lock to `TaskMemoryManager` and waits for lock on the `MapIterator` object.
    
    To avoid deadlock here, this patch proposes to keep reference to the page to free and free it after releasing the lock of `MapIterator`.
    
    ## How was this patch tested?
    
    Added test and manually test by running the test 100 times to make sure there is no deadlock.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-26265

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23272
    
----
commit 25e8e068047b714f27706399e1e6c03c338ac178
Author: Liang-Chi Hsieh <vi...@...>
Date:   2018-12-10T07:59:09Z

    Fix deadlock in BytesToBytesMap.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5925/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5926/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    cc @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99900/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    **[Test build #99900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99900/testReport)** for PR 23272 at commit [`25e8e06`](https://github.com/apache/spark/commit/25e8e068047b714f27706399e1e6c03c338ac178).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240178993
  
    --- Diff: core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java ---
    @@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger) throws IOException {
         return used;
       }
     
    -  void use(long size) {
    +  @VisibleForTesting
    --- End diff --
    
    This is a test class, we can just make it public.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240189245
  
    --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
    @@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
         }
       }
     
    +  @Test
    +  public void avoidDeadlock() throws InterruptedException {
    +    memoryManager.limit(PAGE_SIZE_BYTES);
    +    MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
    +    TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
    +    BytesToBytesMap map =
    +      new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
    +
    +    Runnable memoryConsumer = new Runnable() {
    +      @Override
    +      public void run() {
    +        int i = 0;
    +        long used = 0;
    +        while (i < 10) {
    +          c1.use(10000000);
    +          used += 10000000;
    +          i++;
    +        }
    +        c1.free(used);
    +      }
    +    };
    +
    +    Thread thread = new Thread(memoryConsumer);
    +
    +    try {
    +      int i;
    +      for (i = 0; i < 1024; i++) {
    +        final long[] arr = new long[]{i};
    +        final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
    +        loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
    +      }
    +
    +      // Starts to require memory at another memory consumer.
    +      thread.start();
    +
    +      BytesToBytesMap.MapIterator iter = map.destructiveIterator();
    +      for (i = 0; i < 1024; i++) {
    +        iter.next();
    +      }
    +      assertFalse(iter.hasNext());
    +    } finally {
    +      map.free();
    +      thread.join();
    --- End diff --
    
    Is this line where the test hangs without the fix?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240204508
  
    --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
    @@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
         }
       }
     
    +  @Test
    +  public void avoidDeadlock() throws InterruptedException {
    +    memoryManager.limit(PAGE_SIZE_BYTES);
    +    MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
    +    TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
    +    BytesToBytesMap map =
    +      new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
    +
    +    Runnable memoryConsumer = new Runnable() {
    +      @Override
    +      public void run() {
    +        int i = 0;
    +        long used = 0;
    +        while (i < 10) {
    +          c1.use(10000000);
    +          used += 10000000;
    +          i++;
    +        }
    +        c1.free(used);
    +      }
    +    };
    +
    +    Thread thread = new Thread(memoryConsumer);
    +
    +    try {
    +      int i;
    +      for (i = 0; i < 1024; i++) {
    +        final long[] arr = new long[]{i};
    +        final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
    +        loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
    +      }
    +
    +      // Starts to require memory at another memory consumer.
    +      thread.start();
    +
    +      BytesToBytesMap.MapIterator iter = map.destructiveIterator();
    +      for (i = 0; i < 1024; i++) {
    +        iter.next();
    +      }
    +      assertFalse(iter.hasNext());
    +    } finally {
    +      map.free();
    +      thread.join();
    --- End diff --
    
    This line just makes sure `memoryConsumer` to end and free memory.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240204245
  
    --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
    @@ -667,4 +669,54 @@ public void testPeakMemoryUsed() {
         }
       }
     
    +  @Test
    +  public void avoidDeadlock() throws InterruptedException {
    +    memoryManager.limit(PAGE_SIZE_BYTES);
    +    MemoryMode mode = useOffHeapMemoryAllocator() ? MemoryMode.OFF_HEAP: MemoryMode.ON_HEAP;
    +    TestMemoryConsumer c1 = new TestMemoryConsumer(taskMemoryManager, mode);
    +    BytesToBytesMap map =
    +      new BytesToBytesMap(taskMemoryManager, blockManager, serializerManager, 1, 0.5, 1024);
    +
    +    Runnable memoryConsumer = new Runnable() {
    +      @Override
    +      public void run() {
    +        int i = 0;
    +        long used = 0;
    +        while (i < 10) {
    +          c1.use(10000000);
    +          used += 10000000;
    +          i++;
    +        }
    +        c1.free(used);
    +      }
    +    };
    +
    +    Thread thread = new Thread(memoryConsumer);
    +
    +    try {
    +      int i;
    +      for (i = 0; i < 1024; i++) {
    +        final long[] arr = new long[]{i};
    +        final BytesToBytesMap.Location loc = map.lookup(arr, Platform.LONG_ARRAY_OFFSET, 8);
    +        loc.append(arr, Platform.LONG_ARRAY_OFFSET, 8, arr, Platform.LONG_ARRAY_OFFSET, 8);
    +      }
    +
    +      // Starts to require memory at another memory consumer.
    +      thread.start();
    +
    +      BytesToBytesMap.MapIterator iter = map.destructiveIterator();
    +      for (i = 0; i < 1024; i++) {
    +        iter.next();
    +      }
    +      assertFalse(iter.hasNext());
    +    } finally {
    +      map.free();
    +      thread.join();
    --- End diff --
    
    When without this line, the test still hangs. The test thread hangs on the deadlock with the other thread of running `memoryConsumer`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    > have you seen any bug report caused by this dead lock?
    
    The original reporter of the JIRA ticket SPARK-26265 has hit with this bug in their workload.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    **[Test build #99900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99900/testReport)** for PR 23272 at commit [`25e8e06`](https://github.com/apache/spark/commit/25e8e068047b714f27706399e1e6c03c338ac178).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240170574
  
    --- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
    @@ -255,11 +255,18 @@ private MapIterator(int numRecords, Location loc, boolean destructive) {
         }
     
         private void advanceToNextPage() {
    +      // SPARK-26265: We will first lock this `MapIterator` and then `TaskMemoryManager` when going
    +      // to free a memory page by calling `freePage`. At the same time, it is possibly that another
    +      // memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it
    +      // acquires memory and causes spilling on this `MapIterator`. To avoid deadlock here, we keep
    +      // reference to the page to free and free it after releasing the lock of `MapIterator`.
    --- End diff --
    
    Yea, OffHead also suffers from this issue. One change is needed for this test case to cover it. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    have you seen any bug report caused by this dead lock?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    **[Test build #99915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99915/testReport)** for PR 23272 at commit [`9d52320`](https://github.com/apache/spark/commit/9d52320e24077a8c94639aad6b21a4af5d3e83d9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240146031
  
    --- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
    @@ -255,11 +255,18 @@ private MapIterator(int numRecords, Location loc, boolean destructive) {
         }
     
         private void advanceToNextPage() {
    +      // SPARK-26265: We will first lock this `MapIterator` and then `TaskMemoryManager` when going
    +      // to free a memory page by calling `freePage`. At the same time, it is possibly that another
    +      // memory consumer first locks `TaskMemoryManager` and then this `MapIterator` when it
    +      // acquires memory and causes spilling on this `MapIterator`. To avoid deadlock here, we keep
    +      // reference to the page to free and free it after releasing the lock of `MapIterator`.
    --- End diff --
    
    Do we have a deadlock situation for OffHeap situation, too? Does the new test case cover that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5914/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240144674
  
    --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
    @@ -667,4 +668,53 @@ public void testPeakMemoryUsed() {
         }
       }
     
    +  @Test
    +  public void avoidDeadlock() throws InterruptedException {
    --- End diff --
    
    Hi, @viirya . Since this test case reproduces `Deadlock` situation, we need a timeout logic. Otherwise, it will hang (instead of failures) when we hit this issue later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240187678
  
    --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java ---
    @@ -667,4 +668,53 @@ public void testPeakMemoryUsed() {
         }
       }
     
    +  @Test
    +  public void avoidDeadlock() throws InterruptedException {
    --- End diff --
    
    I've tried few ways to set a timeout logic, but don't work. The deadlock hangs the test thread.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23272#discussion_r240187707
  
    --- Diff: core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java ---
    @@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger) throws IOException {
         return used;
       }
     
    -  void use(long size) {
    +  @VisibleForTesting
    --- End diff --
    
    Ok.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    **[Test build #99914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99914/testReport)** for PR 23272 at commit [`4c621d2`](https://github.com/apache/spark/commit/4c621d2bd36c50a10591d93ccd77bd7c0432a873).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23272
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org