You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2016/08/31 14:26:20 UTC

[jira] [Created] (FLINK-4543) Race Deadlock in SpilledSubpartitionViewTest

Stephan Ewen created FLINK-4543:
-----------------------------------

             Summary: Race Deadlock in SpilledSubpartitionViewTest
                 Key: FLINK-4543
                 URL: https://issues.apache.org/jira/browse/FLINK-4543
             Project: Flink
          Issue Type: Improvement
          Components: Network
    Affects Versions: 1.1.2
            Reporter: Stephan Ewen
            Assignee: Stephan Ewen
             Fix For: 1.2.0


The test deadlocked (Java level deadlock) with the following stack traces:

{code}
Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting to lock monitor 0x00007fec2c006168 (object 0x00000000ef661c20, a java.lang.Object),
  which is held by "IOManager reader thread #1"
"IOManager reader thread #1":
  waiting to lock monitor 0x00007fec2c005ea8 (object 0x00000000ef62c8a8, a java.lang.Object),
  which is held by "pool-1-thread-2"

Java stack information for the threads listed above:
===================================================
"pool-1-thread-2":
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.notifyError(SpilledSubpartitionViewAsyncIO.java:309)
        - waiting to lock <0x00000000ef661c20> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.onAvailableBuffer(SpilledSubpartitionViewAsyncIO.java:261)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$300(SpilledSubpartitionViewAsyncIO.java:42)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:380)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:366)
        at org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:135)
        - locked <0x00000000ef62c8a8> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:118)
        - locked <0x00000000ef9597c0> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.util.TestConsumerCallback$RecyclingCallback.onBuffer(TestConsumerCallback.java:72)
        at org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:87)
        at org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:39)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
"IOManager reader thread #1":
        at org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:126)
        - waiting to lock <0x00000000ef62c8a8> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:118)
        - locked <0x00000000efa016f0> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.returnBufferFromIOThread(SpilledSubpartitionViewAsyncIO.java:275)
        - locked <0x00000000ef661c20> (a java.lang.Object)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$100(SpilledSubpartitionViewAsyncIO.java:42)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:343)
        at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:333)
        at org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.handleProcessedBuffer(AsynchronousFileIOChannel.java:199)
        at org.apache.flink.runtime.io.disk.iomanager.BufferReadRequest.requestDone(AsynchronousFileIOChannel.java:435)
        at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$ReaderThread.run(IOManagerAsync.java:408)

Found 1 deadlock.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)