You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Ufuk Celebi (JIRA)" <ji...@apache.org> on 2016/12/02 11:07:59 UTC
[jira] [Created] (FLINK-5228) LocalInputChannel re-trigger request
and release deadlock
Ufuk Celebi created FLINK-5228:
----------------------------------
Summary: LocalInputChannel re-trigger request and release deadlock
Key: FLINK-5228
URL: https://issues.apache.org/jira/browse/FLINK-5228
Project: Flink
Issue Type: Bug
Components: Network
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
Priority: Critical
Fix For: 1.2.0, 1.1.4
Concurrent release and re-triggering of a partition request can lead to a deadlock.
{code}
Found one Java-level deadlock:
=============================
"Canceler for Map -> Sink: Unnamed (1/4)":
waiting to lock monitor 0x0000000001e27bd8 (object 0x00000000ffa1f688, a java.lang.Object),
which is held by "Timer-3"
"Timer-3":
waiting to lock monitor 0x00007fdbd029ec48 (object 0x00000000ffa1f3a0, a java.lang.Object),
which is held by "Canceler for Map -> Sink: Unnamed (1/4)"
Java stack information for the threads listed above:
===================================================
"Canceler for Map -> Sink: Unnamed (1/4)":
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.releaseAllResources(LocalInputChannel.java:240)
- waiting to lock <0x00000000ffa1f688> (a java.lang.Object)
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.releaseAllResources(SingleInputGate.java:348)
- locked <0x00000000ffa1f3a0> (a java.lang.Object)
at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1280)
at java.lang.Thread.run(Thread.java:745)
"Timer-3":
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.retriggerPartitionRequest(SingleInputGate.java:307)
- waiting to lock <0x00000000ffa1f3a0> (a java.lang.Object)
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.requestSubpartition(LocalInputChannel.java:128)
- locked <0x00000000ffa1f688> (a java.lang.Object)
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel$1.run(LocalInputChannel.java:148)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)