You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (JIRA)" <ji...@apache.org> on 2019/07/15 06:06:00 UTC

[jira] [Commented] (FLINK-13254) Task launching blocked due to pending on #waitForChannel

    [ https://issues.apache.org/jira/browse/FLINK-13254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884873#comment-16884873 ] 

Yun Gao commented on FLINK-13254:
---------------------------------

By future looking at the stack of the blocked jobs, I think that there should be a deadlock:

!image-2019-07-15-14-02-53-877.png!

 

!image-2019-07-15-14-03-04-983.png!

 

The task thread is holding the _requestLock_ and waiting for the channel, but Netty client thread is waiting for the lock to make progress, and it can not be able to continue building the connection to unblock the waitForChannel operation.

 

> Task launching blocked due to pending on #waitForChannel
> --------------------------------------------------------
>
>                 Key: FLINK-13254
>                 URL: https://issues.apache.org/jira/browse/FLINK-13254
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.9.0
>            Reporter: Zhu Zhu
>            Priority: Major
>         Attachments: image-2019-07-15-14-02-53-877.png, image-2019-07-15-14-03-04-983.png, pendingOnDeploying.log, task_counts2.stack
>
>
> We observed that a task may stay in DEPLOYING state forever, pending on waiting for channels at TM side.
> Sample log is attached, including the TM log and stack for task "Sink: counts2 (1/20)".
>  
> This case happens after a region failover, which might be related.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)