You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (Jira)" <ji...@apache.org> on 2020/04/23 16:42:00 UTC

[jira] [Commented] (FLINK-17330) Avoid scheduling deadlocks caused by cyclic input dependencies between regions

    [ https://issues.apache.org/jira/browse/FLINK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090753#comment-17090753 ] 

Till Rohrmann commented on FLINK-17330:
---------------------------------------

Thanks for reporting this issue [~zhuzh]. I think you are right that cyclic dependencies between pipelined regions are a problem we have not considered.

Would it work to say that in the first version we don't support pipelined regions which contain a blocking data exchange? Users would be able to work around this problem by setting the data exchanges to blocking if they have such a topology.

Once we have the first version of the pipelined region scheduler working we could then address the problem of cyclic dependencies. I think we would have to detect cyclic dependencies between pipelined regions and merge all regions which are part of the cycle into the same pipelined region. The cyclic dependency detection should handle the problem of intra-region all-to-all blocking edges as well as any other kind of cyclic cross-region dependencies.

> Avoid scheduling deadlocks caused by cyclic input dependencies between regions
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-17330
>                 URL: https://issues.apache.org/jira/browse/FLINK-17330
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.0
>            Reporter: Zhu Zhu
>            Priority: Major
>             Fix For: 1.11.0
>
>
> Imagine a job like this:
> A -- (pipelined FORWARD) --> B -- (blocking ALL-to-ALL) --> D
> A -- (pipelined FORWARD) --> C -- (pipelined FORWARD) --> D
> parallelism=2 for all vertices.
> We will have 2 execution pipelined regions:
> R1 = {A1, B1, C1, D1}
> R2 = {A2, B2, C2, D2}
> R1 has a cross-region input edge (B2->D1).
> R2 has a cross-region input edge (B1->D2).
> Scheduling deadlock will happen since we schedule a region only when all its inputs are consumable (i.e. blocking partitions to be finished). This is because R1 can be scheduled only if R2 finishes, while R2 can be scheduled only if R1 finishes.
> To avoid this, one solution is to force a logical pipelined region with intra-region ALL-to-ALL blocking edges to form one only execution pipelined region, so that there would not be cyclic input dependency between regions.
> Besides that, we should also pay attention to avoid cyclic cross-region POINTWISE blocking edges. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)