You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by "Zhu Zhu (Jira)" <ji...@apache.org> on 2020/11/05 09:30:00 UTC

[jira] [Created] (FLINK-19994) All vertices in an DataSet iteration job will be eagerly scheduled

Zhu Zhu created FLINK-19994:
-------------------------------

             Summary: All vertices in an DataSet iteration job will be eagerly scheduled
                 Key: FLINK-19994
                 URL: https://issues.apache.org/jira/browse/FLINK-19994
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.12.0
            Reporter: Zhu Zhu
             Fix For: 1.12.0


After switching to pipelined region scheduling, all vertices in an DataSet iteration job will be eagerly scheduled, which means BLOCKING result consumers can be deployed even before the result finishes and resource waste happens. This is because all vertices will be put into one pipelined region if the job contains {{ColocationConstraint}}, see [PipelinedRegionComputeUtil|https://github.com/apache/flink/blob/c0f382f5f0072441ef8933f6993f1c34168004d6/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/PipelinedRegionComputeUtil.java#L52].

IIUC, this {{makeAllOneRegion()}} behavior was introduced to ensure co-located iteration head and tail to be restarted together in pipelined region failover. However, given that edges within an iteration will always be PIPELINED ([ref|https://github.com/apache/flink/blob/0523ef6451a93da450c6bdf5dd4757c3702f3962/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L1188]), co-located iteration head and tail will always be in the same region. So I think we can drop the {{PipelinedRegionComputeUtil#makeAllOneRegion()}} code path and build regions in the the same way no matter if there is co-location constraints or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)