You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (Jira)" <ji...@apache.org> on 2022/04/13 06:28:05 UTC

[jira] [Updated] (FLINK-24343) Revisit Scheduler and Coordinator Startup Procedure

     [ https://issues.apache.org/jira/browse/FLINK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yun Gao updated FLINK-24343:
----------------------------
    Fix Version/s: 1.16.0

> Revisit Scheduler and Coordinator Startup Procedure
> ---------------------------------------------------
>
>                 Key: FLINK-24343
>                 URL: https://issues.apache.org/jira/browse/FLINK-24343
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Runtime / Coordination
>    Affects Versions: 1.14.0, 1.13.2
>            Reporter: Stephan Ewen
>            Priority: Major
>             Fix For: 1.15.0, 1.16.0
>
>
> We need to re-examine the startup procedure of the scheduler, and how it interacts with the startup of the operator coordinators.
> We need to make sure the following conditions are met:
>   - The Operator Coordinators are started before the first action happens that they need to be informed of. That includes as task being ready, a checkpoint happening, etc.
>   - The scheduler must be started to the point that it can handle "failGlobal()" calls, because the coordinators might trigger that during their startup when an exception in "start()" occurs.
> /cc [~chesnay]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)