You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (Jira)" <ji...@apache.org> on 2022/04/13 06:28:05 UTC
[jira] [Updated] (FLINK-24343) Revisit Scheduler and Coordinator Startup Procedure
[ https://issues.apache.org/jira/browse/FLINK-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yun Gao updated FLINK-24343:
----------------------------
Fix Version/s: 1.16.0
> Revisit Scheduler and Coordinator Startup Procedure
> ---------------------------------------------------
>
> Key: FLINK-24343
> URL: https://issues.apache.org/jira/browse/FLINK-24343
> Project: Flink
> Issue Type: Technical Debt
> Components: Runtime / Coordination
> Affects Versions: 1.14.0, 1.13.2
> Reporter: Stephan Ewen
> Priority: Major
> Fix For: 1.15.0, 1.16.0
>
>
> We need to re-examine the startup procedure of the scheduler, and how it interacts with the startup of the operator coordinators.
> We need to make sure the following conditions are met:
> - The Operator Coordinators are started before the first action happens that they need to be informed of. That includes as task being ready, a checkpoint happening, etc.
> - The scheduler must be started to the point that it can handle "failGlobal()" calls, because the coordinators might trigger that during their startup when an exception in "start()" occurs.
> /cc [~chesnay]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)