You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yunfeng Zhou (Jira)" <ji...@apache.org> on 2022/07/19 07:23:00 UTC

[jira] [Updated] (FLINK-28606) Preserve distributed consistency of OperatorEvents from OperatorCoordinator to subtasks

     [ https://issues.apache.org/jira/browse/FLINK-28606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yunfeng Zhou updated FLINK-28606:
---------------------------------
    Description: 
This is a component of our solution to the consistency issue in the operator coordinator mechanism. In this step, we would guarantee the consistency of all communications in one direction, from OC to subtasks. This would need less workload and should unblock the implementation of the CEP coordinator in FLIP-200.

Roughly, we would need to implement the following process in this step.
 # Let the OC finish processing all the incoming OperatorEvents before the snapshot.

 # Closes the gateway that sends operator events to its subtasks when the OC completes the snapshot.
 # Wait until all the outgoing OperatorEvents created before the snapshot are sent and acked.
 # Send checkpoint barriers to the Source operators.
 # Open the corresponding gateway of a subtask when the OC learned that the subtask has completed the checkpoint.

  was:
This is a component of our solution to the consistency issue in the operator coordinator mechanism. In this step, we would guarantee the consistency of all communications in one direction, from OC to subtasks. This would need less workload and should unblock the implementation of the CEP coordinator in FLIP-200.

Roughly, we would need to implement the following process in this step.
 # 
Let the OC finish processing all the incoming OperatorEvents before the snapshot.
 # 
Closes the gateway that sends operator events to its subtasks when the OC completes snapshot.
 # 
Wait until all the outgoing OperatorEvents before the snapshot are sent and acked.
 # 
Send checkpoint barriers to the Source operators.
 # 
Open the corresponding gateway of a subtask when the OC learned that the subtask has completed the checkpoint.


> Preserve distributed consistency of OperatorEvents from OperatorCoordinator to subtasks
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-28606
>                 URL: https://issues.apache.org/jira/browse/FLINK-28606
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.14.3
>            Reporter: Yunfeng Zhou
>            Priority: Major
>             Fix For: 1.16.0
>
>
> This is a component of our solution to the consistency issue in the operator coordinator mechanism. In this step, we would guarantee the consistency of all communications in one direction, from OC to subtasks. This would need less workload and should unblock the implementation of the CEP coordinator in FLIP-200.
> Roughly, we would need to implement the following process in this step.
>  # Let the OC finish processing all the incoming OperatorEvents before the snapshot.
>  # Closes the gateway that sends operator events to its subtasks when the OC completes the snapshot.
>  # Wait until all the outgoing OperatorEvents created before the snapshot are sent and acked.
>  # Send checkpoint barriers to the Source operators.
>  # Open the corresponding gateway of a subtask when the OC learned that the subtask has completed the checkpoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)