You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/11/06 16:23:15 UTC

[GitHub] [flink] ifndef-SleePy commented on a change in pull request #9885: [FLINK-14344][checkpointing] Snapshots master hook state asynchronously

ifndef-SleePy commented on a change in pull request #9885: [FLINK-14344][checkpointing] Snapshots master hook state asynchronously
URL: https://github.com/apache/flink/pull/9885#discussion_r343192744
 
 

 ##########
 File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
 ##########
 @@ -659,10 +660,10 @@ else if (!props.forceCheckpoint()) {
 				}
 
 				// trigger the master hooks for the checkpoint
-				final List<MasterState> masterStates = MasterHooks.triggerMasterHooks(masterHooks.values(),
+				final Map<String, MasterState> masterStates = MasterHooks.triggerMasterHooks(masterHooks.values(),
 
 Review comment:
   Remember there are two optional semantics discussed in [JIRA ticket](https://issues.apache.org/jira/browse/FLINK-14344?focusedCommentId=16964751&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16964751) 
   > 1. hook is triggered asynchronously before the checkpoints starts, but checkpoint barriers are being sent and the checkpoints starts before async hook is completed
   > 2. hook is triggered asynchronously, but checkpoint is not started before the async action completes
   
   Actually I had thought these two choices before discussing with you. This change of `PendingCheckpoint` is designed originally to be able to support both of them. I always think there should be no sequence guarantee of master state and task state. The reason of keeping sequence here is just because the semantic of `MasterHook`. IMO, only `CheckpointCoordinator` should care about it.
   `PendingCheckpoint` doesn't know or care about the sequence of master state and task state. It just provides some basic semantics, `acknowledgeMaster/TaskState`, `areMaster/Task/FullyStatesAcknowledged`. The semantic of `PendingCheckpoint` is quite clear. In this way, `PendingCheckpoint` could also support other kind of master state easily (I guess there probably will be some in the future not far away).
   
   There is another reason of this change. It could support asynchronously snapshotting in a more elegant way than combining all master hook futures. Eventually, master hook state will be snapshotted asynchronously. The pseudo codes looks like:
   ```
   for (MasterTriggerRestoreHook<?> masterHook : masterHooks.values()) {
   	CompletableFuture<MasterState> future = MasterHooks.triggerHook(masterHook, checkpointID, timestamp, executor);
   	future.thenAccept(masterState -> {
   		pendingCheckpoint.acknowledgeMasterState(masterHook.getIdentifier(), masterState);
   		if (pendingCheckpoint.areMasterStatesFullyAcknowledged()) {
   			// launches next stage, for example, snapshotting task state
   		}
   	}
   }
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services