You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "fanrui (Jira)" <ji...@apache.org> on 2022/07/09 14:24:00 UTC

[jira] [Created] (FLINK-28474) ChannelStateWriteResult may not fail after checkpoint abort

fanrui created FLINK-28474:
------------------------------

             Summary: ChannelStateWriteResult may not fail after checkpoint abort
                 Key: FLINK-28474
                 URL: https://issues.apache.org/jira/browse/FLINK-28474
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
    Affects Versions: 1.15.1, 1.14.5
            Reporter: fanrui
             Fix For: 1.16.0, 1.15.2, 1.14.6
         Attachments: image-2022-07-09-22-21-24-417.png

After Checkpoint abort, ChannelStateWriteResult should fail.

But if _channelStateWriter.start(id, checkpointOptions);_ is executed after Checkpoint abort, ChannelStateWriteResult will not fail.

 
h2. Cause Analysis:

When abort checkpoint, channelStateWriter.start(id, checkpointOptions); may not be executed yet. These checkpointIds will be stored in the abortedCheckpointIds of SubtaskCheckpointCoordinatorImpl, and when checkpointState is called, it will check if the checkpointId should be aborted.

_ChannelStateWriter.abort(checkpointId, exception, true) should also be executed here._

!image-2022-07-09-22-21-24-417.png|width=803,height=307!

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)