You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Roman Khachatryan (Jira)" <ji...@apache.org> on 2022/04/11 22:35:00 UTC

[jira] [Updated] (FLINK-27132) CheckpointResourcesCleanupRunner might discard shared state of the initial checkpoint

     [ https://issues.apache.org/jira/browse/FLINK-27132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Khachatryan updated FLINK-27132:
--------------------------------------
    Description: 
When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
 # It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
 # Job terminates abruptly
 # The cleaner is started for that job
 # ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
 # The store is shut down - discarding all the checkpoints and also any shared state

 
In -6-5, if some checkpoint uses the initial state, it will also be discarded
 
[~mapohl] could you please confirm this?
 
cc: [~yunta]

  was:
When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
 # It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
 # Job terminates abruptly
 # The cleaner is started for that job
 # ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
 # The store is shut down - discarding all the checkpoints and also any shared state

 
In 5, if some checkpoint uses the initial state, it will also be discarded
 
[~mapohl] could you please confirm this?
 
cc: [~yunta]


> CheckpointResourcesCleanupRunner might discard shared state of the initial checkpoint
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-27132
>                 URL: https://issues.apache.org/jira/browse/FLINK-27132
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.15.0, 1.16.0
>            Reporter: Roman Khachatryan
>            Priority: Major
>
> When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
>  # It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
>  # Job terminates abruptly
>  # The cleaner is started for that job
>  # ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
>  # The store is shut down - discarding all the checkpoints and also any shared state
>  
> In -6-5, if some checkpoint uses the initial state, it will also be discarded
>  
> [~mapohl] could you please confirm this?
>  
> cc: [~yunta]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)