You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Roman Khachatryan (Jira)" <ji...@apache.org> on 2022/04/11 22:35:00 UTC
[jira] [Updated] (FLINK-27132) CheckpointResourcesCleanupRunner might discard shared state of the initial checkpoint
[ https://issues.apache.org/jira/browse/FLINK-27132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Khachatryan updated FLINK-27132:
--------------------------------------
Description:
When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
# It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
# Job terminates abruptly
# The cleaner is started for that job
# ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
# The store is shut down - discarding all the checkpoints and also any shared state
In -6-5, if some checkpoint uses the initial state, it will also be discarded
[~mapohl] could you please confirm this?
cc: [~yunta]
was:
When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
# It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
# Job terminates abruptly
# The cleaner is started for that job
# ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
# The store is shut down - discarding all the checkpoints and also any shared state
In 5, if some checkpoint uses the initial state, it will also be discarded
[~mapohl] could you please confirm this?
cc: [~yunta]
> CheckpointResourcesCleanupRunner might discard shared state of the initial checkpoint
> -------------------------------------------------------------------------------------
>
> Key: FLINK-27132
> URL: https://issues.apache.org/jira/browse/FLINK-27132
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.15.0, 1.16.0
> Reporter: Roman Khachatryan
> Priority: Major
>
> When considering the following case: # A job starts from a checkpoint in NO_CLAIM mode, with incremental checkpoints enabled
> # It produces some new checkpoints and subsumes the original one (not discarding shared state - before FLINK-24611 or after FLINK-26985)
> # Job terminates abruptly
> # The cleaner is started for that job
> # ZK doesn't have the initial checkpoint, so the store will load only the new checkpoints (created in 2). Shared state is registered
> # The store is shut down - discarding all the checkpoints and also any shared state
>
> In -6-5, if some checkpoint uses the initial state, it will also be discarded
>
> [~mapohl] could you please confirm this?
>
> cc: [~yunta]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)