You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by "Lihe Ma (Jira)" <ji...@apache.org> on 2022/08/06 03:10:00 UTC

[jira] [Created] (FLINK-28843) Failed to restore from changelog checkpoint in claim mode

Lihe Ma created FLINK-28843:
-------------------------------

             Summary: Failed to restore from changelog checkpoint in claim mode
                 Key: FLINK-28843
                 URL: https://issues.apache.org/jira/browse/FLINK-28843
             Project: Flink
          Issue Type: Bug
          Components: Runtime / State Backends
    Affects Versions: 1.15.1, 1.15.0
            Reporter: Lihe Ma


# When native checkpoint is enabled and incremental checkpointing is enabled in rocksdb statebackend，if state data is greater than state.storage.fs.memory-threshold，it will be stored in a data file (FileStateHandle，RelativeFileStateHandle, etc) rather than stored with ByteStreamStateHandle in checkpoint metadata, like base-path1/chk-1/file1.
 # Then restore the job from base-path1/chk-1 in claim mode，using changelog statebackend，and the checkpoint path is set to base-path2, then new checkpoint will be saved in base-path2/chk-2, previous checkpoint file (base-path1/chk-1/file1) is needed.
 # Then restore the job from base-path2/chk-2 in changelog statebackend, flink will try to read base-path2/chk-2/file1, rather than the actual file location base-path1/chk-1/file1, which leads to FileNotFoundException and job failed.

 
How to reproduce? # Set state.storage.fs.memory-threshold to a small value, like '20b'.
 # {{run org.apache.flink.test.checkpointing.ChangelogPeriodicMaterializationSwitchStateBackendITCase#testSwitchFromDisablingToEnablingInClaimMode}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)