You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2017/05/04 10:34:04 UTC

[jira] [Commented] (FLINK-6284) Incorrect sorting of completed checkpoints in ZooKeeperCompletedCheckpointStore

    [ https://issues.apache.org/jira/browse/FLINK-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996520#comment-15996520 ] 

Stephan Ewen commented on FLINK-6284:
-------------------------------------

Thanks for opening that issue. That is critical indeed.
Adding [~till.rohrmann] to this conversation.

> Incorrect sorting of completed checkpoints in ZooKeeperCompletedCheckpointStore
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-6284
>                 URL: https://issues.apache.org/jira/browse/FLINK-6284
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Xiaogang Shi
>            Priority: Blocker
>
> Now all completed checkpoints are sorted in their paths when they are recovered in {{ZooKeeperCompletedCheckpointStore}} . In the cases where the latest checkpoint's id is not the largest in lexical order (e.g., "100" is smaller than "99" in lexical order), Flink will not recover from the latest completed checkpoint.
> The problem can be easily observed by setting the checkpoint ids in {{ZooKeeperCompletedCheckpointStoreITCase#testRecover()}} to be 99, 100 and 101. 
> To fix the problem, we should explicitly sort found checkpoints in their checkpoint ids, without the usage of {{ZooKeeperStateHandleStore#getAllSortedByName()}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)