You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "maoling (JIRA)" <ji...@apache.org> on 2019/01/08 12:33:00 UTC

[jira] [Issue Comment Deleted] (ZOOKEEPER-3231) Purge task may lost data when we have many invalid snapshots.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

maoling updated ZOOKEEPER-3231:
-------------------------------
    Comment: was deleted

(was: [~jiangjiafu]

good issue.yep!
the data-loss situation can only happen when the retained count of snapshots were all invalid(very unfortunately,little probability) and at that time,zk server took any new snapshots.
the specific source codes about the *restore* can be found in:
{code:java}
FileTxnSnapLog#restore--->snapLog.deserialize{code})

>  Purge task may lost data when we have many invalid snapshots.
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3231
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3231
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.4, 3.4.13
>            Reporter: Jiafu Jiang
>            Priority: Major
>
> I read the ZooKeeper source code, and I find the purge task use FileTxnSnapLog#findNRecentSnapshots to find snapshots, but the method does not check whether the snapshots are valid.
> Consider a worse case, a ZooKeeper server may have many invalid snapshots, and when a purge task begins, it will use the zxid in the last snapshot's name to purge old snapshots and transaction logs, then we may lost data. 
> I think we should use FileSnap#findNValidSnapshots(int) instead of FileSnap#findNRecentSnapshots in FileTxnSnapLog#findNRecentSnapshots, but I am not sure.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)