You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/01/30 13:46:38 UTC

[GitHub] [zookeeper] maoling opened a new pull request #1079: ZOOKEEPER-3231:Purge task may lost data when the recent snapshots are all invalid

maoling opened a new pull request #1079: ZOOKEEPER-3231:Purge task may lost data when the recent snapshots are all invalid
URL: https://github.com/apache/zookeeper/pull/1079
 
 
   - Purge task uses `FileTxnSnapLog#findNRecentSnapshot`, which's likely to lost data when the recent 3 snapshots are all invalid(a new valid snapshot has not generated yet) and at the same time, Purge task(`e.g ./zkCleanup.sh -n 3`) has started a new round work to clean up the preceding snapshots. we will lose all the data.that's a small probability events, but it's reproducible.
   - Overall, using `snaplog.findNValidSnapshots` to make sure the purge task tries to retain N valid snapshots(rather than N snapshots) to avoid a risk of data loss.
   - For the unit test, it's not easy to use the `mock` way for the following reasons:
      - when we want to test the `dataDir` which some Snapshots are valid, others not.Just writing a little data contents to the snapshot to make it valid/invalid has a better flexibility.
      - too much code changes in the `PurgeTxnTest.java`(pass the original UT) and `FileTxnSnapLog.java`(have some handles)
   - more details in the [ZOOKEEPER-3231](https://issues.apache.org/jira/browse/ZOOKEEPER-3231)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services