You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/10/08 03:17:46 UTC

[GitHub] [zookeeper] hanm opened a new pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

hanm opened a new pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492


   WIP / Don't merge please. This branch currently is not buildable. Submit for LinkedIn folks / get early community feedback.
   
   Implement in-process backup and restore for ZooKeeper. Support point in time backup and recovery. The idea is simple: each ZK server optionally can enable a backup process where it periodically sending the transaction log and snapshot required for recovery to a storage provider. Currently HDFS storage provider is implemented. This can also be used to support multiple version of ZooKeeper data tree for forensic / debug purposes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] hanm commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
hanm commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-705969415


   Convert this PR to draft status as it's not ready to merge (thanks @ctubbsii for the pointer).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] ctubbsii commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-705359566


   @hanm You can make a PR a "Draft" PR, if it's not ready to merge. See https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/about-pull-requests#draft-pull-requests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-729188503


   @hanm 
   Thanks for the follow-up! This is helpful. Just so I don't leave you in the dark, I am actively reviewing this. There are a few things that I think would also be beneficial (both to the functionality and to make it more compatible to be integrated into open source):
   
   1. Generifying components such as the storage provider implementation in QuorumPeerMain.java and BackupConfig. I do see that an effort was made throughout the code, but perhaps they don't make it completely pluggable. We are considering adding support for NFS-based storage as well. 
   2. Adding support for interface that will allow timestamp-Zxid translation. Ideally, we want to have some variant of "point-in-time recovery" with some degree of automation. I think it will jibe well with the automatic backup feature. Will follow up on this in the near future.
   3. (Minor) Hoping to offer some alternatives to the components proprietary to Twitter.
   
   Hope to continue the discussion here or elsewhere (whichever medium is preferred for ZK development - should we move this discussion to ZooKeeper Atlassian wiki/JIRA?). I'm new to the community but eager to contribute. Keep me posted.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] hanm commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
hanm commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-705969415


   Convert this PR to draft status as it's not ready to merge (thanks @ctubbsii for the pointer).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-730194950


   @hanm thanks for the reply. I'll keep the discussion on this PR then. I realized from the email that the JIRA tickets were getting updated automatically as well.
   
   > Adding support for interface that will allow timestamp-Zxid translation
   I have some ideas around this - we have been discussing this on our end, and hopefully this is also something that we could contribute since it seems like we are going to need it in addition to the bare-bone backup/recovery feature.
   
   > yes the twitter specific internal stuff will be removed - which is required for this PR to be considered mergeable.
   We're working on generic data models/classes for metrics. We'll post a PR once they become available - hoping to contribute where we can to facilitate the process.
   
   > which specific "comment block" you were referring to? do you mind elaborate the use case of "leader-election protocol"?
   I'm referring to the idea of achieving high-availability and redundancy among the nodes that have backup enabled. I was initially referring to the JavaDoc (that mentions ZAB protocol) in `BackupStatus.java`, but I took another look and I may have interpreted it out of context - so please disregard the reference to the comment block.
   
   We want to achieve a setup where we have multiple nodes that have backup enabled (as observers), but only one of them actually runs the BackupManager with some sort of coordination among those nodes with backup enabled. This is to ensure that backup continues on in the case of the backup node failure.
   
   It might not be so trivial and might require more thought - could use ZK to coordinate but would that create a circular dependency, for example.
   
   Also, I was wondering if some parts of the change were left out of this draft PR intentionally? For example, `MetricsReceiver`, and changes to existing classes like `Util.getZxidRangeFromName()` and `FileTxnSnapLog.findValidSnapshots(long start, long end)` do not appear in this PR. I'm giving a try at filling these in but was wondering if there was a specific reason they were left out.
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots from the past iteration? And the returned list is sorted in descending order (most recent to oldest)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin up the `SnapBackup` process and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] hanm commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
hanm commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-730136643


   @narendly thank you for feedback. 
   
   >> Generifying components such as the storage provider implementation
   
   yes a more generalized storage provider interface is of my interest as well. something like org.apache.hadoop.fs.FileSystem would probably fit the needs here and then we can have various implementations including backup storage on s3, azure, and google cloud and so on which will be interesting.
   
   >> Adding support for interface that will allow timestamp-Zxid translation
   
   Interesting, i'll think about this.
   
   >> Hoping to offer some alternatives to the components proprietary to Twitter
   
   yes the twitter specific internal stuff will be removed - which is required for this PR to be considered mergeable.
   
   >> should we move this discussion to ZooKeeper Atlassian wiki/JIRA
   
   It's find to discuss here - the comments will be bridged automatically to Apache JIRA tickets (in work log tab IIRC).
   
   >> add some sort of leader-election protocol for redundancy among the nodes whose backup.enabled is set to true. I know there's a comment block in this change 
   
   which specific "comment block" you were referring to? do you mind elaborate the use case of "leader-election protocol"? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-730194950


   @hanm thanks for the reply. I'll keep the discussion on this PR then. I realized from the email that the JIRA tickets were getting updated automatically as well.
   
   > Adding support for interface that will allow timestamp-Zxid translation
   
   I have some ideas around this - we have been discussing this on our end, and hopefully this is also something that we could contribute since it seems like we are going to need it in addition to the bare-bone backup/recovery feature.
   
   > yes the twitter specific internal stuff will be removed - which is required for this PR to be considered mergeable.
   
   We're working on generic data models/classes for metrics. We'll post a PR once they become available - hoping to contribute where we can to facilitate the process.
   
   > which specific "comment block" you were referring to? do you mind elaborate the use case of "leader-election protocol"?
   
   I'm referring to the idea of achieving high-availability and redundancy among the nodes that have backup enabled. I was initially referring to the JavaDoc (that mentions ZAB protocol) in `BackupStatus.java`, but I took another look and I may have interpreted it out of context - so please disregard the reference to the comment block.
   
   We want to achieve a setup where we have multiple nodes that have backup enabled (as observers), but only one of them actually runs the BackupManager with some sort of coordination among those nodes with backup enabled. This is to ensure that backup continues on in the case of the backup node failure.
   
   It might not be so trivial and might require more thought - could use ZK to coordinate but would that create a circular dependency, for example.
   
   Also, I was wondering if some parts of the change were left out of this draft PR intentionally? For example, `MetricsReceiver`, and changes to existing classes like `Util.getZxidRangeFromName()` and `FileTxnSnapLog.findValidSnapshots(long start, long end)` do not appear in this PR. I'm giving a try at filling these in but was wondering if there was a specific reason they were left out.
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots from the past iteration? And the returned list is sorted in descending order (most recent to oldest)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin up the `SnapBackup` process and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing both `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots from the past iteration? And the returned list is sorted in descending order (most recent to oldest)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin up the `SnapBackup` process and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing both `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   The approach that we (LinkedIn) want to take here is a little bit different - we actually want all the snapshots, not just the latest snapshot, so I'm trying to confirm with you in order to minimize the difference between the implementation you provided where we can but still achieve the outcome we want. Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] hanm edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
hanm edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-730136643


   @narendly thank you for feedback. 
   
   >> Generifying components such as the storage provider implementation
   
   yes a more generalized storage provider interface is of my interest as well. something like org.apache.hadoop.fs.FileSystem would probably fit the needs here and then we can have various implementations including backup storage on s3, azure, and google cloud and so on which will be interesting.
   
   >> Adding support for interface that will allow timestamp-Zxid translation
   
   Interesting, i'll think about this.
   
   >> Hoping to offer some alternatives to the components proprietary to Twitter
   
   yes the twitter specific internal stuff will be removed - which is required for this PR to be considered mergeable.
   
   >> should we move this discussion to ZooKeeper Atlassian wiki/JIRA
   
   It's fine to discuss here - the comments will be bridged automatically to Apache JIRA tickets (in work log tab IIRC).
   
   >> add some sort of leader-election protocol for redundancy among the nodes whose backup.enabled is set to true. I know there's a comment block in this change 
   
   which specific "comment block" you were referring to? do you mind elaborate the use case of "leader-election protocol"? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] ctubbsii commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-705359566


   @hanm You can make a PR a "Draft" PR, if it's not ready to merge. See https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/about-pull-requests#draft-pull-requests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots than what was backed up from the last iteration? And the returned list is sorted in descending order (most recent to oldest - which is why we sort it again)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin up the `SnapBackup` process and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing both `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   The approach that we (LinkedIn) want to take here is a little bit different - we actually want all the snapshots, not just the latest snapshot, so I'm trying to confirm with you in order to minimize the difference between the implementation you provided where we can but still achieve the outcome we want. Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] hanm commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
hanm commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-730867303


   >> This is to ensure that backup continues on in the case of the backup node failure.
   
   Currently we just rely on monitoring + alerts and let on call to handle the backup node failure. Not fully automatic but we are ok with the trade off because the current backup is not the source of truth for our use cases. The current backup implementation also can't guarantee consistency with quorum without making it part of quorum operation on the critical transaction path and we primarily rely on the backup as a form of multi-version data tree on top of which we can do interesting analysis (since ZK itself does not support multiple version of data). So we are ok with a much simpler system that occasionally fails that requires human intervention (rarely, though). Making backup automatic is an interesting use case though, we can discuss more moving forward.
   
   >>  I was wondering if some parts of the change were left out of this draft PR intentionally
   
   MetricsReceiver is twitter specific code (it's from Finagle, twitter's RPC library).
   Util.getZxidRangeFromName relies on 3rd party libraries we intend to remove, so was left out intentionally.
   Changes to FileTxnSnapLog was also left out intentionally as most of the changes unfortunately were intrusive and not upstream compatible.
   
   The plan is to rework on removing these in house specific dependencies and incompatible changes so this PR can build with master cleanly, at which point we can have a baseline to evolve the build more features on top of the basic backup sub system. I'll try to give an update to this PR in next few weeks (can't promise a hard date, have other priorities at the moment).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly edited a comment on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly edited a comment on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots than what was backed up from the last iteration? And the returned list is sorted in descending order (most recent to oldest)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin up the `SnapBackup` process and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing both `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   The approach that we (LinkedIn) want to take here is a little bit different - we actually want all the snapshots, not just the latest snapshot, so I'm trying to confirm with you in order to minimize the difference between the implementation you provided where we can but still achieve the outcome we want. Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-729841088


   One thing I forgot to mention is to add some sort of leader-election protocol for redundancy among the nodes whose `backup.enabled` is set to `true`. I know there's a comment block in this change and would be willing to explore that further.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] narendly commented on pull request #1492: ZOOKEEPER-3419: Backup and recovery support.

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1492:
URL: https://github.com/apache/zookeeper/pull/1492#issuecomment-745057466


   Thanks @hanm .
   
   Also a question here if you could clarify - some parts of the code are missing so I'm just making educated guesses.
   
   For `SnapBackup`, is my understanding correct that here this always stores the latest snapshot only (per iteration), and not all of the snapshots that might exist between iterations?
   
   Here are more detailed questions just to confirm:
   1. `List<File> candidateSnapshots = snapLog.findValidSnapshots(0, backedupSnapZxid);`
   -> What does this exactly do? If I'm guessing right, it finds all valid snapshots (say the snapshot still has the naming of only one Zxid - e.g.) snapshot.XXXX) _excluding_ the ones that fall in the interval [0, backedupSnapZxid]? so that the `candidateSnapshots` will only contain the newer snapshots from the past iteration? And the returned list is sorted in descending order (most recent to oldest)?
   2. In the very first iteration where `backedupSnapZxid == BackupUtil.INVALID_SNAP_ZXID` evaluates to `true`, and in the case that we find more `candidateSnapshots` than one, `candidateSnapshots.size() > 1`, is there a reason we end up storing two BackupFiles (the first and the last), instead of just the last one?
   For example, suppose you spin this thing up and that we have `snapshot.100, snapshot.110, snapshot.120, ..., snapshot.200` and that `lastLoggedZxid` is 205. Then we'll end up storing `snapshot.100-109` and `snapshot.200-205`. I might be mistaken here, but it seems a little inconsistent that we store 2 here instead of just going for the most recent snapshot.
   
   Let me know if I'm mistaken or if you could help me get this cleared up?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org