You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by "Dan Smith (JIRA)" <ji...@apache.org> on 2017/03/14 00:24:41 UTC

[jira] [Created] (GEODE-2654) Backups can capture different members from different points in time

Dan Smith created GEODE-2654:
--------------------------------

             Summary: Backups can capture different members from different points in time
                 Key: GEODE-2654
                 URL: https://issues.apache.org/jira/browse/GEODE-2654
             Project: Geode
          Issue Type: Bug
          Components: persistence
            Reporter: Dan Smith


Geode backups should behave the same as recovering from disk after killing all of the members.

Unfortunately, the backups instead can backup data on different members at different points in time, resulting in application level inconsistency. Here's an example of what goes wrong:

# Do a put in region A
# Do a put in region B
# Backup the system
# Recover from the backup
# You may see the put to region B, but not A, even if the data is colocated.

We ran into this with with lucene indexes - see GEODE-2643. We've worked around GEODE-2643 by putting all data into the same region, but we're worried that we still have a problem with the async event queue. With an async event listener that writes to another geode region, because it's possible to recover different points in time for the async event queue and the region, resulting in missed events. 

The issue is that there is no locking or other mechanism to prevent different members from backing up their data at different points in time. Colocating data does not avoid this problem, because when we recover from disk we may recover region A's bucket from one member and region B's bucket from another member.

The backup operation does have a mechanism for making sure that it gets a point in time snapshot of *metadata*. It sends a PrepareBackupRequest to all members which causes them to lock their init file. Then it sends a FinishBackupRequest which tells all members to backup their data and release the lock. This ensures that a backup doesn't completely miss a bucket or get corrupt metadata about what members host as bucket. See the comments in DiskStoreImpl.lockStoreBeforeBackup.

We should extend this Prepare/Finish mechanism to make sure we get a point in time snapshot of region data as well. One way to do this would be to get a lock on the *oplog* in lockStoreBeforeBackup to prevent writes and hold it until releaseBackupLock is called.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)