You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@geode.apache.org by "Anilkumar Gingade (Jira)" <ji...@apache.org> on 2020/08/10 18:36:00 UTC

[jira] [Updated] (GEODE-2682) Compaction with async disk writes may resurrect removed entries

     [ https://issues.apache.org/jira/browse/GEODE-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anilkumar Gingade updated GEODE-2682:
-------------------------------------
    Labels: GeodeOperationAPI storage_2  (was: storage_2)

> Compaction with async disk writes may resurrect removed entries
> ---------------------------------------------------------------
>
>                 Key: GEODE-2682
>                 URL: https://issues.apache.org/jira/browse/GEODE-2682
>             Project: Geode
>          Issue Type: Bug
>          Components: persistence
>    Affects Versions: 1.0.0-incubating, 1.1.0
>            Reporter: Jason Huynh
>            Priority: Major
>              Labels: GeodeOperationAPI, storage_2
>         Attachments: cache.xml, oplogs.tar.gz
>
>
> This can occur for persistent async event queues and for regions when concurrency checks are disabled.
> Currently->
> 1.) When rolling a crf we create a krf that is based on the current “live” region
> 2.) If removes are being done at the same time, the krf will reflect the current state, where the keys are not part of the krf file
> 3.) Due to the async disk write, the drf has yet to be updated.
> 4.) If the cluster gets shut down before the drf is written to.  This can lead to the following scenarios:
>  * (No issue) the user recovers with the existing krf/drf/crf files.  This works just fine as the krf has reflected the change
>  * (Problem!) If the user compacts and then recovers, the removed entries are now resurrected and appear in the region due to the way compaction operates.  It ignores the krf and works on a the existing crf/drf files.  Because the drf does not reflect removed events, the events are rolled forward from the crf.
> Attached is a set oplogs for a single node prior to compaction and a cache.xml (need to fill in the correct location for the diskstore directory) 
> Recovering from this set of oplogs recovers 0 entries for the async event queues.
> If you run offline compaction on the oplogs and recover, there are now entries in the async event queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)