You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bill Burcham (Jira)" <ji...@apache.org> on 2022/06/17 01:57:00 UTC

[jira] [Created] (GEODE-10391) Region Operation During Primary Change in P2P-only Configuration Results in Spurious Entry{NotFound|Exists}Exception

Bill Burcham created GEODE-10391:
------------------------------------

             Summary: Region Operation During Primary Change in P2P-only Configuration Results in Spurious Entry{NotFound|Exists}Exception
                 Key: GEODE-10391
                 URL: https://issues.apache.org/jira/browse/GEODE-10391
             Project: Geode
          Issue Type: Bug
          Components: regions
    Affects Versions: 1.16.0
            Reporter: Bill Burcham


When a primary moves while a region operation, e.g. create, is in-flight, i.e. started but not yet acknowledged, the operation will be retried automatically, until the operation succeeds or fails.

When a member notices another member has crashed, the surviving member requests (from the remaining members) data for which the crashed member had been primary (delta-GII/sync). This sync is necessary to regain consistency in case the (retrying) requester fails before it can re-issue the request to the new primary.

In GEODE-5055 we learned that we needed to delay that sync request long enough for the new primary to be chosen and for the original requester to make a new request against the new primary. If we didn't delay the sync, the primary could end up with the entry in the new state (as if the operation had completed) but without the corresponding event tracker data needed to conflate the retried event.

The fix for GEODE-5055 introduced a delay, but only for configurations where clients were present. If only peers were present there would be no delay. This ticket pertains to the P2P-only case.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)