You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/11/05 20:26:00 UTC

[jira] [Commented] (GEODE-8686) Tombstone removal optimization during GII could cause deadlock

    [ https://issues.apache.org/jira/browse/GEODE-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226963#comment-17226963 ] 

ASF GitHub Bot commented on GEODE-8686:
---------------------------------------

DonalEvans opened a new pull request #5707:
URL: https://github.com/apache/geode/pull/5707


   - Do not call AbstractRegionMap.removeTombstone() outside of
   TombstoneService class
   - Add test to confirm that tombstones are correctly scheduled and
   collected with this change
   
   Authored-by: Donal Evans <do...@vmware.com>
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
   
   - [x] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
   
   - [x] Is your initial contribution a single, squashed commit?
   
   - [x] Does `gradlew build` run cleanly?
   
   - [x] Have you written or updated unit tests to verify your changes?
   
   - [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build issues and
   submit an update to your PR as soon as possible. If you need help, please send an
   email to dev@geode.apache.org.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Tombstone removal optimization during GII could cause deadlock
> --------------------------------------------------------------
>
>                 Key: GEODE-8686
>                 URL: https://issues.apache.org/jira/browse/GEODE-8686
>             Project: Geode
>          Issue Type: Improvement
>    Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0
>            Reporter: Donal Evans
>            Assignee: Donal Evans
>            Priority: Major
>
> Similar to the issue described in GEODE-6526, if the condition in the below if statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a call to {{AbstractRegionMap.removeTombstone()}} will be triggered that could lead to deadlock between the calling thread and a Tombstone GC thread calling {{TombstoneService.gcTombstones()}}. 
> {code:java}
> if (owner.getServerProxy() == null && owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), entryVersion.getRegionVersion())) { 
>   // the received tombstone has already been reaped, so don't retain it 
>   if (owner.getIndexManager() != null) { 
>     owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
>   } 
>   removeTombstone(oldRe, entryVersion, false, false); 
>   return false; 
> } else { 
>   owner.scheduleTombstone(oldRe, entryVersion); 
>   lruEntryDestroy(oldRe); 
> }
> {code}
> The proposed change is to remove this if statement and allow the old tombstone to be collected later by calling {{scheduleTombstone()}} in all cases. The call to {{AbstractRegionMap.removeTombstone()}} in {{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to allow immediate removal of tombstones that we know have already been collected on other members, but since the conditions to trigger it are rare (the old entry must be a tombstone, the new entry received during GII must be a tombstone with a newer version, and we must have already collected a tombstone with a newer version than the new entry) and the overhead of scheduling a tombstone to be collected is comparatively low, the performance impact of removing this optimization in favour of simply scheduling the tombstone to be collected in all cases should be insignificant.
> The solution to the deadlock observed in GEODE-6526 was also to remove the call to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be collected later and did not result in any unwanted behaviour, so the proposed fix should be similarly low-impact.
> Also of note is that with this proposed change, there will be no calls to {{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} class, which should ensure that other deadlocks involving this method are not possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)