You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/09/20 17:05:00 UTC

[jira] [Commented] (GEODE-10395) TXLockIdImpl memory leak after CommitConflictException from another node

    [ https://issues.apache.org/jira/browse/GEODE-10395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607318#comment-17607318 ] 

ASF subversion and git services commented on GEODE-10395:
---------------------------------------------------------

Commit 4cb75ae4848250606db2f4b14300601755586192 in geode's branch refs/heads/develop from Mario Kevo
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=4cb75ae484 ]

GEODE-10395 remove locks from List if dlock.acquireTryLocks return false (#7846)



> TXLockIdImpl memory leak after CommitConflictException from another node
> ------------------------------------------------------------------------
>
>                 Key: GEODE-10395
>                 URL: https://issues.apache.org/jira/browse/GEODE-10395
>             Project: Geode
>          Issue Type: Bug
>    Affects Versions: 1.14.0, 1.15.0
>            Reporter: Eugene Nedzvetsky
>            Assignee: Mario Kevo
>            Priority: Major
>              Labels: pull-request-available
>
> org.apache.geode.internal.cache.locks.TXLockServiceImpl#txLock:120 adds TXLockIdImpl  objects to TXLockServiceImpl#txLockIdList. 
> {code:java}
> synchronized (txLockIdList) {
>         txLockId = new TXLockIdImpl(dlock.getDistributionManager().getId());
>         txLockIdList.add(txLockId);
>       }
> {code}
> These objects will not be removed from this list if dlock.acquireTryLocks returned false.
> {code:java}
>       gotLocks = dlock.acquireTryLocks(batch, TIMEOUT_MILLIS, LEASE_MILLIS, keyIfFail);
>       if (gotLocks) { // ...otherwise race can occur between tryLocks and readLock
>         acquireRecoveryReadLock();
>       } else if (keyIfFail[0] != null) {
>         throw new CommitConflictException(
>             String.format("Concurrent transaction commit detected %s",
>                 keyIfFail[0]));
>       } else {
>         throw new CommitConflictException(
>             String.format("Failed to request try locks from grantor: %s",
>                 dlock.getLockGrantorId()));
>       }
> {code}
> It throws CommitConflictException and after that system doesn't have a txLockId reference and this txLockId will be never removed from this list.
> It produces critical performance degradation. txLockIdList has a few hundred thousand txLocks after a few weeks and TXLockServiceImpl#release iterates this list 2 times on every tx commit and the same time "synchronized (txLockIdList)" locks other threads.
> TXLockIdImpl#equals works really slow because it checks bunch of variables in memberId.equals(that.memberId).
> {code:java}
> public void release(TXLockId txLockId) {
>     synchronized (txLockIdList) {
>       if (!txLockIdList.contains(txLockId)) {
>         // TXLockService.destroyServices can be invoked in cache.close().
>         // Other P2P threads could process message such as TXCommitMessage afterwards,
>         // and invoke TXLockService.createDTLS(). It could create a new TXLockService
>         // which will have a new empty list (txLockIdList) and it will not
>         // contain the originally added txLockId
>         throw new IllegalArgumentException(
>             String.format("Invalid txLockId not found: %s",
>                 txLockId));
>       }
>       dlock.releaseTryLocks(txLockId, () -> {
>         return recovering;
>       });
>       txLockIdList.remove(txLockId);
>       releaseRecoveryReadLock();
>     }
>   }
> {code}
> I think TXLockServiceImpl#txLock should remove this txLockId from TXLockServiceImpl#txLockIdList in case of CommitConflictException:
> {code:java}
>  if (gotLocks) { // ...otherwise race can occur between tryLocks and readLock
>                 acquireRecoveryReadLock();
>             } else if (keyIfFail[0] != null) {
>                 synchronized (this.txLockIdList) {
>                     this.txLockIdList.remove(txLockId);
>                 }
>                 throw new CommitConflictException(
>                         String.format("Concurrent transaction commit detected %s",
>                                 keyIfFail[0]));
>             } else {
>                 synchronized (this.txLockIdList) {
>                     this.txLockIdList.remove(txLockId);
>                 }
>                 throw new CommitConflictException(
>                         String.format("Failed to request try locks from grantor: %s",
>                                 this.dlock.getLockGrantorId()));
>             }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)