You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Maksim Timonin (Jira)" <ji...@apache.org> on 2022/08/25 09:03:00 UTC

[jira] [Updated] (IGNITE-17385) Frequent commits of single cache transactions can lead GridCacheAdapter#asyncOpsSem permits overflow

     [ https://issues.apache.org/jira/browse/IGNITE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maksim Timonin updated IGNITE-17385:
------------------------------------
    Release Note: Fixed overflowing async operation permits maximum for explicit transaction with single write entry

> Frequent commits of single cache transactions can lead GridCacheAdapter#asyncOpsSem permits overflow
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-17385
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17385
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.13
>            Reporter: Ilya Shishkov
>            Assignee: Maksim Timonin
>            Priority: Major
>              Labels: ise
>             Fix For: 2.14
>
>         Attachments: SemaphorePermitsExceeded.patch
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> When you commit a transaction, which was _explicitly started only over a single cache_, then {{GridCacheAdapter#asyncOpRelease}} is called without {{GridCacheAdapter#asyncOpAcquire}}. This situation can lead to continuous grow of permits count in {{GridCacheAdapter#asyncOpsSem}} and to overflow with a further failure  of node started the transaction:
> {code}
> Critical system error detected. Will be handled accordingly to configured handler [hnd=o.a.i.i.processors.cache.transactions.TxAsyncOpsSemaphorePermitsExeededTest$$Lambda$42/1924582348@7379bebb, failureCtx=FailureContext [type=CRITICAL_ERROR, err=java.lang.Error: Maximum permit count exceeded]]
> {code}
> As you can see in [1], in case of the single cache context, transaction will be commited by calling of {{GridCacheAdapter#commitTxAsync}}, which invokes {{GridCacheAdapter#asyncOpRelease}} later. But, when multiple caches affected by transaction, {{GridNearTxLocal#commitNearTxLocalAsync}} is called to commit transaction, and no invokes of {{GridCacheAdapter#asyncOpRelease}} occur.
> So, the greater the load (RPS / TPS) with a such single cache transactions, the faster the failure of a node will happen.
> Reproducer of the problem:  [^SemaphorePermitsExceeded.patch]. It prints additional messages, when semaphore is released, or acquired.
> Links:
> # https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/GridCacheSharedContext.java#L1122



--
This message was sent by Atlassian Jira
(v8.20.10#820010)