You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Sergey Korotkov (Jira)" <ji...@apache.org> on 2022/08/02 17:58:00 UTC

[jira] [Created] (IGNITE-17457) Cluster locks after the transaction recovery procedure if the tx primary node fail

Sergey Korotkov created IGNITE-17457:
----------------------------------------

             Summary: Cluster locks after the transaction recovery procedure if the tx primary node fail
                 Key: IGNITE-17457
                 URL: https://issues.apache.org/jira/browse/IGNITE-17457
             Project: Ignite
          Issue Type: Bug
            Reporter: Sergey Korotkov


Ignite cluster may be locked (all client operations would block) after the tx recovery procedure executed on the tx primary node failure.

The prepared transaction may remain un-commited on the backup node after the tx recovery.  So the partition exchange wouldn't complete. So cluster would be locked.

The Immediate reason is the race condition in the method:
{code:java}
org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter::markFinalizing(RECOVERY_FINISH){code}
It may be called concurrently for the same transaction both from the recovery procedure:
{code:java}
IgniteTxManager::commitIfPrepared{code}
and from the tx recovery request handler:
{code:java}
IgniteTxHandler::processCheckPreparedTxRequest{code}
 

Details and reproducer {color:#ff0000}TBD{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)