You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Jason Gustafson (Jira)" <ji...@apache.org> on 2020/03/27 18:51:00 UTC

[jira] [Created] (KAFKA-9777) Purgatory locking bug can lead to hanging transaction

Jason Gustafson created KAFKA-9777:
--------------------------------------

Summary: Purgatory locking bug can lead to hanging transaction
Key: KAFKA-9777
URL: https://issues.apache.org/jira/browse/KAFKA-9777
Project: Kafka
Issue Type: Bug
Affects Versions: 2.4.1, 2.3.1, 2.2.2, 2.1.1, 2.0.1, 1.1.1
Reporter: Jason Gustafson
Assignee: Jason Gustafson

Once a transaction reaches the `PrepareCommit` or `PrepareAbort` state, the transaction coordinator must send markers to all partitions included in the transaction. After all markers have been sent, then the transaction transitions to the corresponding completed state. Until this transition occurs, no additional progress can be made by the producer.

The transaction coordinator uses a purgatory to track completion of the markers that need to be sent. Once all markers have been written, then the `DelayedTxnMarker` task becomes completable. We depend on its completion in order to transition to the completed state.

Related to KAFKA-8334, there is a bug in the locking protocol which is used to check completion of the `DelayedTxnMarker` task. The purgatory attempts to provide a "happens before" contract for task completion with `checkAndComplete`. Basically if a task is completed before calling `checkAndComplete`, then it should be given an opportunity to complete as long as there is sufficient time remaining before expiration.

The bug in the locking protocol is that it expects that the operation lock is exclusive to the operation. See here: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedOperation.scala#L114. The logic assumes that if the lock cannot be acquired, then the other holder of the lock must be attempting completion of the same delayed operation. If that is not the case, then the "happens before" contract is broken and a task may not get completed until expiration even if it has been satisfied.

In the case of `DelayedTxnMarker`, the lock in use is the read side of a read-write lock which is used for partition loading: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L264. In fact, if the lock cannot be acquired, it means that it is being held in order to complete some loading operation, in which case it will definitely not attempt completion of the delayed operation. If this happens to occur on the last call to `checkAndComplete` after all markers have been written, then the transition to the completing state will never occur.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)