You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jason Gustafson (Jira)" <ji...@apache.org> on 2022/07/07 19:42:00 UTC

[jira] [Created] (KAFKA-14055) Transaction markers may be lost during cleaning if data keys conflict with marker keys

Jason Gustafson created KAFKA-14055:
---------------------------------------

             Summary: Transaction markers may be lost during cleaning if data keys conflict with marker keys
                 Key: KAFKA-14055
                 URL: https://issues.apache.org/jira/browse/KAFKA-14055
             Project: Kafka
          Issue Type: Bug
            Reporter: Jason Gustafson
             Fix For: 3.3.0, 3.0.2, 3.1.2, 3.2.1


We have been seeing recently hanging transactions occur on streams changelog topics quite frequently. After investigation, we found that the keys used in the changelog topic conflict with the keys used in the transaction markers (the schema used in control records is 4 bytes, which happens to be the same for the changelog topics that we investigated). When we build the offset map prior to cleaning, we do properly exclude the transaction marker keys, but the bug is the fact that we do not exclude them during the cleaning phase. This can result in the marker being removed from the cleaned log before the corresponding data is removed when there is a user record with a conflicting key at a higher offset. A side effect of this is a so-called "hanging" transaction, but the bigger problem is that we lose the atomicity of the transaction. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)