You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Michael Jaschob (Jira)" <ji...@apache.org> on 2021/08/06 00:21:00 UTC
[jira] [Created] (KAFKA-13174) Log Compaction Blocked Forever by Unstable Offset/Unclosed Transaction

Michael Jaschob created KAFKA-13174:
---------------------------------------

             Summary: Log Compaction Blocked Forever by Unstable Offset/Unclosed Transaction
                 Key: KAFKA-13174
                 URL: https://issues.apache.org/jira/browse/KAFKA-13174
             Project: Kafka
          Issue Type: Bug
          Components: core, log cleaner
    Affects Versions: 2.5.1
            Reporter: Michael Jaschob


Our production cluster experienced a single __consumer_offsets partition that was growing without ever being compacted. A closer inspection of the cleaner logs showed that the log was forever uncleanable at an offset from July 28, which had been written ~7 days previously:
{code:java}
[2021-08-02 19:08:39,650] DEBUG Finding range of cleanable offsets for log=__consumer_offsets-9. Last clean offset=Some(75795702964) now=1627956519650 => firstDirtyOffset=75795702964 firstUncleanableOffset=75868740168 activeSegment.baseOffset=76691318694 (kafka.log.LogCleanerManager$)
{code}
Using the log dumper tool, we were able to examine the records/batches around this offset and determined that the proximate cause was an "open" transaction that was never committed or aborted. We saw this:
 - a consumer group offset commit for group {{foo-group}} for topic-partition {{foo-topic-46}} from pid 1686325 (log offset 75868731509)
 - a transactional COMMIT marker from pid 1686325 (log offset 75868731579)
 - another consumer group offset commit for group {{foo-group}} for topic-partition {{foo-topic-46}} for pid 1686325 (log offset 75868740168, our first uncleanable offset)

Here's the raw log dumper output:
{code:java}
baseOffset: 75868731509 lastOffset: 75868731509 count: 1 baseSequence: 0 lastSequence: 0 producerId: 1686325 producerEpoch: 249 partitionLeaderEpoch: 320 isTransactional: true isControl: false position: 98725764 CreateTime: 1627495733656 size: 232 magic
: 2 compresscodec: NONE crc: 485368943 isvalid: true
| offset: 75868731509 CreateTime: 1627495733656 keysize: 126 valuesize: 36 sequence: 0 headerKeys: [] key: offset_commit::group=foo-group,partition=foo-topic-46 payload: offset=59016695,metadata=AQAAAXruS8Fg
...

baseOffset: 75868731579 lastOffset: 75868731579 count: 1 baseSequence: -1 lastSequence: -1 producerId: 1686325 producerEpoch: 249 partitionLeaderEpoch: 320 isTransactional: true isControl: true position: 98732634 CreateTime: 1627495733700 size: 78 magic
: 2 compresscodec: NONE crc: 785483064 isvalid: true
| offset: 75868731579 CreateTime: 1627495733700 keysize: 4 valuesize: 6 sequence: -1 headerKeys: [] endTxnMarker: COMMIT coordinatorEpoch: 143
...

baseOffset: 75868740168 lastOffset: 75868740168 count: 1 baseSequence: 0 lastSequence: 0 producerId: 1686325 producerEpoch: 249 partitionLeaderEpoch: 320 isTransactional: true isControl: false position: 99599843 CreateTime: 1627495737629 size: 232 magic: 2 compresscodec: NONE crc: 1222829008 isvalid: true
| offset: 75868740168 CreateTime: 1627495737629 keysize: 126 valuesize: 36 sequence: 0 headerKeys: [] key: offset_commit::group=foo-group,partition=foo-topic-46 payload: offset=59016695,metadata=AQAAAXruS8Fg
...
{code}
There was no further activity from that pid 1686325. In fact, the KStream application in question stalled on that partition because of this unstable offset/open transaction: {{The following partitions still have unstable offsets which are not cleared on the broker side: [foo-topic-46], this could be either transactional offsets waiting for completion, or normal offsets waiting for replication after appending to local log}}

We then captured the producer snapshot file from the broker data directory and wrote a quick tool to dump it as text. From its output, we found that the transactional producer in question (pid 1686325) was still considered alive with its hanging transaction at 75868740168:
{code:java}
ArraySeq(ProducerStateEntry(producerId=1686325, producerEpoch=249, currentTxnFirstOffset=Some(75868740168), coordinatorEpoch=143, lastTimestamp=1627495737629, batchMetadata=Queue(BatchMetadata(firstSeq=0, lastSeq=0, firstOffset=75868740168, lastOffset=75868740168, timestamp=1627495737629)))
{code}
This was very perplexing. As far as we can tell, the code in both Apache Kafka 2.5.1 and in trunk essentially treats an open transaction like we had as uncleanable, which in practice blocks the log from ever being compacted again, for all eternity. Once a pid has an open transaction - a defined {{currentTxnFirstOffset}} - {{ProducerStateManager}} will [never expire the producer|https://github.com/apache/kafka/blob/2.5.1/core/src/main/scala/kafka/log/ProducerStateManager.scala#L576-L577], even after {{transactional.id.expiration.ms}} has passed. This, obviously, has severe repercussions on a topic like __consumer_offsets (long coordinator load times, always-growing disk usage).

While we're not sure what led to this hanging open transaction (note: we were running partition reassignments at that time so perhaps the transaction coordinator moved?), it seems just very possible to end up with a started, but never committed or aborted, transaction like in our example for myriad reasons: network issues; broker deaths, etc.

Are we misunderstanding something here? How could a log like we had ever get compacted? If we're not misunderstanding, it looks like this is a potentially severe bug and the log cleaner should be able to handle it. Perhaps the implementation of {{isProducerExpired}} should be an or-condition, not an and-condition?

Please let us know if we can provide any more details that would help shed light on what is going on here!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)