You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Michael Bingham (JIRA)" <ji...@apache.org> on 2018/10/05 20:27:00 UTC

[jira] [Created] (KAFKA-7487) DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch

Michael Bingham created KAFKA-7487:
--------------------------------------

             Summary: DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch
                 Key: KAFKA-7487
                 URL: https://issues.apache.org/jira/browse/KAFKA-7487
             Project: Kafka
          Issue Type: Improvement
          Components: core
    Affects Versions: 2.0.0
            Reporter: Michael Bingham


When running {{DumpLogSegments}} against an {{.index file}}, mismatches may be reported when the indexed message offset is not the first record in a batch. For example:

{code}
 Mismatches in :/var/lib/kafka/data/replicated-topic-0/00000000000000000000.index
 Index offset: 968, log offset: 966
{code}

And looking at the corresponding {{.log}} file:

{code}
baseOffset: 966 lastOffset: 968 count: 3 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false position: 3952771 CreateTime: 1538768639065 isvalid: true size: 12166 magic: 2 compresscodec: NONE crc: 294402254 
{code}

In this case, the last offset in the batch was indexed instead of the first, but the index has to map physical position to the start of the batch, leading to the mismatch.

It seems like {{DumpLogSegments}} should not report these cases as mismatches, which users might interpret as an error or problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)