You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Lucas Wang (Jira)" <ji...@apache.org> on 2020/11/19 19:43:00 UTC

[jira] [Created] (KAFKA-10751) Generate log to help estimate messages lost during ULE

Lucas Wang created KAFKA-10751:
----------------------------------

             Summary: Generate log to help estimate messages lost during ULE
                 Key: KAFKA-10751
                 URL: https://issues.apache.org/jira/browse/KAFKA-10751
             Project: Kafka
          Issue Type: Improvement
            Reporter: Lucas Wang
            Assignee: Lucas Wang


During Unclean Leader Election, there could be data loss due to truncation at the resigned leader.

Suppose there are 3 brokers that has replicas for a given partition:
Broker A (leader) with largest offset 9 (log end offset 10)
Broker B (follower) with largest offset 4 (log end offset 5)
Broker C (follower) with largest offset 1 (log end offset 2)

Only the leader A is in the ISR with B and C lagging behind.
Now an unclean leader election causes the leadership to be transferred to C. Broker A would need to truncate 8 messages, and Broker B 3 messages.

Case 1: if these messages have been produced with acks=0 or 1, then clients would experience 8 lost messages.
Case 2: if the client is using acks=all and the partition's minISR setting is 2, and further let's assume broker B dropped out of the ISR after receiving the message with offset 4, then only the messages with offset<=4 have been acked to the client. The truncation effectively causes the client to lose 3 messages.

Knowing the exact amount of data loss involves knowing the client's acks setting when the messages are produced, and also whether the messages have been sufficiently replicated according to the MinISR setting.


If getting the exact data loss is too involved, at least there should be logs to help ESTIMATE the amount of data loss.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)