You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Guozhang Wang (Jira)" <ji...@apache.org> on 2020/06/16 00:21:00 UTC

[jira] [Created] (KAFKA-10167) Streams EOS-Beta should not try to get end-offsets as read-committed

Guozhang Wang created KAFKA-10167:
-------------------------------------

             Summary: Streams EOS-Beta should not try to get end-offsets as read-committed
                 Key: KAFKA-10167
                 URL: https://issues.apache.org/jira/browse/KAFKA-10167
             Project: Kafka
          Issue Type: Bug
            Reporter: Guozhang Wang


This is a bug discovered with the new EOS protocol (KIP-447), here's the context:

In Streams when we are assigned with the new active tasks, we would first try to restore the state from the changelog topic all the way to the log end offset, and then we can transit from the `restoring` to the `running` state to start processing the task.

Before KIP-447, the end-offset call is only triggered after we've passed the synchronization barrier at the txn-coordinator which would guarantee that the txn-marker has been sent and received (otherwise we would error with CONCURRENT_TRANSACTIONS and let the producer retry), and when the txn-marker is received, it also means that the marker has been fully replicated, which in turn guarantees that the data written before that marker has been fully replicated. As a result, when we send the list-offset with `read-committed` flag we are guaranteed that the returned offset == LSO == high-watermark.

After KIP-447 however, we do not fence on the txn-coordinator but on group-coordinator upon offset-fetch, and the group-coordinator would return the fetching offset right after it has received the replicated the txn-marker sent to it. However, since the txn-marker are sent to different brokers in parallel, and even within the same broker markers of different partitions are appended / replicated independently as well, so when the fetch-offset request returns it is NOT guaranteed that the LSO on other data partitions would have been advanced as well. And hence in that case the `endOffset` call may returned a smaller offset, causing data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)