You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/04/08 17:39:05 UTC

[GitHub] [kafka] hachikuji commented on a change in pull request #10481: KAFKA-12619; Raft leader should expose hw only after committing LeaderChange

hachikuji commented on a change in pull request #10481:
URL: https://github.com/apache/kafka/pull/10481#discussion_r609943203



##########
File path: raft/src/main/java/org/apache/kafka/raft/LeaderState.java
##########
@@ -113,13 +113,18 @@ private boolean updateHighWatermark() {
         Optional<LogOffsetMetadata> highWatermarkUpdateOpt = followersByDescendingFetchOffset.get(indexOfHw).endOffset;
 
         if (highWatermarkUpdateOpt.isPresent()) {
-            // When a leader is first elected, it cannot know the high watermark of the previous
-            // leader. In order to avoid exposing a non-monotonically increasing value, we have
-            // to wait for followers to catch up to the start of the leader's epoch.
+
+            // The KRaft protocol requires an extra condition on commitment after a leader
+            // election. The leader must commit one record from its own epoch before it is
+            // allowed to expose records from any previous epoch. This guarantees that its
+            // log will contain the largest record (in terms of epoch/offset) in any log
+            // which ensures that any future leader will have replicated this record as well
+            // as all records from previous epochs that the current leader has committed.
+
             LogOffsetMetadata highWatermarkUpdateMetadata = highWatermarkUpdateOpt.get();
             long highWatermarkUpdateOffset = highWatermarkUpdateMetadata.offset;
 
-            if (highWatermarkUpdateOffset >= epochStartOffset) {
+            if (highWatermarkUpdateOffset > epochStartOffset) {

Review comment:
       That makes sense. I guess it could save a round-trip in some cases, but I'm also inclined to leave it for potential future investigation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org