You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "jsancio (via GitHub)" <gi...@apache.org> on 2023/06/09 21:09:33 UTC
[GitHub] [kafka] jsancio commented on a diff in pull request #13834: KAFKA-15076; KRaft should prefer latest snapshot

jsancio commented on code in PR #13834:
URL: https://github.com/apache/kafka/pull/13834#discussion_r1224779841


##########
raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java:
##########
@@ -308,7 +308,12 @@ private void onUpdateLeaderHighWatermark(
     private void updateListenersProgress(long highWatermark) {
         for (ListenerContext listenerContext : listenerContexts.values()) {
             listenerContext.nextExpectedOffset().ifPresent(nextExpectedOffset -> {
-                if (nextExpectedOffset < log.startOffset() && nextExpectedOffset < highWatermark) {
+                // Send snapshot if the listener is at the beging of the log and there is a snapshot or
+                // the listener is trying to read an offset for which there isn't a segment in the log.
+                if (nextExpectedOffset < highWatermark &&
+                    ((nextExpectedOffset == 0 && latestSnapshot().isPresent()) ||

Review Comment:
   Thanks for the comments @mumrah @hachikuji .
   
   Yes. This PR is only addresses the contract between the `RaftClient.Listener` and `RaftClient`. Before this PR the `KafkaRaftClient` was never calling `Listener.handleLoadSnapshot` if the log segments had offset 0.
   
   Regarding replication between the leader and the followers (voters or observer), I agree that we can have the leader force the follower to send a FETCH_SNAPSHOT request even if the log segments contain the fetched offset. This heuristic could be something like -- if the number of bytes between the fetch offset and the offset of the latest snapshot is greater than the size of the latest snapshot then the leader will response to the FETCH request asking the follower to FETCH_SNAPSHOT instead.
   
   I thought about creating an issue for this but I am not sure how likely this would be. I think the most common case would be the user expanding the cluster by adding a new broker. In that case we can have the leader ask the follower to do a FETCH_SNAPSHOT if the fetch offset is 0 and the leader has generated a snapshot.
   
   Another case would be that the broker or inactive controller was offline for a long time. There the optimization above wouldn't work and we have to have the heuristic we have been discussing. I am just not sure how common this would be for the cluster metadata log partition. We could use the offset index for a rough computation of the number of bytes between the two offsets but I am not sure if it is worth the code complexity.
   
   I created this issue for the case when the follower is fetching offset 0: https://issues.apache.org/jira/browse/KAFKA-15078



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org