You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/11/24 01:26:09 UTC

[GitHub] [kafka] jsancio commented on a change in pull request #9631: KAFKA-9672: Leader with ISR as a superset of replicas

jsancio commented on a change in pull request #9631:
URL: https://github.com/apache/kafka/pull/9631#discussion_r529105990



##########
File path: core/src/main/scala/kafka/cluster/Partition.scala
##########
@@ -947,9 +947,10 @@ class Partition(val topicPartition: TopicPartition,
                                   leaderEndOffset: Long,
                                   currentTimeMs: Long,
                                   maxLagMs: Long): Boolean = {
-    val followerReplica = getReplicaOrException(replicaId)
-    followerReplica.logEndOffset != leaderEndOffset &&
-      (currentTimeMs - followerReplica.lastCaughtUpTimeMs) > maxLagMs
+    getReplica(replicaId).fold(true) { followerReplica =>

Review comment:
       Thanks for the review!
   
   > This might be ok, but is unnecessary work since the controller will be doing that soon.
   
   According to some users and the report from KAFKA-9672, it looks like under some conditions the controller is writing to ZK that it removed the replica from the assignment but not from the ISR. I am unable to reproduce this or convince myself from the code on how this can happen.
   
   I was thinking of defensively letting the leader also remove the replica from the ISR so that Kafka can recover from this case. If the leader is not allowed to do this then `ack=all` produce messages will continue to fail.
   
   What do you think @junrao?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org