You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Zach Thornton <zt...@hubspot.com.INVALID> on 2023/03/28 17:03:16 UTC

ISR expansion vs. shrink eligibility

Hello!

I was reading through the partition code, and I noticed that the criteria
for expanding an ISR
<https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L926-L958>
differs
from the criteria to shrink an ISR
<https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1167-L1178>
.

Specifically to summarize, I noticed that a replica can be considered as
eligible for expansion if its local end offset is >= the leaders high
watermark, but is considered "out of sync" if its local end offset != the
leaders local end offset.  It was a bit surprising to me that the criteria
here would be different, is there some part of the picture that I'm missing?

Thanks in advance!

Zach

Re: ISR expansion vs. shrink eligibility

Posted by Haruki Okada <oc...@gmail.com>.

Hi.

So the question is about the difference between the leader LEO (shrink
criteria) and the leader HW (expand criteria), right?

1. Why shrink-criteria uses leader LEO
Since HW is defined as "the latest offset that is replicated to all ISRs",
it can't be used to kick out a replica from the ISR set. (By its
definition, if we use HW here, a replica will never be out-of-sync even
when it's lagged, because HW will not be updated in the meantime)

2. Why expand-criteria uses HW
In expand-criteria, replicaLagTime is not taken into consideration (
https://github.com/apache/kafka/blob/e28e0bf0f2c21206abccfffb280605dd02404678/core/src/main/scala/kafka/cluster/Partition.scala#L934-L936
).
So if we use leader LEO here, for out-of-sync replica joins to ISR, it has
to catch-up the leader instantaneously after a message is appended to the
leader, which is almost impossible.

... Then, I came up with another question: "let's say min.insync.replicas =
1. In this case, leader HW will be incremented alone, so other replicas
will never become in-sync?" => but found that the leader waits to increment
HW if there's a replica that is "caught-up" (i.e. catching-up fast enough
than replicaLagTime)
https://github.com/apache/kafka/blob/e28e0bf0f2c21206abccfffb280605dd02404678/core/src/main/scala/kafka/cluster/Partition.scala#L1048-L1075

2023年3月29日(水) 2:04 Zach Thornton <zt...@hubspot.com.invalid>:

> Hello!
>
> I was reading through the partition code, and I noticed that the criteria
> for expanding an ISR
> <
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L926-L958
> >
> differs
> from the criteria to shrink an ISR
> <
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1167-L1178
> >
> .
>
> Specifically to summarize, I noticed that a replica can be considered as
> eligible for expansion if its local end offset is >= the leaders high
> watermark, but is considered "out of sync" if its local end offset != the
> leaders local end offset.  It was a bit surprising to me that the criteria
> here would be different, is there some part of the picture that I'm
> missing?
>
> Thanks in advance!
>
> Zach
>


-- 
========================
Okada Haruki
ocadaruma@gmail.com
========================