You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Calvin Liu <ca...@confluent.io.INVALID> on 2023/02/10 19:02:40 UTC

[VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Hi all,

I'd like to call for a vote on KIP-903, which proposes a fix to the broker
reboot data loss KAFKA-14139
<https://issues.apache.org/jira/browse/KAFKA-14139>
It changes the Fetch and AlterPartition requests to include the broker
epochs. Then the controller can use these epochs to help reject the stale
AlterPartition request.

KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR

Discussion thread:
https://lists.apache.org/thread/vk9h68qpqqq69nlzbvxqx2yfbmzcd0mo

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Calvin Liu <ca...@confluent.io.INVALID>.
The KIP is accepted with 3 binding votes (David, Jun, Jason) and 1
non-binding vote (Alexandre).
Thanks!

On Tue, Feb 28, 2023 at 3:06 PM Jason Gustafson <ja...@confluent.io.invalid>
wrote:

> Thanks Calvin, +1 from me.
>
> On Mon, Feb 27, 2023 at 9:41 AM Calvin Liu <ca...@confluent.io.invalid>
> wrote:
>
> > Hi Jason,
> > Updated the fields accordingly. Also, rename the BrokerState to
> > ReplicaState.
> > Thanks.
> >
> > On Wed, Feb 22, 2023 at 4:38 PM Jason Gustafson
> <jason@confluent.io.invalid
> > >
> > wrote:
> >
> > > Hi Calvin,
> > >
> > > The `BrokerState` struct I suggested would replace the `BrokerId` field
> > in
> > > older versions.
> > >
> > >     { "name": "ReplicaId", "type": "int32", "versions": "0-13",
> > > "entityType": "brokerId",
> > >       "about": "The broker ID of the follower, of -1 if this request is
> > > from a consumer." },
> > >     { "name": "BrokerState", "type": "BrokerState", "taggedVersions":
> > > "14+", "tag": 1, "fields": [
> > >       { "name": "BrokerId", "type": "int32", "versions": "14+",
> > > "entityType": "brokerId",
> > >         "about": "The broker ID of the follower, of -1 if this request
> is
> > > from a consumer." },
> > >       { "name": "BrokerEpoch", "type": "int64", "versions": "14+",
> > "about":
> > > "The epoch of this follower." }
> > >     ]},
> > >
> > > Note that the version range of `ReplicaId` is set to 0-13. Version 14
> > > onward would not include it.
> > >
> > > -Jason
> > >
> > > On Wed, Feb 22, 2023 at 12:07 PM Calvin Liu <caliu@confluent.io.invalid
> >
> > > wrote:
> > >
> > > > To Jose:
> > > > 1. Actually I have a second thoughts about the naming ReplicaEpoch.
> The
> > > > BrokerEpoch only applies to the replication protocol between the
> > brokers.
> > > > For others like the KRaft controller, this field can be ignored. So
> can
> > > we
> > > > change the name to ReplicaEpoch when we really use it in other
> > scenarios?
> > > >
> > > > On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io>
> > wrote:
> > > >
> > > > > To Jason:
> > > > > 1. Related to the Fetch Request fields change, previously you
> > suggested
> > > > > deprecating the ReplicaId and moving it into a BrokerState field.
> How
> > > > about
> > > > > we just make the BrokerEpoch a tag field?
> > > > > - The ReplicaId is currently in use and is filled every time. So
> that
> > > we
> > > > > can keep the way simple.
> > > > > - We can still make the optional BrokerEpoch out of the request
> when
> > it
> > > > is
> > > > > not needed.
> > > > >
> > > > > On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io>
> > > wrote:
> > > > >
> > > > >> To Jason:
> > > > >> 1. We can make the BrokerEpoch a tagged field. But I am not sure
> > about
> > > > >> your proposed metadata structure. In the BrokerState, do we need
> to
> > > > store
> > > > >> the BrokerId again? It would duplicate with ReplicaId.
> > > > >> 2. Considering that the broker reboot data loss case is rare and
> > Kraft
> > > > is
> > > > >> coming soon. Plus we need extra effort to
> > > > >> - Simply asking the controller to compare the epoch with its best
> > > > >> knowledge is not enough, because the ZK controller may not know
> the
> > > > latest
> > > > >> broker epoch,
> > > > >> - The current design only helps with the delayed AlterPartition
> > issue
> > > > >> when the broker reboots. In ZK mode, we also need to cover the
> > broker
> > > > >> reboot + controller reboot scenario. If the reboot broker is in
> ISR
> > > > >> already, the controller also crashes during the broker reboot, the
> > new
> > > > >> controller can be completely unaware of the bounced broker and
> > select
> > > > this
> > > > >> broker as the leader.
> > > > >> - Create a test framework to simulate the event sequence of broker
> > > > reboot
> > > > >> and registration, delayed AlterPartition request.
> > > > >>
> > > > >> To Jose:
> > > > >> 1. Thanks for the renaming advice. I will update the KIP later.
> > > > >> 2. Ack, will update.
> > > > >>
> > > > >>
> > > > >> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
> > > > >> <js...@confluent.io.invalid> wrote:
> > > > >>
> > > > >>> Hi Calvin,
> > > > >>>
> > > > >>> Thanks for the improvement.
> > > > >>>
> > > > >>> 1. In the KIP, you suggest changing the Fetch request to "Rename
> > the
> > > > >>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The
> Fetch
> > > > >>> RPC is used by replicas that are not brokers, for example
> > controllers
> > > > >>> in KRaft.
> > > > >>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both
> KRaft
> > > > >>> and ISR partitions have the concept of replica id and replica
> epoch
> > > > >>> but not necessarily the concept of a broker.
> > > > >>>
> > > > >>> 2. Since the new field "BrokerEpoch '' is ignorable, should it
> also
> > > > >>> have a default value? How about -1 since that is what you use in
> > > > >>> AlterPartittion RPC.
> > > > >>>
> > > > >>> --
> > > > >>> -José
> > > > >>>
> > > > >>
> > > >
> > >
> >
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Jason Gustafson <ja...@confluent.io.INVALID>.
Thanks Calvin, +1 from me.

On Mon, Feb 27, 2023 at 9:41 AM Calvin Liu <ca...@confluent.io.invalid>
wrote:

> Hi Jason,
> Updated the fields accordingly. Also, rename the BrokerState to
> ReplicaState.
> Thanks.
>
> On Wed, Feb 22, 2023 at 4:38 PM Jason Gustafson <jason@confluent.io.invalid
> >
> wrote:
>
> > Hi Calvin,
> >
> > The `BrokerState` struct I suggested would replace the `BrokerId` field
> in
> > older versions.
> >
> >     { "name": "ReplicaId", "type": "int32", "versions": "0-13",
> > "entityType": "brokerId",
> >       "about": "The broker ID of the follower, of -1 if this request is
> > from a consumer." },
> >     { "name": "BrokerState", "type": "BrokerState", "taggedVersions":
> > "14+", "tag": 1, "fields": [
> >       { "name": "BrokerId", "type": "int32", "versions": "14+",
> > "entityType": "brokerId",
> >         "about": "The broker ID of the follower, of -1 if this request is
> > from a consumer." },
> >       { "name": "BrokerEpoch", "type": "int64", "versions": "14+",
> "about":
> > "The epoch of this follower." }
> >     ]},
> >
> > Note that the version range of `ReplicaId` is set to 0-13. Version 14
> > onward would not include it.
> >
> > -Jason
> >
> > On Wed, Feb 22, 2023 at 12:07 PM Calvin Liu <ca...@confluent.io.invalid>
> > wrote:
> >
> > > To Jose:
> > > 1. Actually I have a second thoughts about the naming ReplicaEpoch. The
> > > BrokerEpoch only applies to the replication protocol between the
> brokers.
> > > For others like the KRaft controller, this field can be ignored. So can
> > we
> > > change the name to ReplicaEpoch when we really use it in other
> scenarios?
> > >
> > > On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io>
> wrote:
> > >
> > > > To Jason:
> > > > 1. Related to the Fetch Request fields change, previously you
> suggested
> > > > deprecating the ReplicaId and moving it into a BrokerState field. How
> > > about
> > > > we just make the BrokerEpoch a tag field?
> > > > - The ReplicaId is currently in use and is filled every time. So that
> > we
> > > > can keep the way simple.
> > > > - We can still make the optional BrokerEpoch out of the request when
> it
> > > is
> > > > not needed.
> > > >
> > > > On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io>
> > wrote:
> > > >
> > > >> To Jason:
> > > >> 1. We can make the BrokerEpoch a tagged field. But I am not sure
> about
> > > >> your proposed metadata structure. In the BrokerState, do we need to
> > > store
> > > >> the BrokerId again? It would duplicate with ReplicaId.
> > > >> 2. Considering that the broker reboot data loss case is rare and
> Kraft
> > > is
> > > >> coming soon. Plus we need extra effort to
> > > >> - Simply asking the controller to compare the epoch with its best
> > > >> knowledge is not enough, because the ZK controller may not know the
> > > latest
> > > >> broker epoch,
> > > >> - The current design only helps with the delayed AlterPartition
> issue
> > > >> when the broker reboots. In ZK mode, we also need to cover the
> broker
> > > >> reboot + controller reboot scenario. If the reboot broker is in ISR
> > > >> already, the controller also crashes during the broker reboot, the
> new
> > > >> controller can be completely unaware of the bounced broker and
> select
> > > this
> > > >> broker as the leader.
> > > >> - Create a test framework to simulate the event sequence of broker
> > > reboot
> > > >> and registration, delayed AlterPartition request.
> > > >>
> > > >> To Jose:
> > > >> 1. Thanks for the renaming advice. I will update the KIP later.
> > > >> 2. Ack, will update.
> > > >>
> > > >>
> > > >> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
> > > >> <js...@confluent.io.invalid> wrote:
> > > >>
> > > >>> Hi Calvin,
> > > >>>
> > > >>> Thanks for the improvement.
> > > >>>
> > > >>> 1. In the KIP, you suggest changing the Fetch request to "Rename
> the
> > > >>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
> > > >>> RPC is used by replicas that are not brokers, for example
> controllers
> > > >>> in KRaft.
> > > >>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
> > > >>> and ISR partitions have the concept of replica id and replica epoch
> > > >>> but not necessarily the concept of a broker.
> > > >>>
> > > >>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
> > > >>> have a default value? How about -1 since that is what you use in
> > > >>> AlterPartittion RPC.
> > > >>>
> > > >>> --
> > > >>> -José
> > > >>>
> > > >>
> > >
> >
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Calvin Liu <ca...@confluent.io.INVALID>.
Hi Jason,
Updated the fields accordingly. Also, rename the BrokerState to
ReplicaState.
Thanks.

On Wed, Feb 22, 2023 at 4:38 PM Jason Gustafson <ja...@confluent.io.invalid>
wrote:

> Hi Calvin,
>
> The `BrokerState` struct I suggested would replace the `BrokerId` field in
> older versions.
>
>     { "name": "ReplicaId", "type": "int32", "versions": "0-13",
> "entityType": "brokerId",
>       "about": "The broker ID of the follower, of -1 if this request is
> from a consumer." },
>     { "name": "BrokerState", "type": "BrokerState", "taggedVersions":
> "14+", "tag": 1, "fields": [
>       { "name": "BrokerId", "type": "int32", "versions": "14+",
> "entityType": "brokerId",
>         "about": "The broker ID of the follower, of -1 if this request is
> from a consumer." },
>       { "name": "BrokerEpoch", "type": "int64", "versions": "14+", "about":
> "The epoch of this follower." }
>     ]},
>
> Note that the version range of `ReplicaId` is set to 0-13. Version 14
> onward would not include it.
>
> -Jason
>
> On Wed, Feb 22, 2023 at 12:07 PM Calvin Liu <ca...@confluent.io.invalid>
> wrote:
>
> > To Jose:
> > 1. Actually I have a second thoughts about the naming ReplicaEpoch. The
> > BrokerEpoch only applies to the replication protocol between the brokers.
> > For others like the KRaft controller, this field can be ignored. So can
> we
> > change the name to ReplicaEpoch when we really use it in other scenarios?
> >
> > On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io> wrote:
> >
> > > To Jason:
> > > 1. Related to the Fetch Request fields change, previously you suggested
> > > deprecating the ReplicaId and moving it into a BrokerState field. How
> > about
> > > we just make the BrokerEpoch a tag field?
> > > - The ReplicaId is currently in use and is filled every time. So that
> we
> > > can keep the way simple.
> > > - We can still make the optional BrokerEpoch out of the request when it
> > is
> > > not needed.
> > >
> > > On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io>
> wrote:
> > >
> > >> To Jason:
> > >> 1. We can make the BrokerEpoch a tagged field. But I am not sure about
> > >> your proposed metadata structure. In the BrokerState, do we need to
> > store
> > >> the BrokerId again? It would duplicate with ReplicaId.
> > >> 2. Considering that the broker reboot data loss case is rare and Kraft
> > is
> > >> coming soon. Plus we need extra effort to
> > >> - Simply asking the controller to compare the epoch with its best
> > >> knowledge is not enough, because the ZK controller may not know the
> > latest
> > >> broker epoch,
> > >> - The current design only helps with the delayed AlterPartition issue
> > >> when the broker reboots. In ZK mode, we also need to cover the broker
> > >> reboot + controller reboot scenario. If the reboot broker is in ISR
> > >> already, the controller also crashes during the broker reboot, the new
> > >> controller can be completely unaware of the bounced broker and select
> > this
> > >> broker as the leader.
> > >> - Create a test framework to simulate the event sequence of broker
> > reboot
> > >> and registration, delayed AlterPartition request.
> > >>
> > >> To Jose:
> > >> 1. Thanks for the renaming advice. I will update the KIP later.
> > >> 2. Ack, will update.
> > >>
> > >>
> > >> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
> > >> <js...@confluent.io.invalid> wrote:
> > >>
> > >>> Hi Calvin,
> > >>>
> > >>> Thanks for the improvement.
> > >>>
> > >>> 1. In the KIP, you suggest changing the Fetch request to "Rename the
> > >>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
> > >>> RPC is used by replicas that are not brokers, for example controllers
> > >>> in KRaft.
> > >>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
> > >>> and ISR partitions have the concept of replica id and replica epoch
> > >>> but not necessarily the concept of a broker.
> > >>>
> > >>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
> > >>> have a default value? How about -1 since that is what you use in
> > >>> AlterPartittion RPC.
> > >>>
> > >>> --
> > >>> -José
> > >>>
> > >>
> >
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Jason Gustafson <ja...@confluent.io.INVALID>.
Hi Calvin,

The `BrokerState` struct I suggested would replace the `BrokerId` field in
older versions.

    { "name": "ReplicaId", "type": "int32", "versions": "0-13",
"entityType": "brokerId",
      "about": "The broker ID of the follower, of -1 if this request is
from a consumer." },
    { "name": "BrokerState", "type": "BrokerState", "taggedVersions":
"14+", "tag": 1, "fields": [
      { "name": "BrokerId", "type": "int32", "versions": "14+",
"entityType": "brokerId",
        "about": "The broker ID of the follower, of -1 if this request is
from a consumer." },
      { "name": "BrokerEpoch", "type": "int64", "versions": "14+", "about":
"The epoch of this follower." }
    ]},

Note that the version range of `ReplicaId` is set to 0-13. Version 14
onward would not include it.

-Jason

On Wed, Feb 22, 2023 at 12:07 PM Calvin Liu <ca...@confluent.io.invalid>
wrote:

> To Jose:
> 1. Actually I have a second thoughts about the naming ReplicaEpoch. The
> BrokerEpoch only applies to the replication protocol between the brokers.
> For others like the KRaft controller, this field can be ignored. So can we
> change the name to ReplicaEpoch when we really use it in other scenarios?
>
> On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io> wrote:
>
> > To Jason:
> > 1. Related to the Fetch Request fields change, previously you suggested
> > deprecating the ReplicaId and moving it into a BrokerState field. How
> about
> > we just make the BrokerEpoch a tag field?
> > - The ReplicaId is currently in use and is filled every time. So that we
> > can keep the way simple.
> > - We can still make the optional BrokerEpoch out of the request when it
> is
> > not needed.
> >
> > On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io> wrote:
> >
> >> To Jason:
> >> 1. We can make the BrokerEpoch a tagged field. But I am not sure about
> >> your proposed metadata structure. In the BrokerState, do we need to
> store
> >> the BrokerId again? It would duplicate with ReplicaId.
> >> 2. Considering that the broker reboot data loss case is rare and Kraft
> is
> >> coming soon. Plus we need extra effort to
> >> - Simply asking the controller to compare the epoch with its best
> >> knowledge is not enough, because the ZK controller may not know the
> latest
> >> broker epoch,
> >> - The current design only helps with the delayed AlterPartition issue
> >> when the broker reboots. In ZK mode, we also need to cover the broker
> >> reboot + controller reboot scenario. If the reboot broker is in ISR
> >> already, the controller also crashes during the broker reboot, the new
> >> controller can be completely unaware of the bounced broker and select
> this
> >> broker as the leader.
> >> - Create a test framework to simulate the event sequence of broker
> reboot
> >> and registration, delayed AlterPartition request.
> >>
> >> To Jose:
> >> 1. Thanks for the renaming advice. I will update the KIP later.
> >> 2. Ack, will update.
> >>
> >>
> >> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
> >> <js...@confluent.io.invalid> wrote:
> >>
> >>> Hi Calvin,
> >>>
> >>> Thanks for the improvement.
> >>>
> >>> 1. In the KIP, you suggest changing the Fetch request to "Rename the
> >>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
> >>> RPC is used by replicas that are not brokers, for example controllers
> >>> in KRaft.
> >>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
> >>> and ISR partitions have the concept of replica id and replica epoch
> >>> but not necessarily the concept of a broker.
> >>>
> >>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
> >>> have a default value? How about -1 since that is what you use in
> >>> AlterPartittion RPC.
> >>>
> >>> --
> >>> -José
> >>>
> >>
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Calvin Liu <ca...@confluent.io.INVALID>.
To Jose:
1. Actually I have a second thoughts about the naming ReplicaEpoch. The
BrokerEpoch only applies to the replication protocol between the brokers.
For others like the KRaft controller, this field can be ignored. So can we
change the name to ReplicaEpoch when we really use it in other scenarios?

On Wed, Feb 22, 2023 at 11:08 AM Calvin Liu <ca...@confluent.io> wrote:

> To Jason:
> 1. Related to the Fetch Request fields change, previously you suggested
> deprecating the ReplicaId and moving it into a BrokerState field. How about
> we just make the BrokerEpoch a tag field?
> - The ReplicaId is currently in use and is filled every time. So that we
> can keep the way simple.
> - We can still make the optional BrokerEpoch out of the request when it is
> not needed.
>
> On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io> wrote:
>
>> To Jason:
>> 1. We can make the BrokerEpoch a tagged field. But I am not sure about
>> your proposed metadata structure. In the BrokerState, do we need to store
>> the BrokerId again? It would duplicate with ReplicaId.
>> 2. Considering that the broker reboot data loss case is rare and Kraft is
>> coming soon. Plus we need extra effort to
>> - Simply asking the controller to compare the epoch with its best
>> knowledge is not enough, because the ZK controller may not know the latest
>> broker epoch,
>> - The current design only helps with the delayed AlterPartition issue
>> when the broker reboots. In ZK mode, we also need to cover the broker
>> reboot + controller reboot scenario. If the reboot broker is in ISR
>> already, the controller also crashes during the broker reboot, the new
>> controller can be completely unaware of the bounced broker and select this
>> broker as the leader.
>> - Create a test framework to simulate the event sequence of broker reboot
>> and registration, delayed AlterPartition request.
>>
>> To Jose:
>> 1. Thanks for the renaming advice. I will update the KIP later.
>> 2. Ack, will update.
>>
>>
>> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
>> <js...@confluent.io.invalid> wrote:
>>
>>> Hi Calvin,
>>>
>>> Thanks for the improvement.
>>>
>>> 1. In the KIP, you suggest changing the Fetch request to "Rename the
>>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
>>> RPC is used by replicas that are not brokers, for example controllers
>>> in KRaft.
>>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
>>> and ISR partitions have the concept of replica id and replica epoch
>>> but not necessarily the concept of a broker.
>>>
>>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
>>> have a default value? How about -1 since that is what you use in
>>> AlterPartittion RPC.
>>>
>>> --
>>> -José
>>>
>>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Calvin Liu <ca...@confluent.io.INVALID>.
To Jason:
1. Related to the Fetch Request fields change, previously you suggested
deprecating the ReplicaId and moving it into a BrokerState field. How about
we just make the BrokerEpoch a tag field?
- The ReplicaId is currently in use and is filled every time. So that we
can keep the way simple.
- We can still make the optional BrokerEpoch out of the request when it is
not needed.

On Tue, Feb 21, 2023 at 10:39 PM Calvin Liu <ca...@confluent.io> wrote:

> To Jason:
> 1. We can make the BrokerEpoch a tagged field. But I am not sure about
> your proposed metadata structure. In the BrokerState, do we need to store
> the BrokerId again? It would duplicate with ReplicaId.
> 2. Considering that the broker reboot data loss case is rare and Kraft is
> coming soon. Plus we need extra effort to
> - Simply asking the controller to compare the epoch with its best
> knowledge is not enough, because the ZK controller may not know the latest
> broker epoch,
> - The current design only helps with the delayed AlterPartition issue when
> the broker reboots. In ZK mode, we also need to cover the broker reboot +
> controller reboot scenario. If the reboot broker is in ISR already, the
> controller also crashes during the broker reboot, the new controller can be
> completely unaware of the bounced broker and select this broker as the
> leader.
> - Create a test framework to simulate the event sequence of broker reboot
> and registration, delayed AlterPartition request.
>
> To Jose:
> 1. Thanks for the renaming advice. I will update the KIP later.
> 2. Ack, will update.
>
>
> On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
> <js...@confluent.io.invalid> wrote:
>
>> Hi Calvin,
>>
>> Thanks for the improvement.
>>
>> 1. In the KIP, you suggest changing the Fetch request to "Rename the
>> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
>> RPC is used by replicas that are not brokers, for example controllers
>> in KRaft.
>> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
>> and ISR partitions have the concept of replica id and replica epoch
>> but not necessarily the concept of a broker.
>>
>> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
>> have a default value? How about -1 since that is what you use in
>> AlterPartittion RPC.
>>
>> --
>> -José
>>
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Calvin Liu <ca...@confluent.io.INVALID>.
To Jason:
1. We can make the BrokerEpoch a tagged field. But I am not sure about your
proposed metadata structure. In the BrokerState, do we need to store the
BrokerId again? It would duplicate with ReplicaId.
2. Considering that the broker reboot data loss case is rare and Kraft is
coming soon. Plus we need extra effort to
- Simply asking the controller to compare the epoch with its best knowledge
is not enough, because the ZK controller may not know the latest broker
epoch,
- The current design only helps with the delayed AlterPartition issue when
the broker reboots. In ZK mode, we also need to cover the broker reboot +
controller reboot scenario. If the reboot broker is in ISR already, the
controller also crashes during the broker reboot, the new controller can be
completely unaware of the bounced broker and select this broker as the
leader.
- Create a test framework to simulate the event sequence of broker reboot
and registration, delayed AlterPartition request.

To Jose:
1. Thanks for the renaming advice. I will update the KIP later.
2. Ack, will update.


On Tue, Feb 21, 2023 at 2:49 PM José Armando García Sancio
<js...@confluent.io.invalid> wrote:

> Hi Calvin,
>
> Thanks for the improvement.
>
> 1. In the KIP, you suggest changing the Fetch request to "Rename the
> ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
> RPC is used by replicas that are not brokers, for example controllers
> in KRaft.
> Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
> and ISR partitions have the concept of replica id and replica epoch
> but not necessarily the concept of a broker.
>
> 2. Since the new field "BrokerEpoch '' is ignorable, should it also
> have a default value? How about -1 since that is what you use in
> AlterPartittion RPC.
>
> --
> -José
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by José Armando García Sancio <js...@confluent.io.INVALID>.
Hi Calvin,

Thanks for the improvement.

1. In the KIP, you suggest changing the Fetch request to "Rename the
ReplicaId to BrokerId" and "Add a new Field BrokerEpoch". The Fetch
RPC is used by replicas that are not brokers, for example controllers
in KRaft.
Can we keep the name "ReplicaId" and use "ReplicaEpoch". Both KRaft
and ISR partitions have the concept of replica id and replica epoch
but not necessarily the concept of a broker.

2. Since the new field "BrokerEpoch '' is ignorable, should it also
have a default value? How about -1 since that is what you use in
AlterPartittion RPC.

-- 
-José

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Jason Gustafson <ja...@confluent.io.INVALID>.
Hi Calvin, thanks for the KIP! A couple questions below:

- Would it make sense to make the broker epoch a tagged field inside the
Fetch request? It is a little annoying to increase the size of consumer
fetch requests for a field that they will not use. Perhaps we could factor
out broker id and broker epoch into a tagged struct:

    { "name": "ReplicaId", "type": "int32", "versions": "0-13",
"entityType": "brokerId",
      "about": "The broker ID of the follower, of -1 if this request is
from a consumer." },
    { "name": "BrokerState", "type": "BrokerState", "taggedVersions":
"14+", "tag": 1, "fields": [
      { "name": "BrokerId", "type": "int32", "versions": "14+",
"entityType": "brokerId",
        "about": "The broker ID of the follower, of -1 if this request is
from a consumer." },
      { "name": "BrokerEpoch", "type": "int64", "versions": "14+", "about":
"The epoch of this follower." }
    ]},

- The KIP mentions that this will only be implemented for kraft. I guess
it's probably easier to send along the broker epoch in any case in both
Fetch and AlterPartition? Are we saving much by not implementing the logic
in the old controller? I don't think it involves any new state.

Thanks,
Jason

On Tue, Feb 14, 2023 at 12:40 AM David Jacot <dj...@confluent.io.invalid>
wrote:

> +1 (binding). Thanks for the KIP, Calvin!
>
> On Tue, Feb 14, 2023 at 1:10 AM Jun Rao <ju...@confluent.io.invalid> wrote:
>
> > Hi, Calvin,
> >
> > Thanks for the KIP. +1
> >
> > Jun
> >
> > On Fri, Feb 10, 2023 at 11:03 AM Alexandre Dupriez <
> > alexandre.dupriez@gmail.com> wrote:
> >
> > > +1 (non-binding).
> > >
> > > Le ven. 10 févr. 2023 à 19:02, Calvin Liu <ca...@confluent.io.invalid>
> a
> > > écrit :
> > > >
> > > > Hi all,
> > > >
> > > > I'd like to call for a vote on KIP-903, which proposes a fix to the
> > > broker
> > > > reboot data loss KAFKA-14139
> > > > <https://issues.apache.org/jira/browse/KAFKA-14139>
> > > > It changes the Fetch and AlterPartition requests to include the
> broker
> > > > epochs. Then the controller can use these epochs to help reject the
> > stale
> > > > AlterPartition request.
> > > >
> > > > KIP:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR
> > > >
> > > > Discussion thread:
> > > > https://lists.apache.org/thread/vk9h68qpqqq69nlzbvxqx2yfbmzcd0mo
> > >
> >
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by David Jacot <dj...@confluent.io.INVALID>.
+1 (binding). Thanks for the KIP, Calvin!

On Tue, Feb 14, 2023 at 1:10 AM Jun Rao <ju...@confluent.io.invalid> wrote:

> Hi, Calvin,
>
> Thanks for the KIP. +1
>
> Jun
>
> On Fri, Feb 10, 2023 at 11:03 AM Alexandre Dupriez <
> alexandre.dupriez@gmail.com> wrote:
>
> > +1 (non-binding).
> >
> > Le ven. 10 févr. 2023 à 19:02, Calvin Liu <ca...@confluent.io.invalid> a
> > écrit :
> > >
> > > Hi all,
> > >
> > > I'd like to call for a vote on KIP-903, which proposes a fix to the
> > broker
> > > reboot data loss KAFKA-14139
> > > <https://issues.apache.org/jira/browse/KAFKA-14139>
> > > It changes the Fetch and AlterPartition requests to include the broker
> > > epochs. Then the controller can use these epochs to help reject the
> stale
> > > AlterPartition request.
> > >
> > > KIP:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR
> > >
> > > Discussion thread:
> > > https://lists.apache.org/thread/vk9h68qpqqq69nlzbvxqx2yfbmzcd0mo
> >
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Jun Rao <ju...@confluent.io.INVALID>.
Hi, Calvin,

Thanks for the KIP. +1

Jun

On Fri, Feb 10, 2023 at 11:03 AM Alexandre Dupriez <
alexandre.dupriez@gmail.com> wrote:

> +1 (non-binding).
>
> Le ven. 10 févr. 2023 à 19:02, Calvin Liu <ca...@confluent.io.invalid> a
> écrit :
> >
> > Hi all,
> >
> > I'd like to call for a vote on KIP-903, which proposes a fix to the
> broker
> > reboot data loss KAFKA-14139
> > <https://issues.apache.org/jira/browse/KAFKA-14139>
> > It changes the Fetch and AlterPartition requests to include the broker
> > epochs. Then the controller can use these epochs to help reject the stale
> > AlterPartition request.
> >
> > KIP:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR
> >
> > Discussion thread:
> > https://lists.apache.org/thread/vk9h68qpqqq69nlzbvxqx2yfbmzcd0mo
>

Re: [VOTE] KIP-903: Replicas with stale broker epoch should not be allowed to join the ISR

Posted by Alexandre Dupriez <al...@gmail.com>.
+1 (non-binding).

Le ven. 10 févr. 2023 à 19:02, Calvin Liu <ca...@confluent.io.invalid> a écrit :
>
> Hi all,
>
> I'd like to call for a vote on KIP-903, which proposes a fix to the broker
> reboot data loss KAFKA-14139
> <https://issues.apache.org/jira/browse/KAFKA-14139>
> It changes the Fetch and AlterPartition requests to include the broker
> epochs. Then the controller can use these epochs to help reject the stale
> AlterPartition request.
>
> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR
>
> Discussion thread:
> https://lists.apache.org/thread/vk9h68qpqqq69nlzbvxqx2yfbmzcd0mo