You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Luke Chen <sh...@gmail.com> on 2021/12/01 00:36:32 UTC

[VOTE] KIP-792: Add "generation" field into consumer protocol

Hi all,

I'd like to start the vote for KIP-792: Add "generation" field into
consumer protocol.

The goal of this KIP is to allow the assignor/consumer coordinator to have
a way to identify the out-of-date members/assignments, to avoid rebalance
stuck issues in current protocol.

Detailed description can be found here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614

Any feedback is welcome.

Thank you.
Luke

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Thanks David!

So, with 3 binding +1 votes (David, Tom, Guozhang), the vote passes.

Thanks everyone!

Luke

On Thu, Mar 3, 2022 at 4:34 PM David Jacot <dj...@confluent.io.invalid>
wrote:

> +1 (binding). Thanks for the KIP!
>
> On Fri, Jan 28, 2022 at 6:13 PM Tom Bentley <tb...@redhat.com> wrote:
> >
> > Hi Luke,
> >
> > Thanks for the KIP, +1 (binding).
> >
> > Kind regards,
> >
> > Tom
> >
> > On Wed, 19 Jan 2022 at 13:16, Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Bump this thread to see if there are other comments to this KIP.
> > > So far, I have one +1 vote (binding), and need more votes.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Dec 21, 2021 at 10:33 AM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Bump this thread to see if there are other comments to this KIP.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Fri, Dec 17, 2021 at 1:27 AM Colin McCabe <cm...@apache.org>
> wrote:
> > > >
> > > >> Thanks for the explanation, Luke. That makes sense.
> > > >>
> > > >> best,
> > > >> Colin
> > > >>
> > > >> On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
> > > >> > Thanks Luke, in that case I'm +1 on this KIP.
> > > >> >
> > > >> > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com>
> wrote:
> > > >> >
> > > >> >> Hi Guozhang,
> > > >> >>
> > > >> >> Thanks for your comment.
> > > >> >>
> > > >> >> > we need to make sure the old-versioned leader would be able to
> > > >> ignore the
> > > >> >> new
> > > >> >> field during an upgrade e.g. without crashing.
> > > >> >>
> > > >> >> Yes, I understand. I'll be careful to make sure it won't crash
> the
> > > old
> > > >> >> versioned leader.
> > > >> >> But basically, it won't, because we appended the new field into
> the
> > > >> last of
> > > >> >> the ConsumerProtocolSubscription, which means, when
> read/deserialize
> > > >> the
> > > >> >> Subscription metadata, the old versioned leader will just read
> the
> > > head
> > > >> >> part of the data.
> > > >> >>
> > > >> >> Thanks for the reminder!
> > > >> >>
> > > >> >> Luke
> > > >> >>
> > > >> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wangguoz@gmail.com
> >
> > > >> wrote:
> > > >> >>
> > > >> >> > Hi Luke,
> > > >> >> >
> > > >> >> > Thanks for the KIP.
> > > >> >> >
> > > >> >> > One thing I'd like to double check is that, since the
> > > >> >> > ConsumerProtocolSubscription is not auto generated from the
> json
> > > >> file, we
> > > >> >> > need to make sure the old-versioned leader would be able to
> ignore
> > > >> the
> > > >> >> new
> > > >> >> > field during an upgrade e.g. without crashing. Other than
> that, the
> > > >> KIP
> > > >> >> > lgtm.
> > > >> >> >
> > > >> >> > Guozhang
> > > >> >> >
> > > >> >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com>
> > > wrote:
> > > >> >> >
> > > >> >> > > Hi Colin,
> > > >> >> > >
> > > >> >> > > I'm not quite sure if I understand your thoughts correctly.
> > > >> >> > > If I was wrong, please let me know.
> > > >> >> > >
> > > >> >> > > Also, I'm not quite sure how I could lock this feature to a
> new
> > > IBP
> > > >> >> > > version.
> > > >> >> > > I saw "KIP-584: Versioning scheme for features" is still
> under
> > > >> >> > development.
> > > >> >> > > Not sure if I need to lock the IBP version, how should I do?
> > > >> >> > >
> > > >> >> > > Thank you.
> > > >> >> > > Luke
> > > >> >> > >
> > > >> >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com>
> > > >> wrote:
> > > >> >> > >
> > > >> >> > > > Hi Colin,
> > > >> >> > > >
> > > >> >> > > > Thanks for your comments. I've updated the KIP to mention
> about
> > > >> the
> > > >> >> KIP
> > > >> >> > > > won't affect current broker side behavior.
> > > >> >> > > >
> > > >> >> > > > > One scenario that we need to consider is what happens
> during
> > > a
> > > >> >> > rolling
> > > >> >> > > > upgrade. If the coordinator moves back and forth between
> > > brokers
> > > >> with
> > > >> >> > > > different IBPs, it seems that the same epoch numbers could
> be
> > > >> reused
> > > >> >> > for
> > > >> >> > > a
> > > >> >> > > > group, if things are done in the obvious manner (old IBP =
> > > don't
> > > >> read
> > > >> >> > or
> > > >> >> > > > write epoch, new IBP = do)
> > > >> >> > > >
> > > >> >> > > > I think this KIP doesn't care about the group epoch number
> at
> > > >> all.
> > > >> >> The
> > > >> >> > > > subscription metadata is passed from each member to group
> > > >> >> coordinator,
> > > >> >> > > and
> > > >> >> > > > then the group coordinator pass all of them back to the
> > > consumer
> > > >> >> lead.
> > > >> >> > So
> > > >> >> > > > even if the epoch number is reused in a group, it should be
> > > >> fine. On
> > > >> >> > the
> > > >> >> > > > other hand, the group coordinator will have no idea if the
> join
> > > >> group
> > > >> >> > > > request sent from consumer containing the new subscription
> > > >> >> "generation"
> > > >> >> > > > field or not, because group coordinator won't deserialize
> the
> > > >> >> metadata.
> > > >> >> > > >
> > > >> >> > > > I've added also added them into the KIP.
> > > >> >> > > >
> > > >> >> > > > Thank you.
> > > >> >> > > > Luke
> > > >> >> > > >
> > > >> >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <
> > > cmccabe@apache.org
> > > >> >
> > > >> >> > wrote:
> > > >> >> > > >
> > > >> >> > > >> Hi Luke,
> > > >> >> > > >>
> > > >> >> > > >> Thanks for the explanation.
> > > >> >> > > >>
> > > >> >> > > >> I don't see any description of how the broker decides to
> use
> > > >> the new
> > > >> >> > > >> version of ConsumerProtocolSubscription or not. This
> probably
> > > >> needs
> > > >> >> to
> > > >> >> > > be
> > > >> >> > > >> locked to a new IBP version.
> > > >> >> > > >>
> > > >> >> > > >> One scenario that we need to consider is what happens
> during a
> > > >> >> rolling
> > > >> >> > > >> upgrade. If the coordinator moves back and forth between
> > > brokers
> > > >> >> with
> > > >> >> > > >> different IBPs, it seems that the same epoch numbers
> could be
> > > >> reused
> > > >> >> > > for a
> > > >> >> > > >> group, if things are done in the obvious manner (old IBP =
> > > don't
> > > >> >> read
> > > >> >> > or
> > > >> >> > > >> write epoch, new IBP = do).
> > > >> >> > > >>
> > > >> >> > > >> best,
> > > >> >> > > >> Colin
> > > >> >> > > >>
> > > >> >> > > >>
> > > >> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > > >> >> > > >> > Hi Colin,
> > > >> >> > > >> > Thanks for your comment.
> > > >> >> > > >> >
> > > >> >> > > >> >> How are we going to avoid the situation where the
> broker
> > > >> >> restarts,
> > > >> >> > > and
> > > >> >> > > >> > the same generation number is reused?
> > > >> >> > > >> >
> > > >> >> > > >> > Actually, this KIP doesn't have anything to do with the
> > > >> brokers.
> > > >> >> The
> > > >> >> > > >> > "generation" field I added, is in the subscription
> metadata,
> > > >> which
> > > >> >> > > will
> > > >> >> > > >> not
> > > >> >> > > >> > be deserialized by brokers. The metadata is only
> > > deserialized
> > > >> by
> > > >> >> > > >> consumer
> > > >> >> > > >> > lead. And for the consumer lead, the only thing the lead
> > > cared
> > > >> >> > about,
> > > >> >> > > is
> > > >> >> > > >> > the highest generation of the ownedPartitions among all
> the
> > > >> >> > consumers.
> > > >> >> > > >> With
> > > >> >> > > >> > the highest generation of the ownedPartitions, the
> consumer
> > > >> lead
> > > >> >> can
> > > >> >> > > >> > distribute the partitions as sticky as possible, and
> most
> > > >> >> > importantly,
> > > >> >> > > >> > without errors.
> > > >> >> > > >> >
> > > >> >> > > >> > That is, after this KIP, if the broker restarts, and the
> > > same
> > > >> >> > > generation
> > > >> >> > > >> > number is reused, it won't break current rebalance
> behavior.
> > > >> But
> > > >> >> > it'll
> > > >> >> > > >> help
> > > >> >> > > >> > the consumer lead do the sticky assignments correctly.
> > > >> >> > > >> >
> > > >> >> > > >> > Thank you.
> > > >> >> > > >> > Luke
> > > >> >> > > >> >
> > > >> >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <
> > > >> colin@cmccabe.xyz>
> > > >> >> > > wrote:
> > > >> >> > > >> >
> > > >> >> > > >> >> How are we going to avoid the situation where the
> broker
> > > >> >> restarts,
> > > >> >> > > and
> > > >> >> > > >> the
> > > >> >> > > >> >> same generation number is reused?
> > > >> >> > > >> >>
> > > >> >> > > >> >> best,
> > > >> >> > > >> >> Colin
> > > >> >> > > >> >>
> > > >> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> > > >> >> > > >> >> > Hi all,
> > > >> >> > > >> >> >
> > > >> >> > > >> >> > I'd like to start the vote for KIP-792: Add
> "generation"
> > > >> field
> > > >> >> > into
> > > >> >> > > >> >> > consumer protocol.
> > > >> >> > > >> >> >
> > > >> >> > > >> >> > The goal of this KIP is to allow the
> assignor/consumer
> > > >> >> > coordinator
> > > >> >> > > to
> > > >> >> > > >> >> have
> > > >> >> > > >> >> > a way to identify the out-of-date
> members/assignments, to
> > > >> avoid
> > > >> >> > > >> rebalance
> > > >> >> > > >> >> > stuck issues in current protocol.
> > > >> >> > > >> >> >
> > > >> >> > > >> >> > Detailed description can be found here:
> > > >> >> > > >> >> >
> > > >> >> > > >> >>
> > > >> >> > > >>
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > >> >> > > >> >> >
> > > >> >> > > >> >> > Any feedback is welcome.
> > > >> >> > > >> >> >
> > > >> >> > > >> >> > Thank you.
> > > >> >> > > >> >> > Luke
> > > >> >> > > >> >>
> > > >> >> > > >>
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >> >
> > > >> >> > --
> > > >> >> > -- Guozhang
> > > >> >> >
> > > >> >>
> > > >> >
> > > >> >
> > > >> > --
> > > >> > -- Guozhang
> > > >>
> > > >
> > >
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
+1 (binding). Thanks for the KIP!

On Fri, Jan 28, 2022 at 6:13 PM Tom Bentley <tb...@redhat.com> wrote:
>
> Hi Luke,
>
> Thanks for the KIP, +1 (binding).
>
> Kind regards,
>
> Tom
>
> On Wed, 19 Jan 2022 at 13:16, Luke Chen <sh...@gmail.com> wrote:
>
> > Hi all,
> >
> > Bump this thread to see if there are other comments to this KIP.
> > So far, I have one +1 vote (binding), and need more votes.
> >
> > Thank you.
> > Luke
> >
> > On Tue, Dec 21, 2021 at 10:33 AM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Bump this thread to see if there are other comments to this KIP.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Fri, Dec 17, 2021 at 1:27 AM Colin McCabe <cm...@apache.org> wrote:
> > >
> > >> Thanks for the explanation, Luke. That makes sense.
> > >>
> > >> best,
> > >> Colin
> > >>
> > >> On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
> > >> > Thanks Luke, in that case I'm +1 on this KIP.
> > >> >
> > >> > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:
> > >> >
> > >> >> Hi Guozhang,
> > >> >>
> > >> >> Thanks for your comment.
> > >> >>
> > >> >> > we need to make sure the old-versioned leader would be able to
> > >> ignore the
> > >> >> new
> > >> >> field during an upgrade e.g. without crashing.
> > >> >>
> > >> >> Yes, I understand. I'll be careful to make sure it won't crash the
> > old
> > >> >> versioned leader.
> > >> >> But basically, it won't, because we appended the new field into the
> > >> last of
> > >> >> the ConsumerProtocolSubscription, which means, when read/deserialize
> > >> the
> > >> >> Subscription metadata, the old versioned leader will just read the
> > head
> > >> >> part of the data.
> > >> >>
> > >> >> Thanks for the reminder!
> > >> >>
> > >> >> Luke
> > >> >>
> > >> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> > Hi Luke,
> > >> >> >
> > >> >> > Thanks for the KIP.
> > >> >> >
> > >> >> > One thing I'd like to double check is that, since the
> > >> >> > ConsumerProtocolSubscription is not auto generated from the json
> > >> file, we
> > >> >> > need to make sure the old-versioned leader would be able to ignore
> > >> the
> > >> >> new
> > >> >> > field during an upgrade e.g. without crashing. Other than that, the
> > >> KIP
> > >> >> > lgtm.
> > >> >> >
> > >> >> > Guozhang
> > >> >> >
> > >> >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com>
> > wrote:
> > >> >> >
> > >> >> > > Hi Colin,
> > >> >> > >
> > >> >> > > I'm not quite sure if I understand your thoughts correctly.
> > >> >> > > If I was wrong, please let me know.
> > >> >> > >
> > >> >> > > Also, I'm not quite sure how I could lock this feature to a new
> > IBP
> > >> >> > > version.
> > >> >> > > I saw "KIP-584: Versioning scheme for features" is still under
> > >> >> > development.
> > >> >> > > Not sure if I need to lock the IBP version, how should I do?
> > >> >> > >
> > >> >> > > Thank you.
> > >> >> > > Luke
> > >> >> > >
> > >> >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > > Hi Colin,
> > >> >> > > >
> > >> >> > > > Thanks for your comments. I've updated the KIP to mention about
> > >> the
> > >> >> KIP
> > >> >> > > > won't affect current broker side behavior.
> > >> >> > > >
> > >> >> > > > > One scenario that we need to consider is what happens during
> > a
> > >> >> > rolling
> > >> >> > > > upgrade. If the coordinator moves back and forth between
> > brokers
> > >> with
> > >> >> > > > different IBPs, it seems that the same epoch numbers could be
> > >> reused
> > >> >> > for
> > >> >> > > a
> > >> >> > > > group, if things are done in the obvious manner (old IBP =
> > don't
> > >> read
> > >> >> > or
> > >> >> > > > write epoch, new IBP = do)
> > >> >> > > >
> > >> >> > > > I think this KIP doesn't care about the group epoch number at
> > >> all.
> > >> >> The
> > >> >> > > > subscription metadata is passed from each member to group
> > >> >> coordinator,
> > >> >> > > and
> > >> >> > > > then the group coordinator pass all of them back to the
> > consumer
> > >> >> lead.
> > >> >> > So
> > >> >> > > > even if the epoch number is reused in a group, it should be
> > >> fine. On
> > >> >> > the
> > >> >> > > > other hand, the group coordinator will have no idea if the join
> > >> group
> > >> >> > > > request sent from consumer containing the new subscription
> > >> >> "generation"
> > >> >> > > > field or not, because group coordinator won't deserialize the
> > >> >> metadata.
> > >> >> > > >
> > >> >> > > > I've added also added them into the KIP.
> > >> >> > > >
> > >> >> > > > Thank you.
> > >> >> > > > Luke
> > >> >> > > >
> > >> >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <
> > cmccabe@apache.org
> > >> >
> > >> >> > wrote:
> > >> >> > > >
> > >> >> > > >> Hi Luke,
> > >> >> > > >>
> > >> >> > > >> Thanks for the explanation.
> > >> >> > > >>
> > >> >> > > >> I don't see any description of how the broker decides to use
> > >> the new
> > >> >> > > >> version of ConsumerProtocolSubscription or not. This probably
> > >> needs
> > >> >> to
> > >> >> > > be
> > >> >> > > >> locked to a new IBP version.
> > >> >> > > >>
> > >> >> > > >> One scenario that we need to consider is what happens during a
> > >> >> rolling
> > >> >> > > >> upgrade. If the coordinator moves back and forth between
> > brokers
> > >> >> with
> > >> >> > > >> different IBPs, it seems that the same epoch numbers could be
> > >> reused
> > >> >> > > for a
> > >> >> > > >> group, if things are done in the obvious manner (old IBP =
> > don't
> > >> >> read
> > >> >> > or
> > >> >> > > >> write epoch, new IBP = do).
> > >> >> > > >>
> > >> >> > > >> best,
> > >> >> > > >> Colin
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > >> >> > > >> > Hi Colin,
> > >> >> > > >> > Thanks for your comment.
> > >> >> > > >> >
> > >> >> > > >> >> How are we going to avoid the situation where the broker
> > >> >> restarts,
> > >> >> > > and
> > >> >> > > >> > the same generation number is reused?
> > >> >> > > >> >
> > >> >> > > >> > Actually, this KIP doesn't have anything to do with the
> > >> brokers.
> > >> >> The
> > >> >> > > >> > "generation" field I added, is in the subscription metadata,
> > >> which
> > >> >> > > will
> > >> >> > > >> not
> > >> >> > > >> > be deserialized by brokers. The metadata is only
> > deserialized
> > >> by
> > >> >> > > >> consumer
> > >> >> > > >> > lead. And for the consumer lead, the only thing the lead
> > cared
> > >> >> > about,
> > >> >> > > is
> > >> >> > > >> > the highest generation of the ownedPartitions among all the
> > >> >> > consumers.
> > >> >> > > >> With
> > >> >> > > >> > the highest generation of the ownedPartitions, the consumer
> > >> lead
> > >> >> can
> > >> >> > > >> > distribute the partitions as sticky as possible, and most
> > >> >> > importantly,
> > >> >> > > >> > without errors.
> > >> >> > > >> >
> > >> >> > > >> > That is, after this KIP, if the broker restarts, and the
> > same
> > >> >> > > generation
> > >> >> > > >> > number is reused, it won't break current rebalance behavior.
> > >> But
> > >> >> > it'll
> > >> >> > > >> help
> > >> >> > > >> > the consumer lead do the sticky assignments correctly.
> > >> >> > > >> >
> > >> >> > > >> > Thank you.
> > >> >> > > >> > Luke
> > >> >> > > >> >
> > >> >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <
> > >> colin@cmccabe.xyz>
> > >> >> > > wrote:
> > >> >> > > >> >
> > >> >> > > >> >> How are we going to avoid the situation where the broker
> > >> >> restarts,
> > >> >> > > and
> > >> >> > > >> the
> > >> >> > > >> >> same generation number is reused?
> > >> >> > > >> >>
> > >> >> > > >> >> best,
> > >> >> > > >> >> Colin
> > >> >> > > >> >>
> > >> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> > >> >> > > >> >> > Hi all,
> > >> >> > > >> >> >
> > >> >> > > >> >> > I'd like to start the vote for KIP-792: Add "generation"
> > >> field
> > >> >> > into
> > >> >> > > >> >> > consumer protocol.
> > >> >> > > >> >> >
> > >> >> > > >> >> > The goal of this KIP is to allow the assignor/consumer
> > >> >> > coordinator
> > >> >> > > to
> > >> >> > > >> >> have
> > >> >> > > >> >> > a way to identify the out-of-date members/assignments, to
> > >> avoid
> > >> >> > > >> rebalance
> > >> >> > > >> >> > stuck issues in current protocol.
> > >> >> > > >> >> >
> > >> >> > > >> >> > Detailed description can be found here:
> > >> >> > > >> >> >
> > >> >> > > >> >>
> > >> >> > > >>
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > >> >> > > >> >> >
> > >> >> > > >> >> > Any feedback is welcome.
> > >> >> > > >> >> >
> > >> >> > > >> >> > Thank you.
> > >> >> > > >> >> > Luke
> > >> >> > > >> >>
> > >> >> > > >>
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >> >
> > >> >> > --
> > >> >> > -- Guozhang
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >> > --
> > >> > -- Guozhang
> > >>
> > >
> >

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Tom Bentley <tb...@redhat.com>.
Hi Luke,

Thanks for the KIP, +1 (binding).

Kind regards,

Tom

On Wed, 19 Jan 2022 at 13:16, Luke Chen <sh...@gmail.com> wrote:

> Hi all,
>
> Bump this thread to see if there are other comments to this KIP.
> So far, I have one +1 vote (binding), and need more votes.
>
> Thank you.
> Luke
>
> On Tue, Dec 21, 2021 at 10:33 AM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi all,
> >
> > Bump this thread to see if there are other comments to this KIP.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Dec 17, 2021 at 1:27 AM Colin McCabe <cm...@apache.org> wrote:
> >
> >> Thanks for the explanation, Luke. That makes sense.
> >>
> >> best,
> >> Colin
> >>
> >> On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
> >> > Thanks Luke, in that case I'm +1 on this KIP.
> >> >
> >> > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:
> >> >
> >> >> Hi Guozhang,
> >> >>
> >> >> Thanks for your comment.
> >> >>
> >> >> > we need to make sure the old-versioned leader would be able to
> >> ignore the
> >> >> new
> >> >> field during an upgrade e.g. without crashing.
> >> >>
> >> >> Yes, I understand. I'll be careful to make sure it won't crash the
> old
> >> >> versioned leader.
> >> >> But basically, it won't, because we appended the new field into the
> >> last of
> >> >> the ConsumerProtocolSubscription, which means, when read/deserialize
> >> the
> >> >> Subscription metadata, the old versioned leader will just read the
> head
> >> >> part of the data.
> >> >>
> >> >> Thanks for the reminder!
> >> >>
> >> >> Luke
> >> >>
> >> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com>
> >> wrote:
> >> >>
> >> >> > Hi Luke,
> >> >> >
> >> >> > Thanks for the KIP.
> >> >> >
> >> >> > One thing I'd like to double check is that, since the
> >> >> > ConsumerProtocolSubscription is not auto generated from the json
> >> file, we
> >> >> > need to make sure the old-versioned leader would be able to ignore
> >> the
> >> >> new
> >> >> > field during an upgrade e.g. without crashing. Other than that, the
> >> KIP
> >> >> > lgtm.
> >> >> >
> >> >> > Guozhang
> >> >> >
> >> >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com>
> wrote:
> >> >> >
> >> >> > > Hi Colin,
> >> >> > >
> >> >> > > I'm not quite sure if I understand your thoughts correctly.
> >> >> > > If I was wrong, please let me know.
> >> >> > >
> >> >> > > Also, I'm not quite sure how I could lock this feature to a new
> IBP
> >> >> > > version.
> >> >> > > I saw "KIP-584: Versioning scheme for features" is still under
> >> >> > development.
> >> >> > > Not sure if I need to lock the IBP version, how should I do?
> >> >> > >
> >> >> > > Thank you.
> >> >> > > Luke
> >> >> > >
> >> >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com>
> >> wrote:
> >> >> > >
> >> >> > > > Hi Colin,
> >> >> > > >
> >> >> > > > Thanks for your comments. I've updated the KIP to mention about
> >> the
> >> >> KIP
> >> >> > > > won't affect current broker side behavior.
> >> >> > > >
> >> >> > > > > One scenario that we need to consider is what happens during
> a
> >> >> > rolling
> >> >> > > > upgrade. If the coordinator moves back and forth between
> brokers
> >> with
> >> >> > > > different IBPs, it seems that the same epoch numbers could be
> >> reused
> >> >> > for
> >> >> > > a
> >> >> > > > group, if things are done in the obvious manner (old IBP =
> don't
> >> read
> >> >> > or
> >> >> > > > write epoch, new IBP = do)
> >> >> > > >
> >> >> > > > I think this KIP doesn't care about the group epoch number at
> >> all.
> >> >> The
> >> >> > > > subscription metadata is passed from each member to group
> >> >> coordinator,
> >> >> > > and
> >> >> > > > then the group coordinator pass all of them back to the
> consumer
> >> >> lead.
> >> >> > So
> >> >> > > > even if the epoch number is reused in a group, it should be
> >> fine. On
> >> >> > the
> >> >> > > > other hand, the group coordinator will have no idea if the join
> >> group
> >> >> > > > request sent from consumer containing the new subscription
> >> >> "generation"
> >> >> > > > field or not, because group coordinator won't deserialize the
> >> >> metadata.
> >> >> > > >
> >> >> > > > I've added also added them into the KIP.
> >> >> > > >
> >> >> > > > Thank you.
> >> >> > > > Luke
> >> >> > > >
> >> >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <
> cmccabe@apache.org
> >> >
> >> >> > wrote:
> >> >> > > >
> >> >> > > >> Hi Luke,
> >> >> > > >>
> >> >> > > >> Thanks for the explanation.
> >> >> > > >>
> >> >> > > >> I don't see any description of how the broker decides to use
> >> the new
> >> >> > > >> version of ConsumerProtocolSubscription or not. This probably
> >> needs
> >> >> to
> >> >> > > be
> >> >> > > >> locked to a new IBP version.
> >> >> > > >>
> >> >> > > >> One scenario that we need to consider is what happens during a
> >> >> rolling
> >> >> > > >> upgrade. If the coordinator moves back and forth between
> brokers
> >> >> with
> >> >> > > >> different IBPs, it seems that the same epoch numbers could be
> >> reused
> >> >> > > for a
> >> >> > > >> group, if things are done in the obvious manner (old IBP =
> don't
> >> >> read
> >> >> > or
> >> >> > > >> write epoch, new IBP = do).
> >> >> > > >>
> >> >> > > >> best,
> >> >> > > >> Colin
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> >> >> > > >> > Hi Colin,
> >> >> > > >> > Thanks for your comment.
> >> >> > > >> >
> >> >> > > >> >> How are we going to avoid the situation where the broker
> >> >> restarts,
> >> >> > > and
> >> >> > > >> > the same generation number is reused?
> >> >> > > >> >
> >> >> > > >> > Actually, this KIP doesn't have anything to do with the
> >> brokers.
> >> >> The
> >> >> > > >> > "generation" field I added, is in the subscription metadata,
> >> which
> >> >> > > will
> >> >> > > >> not
> >> >> > > >> > be deserialized by brokers. The metadata is only
> deserialized
> >> by
> >> >> > > >> consumer
> >> >> > > >> > lead. And for the consumer lead, the only thing the lead
> cared
> >> >> > about,
> >> >> > > is
> >> >> > > >> > the highest generation of the ownedPartitions among all the
> >> >> > consumers.
> >> >> > > >> With
> >> >> > > >> > the highest generation of the ownedPartitions, the consumer
> >> lead
> >> >> can
> >> >> > > >> > distribute the partitions as sticky as possible, and most
> >> >> > importantly,
> >> >> > > >> > without errors.
> >> >> > > >> >
> >> >> > > >> > That is, after this KIP, if the broker restarts, and the
> same
> >> >> > > generation
> >> >> > > >> > number is reused, it won't break current rebalance behavior.
> >> But
> >> >> > it'll
> >> >> > > >> help
> >> >> > > >> > the consumer lead do the sticky assignments correctly.
> >> >> > > >> >
> >> >> > > >> > Thank you.
> >> >> > > >> > Luke
> >> >> > > >> >
> >> >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <
> >> colin@cmccabe.xyz>
> >> >> > > wrote:
> >> >> > > >> >
> >> >> > > >> >> How are we going to avoid the situation where the broker
> >> >> restarts,
> >> >> > > and
> >> >> > > >> the
> >> >> > > >> >> same generation number is reused?
> >> >> > > >> >>
> >> >> > > >> >> best,
> >> >> > > >> >> Colin
> >> >> > > >> >>
> >> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> >> >> > > >> >> > Hi all,
> >> >> > > >> >> >
> >> >> > > >> >> > I'd like to start the vote for KIP-792: Add "generation"
> >> field
> >> >> > into
> >> >> > > >> >> > consumer protocol.
> >> >> > > >> >> >
> >> >> > > >> >> > The goal of this KIP is to allow the assignor/consumer
> >> >> > coordinator
> >> >> > > to
> >> >> > > >> >> have
> >> >> > > >> >> > a way to identify the out-of-date members/assignments, to
> >> avoid
> >> >> > > >> rebalance
> >> >> > > >> >> > stuck issues in current protocol.
> >> >> > > >> >> >
> >> >> > > >> >> > Detailed description can be found here:
> >> >> > > >> >> >
> >> >> > > >> >>
> >> >> > > >>
> >> >> > >
> >> >> >
> >> >>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >> >> > > >> >> >
> >> >> > > >> >> > Any feedback is welcome.
> >> >> > > >> >> >
> >> >> > > >> >> > Thank you.
> >> >> > > >> >> > Luke
> >> >> > > >> >>
> >> >> > > >>
> >> >> > > >
> >> >> > >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > -- Guozhang
> >> >> >
> >> >>
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >>
> >
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi all,

Bump this thread to see if there are other comments to this KIP.
So far, I have one +1 vote (binding), and need more votes.

Thank you.
Luke

On Tue, Dec 21, 2021 at 10:33 AM Luke Chen <sh...@gmail.com> wrote:

> Hi all,
>
> Bump this thread to see if there are other comments to this KIP.
>
> Thank you.
> Luke
>
> On Fri, Dec 17, 2021 at 1:27 AM Colin McCabe <cm...@apache.org> wrote:
>
>> Thanks for the explanation, Luke. That makes sense.
>>
>> best,
>> Colin
>>
>> On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
>> > Thanks Luke, in that case I'm +1 on this KIP.
>> >
>> > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:
>> >
>> >> Hi Guozhang,
>> >>
>> >> Thanks for your comment.
>> >>
>> >> > we need to make sure the old-versioned leader would be able to
>> ignore the
>> >> new
>> >> field during an upgrade e.g. without crashing.
>> >>
>> >> Yes, I understand. I'll be careful to make sure it won't crash the old
>> >> versioned leader.
>> >> But basically, it won't, because we appended the new field into the
>> last of
>> >> the ConsumerProtocolSubscription, which means, when read/deserialize
>> the
>> >> Subscription metadata, the old versioned leader will just read the head
>> >> part of the data.
>> >>
>> >> Thanks for the reminder!
>> >>
>> >> Luke
>> >>
>> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >>
>> >> > Hi Luke,
>> >> >
>> >> > Thanks for the KIP.
>> >> >
>> >> > One thing I'd like to double check is that, since the
>> >> > ConsumerProtocolSubscription is not auto generated from the json
>> file, we
>> >> > need to make sure the old-versioned leader would be able to ignore
>> the
>> >> new
>> >> > field during an upgrade e.g. without crashing. Other than that, the
>> KIP
>> >> > lgtm.
>> >> >
>> >> > Guozhang
>> >> >
>> >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:
>> >> >
>> >> > > Hi Colin,
>> >> > >
>> >> > > I'm not quite sure if I understand your thoughts correctly.
>> >> > > If I was wrong, please let me know.
>> >> > >
>> >> > > Also, I'm not quite sure how I could lock this feature to a new IBP
>> >> > > version.
>> >> > > I saw "KIP-584: Versioning scheme for features" is still under
>> >> > development.
>> >> > > Not sure if I need to lock the IBP version, how should I do?
>> >> > >
>> >> > > Thank you.
>> >> > > Luke
>> >> > >
>> >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com>
>> wrote:
>> >> > >
>> >> > > > Hi Colin,
>> >> > > >
>> >> > > > Thanks for your comments. I've updated the KIP to mention about
>> the
>> >> KIP
>> >> > > > won't affect current broker side behavior.
>> >> > > >
>> >> > > > > One scenario that we need to consider is what happens during a
>> >> > rolling
>> >> > > > upgrade. If the coordinator moves back and forth between brokers
>> with
>> >> > > > different IBPs, it seems that the same epoch numbers could be
>> reused
>> >> > for
>> >> > > a
>> >> > > > group, if things are done in the obvious manner (old IBP = don't
>> read
>> >> > or
>> >> > > > write epoch, new IBP = do)
>> >> > > >
>> >> > > > I think this KIP doesn't care about the group epoch number at
>> all.
>> >> The
>> >> > > > subscription metadata is passed from each member to group
>> >> coordinator,
>> >> > > and
>> >> > > > then the group coordinator pass all of them back to the consumer
>> >> lead.
>> >> > So
>> >> > > > even if the epoch number is reused in a group, it should be
>> fine. On
>> >> > the
>> >> > > > other hand, the group coordinator will have no idea if the join
>> group
>> >> > > > request sent from consumer containing the new subscription
>> >> "generation"
>> >> > > > field or not, because group coordinator won't deserialize the
>> >> metadata.
>> >> > > >
>> >> > > > I've added also added them into the KIP.
>> >> > > >
>> >> > > > Thank you.
>> >> > > > Luke
>> >> > > >
>> >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cmccabe@apache.org
>> >
>> >> > wrote:
>> >> > > >
>> >> > > >> Hi Luke,
>> >> > > >>
>> >> > > >> Thanks for the explanation.
>> >> > > >>
>> >> > > >> I don't see any description of how the broker decides to use
>> the new
>> >> > > >> version of ConsumerProtocolSubscription or not. This probably
>> needs
>> >> to
>> >> > > be
>> >> > > >> locked to a new IBP version.
>> >> > > >>
>> >> > > >> One scenario that we need to consider is what happens during a
>> >> rolling
>> >> > > >> upgrade. If the coordinator moves back and forth between brokers
>> >> with
>> >> > > >> different IBPs, it seems that the same epoch numbers could be
>> reused
>> >> > > for a
>> >> > > >> group, if things are done in the obvious manner (old IBP = don't
>> >> read
>> >> > or
>> >> > > >> write epoch, new IBP = do).
>> >> > > >>
>> >> > > >> best,
>> >> > > >> Colin
>> >> > > >>
>> >> > > >>
>> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
>> >> > > >> > Hi Colin,
>> >> > > >> > Thanks for your comment.
>> >> > > >> >
>> >> > > >> >> How are we going to avoid the situation where the broker
>> >> restarts,
>> >> > > and
>> >> > > >> > the same generation number is reused?
>> >> > > >> >
>> >> > > >> > Actually, this KIP doesn't have anything to do with the
>> brokers.
>> >> The
>> >> > > >> > "generation" field I added, is in the subscription metadata,
>> which
>> >> > > will
>> >> > > >> not
>> >> > > >> > be deserialized by brokers. The metadata is only deserialized
>> by
>> >> > > >> consumer
>> >> > > >> > lead. And for the consumer lead, the only thing the lead cared
>> >> > about,
>> >> > > is
>> >> > > >> > the highest generation of the ownedPartitions among all the
>> >> > consumers.
>> >> > > >> With
>> >> > > >> > the highest generation of the ownedPartitions, the consumer
>> lead
>> >> can
>> >> > > >> > distribute the partitions as sticky as possible, and most
>> >> > importantly,
>> >> > > >> > without errors.
>> >> > > >> >
>> >> > > >> > That is, after this KIP, if the broker restarts, and the same
>> >> > > generation
>> >> > > >> > number is reused, it won't break current rebalance behavior.
>> But
>> >> > it'll
>> >> > > >> help
>> >> > > >> > the consumer lead do the sticky assignments correctly.
>> >> > > >> >
>> >> > > >> > Thank you.
>> >> > > >> > Luke
>> >> > > >> >
>> >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <
>> colin@cmccabe.xyz>
>> >> > > wrote:
>> >> > > >> >
>> >> > > >> >> How are we going to avoid the situation where the broker
>> >> restarts,
>> >> > > and
>> >> > > >> the
>> >> > > >> >> same generation number is reused?
>> >> > > >> >>
>> >> > > >> >> best,
>> >> > > >> >> Colin
>> >> > > >> >>
>> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> >> > > >> >> > Hi all,
>> >> > > >> >> >
>> >> > > >> >> > I'd like to start the vote for KIP-792: Add "generation"
>> field
>> >> > into
>> >> > > >> >> > consumer protocol.
>> >> > > >> >> >
>> >> > > >> >> > The goal of this KIP is to allow the assignor/consumer
>> >> > coordinator
>> >> > > to
>> >> > > >> >> have
>> >> > > >> >> > a way to identify the out-of-date members/assignments, to
>> avoid
>> >> > > >> rebalance
>> >> > > >> >> > stuck issues in current protocol.
>> >> > > >> >> >
>> >> > > >> >> > Detailed description can be found here:
>> >> > > >> >> >
>> >> > > >> >>
>> >> > > >>
>> >> > >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >> > > >> >> >
>> >> > > >> >> > Any feedback is welcome.
>> >> > > >> >> >
>> >> > > >> >> > Thank you.
>> >> > > >> >> > Luke
>> >> > > >> >>
>> >> > > >>
>> >> > > >
>> >> > >
>> >> >
>> >> >
>> >> > --
>> >> > -- Guozhang
>> >> >
>> >>
>> >
>> >
>> > --
>> > -- Guozhang
>>
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi all,

Bump this thread to see if there are other comments to this KIP.

Thank you.
Luke

On Fri, Dec 17, 2021 at 1:27 AM Colin McCabe <cm...@apache.org> wrote:

> Thanks for the explanation, Luke. That makes sense.
>
> best,
> Colin
>
> On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
> > Thanks Luke, in that case I'm +1 on this KIP.
> >
> > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:
> >
> >> Hi Guozhang,
> >>
> >> Thanks for your comment.
> >>
> >> > we need to make sure the old-versioned leader would be able to ignore
> the
> >> new
> >> field during an upgrade e.g. without crashing.
> >>
> >> Yes, I understand. I'll be careful to make sure it won't crash the old
> >> versioned leader.
> >> But basically, it won't, because we appended the new field into the
> last of
> >> the ConsumerProtocolSubscription, which means, when read/deserialize the
> >> Subscription metadata, the old versioned leader will just read the head
> >> part of the data.
> >>
> >> Thanks for the reminder!
> >>
> >> Luke
> >>
> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com>
> wrote:
> >>
> >> > Hi Luke,
> >> >
> >> > Thanks for the KIP.
> >> >
> >> > One thing I'd like to double check is that, since the
> >> > ConsumerProtocolSubscription is not auto generated from the json
> file, we
> >> > need to make sure the old-versioned leader would be able to ignore the
> >> new
> >> > field during an upgrade e.g. without crashing. Other than that, the
> KIP
> >> > lgtm.
> >> >
> >> > Guozhang
> >> >
> >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:
> >> >
> >> > > Hi Colin,
> >> > >
> >> > > I'm not quite sure if I understand your thoughts correctly.
> >> > > If I was wrong, please let me know.
> >> > >
> >> > > Also, I'm not quite sure how I could lock this feature to a new IBP
> >> > > version.
> >> > > I saw "KIP-584: Versioning scheme for features" is still under
> >> > development.
> >> > > Not sure if I need to lock the IBP version, how should I do?
> >> > >
> >> > > Thank you.
> >> > > Luke
> >> > >
> >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:
> >> > >
> >> > > > Hi Colin,
> >> > > >
> >> > > > Thanks for your comments. I've updated the KIP to mention about
> the
> >> KIP
> >> > > > won't affect current broker side behavior.
> >> > > >
> >> > > > > One scenario that we need to consider is what happens during a
> >> > rolling
> >> > > > upgrade. If the coordinator moves back and forth between brokers
> with
> >> > > > different IBPs, it seems that the same epoch numbers could be
> reused
> >> > for
> >> > > a
> >> > > > group, if things are done in the obvious manner (old IBP = don't
> read
> >> > or
> >> > > > write epoch, new IBP = do)
> >> > > >
> >> > > > I think this KIP doesn't care about the group epoch number at all.
> >> The
> >> > > > subscription metadata is passed from each member to group
> >> coordinator,
> >> > > and
> >> > > > then the group coordinator pass all of them back to the consumer
> >> lead.
> >> > So
> >> > > > even if the epoch number is reused in a group, it should be fine.
> On
> >> > the
> >> > > > other hand, the group coordinator will have no idea if the join
> group
> >> > > > request sent from consumer containing the new subscription
> >> "generation"
> >> > > > field or not, because group coordinator won't deserialize the
> >> metadata.
> >> > > >
> >> > > > I've added also added them into the KIP.
> >> > > >
> >> > > > Thank you.
> >> > > > Luke
> >> > > >
> >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org>
> >> > wrote:
> >> > > >
> >> > > >> Hi Luke,
> >> > > >>
> >> > > >> Thanks for the explanation.
> >> > > >>
> >> > > >> I don't see any description of how the broker decides to use the
> new
> >> > > >> version of ConsumerProtocolSubscription or not. This probably
> needs
> >> to
> >> > > be
> >> > > >> locked to a new IBP version.
> >> > > >>
> >> > > >> One scenario that we need to consider is what happens during a
> >> rolling
> >> > > >> upgrade. If the coordinator moves back and forth between brokers
> >> with
> >> > > >> different IBPs, it seems that the same epoch numbers could be
> reused
> >> > > for a
> >> > > >> group, if things are done in the obvious manner (old IBP = don't
> >> read
> >> > or
> >> > > >> write epoch, new IBP = do).
> >> > > >>
> >> > > >> best,
> >> > > >> Colin
> >> > > >>
> >> > > >>
> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> >> > > >> > Hi Colin,
> >> > > >> > Thanks for your comment.
> >> > > >> >
> >> > > >> >> How are we going to avoid the situation where the broker
> >> restarts,
> >> > > and
> >> > > >> > the same generation number is reused?
> >> > > >> >
> >> > > >> > Actually, this KIP doesn't have anything to do with the
> brokers.
> >> The
> >> > > >> > "generation" field I added, is in the subscription metadata,
> which
> >> > > will
> >> > > >> not
> >> > > >> > be deserialized by brokers. The metadata is only deserialized
> by
> >> > > >> consumer
> >> > > >> > lead. And for the consumer lead, the only thing the lead cared
> >> > about,
> >> > > is
> >> > > >> > the highest generation of the ownedPartitions among all the
> >> > consumers.
> >> > > >> With
> >> > > >> > the highest generation of the ownedPartitions, the consumer
> lead
> >> can
> >> > > >> > distribute the partitions as sticky as possible, and most
> >> > importantly,
> >> > > >> > without errors.
> >> > > >> >
> >> > > >> > That is, after this KIP, if the broker restarts, and the same
> >> > > generation
> >> > > >> > number is reused, it won't break current rebalance behavior.
> But
> >> > it'll
> >> > > >> help
> >> > > >> > the consumer lead do the sticky assignments correctly.
> >> > > >> >
> >> > > >> > Thank you.
> >> > > >> > Luke
> >> > > >> >
> >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <colin@cmccabe.xyz
> >
> >> > > wrote:
> >> > > >> >
> >> > > >> >> How are we going to avoid the situation where the broker
> >> restarts,
> >> > > and
> >> > > >> the
> >> > > >> >> same generation number is reused?
> >> > > >> >>
> >> > > >> >> best,
> >> > > >> >> Colin
> >> > > >> >>
> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> >> > > >> >> > Hi all,
> >> > > >> >> >
> >> > > >> >> > I'd like to start the vote for KIP-792: Add "generation"
> field
> >> > into
> >> > > >> >> > consumer protocol.
> >> > > >> >> >
> >> > > >> >> > The goal of this KIP is to allow the assignor/consumer
> >> > coordinator
> >> > > to
> >> > > >> >> have
> >> > > >> >> > a way to identify the out-of-date members/assignments, to
> avoid
> >> > > >> rebalance
> >> > > >> >> > stuck issues in current protocol.
> >> > > >> >> >
> >> > > >> >> > Detailed description can be found here:
> >> > > >> >> >
> >> > > >> >>
> >> > > >>
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >> > > >> >> >
> >> > > >> >> > Any feedback is welcome.
> >> > > >> >> >
> >> > > >> >> > Thank you.
> >> > > >> >> > Luke
> >> > > >> >>
> >> > > >>
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >> >
> >>
> >
> >
> > --
> > -- Guozhang
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Colin McCabe <cm...@apache.org>.
Thanks for the explanation, Luke. That makes sense.

best,
Colin

On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote:
> Thanks Luke, in that case I'm +1 on this KIP.
>
> On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:
>
>> Hi Guozhang,
>>
>> Thanks for your comment.
>>
>> > we need to make sure the old-versioned leader would be able to ignore the
>> new
>> field during an upgrade e.g. without crashing.
>>
>> Yes, I understand. I'll be careful to make sure it won't crash the old
>> versioned leader.
>> But basically, it won't, because we appended the new field into the last of
>> the ConsumerProtocolSubscription, which means, when read/deserialize the
>> Subscription metadata, the old versioned leader will just read the head
>> part of the data.
>>
>> Thanks for the reminder!
>>
>> Luke
>>
>> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com> wrote:
>>
>> > Hi Luke,
>> >
>> > Thanks for the KIP.
>> >
>> > One thing I'd like to double check is that, since the
>> > ConsumerProtocolSubscription is not auto generated from the json file, we
>> > need to make sure the old-versioned leader would be able to ignore the
>> new
>> > field during an upgrade e.g. without crashing. Other than that, the KIP
>> > lgtm.
>> >
>> > Guozhang
>> >
>> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:
>> >
>> > > Hi Colin,
>> > >
>> > > I'm not quite sure if I understand your thoughts correctly.
>> > > If I was wrong, please let me know.
>> > >
>> > > Also, I'm not quite sure how I could lock this feature to a new IBP
>> > > version.
>> > > I saw "KIP-584: Versioning scheme for features" is still under
>> > development.
>> > > Not sure if I need to lock the IBP version, how should I do?
>> > >
>> > > Thank you.
>> > > Luke
>> > >
>> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:
>> > >
>> > > > Hi Colin,
>> > > >
>> > > > Thanks for your comments. I've updated the KIP to mention about the
>> KIP
>> > > > won't affect current broker side behavior.
>> > > >
>> > > > > One scenario that we need to consider is what happens during a
>> > rolling
>> > > > upgrade. If the coordinator moves back and forth between brokers with
>> > > > different IBPs, it seems that the same epoch numbers could be reused
>> > for
>> > > a
>> > > > group, if things are done in the obvious manner (old IBP = don't read
>> > or
>> > > > write epoch, new IBP = do)
>> > > >
>> > > > I think this KIP doesn't care about the group epoch number at all.
>> The
>> > > > subscription metadata is passed from each member to group
>> coordinator,
>> > > and
>> > > > then the group coordinator pass all of them back to the consumer
>> lead.
>> > So
>> > > > even if the epoch number is reused in a group, it should be fine. On
>> > the
>> > > > other hand, the group coordinator will have no idea if the join group
>> > > > request sent from consumer containing the new subscription
>> "generation"
>> > > > field or not, because group coordinator won't deserialize the
>> metadata.
>> > > >
>> > > > I've added also added them into the KIP.
>> > > >
>> > > > Thank you.
>> > > > Luke
>> > > >
>> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org>
>> > wrote:
>> > > >
>> > > >> Hi Luke,
>> > > >>
>> > > >> Thanks for the explanation.
>> > > >>
>> > > >> I don't see any description of how the broker decides to use the new
>> > > >> version of ConsumerProtocolSubscription or not. This probably needs
>> to
>> > > be
>> > > >> locked to a new IBP version.
>> > > >>
>> > > >> One scenario that we need to consider is what happens during a
>> rolling
>> > > >> upgrade. If the coordinator moves back and forth between brokers
>> with
>> > > >> different IBPs, it seems that the same epoch numbers could be reused
>> > > for a
>> > > >> group, if things are done in the obvious manner (old IBP = don't
>> read
>> > or
>> > > >> write epoch, new IBP = do).
>> > > >>
>> > > >> best,
>> > > >> Colin
>> > > >>
>> > > >>
>> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
>> > > >> > Hi Colin,
>> > > >> > Thanks for your comment.
>> > > >> >
>> > > >> >> How are we going to avoid the situation where the broker
>> restarts,
>> > > and
>> > > >> > the same generation number is reused?
>> > > >> >
>> > > >> > Actually, this KIP doesn't have anything to do with the brokers.
>> The
>> > > >> > "generation" field I added, is in the subscription metadata, which
>> > > will
>> > > >> not
>> > > >> > be deserialized by brokers. The metadata is only deserialized by
>> > > >> consumer
>> > > >> > lead. And for the consumer lead, the only thing the lead cared
>> > about,
>> > > is
>> > > >> > the highest generation of the ownedPartitions among all the
>> > consumers.
>> > > >> With
>> > > >> > the highest generation of the ownedPartitions, the consumer lead
>> can
>> > > >> > distribute the partitions as sticky as possible, and most
>> > importantly,
>> > > >> > without errors.
>> > > >> >
>> > > >> > That is, after this KIP, if the broker restarts, and the same
>> > > generation
>> > > >> > number is reused, it won't break current rebalance behavior. But
>> > it'll
>> > > >> help
>> > > >> > the consumer lead do the sticky assignments correctly.
>> > > >> >
>> > > >> > Thank you.
>> > > >> > Luke
>> > > >> >
>> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz>
>> > > wrote:
>> > > >> >
>> > > >> >> How are we going to avoid the situation where the broker
>> restarts,
>> > > and
>> > > >> the
>> > > >> >> same generation number is reused?
>> > > >> >>
>> > > >> >> best,
>> > > >> >> Colin
>> > > >> >>
>> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> > > >> >> > Hi all,
>> > > >> >> >
>> > > >> >> > I'd like to start the vote for KIP-792: Add "generation" field
>> > into
>> > > >> >> > consumer protocol.
>> > > >> >> >
>> > > >> >> > The goal of this KIP is to allow the assignor/consumer
>> > coordinator
>> > > to
>> > > >> >> have
>> > > >> >> > a way to identify the out-of-date members/assignments, to avoid
>> > > >> rebalance
>> > > >> >> > stuck issues in current protocol.
>> > > >> >> >
>> > > >> >> > Detailed description can be found here:
>> > > >> >> >
>> > > >> >>
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> > > >> >> >
>> > > >> >> > Any feedback is welcome.
>> > > >> >> >
>> > > >> >> > Thank you.
>> > > >> >> > Luke
>> > > >> >>
>> > > >>
>> > > >
>> > >
>> >
>> >
>> > --
>> > -- Guozhang
>> >
>>
>
>
> -- 
> -- Guozhang

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Luke, in that case I'm +1 on this KIP.

On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <sh...@gmail.com> wrote:

> Hi Guozhang,
>
> Thanks for your comment.
>
> > we need to make sure the old-versioned leader would be able to ignore the
> new
> field during an upgrade e.g. without crashing.
>
> Yes, I understand. I'll be careful to make sure it won't crash the old
> versioned leader.
> But basically, it won't, because we appended the new field into the last of
> the ConsumerProtocolSubscription, which means, when read/deserialize the
> Subscription metadata, the old versioned leader will just read the head
> part of the data.
>
> Thanks for the reminder!
>
> Luke
>
> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hi Luke,
> >
> > Thanks for the KIP.
> >
> > One thing I'd like to double check is that, since the
> > ConsumerProtocolSubscription is not auto generated from the json file, we
> > need to make sure the old-versioned leader would be able to ignore the
> new
> > field during an upgrade e.g. without crashing. Other than that, the KIP
> > lgtm.
> >
> > Guozhang
> >
> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi Colin,
> > >
> > > I'm not quite sure if I understand your thoughts correctly.
> > > If I was wrong, please let me know.
> > >
> > > Also, I'm not quite sure how I could lock this feature to a new IBP
> > > version.
> > > I saw "KIP-584: Versioning scheme for features" is still under
> > development.
> > > Not sure if I need to lock the IBP version, how should I do?
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > > Hi Colin,
> > > >
> > > > Thanks for your comments. I've updated the KIP to mention about the
> KIP
> > > > won't affect current broker side behavior.
> > > >
> > > > > One scenario that we need to consider is what happens during a
> > rolling
> > > > upgrade. If the coordinator moves back and forth between brokers with
> > > > different IBPs, it seems that the same epoch numbers could be reused
> > for
> > > a
> > > > group, if things are done in the obvious manner (old IBP = don't read
> > or
> > > > write epoch, new IBP = do)
> > > >
> > > > I think this KIP doesn't care about the group epoch number at all.
> The
> > > > subscription metadata is passed from each member to group
> coordinator,
> > > and
> > > > then the group coordinator pass all of them back to the consumer
> lead.
> > So
> > > > even if the epoch number is reused in a group, it should be fine. On
> > the
> > > > other hand, the group coordinator will have no idea if the join group
> > > > request sent from consumer containing the new subscription
> "generation"
> > > > field or not, because group coordinator won't deserialize the
> metadata.
> > > >
> > > > I've added also added them into the KIP.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org>
> > wrote:
> > > >
> > > >> Hi Luke,
> > > >>
> > > >> Thanks for the explanation.
> > > >>
> > > >> I don't see any description of how the broker decides to use the new
> > > >> version of ConsumerProtocolSubscription or not. This probably needs
> to
> > > be
> > > >> locked to a new IBP version.
> > > >>
> > > >> One scenario that we need to consider is what happens during a
> rolling
> > > >> upgrade. If the coordinator moves back and forth between brokers
> with
> > > >> different IBPs, it seems that the same epoch numbers could be reused
> > > for a
> > > >> group, if things are done in the obvious manner (old IBP = don't
> read
> > or
> > > >> write epoch, new IBP = do).
> > > >>
> > > >> best,
> > > >> Colin
> > > >>
> > > >>
> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > > >> > Hi Colin,
> > > >> > Thanks for your comment.
> > > >> >
> > > >> >> How are we going to avoid the situation where the broker
> restarts,
> > > and
> > > >> > the same generation number is reused?
> > > >> >
> > > >> > Actually, this KIP doesn't have anything to do with the brokers.
> The
> > > >> > "generation" field I added, is in the subscription metadata, which
> > > will
> > > >> not
> > > >> > be deserialized by brokers. The metadata is only deserialized by
> > > >> consumer
> > > >> > lead. And for the consumer lead, the only thing the lead cared
> > about,
> > > is
> > > >> > the highest generation of the ownedPartitions among all the
> > consumers.
> > > >> With
> > > >> > the highest generation of the ownedPartitions, the consumer lead
> can
> > > >> > distribute the partitions as sticky as possible, and most
> > importantly,
> > > >> > without errors.
> > > >> >
> > > >> > That is, after this KIP, if the broker restarts, and the same
> > > generation
> > > >> > number is reused, it won't break current rebalance behavior. But
> > it'll
> > > >> help
> > > >> > the consumer lead do the sticky assignments correctly.
> > > >> >
> > > >> > Thank you.
> > > >> > Luke
> > > >> >
> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz>
> > > wrote:
> > > >> >
> > > >> >> How are we going to avoid the situation where the broker
> restarts,
> > > and
> > > >> the
> > > >> >> same generation number is reused?
> > > >> >>
> > > >> >> best,
> > > >> >> Colin
> > > >> >>
> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> > > >> >> > Hi all,
> > > >> >> >
> > > >> >> > I'd like to start the vote for KIP-792: Add "generation" field
> > into
> > > >> >> > consumer protocol.
> > > >> >> >
> > > >> >> > The goal of this KIP is to allow the assignor/consumer
> > coordinator
> > > to
> > > >> >> have
> > > >> >> > a way to identify the out-of-date members/assignments, to avoid
> > > >> rebalance
> > > >> >> > stuck issues in current protocol.
> > > >> >> >
> > > >> >> > Detailed description can be found here:
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > >> >> >
> > > >> >> > Any feedback is welcome.
> > > >> >> >
> > > >> >> > Thank you.
> > > >> >> > Luke
> > > >> >>
> > > >>
> > > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


-- 
-- Guozhang

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi Guozhang,

Thanks for your comment.

> we need to make sure the old-versioned leader would be able to ignore the
new
field during an upgrade e.g. without crashing.

Yes, I understand. I'll be careful to make sure it won't crash the old
versioned leader.
But basically, it won't, because we appended the new field into the last of
the ConsumerProtocolSubscription, which means, when read/deserialize the
Subscription metadata, the old versioned leader will just read the head
part of the data.

Thanks for the reminder!

Luke

On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wa...@gmail.com> wrote:

> Hi Luke,
>
> Thanks for the KIP.
>
> One thing I'd like to double check is that, since the
> ConsumerProtocolSubscription is not auto generated from the json file, we
> need to make sure the old-versioned leader would be able to ignore the new
> field during an upgrade e.g. without crashing. Other than that, the KIP
> lgtm.
>
> Guozhang
>
> On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi Colin,
> >
> > I'm not quite sure if I understand your thoughts correctly.
> > If I was wrong, please let me know.
> >
> > Also, I'm not quite sure how I could lock this feature to a new IBP
> > version.
> > I saw "KIP-584: Versioning scheme for features" is still under
> development.
> > Not sure if I need to lock the IBP version, how should I do?
> >
> > Thank you.
> > Luke
> >
> > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi Colin,
> > >
> > > Thanks for your comments. I've updated the KIP to mention about the KIP
> > > won't affect current broker side behavior.
> > >
> > > > One scenario that we need to consider is what happens during a
> rolling
> > > upgrade. If the coordinator moves back and forth between brokers with
> > > different IBPs, it seems that the same epoch numbers could be reused
> for
> > a
> > > group, if things are done in the obvious manner (old IBP = don't read
> or
> > > write epoch, new IBP = do)
> > >
> > > I think this KIP doesn't care about the group epoch number at all. The
> > > subscription metadata is passed from each member to group coordinator,
> > and
> > > then the group coordinator pass all of them back to the consumer lead.
> So
> > > even if the epoch number is reused in a group, it should be fine. On
> the
> > > other hand, the group coordinator will have no idea if the join group
> > > request sent from consumer containing the new subscription "generation"
> > > field or not, because group coordinator won't deserialize the metadata.
> > >
> > > I've added also added them into the KIP.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org>
> wrote:
> > >
> > >> Hi Luke,
> > >>
> > >> Thanks for the explanation.
> > >>
> > >> I don't see any description of how the broker decides to use the new
> > >> version of ConsumerProtocolSubscription or not. This probably needs to
> > be
> > >> locked to a new IBP version.
> > >>
> > >> One scenario that we need to consider is what happens during a rolling
> > >> upgrade. If the coordinator moves back and forth between brokers with
> > >> different IBPs, it seems that the same epoch numbers could be reused
> > for a
> > >> group, if things are done in the obvious manner (old IBP = don't read
> or
> > >> write epoch, new IBP = do).
> > >>
> > >> best,
> > >> Colin
> > >>
> > >>
> > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > >> > Hi Colin,
> > >> > Thanks for your comment.
> > >> >
> > >> >> How are we going to avoid the situation where the broker restarts,
> > and
> > >> > the same generation number is reused?
> > >> >
> > >> > Actually, this KIP doesn't have anything to do with the brokers. The
> > >> > "generation" field I added, is in the subscription metadata, which
> > will
> > >> not
> > >> > be deserialized by brokers. The metadata is only deserialized by
> > >> consumer
> > >> > lead. And for the consumer lead, the only thing the lead cared
> about,
> > is
> > >> > the highest generation of the ownedPartitions among all the
> consumers.
> > >> With
> > >> > the highest generation of the ownedPartitions, the consumer lead can
> > >> > distribute the partitions as sticky as possible, and most
> importantly,
> > >> > without errors.
> > >> >
> > >> > That is, after this KIP, if the broker restarts, and the same
> > generation
> > >> > number is reused, it won't break current rebalance behavior. But
> it'll
> > >> help
> > >> > the consumer lead do the sticky assignments correctly.
> > >> >
> > >> > Thank you.
> > >> > Luke
> > >> >
> > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz>
> > wrote:
> > >> >
> > >> >> How are we going to avoid the situation where the broker restarts,
> > and
> > >> the
> > >> >> same generation number is reused?
> > >> >>
> > >> >> best,
> > >> >> Colin
> > >> >>
> > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I'd like to start the vote for KIP-792: Add "generation" field
> into
> > >> >> > consumer protocol.
> > >> >> >
> > >> >> > The goal of this KIP is to allow the assignor/consumer
> coordinator
> > to
> > >> >> have
> > >> >> > a way to identify the out-of-date members/assignments, to avoid
> > >> rebalance
> > >> >> > stuck issues in current protocol.
> > >> >> >
> > >> >> > Detailed description can be found here:
> > >> >> >
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > >> >> >
> > >> >> > Any feedback is welcome.
> > >> >> >
> > >> >> > Thank you.
> > >> >> > Luke
> > >> >>
> > >>
> > >
> >
>
>
> --
> -- Guozhang
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Guozhang Wang <wa...@gmail.com>.
Hi Luke,

Thanks for the KIP.

One thing I'd like to double check is that, since the
ConsumerProtocolSubscription is not auto generated from the json file, we
need to make sure the old-versioned leader would be able to ignore the new
field during an upgrade e.g. without crashing. Other than that, the KIP
lgtm.

Guozhang

On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <sh...@gmail.com> wrote:

> Hi Colin,
>
> I'm not quite sure if I understand your thoughts correctly.
> If I was wrong, please let me know.
>
> Also, I'm not quite sure how I could lock this feature to a new IBP
> version.
> I saw "KIP-584: Versioning scheme for features" is still under development.
> Not sure if I need to lock the IBP version, how should I do?
>
> Thank you.
> Luke
>
> On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi Colin,
> >
> > Thanks for your comments. I've updated the KIP to mention about the KIP
> > won't affect current broker side behavior.
> >
> > > One scenario that we need to consider is what happens during a rolling
> > upgrade. If the coordinator moves back and forth between brokers with
> > different IBPs, it seems that the same epoch numbers could be reused for
> a
> > group, if things are done in the obvious manner (old IBP = don't read or
> > write epoch, new IBP = do)
> >
> > I think this KIP doesn't care about the group epoch number at all. The
> > subscription metadata is passed from each member to group coordinator,
> and
> > then the group coordinator pass all of them back to the consumer lead. So
> > even if the epoch number is reused in a group, it should be fine. On the
> > other hand, the group coordinator will have no idea if the join group
> > request sent from consumer containing the new subscription "generation"
> > field or not, because group coordinator won't deserialize the metadata.
> >
> > I've added also added them into the KIP.
> >
> > Thank you.
> > Luke
> >
> > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org> wrote:
> >
> >> Hi Luke,
> >>
> >> Thanks for the explanation.
> >>
> >> I don't see any description of how the broker decides to use the new
> >> version of ConsumerProtocolSubscription or not. This probably needs to
> be
> >> locked to a new IBP version.
> >>
> >> One scenario that we need to consider is what happens during a rolling
> >> upgrade. If the coordinator moves back and forth between brokers with
> >> different IBPs, it seems that the same epoch numbers could be reused
> for a
> >> group, if things are done in the obvious manner (old IBP = don't read or
> >> write epoch, new IBP = do).
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> >> > Hi Colin,
> >> > Thanks for your comment.
> >> >
> >> >> How are we going to avoid the situation where the broker restarts,
> and
> >> > the same generation number is reused?
> >> >
> >> > Actually, this KIP doesn't have anything to do with the brokers. The
> >> > "generation" field I added, is in the subscription metadata, which
> will
> >> not
> >> > be deserialized by brokers. The metadata is only deserialized by
> >> consumer
> >> > lead. And for the consumer lead, the only thing the lead cared about,
> is
> >> > the highest generation of the ownedPartitions among all the consumers.
> >> With
> >> > the highest generation of the ownedPartitions, the consumer lead can
> >> > distribute the partitions as sticky as possible, and most importantly,
> >> > without errors.
> >> >
> >> > That is, after this KIP, if the broker restarts, and the same
> generation
> >> > number is reused, it won't break current rebalance behavior. But it'll
> >> help
> >> > the consumer lead do the sticky assignments correctly.
> >> >
> >> > Thank you.
> >> > Luke
> >> >
> >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz>
> wrote:
> >> >
> >> >> How are we going to avoid the situation where the broker restarts,
> and
> >> the
> >> >> same generation number is reused?
> >> >>
> >> >> best,
> >> >> Colin
> >> >>
> >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I'd like to start the vote for KIP-792: Add "generation" field into
> >> >> > consumer protocol.
> >> >> >
> >> >> > The goal of this KIP is to allow the assignor/consumer coordinator
> to
> >> >> have
> >> >> > a way to identify the out-of-date members/assignments, to avoid
> >> rebalance
> >> >> > stuck issues in current protocol.
> >> >> >
> >> >> > Detailed description can be found here:
> >> >> >
> >> >>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >> >> >
> >> >> > Any feedback is welcome.
> >> >> >
> >> >> > Thank you.
> >> >> > Luke
> >> >>
> >>
> >
>


-- 
-- Guozhang

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi Colin,

I'm not quite sure if I understand your thoughts correctly.
If I was wrong, please let me know.

Also, I'm not quite sure how I could lock this feature to a new IBP
version.
I saw "KIP-584: Versioning scheme for features" is still under development.
Not sure if I need to lock the IBP version, how should I do?

Thank you.
Luke

On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <sh...@gmail.com> wrote:

> Hi Colin,
>
> Thanks for your comments. I've updated the KIP to mention about the KIP
> won't affect current broker side behavior.
>
> > One scenario that we need to consider is what happens during a rolling
> upgrade. If the coordinator moves back and forth between brokers with
> different IBPs, it seems that the same epoch numbers could be reused for a
> group, if things are done in the obvious manner (old IBP = don't read or
> write epoch, new IBP = do)
>
> I think this KIP doesn't care about the group epoch number at all. The
> subscription metadata is passed from each member to group coordinator, and
> then the group coordinator pass all of them back to the consumer lead. So
> even if the epoch number is reused in a group, it should be fine. On the
> other hand, the group coordinator will have no idea if the join group
> request sent from consumer containing the new subscription "generation"
> field or not, because group coordinator won't deserialize the metadata.
>
> I've added also added them into the KIP.
>
> Thank you.
> Luke
>
> On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org> wrote:
>
>> Hi Luke,
>>
>> Thanks for the explanation.
>>
>> I don't see any description of how the broker decides to use the new
>> version of ConsumerProtocolSubscription or not. This probably needs to be
>> locked to a new IBP version.
>>
>> One scenario that we need to consider is what happens during a rolling
>> upgrade. If the coordinator moves back and forth between brokers with
>> different IBPs, it seems that the same epoch numbers could be reused for a
>> group, if things are done in the obvious manner (old IBP = don't read or
>> write epoch, new IBP = do).
>>
>> best,
>> Colin
>>
>>
>> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
>> > Hi Colin,
>> > Thanks for your comment.
>> >
>> >> How are we going to avoid the situation where the broker restarts, and
>> > the same generation number is reused?
>> >
>> > Actually, this KIP doesn't have anything to do with the brokers. The
>> > "generation" field I added, is in the subscription metadata, which will
>> not
>> > be deserialized by brokers. The metadata is only deserialized by
>> consumer
>> > lead. And for the consumer lead, the only thing the lead cared about, is
>> > the highest generation of the ownedPartitions among all the consumers.
>> With
>> > the highest generation of the ownedPartitions, the consumer lead can
>> > distribute the partitions as sticky as possible, and most importantly,
>> > without errors.
>> >
>> > That is, after this KIP, if the broker restarts, and the same generation
>> > number is reused, it won't break current rebalance behavior. But it'll
>> help
>> > the consumer lead do the sticky assignments correctly.
>> >
>> > Thank you.
>> > Luke
>> >
>> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> wrote:
>> >
>> >> How are we going to avoid the situation where the broker restarts, and
>> the
>> >> same generation number is reused?
>> >>
>> >> best,
>> >> Colin
>> >>
>> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> >> > Hi all,
>> >> >
>> >> > I'd like to start the vote for KIP-792: Add "generation" field into
>> >> > consumer protocol.
>> >> >
>> >> > The goal of this KIP is to allow the assignor/consumer coordinator to
>> >> have
>> >> > a way to identify the out-of-date members/assignments, to avoid
>> rebalance
>> >> > stuck issues in current protocol.
>> >> >
>> >> > Detailed description can be found here:
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >> >
>> >> > Any feedback is welcome.
>> >> >
>> >> > Thank you.
>> >> > Luke
>> >>
>>
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi Colin,

Thanks for your comments. I've updated the KIP to mention about the KIP
won't affect current broker side behavior.

> One scenario that we need to consider is what happens during a rolling
upgrade. If the coordinator moves back and forth between brokers with
different IBPs, it seems that the same epoch numbers could be reused for a
group, if things are done in the obvious manner (old IBP = don't read or
write epoch, new IBP = do)

I think this KIP doesn't care about the group epoch number at all. The
subscription metadata is passed from each member to group coordinator, and
then the group coordinator pass all of them back to the consumer lead. So
even if the epoch number is reused in a group, it should be fine. On the
other hand, the group coordinator will have no idea if the join group
request sent from consumer containing the new subscription "generation"
field or not, because group coordinator won't deserialize the metadata.

I've added also added them into the KIP.

Thank you.
Luke

On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cm...@apache.org> wrote:

> Hi Luke,
>
> Thanks for the explanation.
>
> I don't see any description of how the broker decides to use the new
> version of ConsumerProtocolSubscription or not. This probably needs to be
> locked to a new IBP version.
>
> One scenario that we need to consider is what happens during a rolling
> upgrade. If the coordinator moves back and forth between brokers with
> different IBPs, it seems that the same epoch numbers could be reused for a
> group, if things are done in the obvious manner (old IBP = don't read or
> write epoch, new IBP = do).
>
> best,
> Colin
>
>
> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > Hi Colin,
> > Thanks for your comment.
> >
> >> How are we going to avoid the situation where the broker restarts, and
> > the same generation number is reused?
> >
> > Actually, this KIP doesn't have anything to do with the brokers. The
> > "generation" field I added, is in the subscription metadata, which will
> not
> > be deserialized by brokers. The metadata is only deserialized by consumer
> > lead. And for the consumer lead, the only thing the lead cared about, is
> > the highest generation of the ownedPartitions among all the consumers.
> With
> > the highest generation of the ownedPartitions, the consumer lead can
> > distribute the partitions as sticky as possible, and most importantly,
> > without errors.
> >
> > That is, after this KIP, if the broker restarts, and the same generation
> > number is reused, it won't break current rebalance behavior. But it'll
> help
> > the consumer lead do the sticky assignments correctly.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> wrote:
> >
> >> How are we going to avoid the situation where the broker restarts, and
> the
> >> same generation number is reused?
> >>
> >> best,
> >> Colin
> >>
> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> >> > Hi all,
> >> >
> >> > I'd like to start the vote for KIP-792: Add "generation" field into
> >> > consumer protocol.
> >> >
> >> > The goal of this KIP is to allow the assignor/consumer coordinator to
> >> have
> >> > a way to identify the out-of-date members/assignments, to avoid
> rebalance
> >> > stuck issues in current protocol.
> >> >
> >> > Detailed description can be found here:
> >> >
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >> >
> >> > Any feedback is welcome.
> >> >
> >> > Thank you.
> >> > Luke
> >>
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Colin McCabe <cm...@apache.org>.
Hi Luke,

Thanks for the explanation.

I don't see any description of how the broker decides to use the new version of ConsumerProtocolSubscription or not. This probably needs to be locked to a new IBP version.

One scenario that we need to consider is what happens during a rolling upgrade. If the coordinator moves back and forth between brokers with different IBPs, it seems that the same epoch numbers could be reused for a group, if things are done in the obvious manner (old IBP = don't read or write epoch, new IBP = do).

best,
Colin


On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> Hi Colin,
> Thanks for your comment.
>
>> How are we going to avoid the situation where the broker restarts, and
> the same generation number is reused?
>
> Actually, this KIP doesn't have anything to do with the brokers. The
> "generation" field I added, is in the subscription metadata, which will not
> be deserialized by brokers. The metadata is only deserialized by consumer
> lead. And for the consumer lead, the only thing the lead cared about, is
> the highest generation of the ownedPartitions among all the consumers. With
> the highest generation of the ownedPartitions, the consumer lead can
> distribute the partitions as sticky as possible, and most importantly,
> without errors.
>
> That is, after this KIP, if the broker restarts, and the same generation
> number is reused, it won't break current rebalance behavior. But it'll help
> the consumer lead do the sticky assignments correctly.
>
> Thank you.
> Luke
>
> On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> wrote:
>
>> How are we going to avoid the situation where the broker restarts, and the
>> same generation number is reused?
>>
>> best,
>> Colin
>>
>> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> > Hi all,
>> >
>> > I'd like to start the vote for KIP-792: Add "generation" field into
>> > consumer protocol.
>> >
>> > The goal of this KIP is to allow the assignor/consumer coordinator to
>> have
>> > a way to identify the out-of-date members/assignments, to avoid rebalance
>> > stuck issues in current protocol.
>> >
>> > Detailed description can be found here:
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >
>> > Any feedback is welcome.
>> >
>> > Thank you.
>> > Luke
>>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi Colin,
Thanks for your comment.

> How are we going to avoid the situation where the broker restarts, and
the same generation number is reused?

Actually, this KIP doesn't have anything to do with the brokers. The
"generation" field I added, is in the subscription metadata, which will not
be deserialized by brokers. The metadata is only deserialized by consumer
lead. And for the consumer lead, the only thing the lead cared about, is
the highest generation of the ownedPartitions among all the consumers. With
the highest generation of the ownedPartitions, the consumer lead can
distribute the partitions as sticky as possible, and most importantly,
without errors.

That is, after this KIP, if the broker restarts, and the same generation
number is reused, it won't break current rebalance behavior. But it'll help
the consumer lead do the sticky assignments correctly.

Thank you.
Luke

On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> wrote:

> How are we going to avoid the situation where the broker restarts, and the
> same generation number is reused?
>
> best,
> Colin
>
> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> > Hi all,
> >
> > I'd like to start the vote for KIP-792: Add "generation" field into
> > consumer protocol.
> >
> > The goal of this KIP is to allow the assignor/consumer coordinator to
> have
> > a way to identify the out-of-date members/assignments, to avoid rebalance
> > stuck issues in current protocol.
> >
> > Detailed description can be found here:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >
> > Any feedback is welcome.
> >
> > Thank you.
> > Luke
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Colin McCabe <co...@cmccabe.xyz>.
How are we going to avoid the situation where the broker restarts, and the same generation number is reused?

best,
Colin

On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> Hi all,
>
> I'd like to start the vote for KIP-792: Add "generation" field into
> consumer protocol.
>
> The goal of this KIP is to allow the assignor/consumer coordinator to have
> a way to identify the out-of-date members/assignments, to avoid rebalance
> stuck issues in current protocol.
>
> Detailed description can be found here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>
> Any feedback is welcome.
>
> Thank you.
> Luke

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
Thanks, Luke. Feel free to ping me for reviews. I am happy to help on this one.

Cheers,
David

On Thu, Sep 22, 2022 at 11:00 AM Luke Chen <sh...@gmail.com> wrote:
>
> Hi David,
>
> Sorry for the delay.
> I'll complete it in v3.4.0.
>
> Thank you.
> Luke
>
> On Thu, Sep 22, 2022 at 3:52 PM David Jacot <dj...@confluent.io.invalid>
> wrote:
>
> > Hi Luke,
> >
> > Are you still interested in implementing this KIP? We need it for
> > KIP-848. If you are not, we could find someone to take it over.
> >
> > Thanks,
> > David
> >
> > On Thu, Mar 3, 2022 at 10:04 AM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > Thanks, Ziming!
> > >
> > > So, now, this KIP vote passed with 3 binding +1 votes (David, Tom,
> > > Guozhang) and 1 non-binding +1 vote (Ziming).
> > > The vote will be closed.
> > >
> > > Thanks again.
> > > Luke
> > >
> > > On Thu, Mar 3, 2022 at 5:00 PM deng ziming <de...@gmail.com>
> > wrote:
> > >
> > > > Thank you Luke for this work,
> > > > I’m +1(non-binding)
> > > >
> > > > --
> > > > Best,
> > > > Ziming Deng
> > > >
> > > > > On Dec 1, 2021, at 8:36 AM, Luke Chen <sh...@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start the vote for KIP-792: Add "generation" field into
> > > > > consumer protocol.
> > > > >
> > > > > The goal of this KIP is to allow the assignor/consumer coordinator to
> > > > have
> > > > > a way to identify the out-of-date members/assignments, to avoid
> > rebalance
> > > > > stuck issues in current protocol.
> > > > >
> > > > > Detailed description can be found here:
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > >
> > > > > Any feedback is welcome.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > >
> > > >
> >

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi David,

Sorry for the delay.
I'll complete it in v3.4.0.

Thank you.
Luke

On Thu, Sep 22, 2022 at 3:52 PM David Jacot <dj...@confluent.io.invalid>
wrote:

> Hi Luke,
>
> Are you still interested in implementing this KIP? We need it for
> KIP-848. If you are not, we could find someone to take it over.
>
> Thanks,
> David
>
> On Thu, Mar 3, 2022 at 10:04 AM Luke Chen <sh...@gmail.com> wrote:
> >
> > Thanks, Ziming!
> >
> > So, now, this KIP vote passed with 3 binding +1 votes (David, Tom,
> > Guozhang) and 1 non-binding +1 vote (Ziming).
> > The vote will be closed.
> >
> > Thanks again.
> > Luke
> >
> > On Thu, Mar 3, 2022 at 5:00 PM deng ziming <de...@gmail.com>
> wrote:
> >
> > > Thank you Luke for this work,
> > > I’m +1(non-binding)
> > >
> > > --
> > > Best,
> > > Ziming Deng
> > >
> > > > On Dec 1, 2021, at 8:36 AM, Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I'd like to start the vote for KIP-792: Add "generation" field into
> > > > consumer protocol.
> > > >
> > > > The goal of this KIP is to allow the assignor/consumer coordinator to
> > > have
> > > > a way to identify the out-of-date members/assignments, to avoid
> rebalance
> > > > stuck issues in current protocol.
> > > >
> > > > Detailed description can be found here:
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > >
> > > > Any feedback is welcome.
> > > >
> > > > Thank you.
> > > > Luke
> > >
> > >
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
Hi Luke,

Are you still interested in implementing this KIP? We need it for
KIP-848. If you are not, we could find someone to take it over.

Thanks,
David

On Thu, Mar 3, 2022 at 10:04 AM Luke Chen <sh...@gmail.com> wrote:
>
> Thanks, Ziming!
>
> So, now, this KIP vote passed with 3 binding +1 votes (David, Tom,
> Guozhang) and 1 non-binding +1 vote (Ziming).
> The vote will be closed.
>
> Thanks again.
> Luke
>
> On Thu, Mar 3, 2022 at 5:00 PM deng ziming <de...@gmail.com> wrote:
>
> > Thank you Luke for this work,
> > I’m +1(non-binding)
> >
> > --
> > Best,
> > Ziming Deng
> >
> > > On Dec 1, 2021, at 8:36 AM, Luke Chen <sh...@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to start the vote for KIP-792: Add "generation" field into
> > > consumer protocol.
> > >
> > > The goal of this KIP is to allow the assignor/consumer coordinator to
> > have
> > > a way to identify the out-of-date members/assignments, to avoid rebalance
> > > stuck issues in current protocol.
> > >
> > > Detailed description can be found here:
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > >
> > > Any feedback is welcome.
> > >
> > > Thank you.
> > > Luke
> >
> >

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Thanks, Ziming!

So, now, this KIP vote passed with 3 binding +1 votes (David, Tom,
Guozhang) and 1 non-binding +1 vote (Ziming).
The vote will be closed.

Thanks again.
Luke

On Thu, Mar 3, 2022 at 5:00 PM deng ziming <de...@gmail.com> wrote:

> Thank you Luke for this work,
> I’m +1(non-binding)
>
> --
> Best,
> Ziming Deng
>
> > On Dec 1, 2021, at 8:36 AM, Luke Chen <sh...@gmail.com> wrote:
> >
> > Hi all,
> >
> > I'd like to start the vote for KIP-792: Add "generation" field into
> > consumer protocol.
> >
> > The goal of this KIP is to allow the assignor/consumer coordinator to
> have
> > a way to identify the out-of-date members/assignments, to avoid rebalance
> > stuck issues in current protocol.
> >
> > Detailed description can be found here:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >
> > Any feedback is welcome.
> >
> > Thank you.
> > Luke
>
>

Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

Posted by deng ziming <de...@gmail.com>.
Thank you Luke for this work,
I’m +1(non-binding)

--
Best,
Ziming Deng

> On Dec 1, 2021, at 8:36 AM, Luke Chen <sh...@gmail.com> wrote:
> 
> Hi all,
> 
> I'd like to start the vote for KIP-792: Add "generation" field into
> consumer protocol.
> 
> The goal of this KIP is to allow the assignor/consumer coordinator to have
> a way to identify the out-of-date members/assignments, to avoid rebalance
> stuck issues in current protocol.
> 
> Detailed description can be found here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> 
> Any feedback is welcome.
> 
> Thank you.
> Luke