You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Luke Chen <sh...@gmail.com> on 2021/11/12 02:30:46 UTC

[DISCUSS] KIP-792: Add "generation" field into consumer protocol

Hi all,

I'd like to start the discussion for KIP-792: Add "generation" field into
consumer protocol.

The goal of this KIP is to allow assignor/consumer coordinator/group
coordinator to have a way to identify the out-of-date members/assignments.

Detailed description can be found here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614

Any feedback is welcome.

Thank you.
Luke

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi David,

If you don't have other comments, would you vote for the KIP?

Thank you.
Luke

On Tue, Feb 22, 2022 at 3:13 PM David Jacot <da...@gmail.com> wrote:

> Thanks, Luke!
>
> Le mar. 22 févr. 2022 à 08:02, Luke Chen <sh...@gmail.com> a écrit :
>
> > Hi David,
> >
> > Thanks for the comment.
> > I've updated the KIP, to add the method will be added into `Subscription`
> > class:
> >
> > // new added, the generationId getter
> > public int generationId() {
> >     return generationId;
> > }
> >
> > Thank you.
> > Luke
> >
> > On Mon, Feb 21, 2022 at 5:24 PM David Jacot <djacot@confluent.io.invalid
> >
> > wrote:
> >
> > > Hi Luke,
> > >
> > > I apologize for my late reply. I was out for a while.
> > >
> > > Coming back to my previous point, could you also
> > > spell out the new method(s) that we need to add to
> > > the Subscription class?
> > >
> > > Thanks,
> > > David
> > >
> > > On Mon, Feb 14, 2022 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > > >
> > > > Thanks Luke, no more comments from me, nice work!
> > > >
> > > > On Mon, Feb 14, 2022 at 5:22 AM Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > > > Hi Guozhang,
> > > > >
> > > > > Thanks for your comments. I've updated the KIP.
> > > > > Here's what I've updated:
> > > > >
> > > > > * In the motivation section, I've added this paragraph after
> > > > > cooperativeStickyAssignor like this:
> > > > >
> > > > > *On the other hand,  `StickyAssignor` is also adding "generation"
> > field
> > > > > plus the "ownedPartitions" into subscription userData bytes. the
> > > difference
> > > > > is that the `StickyAssignor`'s user bytes also encode the
> prev-owned
> > > > > partitions while the `CooperativeStickyAssignor` relies on the
> > > prev-owned
> > > > > partitions on the subscription protocol directly.*
> > > > >
> > > > > * In the proposed change section, I've updated the paragraph as:
> > > > >
> > > > >
> > > > > *For built-in CooperativeStickyAssignor, if there are consumers in
> > old
> > > > > bytecode and some in the new bytecode, it's totally fine, because
> the
> > > > > subscription data from old consumers will contain \[empty
> > > ownedPartitions +
> > > > > default generation(-1)] in V0, or \[current ownedPartitions +
> default
> > > > > generation(-1)] in V1. For V0 case, it's quite simple, because
> we'll
> > > just
> > > > > ignore the info since they are empty. For V1 case, we'll get the
> > > > > "ownedPartitions" data, and then decode the "generation" info in
> the
> > > > > subscription userData bytes. So that we can continue to do
> assignment
> > > with
> > > > > these information.*
> > > > > * Also, after the "cooperativeStickyAssignor paragraph, I've also
> > > mentioned
> > > > > stickyAssignor:
> > > > >
> > > > >
> > > > > *For built-in StickyAssignor, if there are consumers in old
> bytecode
> > > and
> > > > > some in the new bytecode, it's also fine, because the subscription
> > data
> > > > > from old consumers will contain \[empty ownedPartitions + default
> > > > > generation(-1)] in V0, or \[current ownedPartitions + default
> > > > > generation(-1)] in V1. For both V0 and V1 case, we'll directly use
> > the
> > > > > ownedPartition and generation info in the subscription userData
> > bytes.
> > > *
> > > > >
> > > > > Please let me know if you have other comments.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > > On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hello Luke,
> > > > > >
> > > > > > Thanks for the updated KIP, I've taken a look at it and still
> LGTM.
> > > Just
> > > > > a
> > > > > > couple minor comments in the wiki:
> > > > > >
> > > > > > * Both `StickyAssignor` and `CooperativeStickyAssignor` that
> > there's
> > > > > > already generation is encoded in user-data bytes, the difference
> is
> > > that
> > > > > > the `StickyAssignor`'s user bytes also encode the prev-owned
> > > partitions
> > > > > > while the `CooperativeStickyAssignor` relies on the prev-owned
> > > partitions
> > > > > > on the subscription protocol directly. So we can add the
> > > `StickyAssignor`
> > > > > > in your paragraph talking about `CooperativeStickyAssignor` as
> > well.
> > > > > >
> > > > > > * This sentence: "otherwise, we'll take the ownedPartitions as
> > > default
> > > > > > generation(-1)." does not read right to me, maybe need to
> rephrase
> > a
> > > bit?
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > > On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi David,
> > > > > > >
> > > > > > > Thanks for your comments.
> > > > > > > I've updated the KIP to add changes in Subscription class.
> > > > > > >
> > > > > > > Thank you.
> > > > > > > Luke
> > > > > > >
> > > > > > > On Fri, Feb 4, 2022 at 11:43 PM David Jacot
> > > > > <djacot@confluent.io.invalid
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Luke,
> > > > > > > >
> > > > > > > > Thanks for updating the KIP. I just have a minor request.
> > > > > > > > Could you fully describe the changes to the Subscription
> > > > > > > > public class in the KIP? I think that it would be better than
> > > > > > > > just saying that the generation will be added to id.
> > > > > > > >
> > > > > > > > Otherwise, the KIP LGTM.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > David
> > > > > > > >
> > > > > > > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <showuon@gmail.com
> >
> > > wrote:
> > > > > > > > >
> > > > > > > > > Hi devs,
> > > > > > > > > Welcome to provide feedback.
> > > > > > > > >
> > > > > > > > > If there are no other comments, I'll start a vote tomorrow.
> > > > > > > > >
> > > > > > > > > Thank you.
> > > > > > > > > Luke
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <
> showuon@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello David,
> > > > > > > > > >
> > > > > > > > > > For (3):
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > * I suppose that we could add a `generation` field to the
> > > > > > > > JoinGroupRequest
> > > > > > > > > > instead to do the fencing that you describe while
> handling
> > > the
> > > > > > > > sentinel in
> > > > > > > > > > the assignor directly. If we would add the `generation`
> to
> > > the
> > > > > > > request
> > > > > > > > > > itself, would we need the `generation` in the
> subscription
> > > > > protocol
> > > > > > > as
> > > > > > > > > > well?*
> > > > > > > > > >
> > > > > > > > > > On second thought, I think this is not better than adding
> > > > > > > `generation`
> > > > > > > > > > field in the subscription protocol, because I think we
> > don't
> > > have
> > > > > > to
> > > > > > > > do any
> > > > > > > > > > generation validation on joinGroup request. The purpose
> of
> > > > > > > > > > `joinGroupRequest` is to accept any members to join this
> > > group,
> > > > > > even
> > > > > > > > if the
> > > > > > > > > > member is new or ever joined or what. As long as we have
> > the
> > > > > > > > generationId
> > > > > > > > > > in the subscription metadata, the consumer lead can
> > leverage
> > > the
> > > > > > info
> > > > > > > > to
> > > > > > > > > > ignore the old ownedPartitions (or do other handling),
> and
> > > the
> > > > > > > > rebalance
> > > > > > > > > > can still complete successfully and correctly. On the
> other
> > > hand,
> > > > > > if
> > > > > > > > we did
> > > > > > > > > > the generation check on JoinGroupRequest, and return
> > > > > > > > `ILLEGAL_GENERATION`
> > > > > > > > > > back to consumer, the consumer needs to clear its
> > generation
> > > info
> > > > > > and
> > > > > > > > > > rejoin the group to continue the rebalance. It needs more
> > > > > > > > request/response
> > > > > > > > > > network and slow down the rebalance.
> > > > > > > > > >
> > > > > > > > > > So I think we should add the `generationId` field into
> > > > > Subscription
> > > > > > > > > > protocol to achieve what we want.
> > > > > > > > > >
> > > > > > > > > > Thank you.
> > > > > > > > > > Luke
> > > > > > > > > >
> > > > > > > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <
> > showuon@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> Hi David,
> > > > > > > > > >> Thanks for your feedback.
> > > > > > > > > >>
> > > > > > > > > >> I've updated the KIP for your comments (1)(2).
> > > > > > > > > >> For (3), it's a good point! Yes, we didn't deserialize
> the
> > > > > > > > subscription
> > > > > > > > > >> metadata on broker side, and it's not necessary to add
> > > overhead
> > > > > on
> > > > > > > > broker
> > > > > > > > > >> side. And, yes, I think we can fix the original issue by
> > > adding
> > > > > a
> > > > > > > > > >> "generation" field into `JoinGroupRequest` instead, and
> > also
> > > > > add a
> > > > > > > > field
> > > > > > > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember`
> > field.
> > > > > That
> > > > > > > > way, the
> > > > > > > > > >> broker can identify the old member from
> > `JoinGroupRequest`.
> > > And
> > > > > > the
> > > > > > > > > >> assignor can also get the "generation" info via the
> > > > > `Subscription`
> > > > > > > > instance.
> > > > > > > > > >>
> > > > > > > > > >> I'll update the KIP to add "generation" field into
> > > > > > > `JoinGroupRequest`
> > > > > > > > and
> > > > > > > > > >> `JoinGroupResponse`, if there is no other options.
> > > > > > > > > >>
> > > > > > > > > >> Thank you.
> > > > > > > > > >> Luke
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > > > > > > <dj...@confluent.io.invalid>
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >>> Hi Luke,
> > > > > > > > > >>>
> > > > > > > > > >>> Thanks for the KIP. Overall, I think that the
> motivation
> > > makes
> > > > > > > > sense. I
> > > > > > > > > >>> have a couple of comments/questions:
> > > > > > > > > >>>
> > > > > > > > > >>> 1. In the Public Interfaces section, it would be great
> if
> > > you
> > > > > > could
> > > > > > > > put
> > > > > > > > > >>> the
> > > > > > > > > >>> end state not the current one.
> > > > > > > > > >>>
> > > > > > > > > >>> 2. Do we need to update the Subscription class to
> expose
> > > the
> > > > > > > > > >>> generation? If so, it would be great to mention it in
> the
> > > > > Public
> > > > > > > > > >>> Interfaces section as well.
> > > > > > > > > >>>
> > > > > > > > > >>> 3. You mention that the broker will set the generation
> if
> > > the
> > > > > > > > > >>> subscription
> > > > > > > > > >>> contains a sentinel value (-1). As of today, the broker
> > > does
> > > > > not
> > > > > > > > parse
> > > > > > > > > >>> the subscription so I am not sure how/why we would do
> > > this. I
> > > > > > > suppose
> > > > > > > > > >>> that we could add a `generation` field to the
> > > JoinGroupRequest
> > > > > > > > instead
> > > > > > > > > >>> to do the fencing that you describe while handling the
> > > sentinel
> > > > > > in
> > > > > > > > the
> > > > > > > > > >>> assignor directly. If we would add the `generation` to
> > the
> > > > > > request
> > > > > > > > > >>> itself,
> > > > > > > > > >>> would we need the `generation` in the subscription
> > > protocol as
> > > > > > > well?
> > > > > > > > > >>>
> > > > > > > > > >>> Best,
> > > > > > > > > >>> David
> > > > > > > > > >>>
> > > > > > > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <
> > > showuon@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >>> >
> > > > > > > > > >>> > Hi all,
> > > > > > > > > >>> >
> > > > > > > > > >>> > I'd like to start the discussion for KIP-792: Add
> > > > > "generation"
> > > > > > > > field
> > > > > > > > > >>> into
> > > > > > > > > >>> > consumer protocol.
> > > > > > > > > >>> >
> > > > > > > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > > > > > > coordinator/group
> > > > > > > > > >>> > coordinator to have a way to identify the out-of-date
> > > > > > > > > >>> members/assignments.
> > > > > > > > > >>> >
> > > > > > > > > >>> > Detailed description can be found here:
> > > > > > > > > >>> >
> > > > > > > > > >>>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > > > > > > >>> >
> > > > > > > > > >>> > Any feedback is welcome.
> > > > > > > > > >>> >
> > > > > > > > > >>> > Thank you.
> > > > > > > > > >>> > Luke
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > >
> >
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <da...@gmail.com>.
Thanks, Luke!

Le mar. 22 févr. 2022 à 08:02, Luke Chen <sh...@gmail.com> a écrit :

> Hi David,
>
> Thanks for the comment.
> I've updated the KIP, to add the method will be added into `Subscription`
> class:
>
> // new added, the generationId getter
> public int generationId() {
>     return generationId;
> }
>
> Thank you.
> Luke
>
> On Mon, Feb 21, 2022 at 5:24 PM David Jacot <dj...@confluent.io.invalid>
> wrote:
>
> > Hi Luke,
> >
> > I apologize for my late reply. I was out for a while.
> >
> > Coming back to my previous point, could you also
> > spell out the new method(s) that we need to add to
> > the Subscription class?
> >
> > Thanks,
> > David
> >
> > On Mon, Feb 14, 2022 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> > >
> > > Thanks Luke, no more comments from me, nice work!
> > >
> > > On Mon, Feb 14, 2022 at 5:22 AM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > > Hi Guozhang,
> > > >
> > > > Thanks for your comments. I've updated the KIP.
> > > > Here's what I've updated:
> > > >
> > > > * In the motivation section, I've added this paragraph after
> > > > cooperativeStickyAssignor like this:
> > > >
> > > > *On the other hand,  `StickyAssignor` is also adding "generation"
> field
> > > > plus the "ownedPartitions" into subscription userData bytes. the
> > difference
> > > > is that the `StickyAssignor`'s user bytes also encode the prev-owned
> > > > partitions while the `CooperativeStickyAssignor` relies on the
> > prev-owned
> > > > partitions on the subscription protocol directly.*
> > > >
> > > > * In the proposed change section, I've updated the paragraph as:
> > > >
> > > >
> > > > *For built-in CooperativeStickyAssignor, if there are consumers in
> old
> > > > bytecode and some in the new bytecode, it's totally fine, because the
> > > > subscription data from old consumers will contain \[empty
> > ownedPartitions +
> > > > default generation(-1)] in V0, or \[current ownedPartitions + default
> > > > generation(-1)] in V1. For V0 case, it's quite simple, because we'll
> > just
> > > > ignore the info since they are empty. For V1 case, we'll get the
> > > > "ownedPartitions" data, and then decode the "generation" info in the
> > > > subscription userData bytes. So that we can continue to do assignment
> > with
> > > > these information.*
> > > > * Also, after the "cooperativeStickyAssignor paragraph, I've also
> > mentioned
> > > > stickyAssignor:
> > > >
> > > >
> > > > *For built-in StickyAssignor, if there are consumers in old bytecode
> > and
> > > > some in the new bytecode, it's also fine, because the subscription
> data
> > > > from old consumers will contain \[empty ownedPartitions + default
> > > > generation(-1)] in V0, or \[current ownedPartitions + default
> > > > generation(-1)] in V1. For both V0 and V1 case, we'll directly use
> the
> > > > ownedPartition and generation info in the subscription userData
> bytes.
> > *
> > > >
> > > > Please let me know if you have other comments.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > > >
> > > > > Hello Luke,
> > > > >
> > > > > Thanks for the updated KIP, I've taken a look at it and still LGTM.
> > Just
> > > > a
> > > > > couple minor comments in the wiki:
> > > > >
> > > > > * Both `StickyAssignor` and `CooperativeStickyAssignor` that
> there's
> > > > > already generation is encoded in user-data bytes, the difference is
> > that
> > > > > the `StickyAssignor`'s user bytes also encode the prev-owned
> > partitions
> > > > > while the `CooperativeStickyAssignor` relies on the prev-owned
> > partitions
> > > > > on the subscription protocol directly. So we can add the
> > `StickyAssignor`
> > > > > in your paragraph talking about `CooperativeStickyAssignor` as
> well.
> > > > >
> > > > > * This sentence: "otherwise, we'll take the ownedPartitions as
> > default
> > > > > generation(-1)." does not read right to me, maybe need to rephrase
> a
> > bit?
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > Thanks for your comments.
> > > > > > I've updated the KIP to add changes in Subscription class.
> > > > > >
> > > > > > Thank you.
> > > > > > Luke
> > > > > >
> > > > > > On Fri, Feb 4, 2022 at 11:43 PM David Jacot
> > > > <djacot@confluent.io.invalid
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Luke,
> > > > > > >
> > > > > > > Thanks for updating the KIP. I just have a minor request.
> > > > > > > Could you fully describe the changes to the Subscription
> > > > > > > public class in the KIP? I think that it would be better than
> > > > > > > just saying that the generation will be added to id.
> > > > > > >
> > > > > > > Otherwise, the KIP LGTM.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > David
> > > > > > >
> > > > > > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com>
> > wrote:
> > > > > > > >
> > > > > > > > Hi devs,
> > > > > > > > Welcome to provide feedback.
> > > > > > > >
> > > > > > > > If there are no other comments, I'll start a vote tomorrow.
> > > > > > > >
> > > > > > > > Thank you.
> > > > > > > > Luke
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <showuon@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hello David,
> > > > > > > > >
> > > > > > > > > For (3):
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > * I suppose that we could add a `generation` field to the
> > > > > > > JoinGroupRequest
> > > > > > > > > instead to do the fencing that you describe while handling
> > the
> > > > > > > sentinel in
> > > > > > > > > the assignor directly. If we would add the `generation` to
> > the
> > > > > > request
> > > > > > > > > itself, would we need the `generation` in the subscription
> > > > protocol
> > > > > > as
> > > > > > > > > well?*
> > > > > > > > >
> > > > > > > > > On second thought, I think this is not better than adding
> > > > > > `generation`
> > > > > > > > > field in the subscription protocol, because I think we
> don't
> > have
> > > > > to
> > > > > > > do any
> > > > > > > > > generation validation on joinGroup request. The purpose of
> > > > > > > > > `joinGroupRequest` is to accept any members to join this
> > group,
> > > > > even
> > > > > > > if the
> > > > > > > > > member is new or ever joined or what. As long as we have
> the
> > > > > > > generationId
> > > > > > > > > in the subscription metadata, the consumer lead can
> leverage
> > the
> > > > > info
> > > > > > > to
> > > > > > > > > ignore the old ownedPartitions (or do other handling), and
> > the
> > > > > > > rebalance
> > > > > > > > > can still complete successfully and correctly. On the other
> > hand,
> > > > > if
> > > > > > > we did
> > > > > > > > > the generation check on JoinGroupRequest, and return
> > > > > > > `ILLEGAL_GENERATION`
> > > > > > > > > back to consumer, the consumer needs to clear its
> generation
> > info
> > > > > and
> > > > > > > > > rejoin the group to continue the rebalance. It needs more
> > > > > > > request/response
> > > > > > > > > network and slow down the rebalance.
> > > > > > > > >
> > > > > > > > > So I think we should add the `generationId` field into
> > > > Subscription
> > > > > > > > > protocol to achieve what we want.
> > > > > > > > >
> > > > > > > > > Thank you.
> > > > > > > > > Luke
> > > > > > > > >
> > > > > > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <
> showuon@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Hi David,
> > > > > > > > >> Thanks for your feedback.
> > > > > > > > >>
> > > > > > > > >> I've updated the KIP for your comments (1)(2).
> > > > > > > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > > > > > > subscription
> > > > > > > > >> metadata on broker side, and it's not necessary to add
> > overhead
> > > > on
> > > > > > > broker
> > > > > > > > >> side. And, yes, I think we can fix the original issue by
> > adding
> > > > a
> > > > > > > > >> "generation" field into `JoinGroupRequest` instead, and
> also
> > > > add a
> > > > > > > field
> > > > > > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember`
> field.
> > > > That
> > > > > > > way, the
> > > > > > > > >> broker can identify the old member from
> `JoinGroupRequest`.
> > And
> > > > > the
> > > > > > > > >> assignor can also get the "generation" info via the
> > > > `Subscription`
> > > > > > > instance.
> > > > > > > > >>
> > > > > > > > >> I'll update the KIP to add "generation" field into
> > > > > > `JoinGroupRequest`
> > > > > > > and
> > > > > > > > >> `JoinGroupResponse`, if there is no other options.
> > > > > > > > >>
> > > > > > > > >> Thank you.
> > > > > > > > >> Luke
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > > > > > <dj...@confluent.io.invalid>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >>> Hi Luke,
> > > > > > > > >>>
> > > > > > > > >>> Thanks for the KIP. Overall, I think that the motivation
> > makes
> > > > > > > sense. I
> > > > > > > > >>> have a couple of comments/questions:
> > > > > > > > >>>
> > > > > > > > >>> 1. In the Public Interfaces section, it would be great if
> > you
> > > > > could
> > > > > > > put
> > > > > > > > >>> the
> > > > > > > > >>> end state not the current one.
> > > > > > > > >>>
> > > > > > > > >>> 2. Do we need to update the Subscription class to expose
> > the
> > > > > > > > >>> generation? If so, it would be great to mention it in the
> > > > Public
> > > > > > > > >>> Interfaces section as well.
> > > > > > > > >>>
> > > > > > > > >>> 3. You mention that the broker will set the generation if
> > the
> > > > > > > > >>> subscription
> > > > > > > > >>> contains a sentinel value (-1). As of today, the broker
> > does
> > > > not
> > > > > > > parse
> > > > > > > > >>> the subscription so I am not sure how/why we would do
> > this. I
> > > > > > suppose
> > > > > > > > >>> that we could add a `generation` field to the
> > JoinGroupRequest
> > > > > > > instead
> > > > > > > > >>> to do the fencing that you describe while handling the
> > sentinel
> > > > > in
> > > > > > > the
> > > > > > > > >>> assignor directly. If we would add the `generation` to
> the
> > > > > request
> > > > > > > > >>> itself,
> > > > > > > > >>> would we need the `generation` in the subscription
> > protocol as
> > > > > > well?
> > > > > > > > >>>
> > > > > > > > >>> Best,
> > > > > > > > >>> David
> > > > > > > > >>>
> > > > > > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <
> > showuon@gmail.com>
> > > > > > wrote:
> > > > > > > > >>> >
> > > > > > > > >>> > Hi all,
> > > > > > > > >>> >
> > > > > > > > >>> > I'd like to start the discussion for KIP-792: Add
> > > > "generation"
> > > > > > > field
> > > > > > > > >>> into
> > > > > > > > >>> > consumer protocol.
> > > > > > > > >>> >
> > > > > > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > > > > > coordinator/group
> > > > > > > > >>> > coordinator to have a way to identify the out-of-date
> > > > > > > > >>> members/assignments.
> > > > > > > > >>> >
> > > > > > > > >>> > Detailed description can be found here:
> > > > > > > > >>> >
> > > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > > > > > >>> >
> > > > > > > > >>> > Any feedback is welcome.
> > > > > > > > >>> >
> > > > > > > > >>> > Thank you.
> > > > > > > > >>> > Luke
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> >
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi David,

Thanks for the comment.
I've updated the KIP, to add the method will be added into `Subscription`
class:

// new added, the generationId getter
public int generationId() {
    return generationId;
}

Thank you.
Luke

On Mon, Feb 21, 2022 at 5:24 PM David Jacot <dj...@confluent.io.invalid>
wrote:

> Hi Luke,
>
> I apologize for my late reply. I was out for a while.
>
> Coming back to my previous point, could you also
> spell out the new method(s) that we need to add to
> the Subscription class?
>
> Thanks,
> David
>
> On Mon, Feb 14, 2022 at 6:28 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> > Thanks Luke, no more comments from me, nice work!
> >
> > On Mon, Feb 14, 2022 at 5:22 AM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi Guozhang,
> > >
> > > Thanks for your comments. I've updated the KIP.
> > > Here's what I've updated:
> > >
> > > * In the motivation section, I've added this paragraph after
> > > cooperativeStickyAssignor like this:
> > >
> > > *On the other hand,  `StickyAssignor` is also adding "generation" field
> > > plus the "ownedPartitions" into subscription userData bytes. the
> difference
> > > is that the `StickyAssignor`'s user bytes also encode the prev-owned
> > > partitions while the `CooperativeStickyAssignor` relies on the
> prev-owned
> > > partitions on the subscription protocol directly.*
> > >
> > > * In the proposed change section, I've updated the paragraph as:
> > >
> > >
> > > *For built-in CooperativeStickyAssignor, if there are consumers in old
> > > bytecode and some in the new bytecode, it's totally fine, because the
> > > subscription data from old consumers will contain \[empty
> ownedPartitions +
> > > default generation(-1)] in V0, or \[current ownedPartitions + default
> > > generation(-1)] in V1. For V0 case, it's quite simple, because we'll
> just
> > > ignore the info since they are empty. For V1 case, we'll get the
> > > "ownedPartitions" data, and then decode the "generation" info in the
> > > subscription userData bytes. So that we can continue to do assignment
> with
> > > these information.*
> > > * Also, after the "cooperativeStickyAssignor paragraph, I've also
> mentioned
> > > stickyAssignor:
> > >
> > >
> > > *For built-in StickyAssignor, if there are consumers in old bytecode
> and
> > > some in the new bytecode, it's also fine, because the subscription data
> > > from old consumers will contain \[empty ownedPartitions + default
> > > generation(-1)] in V0, or \[current ownedPartitions + default
> > > generation(-1)] in V1. For both V0 and V1 case, we'll directly use the
> > > ownedPartition and generation info in the subscription userData bytes.
> *
> > >
> > > Please let me know if you have other comments.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> > >
> > > > Hello Luke,
> > > >
> > > > Thanks for the updated KIP, I've taken a look at it and still LGTM.
> Just
> > > a
> > > > couple minor comments in the wiki:
> > > >
> > > > * Both `StickyAssignor` and `CooperativeStickyAssignor` that there's
> > > > already generation is encoded in user-data bytes, the difference is
> that
> > > > the `StickyAssignor`'s user bytes also encode the prev-owned
> partitions
> > > > while the `CooperativeStickyAssignor` relies on the prev-owned
> partitions
> > > > on the subscription protocol directly. So we can add the
> `StickyAssignor`
> > > > in your paragraph talking about `CooperativeStickyAssignor` as well.
> > > >
> > > > * This sentence: "otherwise, we'll take the ownedPartitions as
> default
> > > > generation(-1)." does not read right to me, maybe need to rephrase a
> bit?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > > > Hi David,
> > > > >
> > > > > Thanks for your comments.
> > > > > I've updated the KIP to add changes in Subscription class.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > > On Fri, Feb 4, 2022 at 11:43 PM David Jacot
> > > <djacot@confluent.io.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Luke,
> > > > > >
> > > > > > Thanks for updating the KIP. I just have a minor request.
> > > > > > Could you fully describe the changes to the Subscription
> > > > > > public class in the KIP? I think that it would be better than
> > > > > > just saying that the generation will be added to id.
> > > > > >
> > > > > > Otherwise, the KIP LGTM.
> > > > > >
> > > > > > Thanks,
> > > > > > David
> > > > > >
> > > > > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > Hi devs,
> > > > > > > Welcome to provide feedback.
> > > > > > >
> > > > > > > If there are no other comments, I'll start a vote tomorrow.
> > > > > > >
> > > > > > > Thank you.
> > > > > > > Luke
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hello David,
> > > > > > > >
> > > > > > > > For (3):
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > * I suppose that we could add a `generation` field to the
> > > > > > JoinGroupRequest
> > > > > > > > instead to do the fencing that you describe while handling
> the
> > > > > > sentinel in
> > > > > > > > the assignor directly. If we would add the `generation` to
> the
> > > > > request
> > > > > > > > itself, would we need the `generation` in the subscription
> > > protocol
> > > > > as
> > > > > > > > well?*
> > > > > > > >
> > > > > > > > On second thought, I think this is not better than adding
> > > > > `generation`
> > > > > > > > field in the subscription protocol, because I think we don't
> have
> > > > to
> > > > > > do any
> > > > > > > > generation validation on joinGroup request. The purpose of
> > > > > > > > `joinGroupRequest` is to accept any members to join this
> group,
> > > > even
> > > > > > if the
> > > > > > > > member is new or ever joined or what. As long as we have the
> > > > > > generationId
> > > > > > > > in the subscription metadata, the consumer lead can leverage
> the
> > > > info
> > > > > > to
> > > > > > > > ignore the old ownedPartitions (or do other handling), and
> the
> > > > > > rebalance
> > > > > > > > can still complete successfully and correctly. On the other
> hand,
> > > > if
> > > > > > we did
> > > > > > > > the generation check on JoinGroupRequest, and return
> > > > > > `ILLEGAL_GENERATION`
> > > > > > > > back to consumer, the consumer needs to clear its generation
> info
> > > > and
> > > > > > > > rejoin the group to continue the rebalance. It needs more
> > > > > > request/response
> > > > > > > > network and slow down the rebalance.
> > > > > > > >
> > > > > > > > So I think we should add the `generationId` field into
> > > Subscription
> > > > > > > > protocol to achieve what we want.
> > > > > > > >
> > > > > > > > Thank you.
> > > > > > > > Luke
> > > > > > > >
> > > > > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <showuon@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > >> Hi David,
> > > > > > > >> Thanks for your feedback.
> > > > > > > >>
> > > > > > > >> I've updated the KIP for your comments (1)(2).
> > > > > > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > > > > > subscription
> > > > > > > >> metadata on broker side, and it's not necessary to add
> overhead
> > > on
> > > > > > broker
> > > > > > > >> side. And, yes, I think we can fix the original issue by
> adding
> > > a
> > > > > > > >> "generation" field into `JoinGroupRequest` instead, and also
> > > add a
> > > > > > field
> > > > > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field.
> > > That
> > > > > > way, the
> > > > > > > >> broker can identify the old member from `JoinGroupRequest`.
> And
> > > > the
> > > > > > > >> assignor can also get the "generation" info via the
> > > `Subscription`
> > > > > > instance.
> > > > > > > >>
> > > > > > > >> I'll update the KIP to add "generation" field into
> > > > > `JoinGroupRequest`
> > > > > > and
> > > > > > > >> `JoinGroupResponse`, if there is no other options.
> > > > > > > >>
> > > > > > > >> Thank you.
> > > > > > > >> Luke
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > > > > <dj...@confluent.io.invalid>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >>> Hi Luke,
> > > > > > > >>>
> > > > > > > >>> Thanks for the KIP. Overall, I think that the motivation
> makes
> > > > > > sense. I
> > > > > > > >>> have a couple of comments/questions:
> > > > > > > >>>
> > > > > > > >>> 1. In the Public Interfaces section, it would be great if
> you
> > > > could
> > > > > > put
> > > > > > > >>> the
> > > > > > > >>> end state not the current one.
> > > > > > > >>>
> > > > > > > >>> 2. Do we need to update the Subscription class to expose
> the
> > > > > > > >>> generation? If so, it would be great to mention it in the
> > > Public
> > > > > > > >>> Interfaces section as well.
> > > > > > > >>>
> > > > > > > >>> 3. You mention that the broker will set the generation if
> the
> > > > > > > >>> subscription
> > > > > > > >>> contains a sentinel value (-1). As of today, the broker
> does
> > > not
> > > > > > parse
> > > > > > > >>> the subscription so I am not sure how/why we would do
> this. I
> > > > > suppose
> > > > > > > >>> that we could add a `generation` field to the
> JoinGroupRequest
> > > > > > instead
> > > > > > > >>> to do the fencing that you describe while handling the
> sentinel
> > > > in
> > > > > > the
> > > > > > > >>> assignor directly. If we would add the `generation` to the
> > > > request
> > > > > > > >>> itself,
> > > > > > > >>> would we need the `generation` in the subscription
> protocol as
> > > > > well?
> > > > > > > >>>
> > > > > > > >>> Best,
> > > > > > > >>> David
> > > > > > > >>>
> > > > > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <
> showuon@gmail.com>
> > > > > wrote:
> > > > > > > >>> >
> > > > > > > >>> > Hi all,
> > > > > > > >>> >
> > > > > > > >>> > I'd like to start the discussion for KIP-792: Add
> > > "generation"
> > > > > > field
> > > > > > > >>> into
> > > > > > > >>> > consumer protocol.
> > > > > > > >>> >
> > > > > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > > > > coordinator/group
> > > > > > > >>> > coordinator to have a way to identify the out-of-date
> > > > > > > >>> members/assignments.
> > > > > > > >>> >
> > > > > > > >>> > Detailed description can be found here:
> > > > > > > >>> >
> > > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > > > > >>> >
> > > > > > > >>> > Any feedback is welcome.
> > > > > > > >>> >
> > > > > > > >>> > Thank you.
> > > > > > > >>> > Luke
> > > > > > > >>>
> > > > > > > >>
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> > --
> > -- Guozhang
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
Hi Luke,

I apologize for my late reply. I was out for a while.

Coming back to my previous point, could you also
spell out the new method(s) that we need to add to
the Subscription class?

Thanks,
David

On Mon, Feb 14, 2022 at 6:28 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> Thanks Luke, no more comments from me, nice work!
>
> On Mon, Feb 14, 2022 at 5:22 AM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi Guozhang,
> >
> > Thanks for your comments. I've updated the KIP.
> > Here's what I've updated:
> >
> > * In the motivation section, I've added this paragraph after
> > cooperativeStickyAssignor like this:
> >
> > *On the other hand,  `StickyAssignor` is also adding "generation" field
> > plus the "ownedPartitions" into subscription userData bytes. the difference
> > is that the `StickyAssignor`'s user bytes also encode the prev-owned
> > partitions while the `CooperativeStickyAssignor` relies on the prev-owned
> > partitions on the subscription protocol directly.*
> >
> > * In the proposed change section, I've updated the paragraph as:
> >
> >
> > *For built-in CooperativeStickyAssignor, if there are consumers in old
> > bytecode and some in the new bytecode, it's totally fine, because the
> > subscription data from old consumers will contain \[empty ownedPartitions +
> > default generation(-1)] in V0, or \[current ownedPartitions + default
> > generation(-1)] in V1. For V0 case, it's quite simple, because we'll just
> > ignore the info since they are empty. For V1 case, we'll get the
> > "ownedPartitions" data, and then decode the "generation" info in the
> > subscription userData bytes. So that we can continue to do assignment with
> > these information.*
> > * Also, after the "cooperativeStickyAssignor paragraph, I've also mentioned
> > stickyAssignor:
> >
> >
> > *For built-in StickyAssignor, if there are consumers in old bytecode and
> > some in the new bytecode, it's also fine, because the subscription data
> > from old consumers will contain \[empty ownedPartitions + default
> > generation(-1)] in V0, or \[current ownedPartitions + default
> > generation(-1)] in V1. For both V0 and V1 case, we'll directly use the
> > ownedPartition and generation info in the subscription userData bytes. *
> >
> > Please let me know if you have other comments.
> >
> > Thank you.
> > Luke
> >
> > On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> > > Hello Luke,
> > >
> > > Thanks for the updated KIP, I've taken a look at it and still LGTM. Just
> > a
> > > couple minor comments in the wiki:
> > >
> > > * Both `StickyAssignor` and `CooperativeStickyAssignor` that there's
> > > already generation is encoded in user-data bytes, the difference is that
> > > the `StickyAssignor`'s user bytes also encode the prev-owned partitions
> > > while the `CooperativeStickyAssignor` relies on the prev-owned partitions
> > > on the subscription protocol directly. So we can add the `StickyAssignor`
> > > in your paragraph talking about `CooperativeStickyAssignor` as well.
> > >
> > > * This sentence: "otherwise, we'll take the ownedPartitions as default
> > > generation(-1)." does not read right to me, maybe need to rephrase a bit?
> > >
> > >
> > > Guozhang
> > >
> > > On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > > Hi David,
> > > >
> > > > Thanks for your comments.
> > > > I've updated the KIP to add changes in Subscription class.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Fri, Feb 4, 2022 at 11:43 PM David Jacot
> > <djacot@confluent.io.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Hi Luke,
> > > > >
> > > > > Thanks for updating the KIP. I just have a minor request.
> > > > > Could you fully describe the changes to the Subscription
> > > > > public class in the KIP? I think that it would be better than
> > > > > just saying that the generation will be added to id.
> > > > >
> > > > > Otherwise, the KIP LGTM.
> > > > >
> > > > > Thanks,
> > > > > David
> > > > >
> > > > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
> > > > > >
> > > > > > Hi devs,
> > > > > > Welcome to provide feedback.
> > > > > >
> > > > > > If there are no other comments, I'll start a vote tomorrow.
> > > > > >
> > > > > > Thank you.
> > > > > > Luke
> > > > > >
> > > > > >
> > > > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hello David,
> > > > > > >
> > > > > > > For (3):
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > * I suppose that we could add a `generation` field to the
> > > > > JoinGroupRequest
> > > > > > > instead to do the fencing that you describe while handling the
> > > > > sentinel in
> > > > > > > the assignor directly. If we would add the `generation` to the
> > > > request
> > > > > > > itself, would we need the `generation` in the subscription
> > protocol
> > > > as
> > > > > > > well?*
> > > > > > >
> > > > > > > On second thought, I think this is not better than adding
> > > > `generation`
> > > > > > > field in the subscription protocol, because I think we don't have
> > > to
> > > > > do any
> > > > > > > generation validation on joinGroup request. The purpose of
> > > > > > > `joinGroupRequest` is to accept any members to join this group,
> > > even
> > > > > if the
> > > > > > > member is new or ever joined or what. As long as we have the
> > > > > generationId
> > > > > > > in the subscription metadata, the consumer lead can leverage the
> > > info
> > > > > to
> > > > > > > ignore the old ownedPartitions (or do other handling), and the
> > > > > rebalance
> > > > > > > can still complete successfully and correctly. On the other hand,
> > > if
> > > > > we did
> > > > > > > the generation check on JoinGroupRequest, and return
> > > > > `ILLEGAL_GENERATION`
> > > > > > > back to consumer, the consumer needs to clear its generation info
> > > and
> > > > > > > rejoin the group to continue the rebalance. It needs more
> > > > > request/response
> > > > > > > network and slow down the rebalance.
> > > > > > >
> > > > > > > So I think we should add the `generationId` field into
> > Subscription
> > > > > > > protocol to achieve what we want.
> > > > > > >
> > > > > > > Thank you.
> > > > > > > Luke
> > > > > > >
> > > > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > >> Hi David,
> > > > > > >> Thanks for your feedback.
> > > > > > >>
> > > > > > >> I've updated the KIP for your comments (1)(2).
> > > > > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > > > > subscription
> > > > > > >> metadata on broker side, and it's not necessary to add overhead
> > on
> > > > > broker
> > > > > > >> side. And, yes, I think we can fix the original issue by adding
> > a
> > > > > > >> "generation" field into `JoinGroupRequest` instead, and also
> > add a
> > > > > field
> > > > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field.
> > That
> > > > > way, the
> > > > > > >> broker can identify the old member from `JoinGroupRequest`. And
> > > the
> > > > > > >> assignor can also get the "generation" info via the
> > `Subscription`
> > > > > instance.
> > > > > > >>
> > > > > > >> I'll update the KIP to add "generation" field into
> > > > `JoinGroupRequest`
> > > > > and
> > > > > > >> `JoinGroupResponse`, if there is no other options.
> > > > > > >>
> > > > > > >> Thank you.
> > > > > > >> Luke
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > > > <dj...@confluent.io.invalid>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Hi Luke,
> > > > > > >>>
> > > > > > >>> Thanks for the KIP. Overall, I think that the motivation makes
> > > > > sense. I
> > > > > > >>> have a couple of comments/questions:
> > > > > > >>>
> > > > > > >>> 1. In the Public Interfaces section, it would be great if you
> > > could
> > > > > put
> > > > > > >>> the
> > > > > > >>> end state not the current one.
> > > > > > >>>
> > > > > > >>> 2. Do we need to update the Subscription class to expose the
> > > > > > >>> generation? If so, it would be great to mention it in the
> > Public
> > > > > > >>> Interfaces section as well.
> > > > > > >>>
> > > > > > >>> 3. You mention that the broker will set the generation if the
> > > > > > >>> subscription
> > > > > > >>> contains a sentinel value (-1). As of today, the broker does
> > not
> > > > > parse
> > > > > > >>> the subscription so I am not sure how/why we would do this. I
> > > > suppose
> > > > > > >>> that we could add a `generation` field to the JoinGroupRequest
> > > > > instead
> > > > > > >>> to do the fencing that you describe while handling the sentinel
> > > in
> > > > > the
> > > > > > >>> assignor directly. If we would add the `generation` to the
> > > request
> > > > > > >>> itself,
> > > > > > >>> would we need the `generation` in the subscription protocol as
> > > > well?
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> David
> > > > > > >>>
> > > > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com>
> > > > wrote:
> > > > > > >>> >
> > > > > > >>> > Hi all,
> > > > > > >>> >
> > > > > > >>> > I'd like to start the discussion for KIP-792: Add
> > "generation"
> > > > > field
> > > > > > >>> into
> > > > > > >>> > consumer protocol.
> > > > > > >>> >
> > > > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > > > coordinator/group
> > > > > > >>> > coordinator to have a way to identify the out-of-date
> > > > > > >>> members/assignments.
> > > > > > >>> >
> > > > > > >>> > Detailed description can be found here:
> > > > > > >>> >
> > > > > > >>>
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > > > >>> >
> > > > > > >>> > Any feedback is welcome.
> > > > > > >>> >
> > > > > > >>> > Thank you.
> > > > > > >>> > Luke
> > > > > > >>>
> > > > > > >>
> > > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
> --
> -- Guozhang

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Luke, no more comments from me, nice work!

On Mon, Feb 14, 2022 at 5:22 AM Luke Chen <sh...@gmail.com> wrote:

> Hi Guozhang,
>
> Thanks for your comments. I've updated the KIP.
> Here's what I've updated:
>
> * In the motivation section, I've added this paragraph after
> cooperativeStickyAssignor like this:
>
> *On the other hand,  `StickyAssignor` is also adding "generation" field
> plus the "ownedPartitions" into subscription userData bytes. the difference
> is that the `StickyAssignor`'s user bytes also encode the prev-owned
> partitions while the `CooperativeStickyAssignor` relies on the prev-owned
> partitions on the subscription protocol directly.*
>
> * In the proposed change section, I've updated the paragraph as:
>
>
> *For built-in CooperativeStickyAssignor, if there are consumers in old
> bytecode and some in the new bytecode, it's totally fine, because the
> subscription data from old consumers will contain \[empty ownedPartitions +
> default generation(-1)] in V0, or \[current ownedPartitions + default
> generation(-1)] in V1. For V0 case, it's quite simple, because we'll just
> ignore the info since they are empty. For V1 case, we'll get the
> "ownedPartitions" data, and then decode the "generation" info in the
> subscription userData bytes. So that we can continue to do assignment with
> these information.*
> * Also, after the "cooperativeStickyAssignor paragraph, I've also mentioned
> stickyAssignor:
>
>
> *For built-in StickyAssignor, if there are consumers in old bytecode and
> some in the new bytecode, it's also fine, because the subscription data
> from old consumers will contain \[empty ownedPartitions + default
> generation(-1)] in V0, or \[current ownedPartitions + default
> generation(-1)] in V1. For both V0 and V1 case, we'll directly use the
> ownedPartition and generation info in the subscription userData bytes. *
>
> Please let me know if you have other comments.
>
> Thank you.
> Luke
>
> On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hello Luke,
> >
> > Thanks for the updated KIP, I've taken a look at it and still LGTM. Just
> a
> > couple minor comments in the wiki:
> >
> > * Both `StickyAssignor` and `CooperativeStickyAssignor` that there's
> > already generation is encoded in user-data bytes, the difference is that
> > the `StickyAssignor`'s user bytes also encode the prev-owned partitions
> > while the `CooperativeStickyAssignor` relies on the prev-owned partitions
> > on the subscription protocol directly. So we can add the `StickyAssignor`
> > in your paragraph talking about `CooperativeStickyAssignor` as well.
> >
> > * This sentence: "otherwise, we'll take the ownedPartitions as default
> > generation(-1)." does not read right to me, maybe need to rephrase a bit?
> >
> >
> > Guozhang
> >
> > On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi David,
> > >
> > > Thanks for your comments.
> > > I've updated the KIP to add changes in Subscription class.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Fri, Feb 4, 2022 at 11:43 PM David Jacot
> <djacot@confluent.io.invalid
> > >
> > > wrote:
> > >
> > > > Hi Luke,
> > > >
> > > > Thanks for updating the KIP. I just have a minor request.
> > > > Could you fully describe the changes to the Subscription
> > > > public class in the KIP? I think that it would be better than
> > > > just saying that the generation will be added to id.
> > > >
> > > > Otherwise, the KIP LGTM.
> > > >
> > > > Thanks,
> > > > David
> > > >
> > > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
> > > > >
> > > > > Hi devs,
> > > > > Welcome to provide feedback.
> > > > >
> > > > > If there are no other comments, I'll start a vote tomorrow.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > >
> > > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com>
> wrote:
> > > > >
> > > > > > Hello David,
> > > > > >
> > > > > > For (3):
> > > > > >
> > > > > >
> > > > > >
> > > > > > * I suppose that we could add a `generation` field to the
> > > > JoinGroupRequest
> > > > > > instead to do the fencing that you describe while handling the
> > > > sentinel in
> > > > > > the assignor directly. If we would add the `generation` to the
> > > request
> > > > > > itself, would we need the `generation` in the subscription
> protocol
> > > as
> > > > > > well?*
> > > > > >
> > > > > > On second thought, I think this is not better than adding
> > > `generation`
> > > > > > field in the subscription protocol, because I think we don't have
> > to
> > > > do any
> > > > > > generation validation on joinGroup request. The purpose of
> > > > > > `joinGroupRequest` is to accept any members to join this group,
> > even
> > > > if the
> > > > > > member is new or ever joined or what. As long as we have the
> > > > generationId
> > > > > > in the subscription metadata, the consumer lead can leverage the
> > info
> > > > to
> > > > > > ignore the old ownedPartitions (or do other handling), and the
> > > > rebalance
> > > > > > can still complete successfully and correctly. On the other hand,
> > if
> > > > we did
> > > > > > the generation check on JoinGroupRequest, and return
> > > > `ILLEGAL_GENERATION`
> > > > > > back to consumer, the consumer needs to clear its generation info
> > and
> > > > > > rejoin the group to continue the rebalance. It needs more
> > > > request/response
> > > > > > network and slow down the rebalance.
> > > > > >
> > > > > > So I think we should add the `generationId` field into
> Subscription
> > > > > > protocol to achieve what we want.
> > > > > >
> > > > > > Thank you.
> > > > > > Luke
> > > > > >
> > > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com>
> > wrote:
> > > > > >
> > > > > >> Hi David,
> > > > > >> Thanks for your feedback.
> > > > > >>
> > > > > >> I've updated the KIP for your comments (1)(2).
> > > > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > > > subscription
> > > > > >> metadata on broker side, and it's not necessary to add overhead
> on
> > > > broker
> > > > > >> side. And, yes, I think we can fix the original issue by adding
> a
> > > > > >> "generation" field into `JoinGroupRequest` instead, and also
> add a
> > > > field
> > > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field.
> That
> > > > way, the
> > > > > >> broker can identify the old member from `JoinGroupRequest`. And
> > the
> > > > > >> assignor can also get the "generation" info via the
> `Subscription`
> > > > instance.
> > > > > >>
> > > > > >> I'll update the KIP to add "generation" field into
> > > `JoinGroupRequest`
> > > > and
> > > > > >> `JoinGroupResponse`, if there is no other options.
> > > > > >>
> > > > > >> Thank you.
> > > > > >> Luke
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > > <dj...@confluent.io.invalid>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Hi Luke,
> > > > > >>>
> > > > > >>> Thanks for the KIP. Overall, I think that the motivation makes
> > > > sense. I
> > > > > >>> have a couple of comments/questions:
> > > > > >>>
> > > > > >>> 1. In the Public Interfaces section, it would be great if you
> > could
> > > > put
> > > > > >>> the
> > > > > >>> end state not the current one.
> > > > > >>>
> > > > > >>> 2. Do we need to update the Subscription class to expose the
> > > > > >>> generation? If so, it would be great to mention it in the
> Public
> > > > > >>> Interfaces section as well.
> > > > > >>>
> > > > > >>> 3. You mention that the broker will set the generation if the
> > > > > >>> subscription
> > > > > >>> contains a sentinel value (-1). As of today, the broker does
> not
> > > > parse
> > > > > >>> the subscription so I am not sure how/why we would do this. I
> > > suppose
> > > > > >>> that we could add a `generation` field to the JoinGroupRequest
> > > > instead
> > > > > >>> to do the fencing that you describe while handling the sentinel
> > in
> > > > the
> > > > > >>> assignor directly. If we would add the `generation` to the
> > request
> > > > > >>> itself,
> > > > > >>> would we need the `generation` in the subscription protocol as
> > > well?
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> David
> > > > > >>>
> > > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com>
> > > wrote:
> > > > > >>> >
> > > > > >>> > Hi all,
> > > > > >>> >
> > > > > >>> > I'd like to start the discussion for KIP-792: Add
> "generation"
> > > > field
> > > > > >>> into
> > > > > >>> > consumer protocol.
> > > > > >>> >
> > > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > > coordinator/group
> > > > > >>> > coordinator to have a way to identify the out-of-date
> > > > > >>> members/assignments.
> > > > > >>> >
> > > > > >>> > Detailed description can be found here:
> > > > > >>> >
> > > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > > >>> >
> > > > > >>> > Any feedback is welcome.
> > > > > >>> >
> > > > > >>> > Thank you.
> > > > > >>> > Luke
> > > > > >>>
> > > > > >>
> > > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi Guozhang,

Thanks for your comments. I've updated the KIP.
Here's what I've updated:

* In the motivation section, I've added this paragraph after
cooperativeStickyAssignor like this:

*On the other hand,  `StickyAssignor` is also adding "generation" field
plus the "ownedPartitions" into subscription userData bytes. the difference
is that the `StickyAssignor`'s user bytes also encode the prev-owned
partitions while the `CooperativeStickyAssignor` relies on the prev-owned
partitions on the subscription protocol directly.*

* In the proposed change section, I've updated the paragraph as:


*For built-in CooperativeStickyAssignor, if there are consumers in old
bytecode and some in the new bytecode, it's totally fine, because the
subscription data from old consumers will contain \[empty ownedPartitions +
default generation(-1)] in V0, or \[current ownedPartitions + default
generation(-1)] in V1. For V0 case, it's quite simple, because we'll just
ignore the info since they are empty. For V1 case, we'll get the
"ownedPartitions" data, and then decode the "generation" info in the
subscription userData bytes. So that we can continue to do assignment with
these information.*
* Also, after the "cooperativeStickyAssignor paragraph, I've also mentioned
stickyAssignor:


*For built-in StickyAssignor, if there are consumers in old bytecode and
some in the new bytecode, it's also fine, because the subscription data
from old consumers will contain \[empty ownedPartitions + default
generation(-1)] in V0, or \[current ownedPartitions + default
generation(-1)] in V1. For both V0 and V1 case, we'll directly use the
ownedPartition and generation info in the subscription userData bytes. *

Please let me know if you have other comments.

Thank you.
Luke

On Wed, Feb 9, 2022 at 2:57 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hello Luke,
>
> Thanks for the updated KIP, I've taken a look at it and still LGTM. Just a
> couple minor comments in the wiki:
>
> * Both `StickyAssignor` and `CooperativeStickyAssignor` that there's
> already generation is encoded in user-data bytes, the difference is that
> the `StickyAssignor`'s user bytes also encode the prev-owned partitions
> while the `CooperativeStickyAssignor` relies on the prev-owned partitions
> on the subscription protocol directly. So we can add the `StickyAssignor`
> in your paragraph talking about `CooperativeStickyAssignor` as well.
>
> * This sentence: "otherwise, we'll take the ownedPartitions as default
> generation(-1)." does not read right to me, maybe need to rephrase a bit?
>
>
> Guozhang
>
> On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi David,
> >
> > Thanks for your comments.
> > I've updated the KIP to add changes in Subscription class.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Feb 4, 2022 at 11:43 PM David Jacot <djacot@confluent.io.invalid
> >
> > wrote:
> >
> > > Hi Luke,
> > >
> > > Thanks for updating the KIP. I just have a minor request.
> > > Could you fully describe the changes to the Subscription
> > > public class in the KIP? I think that it would be better than
> > > just saying that the generation will be added to id.
> > >
> > > Otherwise, the KIP LGTM.
> > >
> > > Thanks,
> > > David
> > >
> > > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > > Hi devs,
> > > > Welcome to provide feedback.
> > > >
> > > > If there are no other comments, I'll start a vote tomorrow.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > >
> > > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > > > Hello David,
> > > > >
> > > > > For (3):
> > > > >
> > > > >
> > > > >
> > > > > * I suppose that we could add a `generation` field to the
> > > JoinGroupRequest
> > > > > instead to do the fencing that you describe while handling the
> > > sentinel in
> > > > > the assignor directly. If we would add the `generation` to the
> > request
> > > > > itself, would we need the `generation` in the subscription protocol
> > as
> > > > > well?*
> > > > >
> > > > > On second thought, I think this is not better than adding
> > `generation`
> > > > > field in the subscription protocol, because I think we don't have
> to
> > > do any
> > > > > generation validation on joinGroup request. The purpose of
> > > > > `joinGroupRequest` is to accept any members to join this group,
> even
> > > if the
> > > > > member is new or ever joined or what. As long as we have the
> > > generationId
> > > > > in the subscription metadata, the consumer lead can leverage the
> info
> > > to
> > > > > ignore the old ownedPartitions (or do other handling), and the
> > > rebalance
> > > > > can still complete successfully and correctly. On the other hand,
> if
> > > we did
> > > > > the generation check on JoinGroupRequest, and return
> > > `ILLEGAL_GENERATION`
> > > > > back to consumer, the consumer needs to clear its generation info
> and
> > > > > rejoin the group to continue the rebalance. It needs more
> > > request/response
> > > > > network and slow down the rebalance.
> > > > >
> > > > > So I think we should add the `generationId` field into Subscription
> > > > > protocol to achieve what we want.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com>
> wrote:
> > > > >
> > > > >> Hi David,
> > > > >> Thanks for your feedback.
> > > > >>
> > > > >> I've updated the KIP for your comments (1)(2).
> > > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > > subscription
> > > > >> metadata on broker side, and it's not necessary to add overhead on
> > > broker
> > > > >> side. And, yes, I think we can fix the original issue by adding a
> > > > >> "generation" field into `JoinGroupRequest` instead, and also add a
> > > field
> > > > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That
> > > way, the
> > > > >> broker can identify the old member from `JoinGroupRequest`. And
> the
> > > > >> assignor can also get the "generation" info via the `Subscription`
> > > instance.
> > > > >>
> > > > >> I'll update the KIP to add "generation" field into
> > `JoinGroupRequest`
> > > and
> > > > >> `JoinGroupResponse`, if there is no other options.
> > > > >>
> > > > >> Thank you.
> > > > >> Luke
> > > > >>
> > > > >>
> > > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > > <dj...@confluent.io.invalid>
> > > > >> wrote:
> > > > >>
> > > > >>> Hi Luke,
> > > > >>>
> > > > >>> Thanks for the KIP. Overall, I think that the motivation makes
> > > sense. I
> > > > >>> have a couple of comments/questions:
> > > > >>>
> > > > >>> 1. In the Public Interfaces section, it would be great if you
> could
> > > put
> > > > >>> the
> > > > >>> end state not the current one.
> > > > >>>
> > > > >>> 2. Do we need to update the Subscription class to expose the
> > > > >>> generation? If so, it would be great to mention it in the Public
> > > > >>> Interfaces section as well.
> > > > >>>
> > > > >>> 3. You mention that the broker will set the generation if the
> > > > >>> subscription
> > > > >>> contains a sentinel value (-1). As of today, the broker does not
> > > parse
> > > > >>> the subscription so I am not sure how/why we would do this. I
> > suppose
> > > > >>> that we could add a `generation` field to the JoinGroupRequest
> > > instead
> > > > >>> to do the fencing that you describe while handling the sentinel
> in
> > > the
> > > > >>> assignor directly. If we would add the `generation` to the
> request
> > > > >>> itself,
> > > > >>> would we need the `generation` in the subscription protocol as
> > well?
> > > > >>>
> > > > >>> Best,
> > > > >>> David
> > > > >>>
> > > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com>
> > wrote:
> > > > >>> >
> > > > >>> > Hi all,
> > > > >>> >
> > > > >>> > I'd like to start the discussion for KIP-792: Add "generation"
> > > field
> > > > >>> into
> > > > >>> > consumer protocol.
> > > > >>> >
> > > > >>> > The goal of this KIP is to allow assignor/consumer
> > > coordinator/group
> > > > >>> > coordinator to have a way to identify the out-of-date
> > > > >>> members/assignments.
> > > > >>> >
> > > > >>> > Detailed description can be found here:
> > > > >>> >
> > > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > > >>> >
> > > > >>> > Any feedback is welcome.
> > > > >>> >
> > > > >>> > Thank you.
> > > > >>> > Luke
> > > > >>>
> > > > >>
> > >
> >
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Luke,

Thanks for the updated KIP, I've taken a look at it and still LGTM. Just a
couple minor comments in the wiki:

* Both `StickyAssignor` and `CooperativeStickyAssignor` that there's
already generation is encoded in user-data bytes, the difference is that
the `StickyAssignor`'s user bytes also encode the prev-owned partitions
while the `CooperativeStickyAssignor` relies on the prev-owned partitions
on the subscription protocol directly. So we can add the `StickyAssignor`
in your paragraph talking about `CooperativeStickyAssignor` as well.

* This sentence: "otherwise, we'll take the ownedPartitions as default
generation(-1)." does not read right to me, maybe need to rephrase a bit?


Guozhang

On Mon, Feb 7, 2022 at 7:36 PM Luke Chen <sh...@gmail.com> wrote:

> Hi David,
>
> Thanks for your comments.
> I've updated the KIP to add changes in Subscription class.
>
> Thank you.
> Luke
>
> On Fri, Feb 4, 2022 at 11:43 PM David Jacot <dj...@confluent.io.invalid>
> wrote:
>
> > Hi Luke,
> >
> > Thanks for updating the KIP. I just have a minor request.
> > Could you fully describe the changes to the Subscription
> > public class in the KIP? I think that it would be better than
> > just saying that the generation will be added to id.
> >
> > Otherwise, the KIP LGTM.
> >
> > Thanks,
> > David
> >
> > On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > Hi devs,
> > > Welcome to provide feedback.
> > >
> > > If there are no other comments, I'll start a vote tomorrow.
> > >
> > > Thank you.
> > > Luke
> > >
> > >
> > > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com> wrote:
> > >
> > > > Hello David,
> > > >
> > > > For (3):
> > > >
> > > >
> > > >
> > > > * I suppose that we could add a `generation` field to the
> > JoinGroupRequest
> > > > instead to do the fencing that you describe while handling the
> > sentinel in
> > > > the assignor directly. If we would add the `generation` to the
> request
> > > > itself, would we need the `generation` in the subscription protocol
> as
> > > > well?*
> > > >
> > > > On second thought, I think this is not better than adding
> `generation`
> > > > field in the subscription protocol, because I think we don't have to
> > do any
> > > > generation validation on joinGroup request. The purpose of
> > > > `joinGroupRequest` is to accept any members to join this group, even
> > if the
> > > > member is new or ever joined or what. As long as we have the
> > generationId
> > > > in the subscription metadata, the consumer lead can leverage the info
> > to
> > > > ignore the old ownedPartitions (or do other handling), and the
> > rebalance
> > > > can still complete successfully and correctly. On the other hand, if
> > we did
> > > > the generation check on JoinGroupRequest, and return
> > `ILLEGAL_GENERATION`
> > > > back to consumer, the consumer needs to clear its generation info and
> > > > rejoin the group to continue the rebalance. It needs more
> > request/response
> > > > network and slow down the rebalance.
> > > >
> > > > So I think we should add the `generationId` field into Subscription
> > > > protocol to achieve what we want.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com> wrote:
> > > >
> > > >> Hi David,
> > > >> Thanks for your feedback.
> > > >>
> > > >> I've updated the KIP for your comments (1)(2).
> > > >> For (3), it's a good point! Yes, we didn't deserialize the
> > subscription
> > > >> metadata on broker side, and it's not necessary to add overhead on
> > broker
> > > >> side. And, yes, I think we can fix the original issue by adding a
> > > >> "generation" field into `JoinGroupRequest` instead, and also add a
> > field
> > > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That
> > way, the
> > > >> broker can identify the old member from `JoinGroupRequest`. And the
> > > >> assignor can also get the "generation" info via the `Subscription`
> > instance.
> > > >>
> > > >> I'll update the KIP to add "generation" field into
> `JoinGroupRequest`
> > and
> > > >> `JoinGroupResponse`, if there is no other options.
> > > >>
> > > >> Thank you.
> > > >> Luke
> > > >>
> > > >>
> > > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> > <dj...@confluent.io.invalid>
> > > >> wrote:
> > > >>
> > > >>> Hi Luke,
> > > >>>
> > > >>> Thanks for the KIP. Overall, I think that the motivation makes
> > sense. I
> > > >>> have a couple of comments/questions:
> > > >>>
> > > >>> 1. In the Public Interfaces section, it would be great if you could
> > put
> > > >>> the
> > > >>> end state not the current one.
> > > >>>
> > > >>> 2. Do we need to update the Subscription class to expose the
> > > >>> generation? If so, it would be great to mention it in the Public
> > > >>> Interfaces section as well.
> > > >>>
> > > >>> 3. You mention that the broker will set the generation if the
> > > >>> subscription
> > > >>> contains a sentinel value (-1). As of today, the broker does not
> > parse
> > > >>> the subscription so I am not sure how/why we would do this. I
> suppose
> > > >>> that we could add a `generation` field to the JoinGroupRequest
> > instead
> > > >>> to do the fencing that you describe while handling the sentinel in
> > the
> > > >>> assignor directly. If we would add the `generation` to the request
> > > >>> itself,
> > > >>> would we need the `generation` in the subscription protocol as
> well?
> > > >>>
> > > >>> Best,
> > > >>> David
> > > >>>
> > > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com>
> wrote:
> > > >>> >
> > > >>> > Hi all,
> > > >>> >
> > > >>> > I'd like to start the discussion for KIP-792: Add "generation"
> > field
> > > >>> into
> > > >>> > consumer protocol.
> > > >>> >
> > > >>> > The goal of this KIP is to allow assignor/consumer
> > coordinator/group
> > > >>> > coordinator to have a way to identify the out-of-date
> > > >>> members/assignments.
> > > >>> >
> > > >>> > Detailed description can be found here:
> > > >>> >
> > > >>>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > > >>> >
> > > >>> > Any feedback is welcome.
> > > >>> >
> > > >>> > Thank you.
> > > >>> > Luke
> > > >>>
> > > >>
> >
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi David,

Thanks for your comments.
I've updated the KIP to add changes in Subscription class.

Thank you.
Luke

On Fri, Feb 4, 2022 at 11:43 PM David Jacot <dj...@confluent.io.invalid>
wrote:

> Hi Luke,
>
> Thanks for updating the KIP. I just have a minor request.
> Could you fully describe the changes to the Subscription
> public class in the KIP? I think that it would be better than
> just saying that the generation will be added to id.
>
> Otherwise, the KIP LGTM.
>
> Thanks,
> David
>
> On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
> >
> > Hi devs,
> > Welcome to provide feedback.
> >
> > If there are no other comments, I'll start a vote tomorrow.
> >
> > Thank you.
> > Luke
> >
> >
> > On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hello David,
> > >
> > > For (3):
> > >
> > >
> > >
> > > * I suppose that we could add a `generation` field to the
> JoinGroupRequest
> > > instead to do the fencing that you describe while handling the
> sentinel in
> > > the assignor directly. If we would add the `generation` to the request
> > > itself, would we need the `generation` in the subscription protocol as
> > > well?*
> > >
> > > On second thought, I think this is not better than adding `generation`
> > > field in the subscription protocol, because I think we don't have to
> do any
> > > generation validation on joinGroup request. The purpose of
> > > `joinGroupRequest` is to accept any members to join this group, even
> if the
> > > member is new or ever joined or what. As long as we have the
> generationId
> > > in the subscription metadata, the consumer lead can leverage the info
> to
> > > ignore the old ownedPartitions (or do other handling), and the
> rebalance
> > > can still complete successfully and correctly. On the other hand, if
> we did
> > > the generation check on JoinGroupRequest, and return
> `ILLEGAL_GENERATION`
> > > back to consumer, the consumer needs to clear its generation info and
> > > rejoin the group to continue the rebalance. It needs more
> request/response
> > > network and slow down the rebalance.
> > >
> > > So I think we should add the `generationId` field into Subscription
> > > protocol to achieve what we want.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com> wrote:
> > >
> > >> Hi David,
> > >> Thanks for your feedback.
> > >>
> > >> I've updated the KIP for your comments (1)(2).
> > >> For (3), it's a good point! Yes, we didn't deserialize the
> subscription
> > >> metadata on broker side, and it's not necessary to add overhead on
> broker
> > >> side. And, yes, I think we can fix the original issue by adding a
> > >> "generation" field into `JoinGroupRequest` instead, and also add a
> field
> > >> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That
> way, the
> > >> broker can identify the old member from `JoinGroupRequest`. And the
> > >> assignor can also get the "generation" info via the `Subscription`
> instance.
> > >>
> > >> I'll update the KIP to add "generation" field into `JoinGroupRequest`
> and
> > >> `JoinGroupResponse`, if there is no other options.
> > >>
> > >> Thank you.
> > >> Luke
> > >>
> > >>
> > >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot
> <dj...@confluent.io.invalid>
> > >> wrote:
> > >>
> > >>> Hi Luke,
> > >>>
> > >>> Thanks for the KIP. Overall, I think that the motivation makes
> sense. I
> > >>> have a couple of comments/questions:
> > >>>
> > >>> 1. In the Public Interfaces section, it would be great if you could
> put
> > >>> the
> > >>> end state not the current one.
> > >>>
> > >>> 2. Do we need to update the Subscription class to expose the
> > >>> generation? If so, it would be great to mention it in the Public
> > >>> Interfaces section as well.
> > >>>
> > >>> 3. You mention that the broker will set the generation if the
> > >>> subscription
> > >>> contains a sentinel value (-1). As of today, the broker does not
> parse
> > >>> the subscription so I am not sure how/why we would do this. I suppose
> > >>> that we could add a `generation` field to the JoinGroupRequest
> instead
> > >>> to do the fencing that you describe while handling the sentinel in
> the
> > >>> assignor directly. If we would add the `generation` to the request
> > >>> itself,
> > >>> would we need the `generation` in the subscription protocol as well?
> > >>>
> > >>> Best,
> > >>> David
> > >>>
> > >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
> > >>> >
> > >>> > Hi all,
> > >>> >
> > >>> > I'd like to start the discussion for KIP-792: Add "generation"
> field
> > >>> into
> > >>> > consumer protocol.
> > >>> >
> > >>> > The goal of this KIP is to allow assignor/consumer
> coordinator/group
> > >>> > coordinator to have a way to identify the out-of-date
> > >>> members/assignments.
> > >>> >
> > >>> > Detailed description can be found here:
> > >>> >
> > >>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> > >>> >
> > >>> > Any feedback is welcome.
> > >>> >
> > >>> > Thank you.
> > >>> > Luke
> > >>>
> > >>
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
Hi Luke,

Thanks for updating the KIP. I just have a minor request.
Could you fully describe the changes to the Subscription
public class in the KIP? I think that it would be better than
just saying that the generation will be added to id.

Otherwise, the KIP LGTM.

Thanks,
David

On Mon, Nov 29, 2021 at 4:29 AM Luke Chen <sh...@gmail.com> wrote:
>
> Hi devs,
> Welcome to provide feedback.
>
> If there are no other comments, I'll start a vote tomorrow.
>
> Thank you.
> Luke
>
>
> On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hello David,
> >
> > For (3):
> >
> >
> >
> > * I suppose that we could add a `generation` field to the JoinGroupRequest
> > instead to do the fencing that you describe while handling the sentinel in
> > the assignor directly. If we would add the `generation` to the request
> > itself, would we need the `generation` in the subscription protocol as
> > well?*
> >
> > On second thought, I think this is not better than adding `generation`
> > field in the subscription protocol, because I think we don't have to do any
> > generation validation on joinGroup request. The purpose of
> > `joinGroupRequest` is to accept any members to join this group, even if the
> > member is new or ever joined or what. As long as we have the generationId
> > in the subscription metadata, the consumer lead can leverage the info to
> > ignore the old ownedPartitions (or do other handling), and the rebalance
> > can still complete successfully and correctly. On the other hand, if we did
> > the generation check on JoinGroupRequest, and return `ILLEGAL_GENERATION`
> > back to consumer, the consumer needs to clear its generation info and
> > rejoin the group to continue the rebalance. It needs more request/response
> > network and slow down the rebalance.
> >
> > So I think we should add the `generationId` field into Subscription
> > protocol to achieve what we want.
> >
> > Thank you.
> > Luke
> >
> > On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com> wrote:
> >
> >> Hi David,
> >> Thanks for your feedback.
> >>
> >> I've updated the KIP for your comments (1)(2).
> >> For (3), it's a good point! Yes, we didn't deserialize the subscription
> >> metadata on broker side, and it's not necessary to add overhead on broker
> >> side. And, yes, I think we can fix the original issue by adding a
> >> "generation" field into `JoinGroupRequest` instead, and also add a field
> >> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That way, the
> >> broker can identify the old member from `JoinGroupRequest`. And the
> >> assignor can also get the "generation" info via the `Subscription` instance.
> >>
> >> I'll update the KIP to add "generation" field into `JoinGroupRequest` and
> >> `JoinGroupResponse`, if there is no other options.
> >>
> >> Thank you.
> >> Luke
> >>
> >>
> >> On Tue, Nov 16, 2021 at 12:31 AM David Jacot <dj...@confluent.io.invalid>
> >> wrote:
> >>
> >>> Hi Luke,
> >>>
> >>> Thanks for the KIP. Overall, I think that the motivation makes sense. I
> >>> have a couple of comments/questions:
> >>>
> >>> 1. In the Public Interfaces section, it would be great if you could put
> >>> the
> >>> end state not the current one.
> >>>
> >>> 2. Do we need to update the Subscription class to expose the
> >>> generation? If so, it would be great to mention it in the Public
> >>> Interfaces section as well.
> >>>
> >>> 3. You mention that the broker will set the generation if the
> >>> subscription
> >>> contains a sentinel value (-1). As of today, the broker does not parse
> >>> the subscription so I am not sure how/why we would do this. I suppose
> >>> that we could add a `generation` field to the JoinGroupRequest instead
> >>> to do the fencing that you describe while handling the sentinel in the
> >>> assignor directly. If we would add the `generation` to the request
> >>> itself,
> >>> would we need the `generation` in the subscription protocol as well?
> >>>
> >>> Best,
> >>> David
> >>>
> >>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
> >>> >
> >>> > Hi all,
> >>> >
> >>> > I'd like to start the discussion for KIP-792: Add "generation" field
> >>> into
> >>> > consumer protocol.
> >>> >
> >>> > The goal of this KIP is to allow assignor/consumer coordinator/group
> >>> > coordinator to have a way to identify the out-of-date
> >>> members/assignments.
> >>> >
> >>> > Detailed description can be found here:
> >>> >
> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >>> >
> >>> > Any feedback is welcome.
> >>> >
> >>> > Thank you.
> >>> > Luke
> >>>
> >>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi devs,
Welcome to provide feedback.

If there are no other comments, I'll start a vote tomorrow.

Thank you.
Luke


On Mon, Nov 22, 2021 at 4:16 PM Luke Chen <sh...@gmail.com> wrote:

> Hello David,
>
> For (3):
>
>
>
> * I suppose that we could add a `generation` field to the JoinGroupRequest
> instead to do the fencing that you describe while handling the sentinel in
> the assignor directly. If we would add the `generation` to the request
> itself, would we need the `generation` in the subscription protocol as
> well?*
>
> On second thought, I think this is not better than adding `generation`
> field in the subscription protocol, because I think we don't have to do any
> generation validation on joinGroup request. The purpose of
> `joinGroupRequest` is to accept any members to join this group, even if the
> member is new or ever joined or what. As long as we have the generationId
> in the subscription metadata, the consumer lead can leverage the info to
> ignore the old ownedPartitions (or do other handling), and the rebalance
> can still complete successfully and correctly. On the other hand, if we did
> the generation check on JoinGroupRequest, and return `ILLEGAL_GENERATION`
> back to consumer, the consumer needs to clear its generation info and
> rejoin the group to continue the rebalance. It needs more request/response
> network and slow down the rebalance.
>
> So I think we should add the `generationId` field into Subscription
> protocol to achieve what we want.
>
> Thank you.
> Luke
>
> On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com> wrote:
>
>> Hi David,
>> Thanks for your feedback.
>>
>> I've updated the KIP for your comments (1)(2).
>> For (3), it's a good point! Yes, we didn't deserialize the subscription
>> metadata on broker side, and it's not necessary to add overhead on broker
>> side. And, yes, I think we can fix the original issue by adding a
>> "generation" field into `JoinGroupRequest` instead, and also add a field
>> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That way, the
>> broker can identify the old member from `JoinGroupRequest`. And the
>> assignor can also get the "generation" info via the `Subscription` instance.
>>
>> I'll update the KIP to add "generation" field into `JoinGroupRequest` and
>> `JoinGroupResponse`, if there is no other options.
>>
>> Thank you.
>> Luke
>>
>>
>> On Tue, Nov 16, 2021 at 12:31 AM David Jacot <dj...@confluent.io.invalid>
>> wrote:
>>
>>> Hi Luke,
>>>
>>> Thanks for the KIP. Overall, I think that the motivation makes sense. I
>>> have a couple of comments/questions:
>>>
>>> 1. In the Public Interfaces section, it would be great if you could put
>>> the
>>> end state not the current one.
>>>
>>> 2. Do we need to update the Subscription class to expose the
>>> generation? If so, it would be great to mention it in the Public
>>> Interfaces section as well.
>>>
>>> 3. You mention that the broker will set the generation if the
>>> subscription
>>> contains a sentinel value (-1). As of today, the broker does not parse
>>> the subscription so I am not sure how/why we would do this. I suppose
>>> that we could add a `generation` field to the JoinGroupRequest instead
>>> to do the fencing that you describe while handling the sentinel in the
>>> assignor directly. If we would add the `generation` to the request
>>> itself,
>>> would we need the `generation` in the subscription protocol as well?
>>>
>>> Best,
>>> David
>>>
>>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I'd like to start the discussion for KIP-792: Add "generation" field
>>> into
>>> > consumer protocol.
>>> >
>>> > The goal of this KIP is to allow assignor/consumer coordinator/group
>>> > coordinator to have a way to identify the out-of-date
>>> members/assignments.
>>> >
>>> > Detailed description can be found here:
>>> >
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>>> >
>>> > Any feedback is welcome.
>>> >
>>> > Thank you.
>>> > Luke
>>>
>>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hello David,

For (3):



* I suppose that we could add a `generation` field to the JoinGroupRequest
instead to do the fencing that you describe while handling the sentinel in
the assignor directly. If we would add the `generation` to the request
itself, would we need the `generation` in the subscription protocol as
well?*

On second thought, I think this is not better than adding `generation`
field in the subscription protocol, because I think we don't have to do any
generation validation on joinGroup request. The purpose of
`joinGroupRequest` is to accept any members to join this group, even if the
member is new or ever joined or what. As long as we have the generationId
in the subscription metadata, the consumer lead can leverage the info to
ignore the old ownedPartitions (or do other handling), and the rebalance
can still complete successfully and correctly. On the other hand, if we did
the generation check on JoinGroupRequest, and return `ILLEGAL_GENERATION`
back to consumer, the consumer needs to clear its generation info and
rejoin the group to continue the rebalance. It needs more request/response
network and slow down the rebalance.

So I think we should add the `generationId` field into Subscription
protocol to achieve what we want.

Thank you.
Luke

On Thu, Nov 18, 2021 at 8:51 PM Luke Chen <sh...@gmail.com> wrote:

> Hi David,
> Thanks for your feedback.
>
> I've updated the KIP for your comments (1)(2).
> For (3), it's a good point! Yes, we didn't deserialize the subscription
> metadata on broker side, and it's not necessary to add overhead on broker
> side. And, yes, I think we can fix the original issue by adding a
> "generation" field into `JoinGroupRequest` instead, and also add a field
> into `JoinGroupResponse` in `JoinGroupResponseMember` field. That way, the
> broker can identify the old member from `JoinGroupRequest`. And the
> assignor can also get the "generation" info via the `Subscription` instance.
>
> I'll update the KIP to add "generation" field into `JoinGroupRequest` and
> `JoinGroupResponse`, if there is no other options.
>
> Thank you.
> Luke
>
>
> On Tue, Nov 16, 2021 at 12:31 AM David Jacot <dj...@confluent.io.invalid>
> wrote:
>
>> Hi Luke,
>>
>> Thanks for the KIP. Overall, I think that the motivation makes sense. I
>> have a couple of comments/questions:
>>
>> 1. In the Public Interfaces section, it would be great if you could put
>> the
>> end state not the current one.
>>
>> 2. Do we need to update the Subscription class to expose the
>> generation? If so, it would be great to mention it in the Public
>> Interfaces section as well.
>>
>> 3. You mention that the broker will set the generation if the subscription
>> contains a sentinel value (-1). As of today, the broker does not parse
>> the subscription so I am not sure how/why we would do this. I suppose
>> that we could add a `generation` field to the JoinGroupRequest instead
>> to do the fencing that you describe while handling the sentinel in the
>> assignor directly. If we would add the `generation` to the request itself,
>> would we need the `generation` in the subscription protocol as well?
>>
>> Best,
>> David
>>
>> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > I'd like to start the discussion for KIP-792: Add "generation" field
>> into
>> > consumer protocol.
>> >
>> > The goal of this KIP is to allow assignor/consumer coordinator/group
>> > coordinator to have a way to identify the out-of-date
>> members/assignments.
>> >
>> > Detailed description can be found here:
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >
>> > Any feedback is welcome.
>> >
>> > Thank you.
>> > Luke
>>
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by Luke Chen <sh...@gmail.com>.
Hi David,
Thanks for your feedback.

I've updated the KIP for your comments (1)(2).
For (3), it's a good point! Yes, we didn't deserialize the subscription
metadata on broker side, and it's not necessary to add overhead on broker
side. And, yes, I think we can fix the original issue by adding a
"generation" field into `JoinGroupRequest` instead, and also add a field
into `JoinGroupResponse` in `JoinGroupResponseMember` field. That way, the
broker can identify the old member from `JoinGroupRequest`. And the
assignor can also get the "generation" info via the `Subscription` instance.

I'll update the KIP to add "generation" field into `JoinGroupRequest` and
`JoinGroupResponse`, if there is no other options.

Thank you.
Luke


On Tue, Nov 16, 2021 at 12:31 AM David Jacot <dj...@confluent.io.invalid>
wrote:

> Hi Luke,
>
> Thanks for the KIP. Overall, I think that the motivation makes sense. I
> have a couple of comments/questions:
>
> 1. In the Public Interfaces section, it would be great if you could put the
> end state not the current one.
>
> 2. Do we need to update the Subscription class to expose the
> generation? If so, it would be great to mention it in the Public
> Interfaces section as well.
>
> 3. You mention that the broker will set the generation if the subscription
> contains a sentinel value (-1). As of today, the broker does not parse
> the subscription so I am not sure how/why we would do this. I suppose
> that we could add a `generation` field to the JoinGroupRequest instead
> to do the fencing that you describe while handling the sentinel in the
> assignor directly. If we would add the `generation` to the request itself,
> would we need the `generation` in the subscription protocol as well?
>
> Best,
> David
>
> On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
> >
> > Hi all,
> >
> > I'd like to start the discussion for KIP-792: Add "generation" field into
> > consumer protocol.
> >
> > The goal of this KIP is to allow assignor/consumer coordinator/group
> > coordinator to have a way to identify the out-of-date
> members/assignments.
> >
> > Detailed description can be found here:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >
> > Any feedback is welcome.
> >
> > Thank you.
> > Luke
>

Re: [DISCUSS] KIP-792: Add "generation" field into consumer protocol

Posted by David Jacot <dj...@confluent.io.INVALID>.
Hi Luke,

Thanks for the KIP. Overall, I think that the motivation makes sense. I
have a couple of comments/questions:

1. In the Public Interfaces section, it would be great if you could put the
end state not the current one.

2. Do we need to update the Subscription class to expose the
generation? If so, it would be great to mention it in the Public
Interfaces section as well.

3. You mention that the broker will set the generation if the subscription
contains a sentinel value (-1). As of today, the broker does not parse
the subscription so I am not sure how/why we would do this. I suppose
that we could add a `generation` field to the JoinGroupRequest instead
to do the fencing that you describe while handling the sentinel in the
assignor directly. If we would add the `generation` to the request itself,
would we need the `generation` in the subscription protocol as well?

Best,
David

On Fri, Nov 12, 2021 at 3:31 AM Luke Chen <sh...@gmail.com> wrote:
>
> Hi all,
>
> I'd like to start the discussion for KIP-792: Add "generation" field into
> consumer protocol.
>
> The goal of this KIP is to allow assignor/consumer coordinator/group
> coordinator to have a way to identify the out-of-date members/assignments.
>
> Detailed description can be found here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>
> Any feedback is welcome.
>
> Thank you.
> Luke