You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Boyang Chen <bc...@outlook.com> on 2019/04/26 06:13:32 UTC

[DISCUSS] KIP-462 : Use local thread id for KStreams

Hey friends,

I would like to start discussion for a very small KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams

it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.

I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!

Boyang

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Thanks Bill!

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Bill Bejeck <bb...@gmail.com>
Sent: Tuesday, April 30, 2019 6:44:17 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Thanks for the KIP Boyang.  I have no additional comments from the ones
already presented.

+1(binding)

-Bill

On Tue, Apr 30, 2019 at 4:35 PM Boyang Chen <bc...@outlook.com> wrote:

> Thank you Guozhang!
>
> ________________________________
> From: Guozhang Wang <gu...@apache.org>
> Sent: Wednesday, May 1, 2019 3:54 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> +1 (binding)
>
> Guozhang
>
> On 2019/04/26 07:42:12, "Matthias J. Sax" <ma...@confluent.io> wrote:
> > Thanks for the KIP!
> >
> > I agree that the change makes sense, and not only for the static group
> > membership case.
> >
> > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > new one (for example to recover failed threads), while the JVM is still
> > running, it is more intuitive that the thread names are number from 1 to
> > X again, and not from X+1 to 2*x on restart.
> >
> > Also, the original idea about making thread names unique across
> > application is non-intuitive itself. It might make sense if there are
> > two instances of the same application within one JVM -- however, this
> > seems to be a rather rare case. Also, the only pattern for this use case
> > seems to by dynamic scaling, and I believe we should actually void this
> > pattern by adding a `stopThread()` and `addThread()` method to
> > `KafkaStreams` directly.
> >
> >
> > -Matthias
> >
> >
> > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > Hey friends,
> > >
> > > I would like to start discussion for a very small KIP:
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > >
> > > it is trying to avoid sharing thread-id increment between multiple
> stream instances configured in one JVM. This is an important fix for static
> membership<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances>
> to be effective for KStreams in edge case like changing `group.instance.id`
> throughout restarts due to thread-id interleaving.
> > >
> > > I will open the vote thread in the main while, since this is a very
> small fix. Feel free to continue the discussion on this thread, thank you!
> > >
> > > Boyang
> > >
> >
> >
>

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Bill Bejeck <bb...@gmail.com>.
Thanks for the KIP Boyang.  I have no additional comments from the ones
already presented.

+1(binding)

-Bill

On Tue, Apr 30, 2019 at 4:35 PM Boyang Chen <bc...@outlook.com> wrote:

> Thank you Guozhang!
>
> ________________________________
> From: Guozhang Wang <gu...@apache.org>
> Sent: Wednesday, May 1, 2019 3:54 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> +1 (binding)
>
> Guozhang
>
> On 2019/04/26 07:42:12, "Matthias J. Sax" <ma...@confluent.io> wrote:
> > Thanks for the KIP!
> >
> > I agree that the change makes sense, and not only for the static group
> > membership case.
> >
> > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > new one (for example to recover failed threads), while the JVM is still
> > running, it is more intuitive that the thread names are number from 1 to
> > X again, and not from X+1 to 2*x on restart.
> >
> > Also, the original idea about making thread names unique across
> > application is non-intuitive itself. It might make sense if there are
> > two instances of the same application within one JVM -- however, this
> > seems to be a rather rare case. Also, the only pattern for this use case
> > seems to by dynamic scaling, and I believe we should actually void this
> > pattern by adding a `stopThread()` and `addThread()` method to
> > `KafkaStreams` directly.
> >
> >
> > -Matthias
> >
> >
> > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > Hey friends,
> > >
> > > I would like to start discussion for a very small KIP:
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > >
> > > it is trying to avoid sharing thread-id increment between multiple
> stream instances configured in one JVM. This is an important fix for static
> membership<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances>
> to be effective for KStreams in edge case like changing `group.instance.id`
> throughout restarts due to thread-id interleaving.
> > >
> > > I will open the vote thread in the main while, since this is a very
> small fix. Feel free to continue the discussion on this thread, thank you!
> > >
> > > Boyang
> > >
> >
> >
>

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Thank you Guozhang!

________________________________
From: Guozhang Wang <gu...@apache.org>
Sent: Wednesday, May 1, 2019 3:54 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

+1 (binding)

Guozhang

On 2019/04/26 07:42:12, "Matthias J. Sax" <ma...@confluent.io> wrote:
> Thanks for the KIP!
>
> I agree that the change makes sense, and not only for the static group
> membership case.
>
> For example, if a user `closes()` a `KafkaStreams` client and creates a
> new one (for example to recover failed threads), while the JVM is still
> running, it is more intuitive that the thread names are number from 1 to
> X again, and not from X+1 to 2*x on restart.
>
> Also, the original idea about making thread names unique across
> application is non-intuitive itself. It might make sense if there are
> two instances of the same application within one JVM -- however, this
> seems to be a rather rare case. Also, the only pattern for this use case
> seems to by dynamic scaling, and I believe we should actually void this
> pattern by adding a `stopThread()` and `addThread()` method to
> `KafkaStreams` directly.
>
>
> -Matthias
>
>
> On 4/25/19 11:13 PM, Boyang Chen wrote:
> > Hey friends,
> >
> > I would like to start discussion for a very small KIP:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> >
> > it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.
> >
> > I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!
> >
> > Boyang
> >
>
>

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Guozhang Wang <gu...@apache.org>.
+1 (binding)

Guozhang

On 2019/04/26 07:42:12, "Matthias J. Sax" <ma...@confluent.io> wrote: 
> Thanks for the KIP!
> 
> I agree that the change makes sense, and not only for the static group
> membership case.
> 
> For example, if a user `closes()` a `KafkaStreams` client and creates a
> new one (for example to recover failed threads), while the JVM is still
> running, it is more intuitive that the thread names are number from 1 to
> X again, and not from X+1 to 2*x on restart.
> 
> Also, the original idea about making thread names unique across
> application is non-intuitive itself. It might make sense if there are
> two instances of the same application within one JVM -- however, this
> seems to be a rather rare case. Also, the only pattern for this use case
> seems to by dynamic scaling, and I believe we should actually void this
> pattern by adding a `stopThread()` and `addThread()` method to
> `KafkaStreams` directly.
> 
> 
> -Matthias
> 
> 
> On 4/25/19 11:13 PM, Boyang Chen wrote:
> > Hey friends,
> > 
> > I would like to start discussion for a very small KIP:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > 
> > it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.
> > 
> > I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!
> > 
> > Boyang
> > 
> 
> 

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Thank you Sophie! Added the case Matthias described in the Compatibility session.

________________________________
From: Sophie Blee-Goldman <so...@confluent.io>
Sent: Wednesday, May 1, 2019 1:30 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Hey Boyang,

I think this sounds great but one thing you might want to update is the
"Compatibility, Deprecation, and Migration Plan" -- I agree having two
instances in the same JVM is probably a rare occurrence but the (presumably
less rare) situation Matthias described would also be affected in case of
exposed thread ids. Just a small note

Sophie

On Tue, Apr 30, 2019 at 8:25 AM Boyang Chen <bc...@outlook.com> wrote:

> Hey Bruno,
>
> "throttling purpose" means that we use `client.id` to track the request
> quota and do the throttling based on that. So in that context, it is
> expected to use same `client.id` for a certain set of consumers.
>
> Also thank you Guozhang for the comment! Merged with the KIP.
>
> Boyang
> ________________________________
> From: Bruno Cadonna <br...@confluent.io>
> Sent: Tuesday, April 30, 2019 4:15 PM
> To: dev@kafka.apache.org
> Subject: Re: Fw: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> Hi Guozhang,
>
> What do you mean exactly with "throttling purposes"?
>
> @Boyang: Thank you for the KIP!
>
> Best,
> Bruno
>
> On Tue, Apr 30, 2019 at 1:15 AM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hi Boyang,
> >
> > Thanks for the KIP. I think it makes sense.
> >
> > Just following up on the documentation part: since we are effectively
> > removing this guard against same client.ids of instances --- and btw,
> > semantically we would not forbid users to set the same client.ids anyways
> > for throttling purposes for example --- it's worth augmenting the
> > client.id
> > config description by stating what users should expect client.id to be
> > propagated to internal embedded clients, and therefore what's the
> expected
> > outcome if they choose to set same client.ids for different Streams
> client.
> >
> >
> > Otherwise, I've no further comments.
> >
> > Guozhang
> >
> > On Mon, Apr 29, 2019 at 3:42 PM Boyang Chen <bc...@outlook.com> wrote:
> >
> > > FYI
> > >
> > >
> > > ________________________________________
> > > From: Boyang Chen <bc...@outlook.com>
> > > Sent: Tuesday, April 30, 2019 4:32 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Could we get more discussions on this thread?
> > >
> > > Boyang
> > >
> > > ________________________________
> > > From: Boyang Chen <bc...@outlook.com>
> > > Sent: Friday, April 26, 2019 10:51 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Thanks for the explanation Matthias! Will enhance the KIP motivation by
> > > your example.
> > >
> > >
> > > ________________________________
> > > From: Matthias J. Sax <ma...@confluent.io>
> > > Sent: Friday, April 26, 2019 3:42 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Thanks for the KIP!
> > >
> > > I agree that the change makes sense, and not only for the static group
> > > membership case.
> > >
> > > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > > new one (for example to recover failed threads), while the JVM is still
> > > running, it is more intuitive that the thread names are number from 1
> to
> > > X again, and not from X+1 to 2*x on restart.
> > >
> > > Also, the original idea about making thread names unique across
> > > application is non-intuitive itself. It might make sense if there are
> > > two instances of the same application within one JVM -- however, this
> > > seems to be a rather rare case. Also, the only pattern for this use
> case
> > > seems to by dynamic scaling, and I believe we should actually void this
> > > pattern by adding a `stopThread()` and `addThread()` method to
> > > `KafkaStreams` directly.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > > Hey friends,
> > > >
> > > > I would like to start discussion for a very small KIP:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > > >
> > > > it is trying to avoid sharing thread-id increment between multiple
> > > stream instances configured in one JVM. This is an important fix for
> > static
> > > membership<
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> > >
> > > to be effective for KStreams in edge case like changing `
> > group.instance.id`
> > > throughout restarts due to thread-id interleaving.
> > > >
> > > > I will open the vote thread in the main while, since this is a very
> > > small fix. Feel free to continue the discussion on this thread, thank
> > you!
> > > >
> > > > Boyang
> > > >
> > >
> > >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Sophie Blee-Goldman <so...@confluent.io>.
Hey Boyang,

I think this sounds great but one thing you might want to update is the
"Compatibility, Deprecation, and Migration Plan" -- I agree having two
instances in the same JVM is probably a rare occurrence but the (presumably
less rare) situation Matthias described would also be affected in case of
exposed thread ids. Just a small note

Sophie

On Tue, Apr 30, 2019 at 8:25 AM Boyang Chen <bc...@outlook.com> wrote:

> Hey Bruno,
>
> "throttling purpose" means that we use `client.id` to track the request
> quota and do the throttling based on that. So in that context, it is
> expected to use same `client.id` for a certain set of consumers.
>
> Also thank you Guozhang for the comment! Merged with the KIP.
>
> Boyang
> ________________________________
> From: Bruno Cadonna <br...@confluent.io>
> Sent: Tuesday, April 30, 2019 4:15 PM
> To: dev@kafka.apache.org
> Subject: Re: Fw: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> Hi Guozhang,
>
> What do you mean exactly with "throttling purposes"?
>
> @Boyang: Thank you for the KIP!
>
> Best,
> Bruno
>
> On Tue, Apr 30, 2019 at 1:15 AM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hi Boyang,
> >
> > Thanks for the KIP. I think it makes sense.
> >
> > Just following up on the documentation part: since we are effectively
> > removing this guard against same client.ids of instances --- and btw,
> > semantically we would not forbid users to set the same client.ids anyways
> > for throttling purposes for example --- it's worth augmenting the
> > client.id
> > config description by stating what users should expect client.id to be
> > propagated to internal embedded clients, and therefore what's the
> expected
> > outcome if they choose to set same client.ids for different Streams
> client.
> >
> >
> > Otherwise, I've no further comments.
> >
> > Guozhang
> >
> > On Mon, Apr 29, 2019 at 3:42 PM Boyang Chen <bc...@outlook.com> wrote:
> >
> > > FYI
> > >
> > >
> > > ________________________________________
> > > From: Boyang Chen <bc...@outlook.com>
> > > Sent: Tuesday, April 30, 2019 4:32 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Could we get more discussions on this thread?
> > >
> > > Boyang
> > >
> > > ________________________________
> > > From: Boyang Chen <bc...@outlook.com>
> > > Sent: Friday, April 26, 2019 10:51 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Thanks for the explanation Matthias! Will enhance the KIP motivation by
> > > your example.
> > >
> > >
> > > ________________________________
> > > From: Matthias J. Sax <ma...@confluent.io>
> > > Sent: Friday, April 26, 2019 3:42 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> > >
> > > Thanks for the KIP!
> > >
> > > I agree that the change makes sense, and not only for the static group
> > > membership case.
> > >
> > > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > > new one (for example to recover failed threads), while the JVM is still
> > > running, it is more intuitive that the thread names are number from 1
> to
> > > X again, and not from X+1 to 2*x on restart.
> > >
> > > Also, the original idea about making thread names unique across
> > > application is non-intuitive itself. It might make sense if there are
> > > two instances of the same application within one JVM -- however, this
> > > seems to be a rather rare case. Also, the only pattern for this use
> case
> > > seems to by dynamic scaling, and I believe we should actually void this
> > > pattern by adding a `stopThread()` and `addThread()` method to
> > > `KafkaStreams` directly.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > > Hey friends,
> > > >
> > > > I would like to start discussion for a very small KIP:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > > >
> > > > it is trying to avoid sharing thread-id increment between multiple
> > > stream instances configured in one JVM. This is an important fix for
> > static
> > > membership<
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> > >
> > > to be effective for KStreams in edge case like changing `
> > group.instance.id`
> > > throughout restarts due to thread-id interleaving.
> > > >
> > > > I will open the vote thread in the main while, since this is a very
> > > small fix. Feel free to continue the discussion on this thread, thank
> > you!
> > > >
> > > > Boyang
> > > >
> > >
> > >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Hey Bruno,

"throttling purpose" means that we use `client.id` to track the request quota and do the throttling based on that. So in that context, it is expected to use same `client.id` for a certain set of consumers.

Also thank you Guozhang for the comment! Merged with the KIP.

Boyang
________________________________
From: Bruno Cadonna <br...@confluent.io>
Sent: Tuesday, April 30, 2019 4:15 PM
To: dev@kafka.apache.org
Subject: Re: Fw: [DISCUSS] KIP-462 : Use local thread id for KStreams

Hi Guozhang,

What do you mean exactly with "throttling purposes"?

@Boyang: Thank you for the KIP!

Best,
Bruno

On Tue, Apr 30, 2019 at 1:15 AM Guozhang Wang <wa...@gmail.com> wrote:

> Hi Boyang,
>
> Thanks for the KIP. I think it makes sense.
>
> Just following up on the documentation part: since we are effectively
> removing this guard against same client.ids of instances --- and btw,
> semantically we would not forbid users to set the same client.ids anyways
> for throttling purposes for example --- it's worth augmenting the
> client.id
> config description by stating what users should expect client.id to be
> propagated to internal embedded clients, and therefore what's the expected
> outcome if they choose to set same client.ids for different Streams client.
>
>
> Otherwise, I've no further comments.
>
> Guozhang
>
> On Mon, Apr 29, 2019 at 3:42 PM Boyang Chen <bc...@outlook.com> wrote:
>
> > FYI
> >
> >
> > ________________________________________
> > From: Boyang Chen <bc...@outlook.com>
> > Sent: Tuesday, April 30, 2019 4:32 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Could we get more discussions on this thread?
> >
> > Boyang
> >
> > ________________________________
> > From: Boyang Chen <bc...@outlook.com>
> > Sent: Friday, April 26, 2019 10:51 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Thanks for the explanation Matthias! Will enhance the KIP motivation by
> > your example.
> >
> >
> > ________________________________
> > From: Matthias J. Sax <ma...@confluent.io>
> > Sent: Friday, April 26, 2019 3:42 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Thanks for the KIP!
> >
> > I agree that the change makes sense, and not only for the static group
> > membership case.
> >
> > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > new one (for example to recover failed threads), while the JVM is still
> > running, it is more intuitive that the thread names are number from 1 to
> > X again, and not from X+1 to 2*x on restart.
> >
> > Also, the original idea about making thread names unique across
> > application is non-intuitive itself. It might make sense if there are
> > two instances of the same application within one JVM -- however, this
> > seems to be a rather rare case. Also, the only pattern for this use case
> > seems to by dynamic scaling, and I believe we should actually void this
> > pattern by adding a `stopThread()` and `addThread()` method to
> > `KafkaStreams` directly.
> >
> >
> > -Matthias
> >
> >
> > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > Hey friends,
> > >
> > > I would like to start discussion for a very small KIP:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > >
> > > it is trying to avoid sharing thread-id increment between multiple
> > stream instances configured in one JVM. This is an important fix for
> static
> > membership<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> >
> > to be effective for KStreams in edge case like changing `
> group.instance.id`
> > throughout restarts due to thread-id interleaving.
> > >
> > > I will open the vote thread in the main while, since this is a very
> > small fix. Feel free to continue the discussion on this thread, thank
> you!
> > >
> > > Boyang
> > >
> >
> >
>
> --
> -- Guozhang
>

Re: Fw: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Bruno Cadonna <br...@confluent.io>.
Hi Guozhang,

What do you mean exactly with "throttling purposes"?

@Boyang: Thank you for the KIP!

Best,
Bruno

On Tue, Apr 30, 2019 at 1:15 AM Guozhang Wang <wa...@gmail.com> wrote:

> Hi Boyang,
>
> Thanks for the KIP. I think it makes sense.
>
> Just following up on the documentation part: since we are effectively
> removing this guard against same client.ids of instances --- and btw,
> semantically we would not forbid users to set the same client.ids anyways
> for throttling purposes for example --- it's worth augmenting the
> client.id
> config description by stating what users should expect client.id to be
> propagated to internal embedded clients, and therefore what's the expected
> outcome if they choose to set same client.ids for different Streams client.
>
>
> Otherwise, I've no further comments.
>
> Guozhang
>
> On Mon, Apr 29, 2019 at 3:42 PM Boyang Chen <bc...@outlook.com> wrote:
>
> > FYI
> >
> >
> > ________________________________________
> > From: Boyang Chen <bc...@outlook.com>
> > Sent: Tuesday, April 30, 2019 4:32 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Could we get more discussions on this thread?
> >
> > Boyang
> >
> > ________________________________
> > From: Boyang Chen <bc...@outlook.com>
> > Sent: Friday, April 26, 2019 10:51 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Thanks for the explanation Matthias! Will enhance the KIP motivation by
> > your example.
> >
> >
> > ________________________________
> > From: Matthias J. Sax <ma...@confluent.io>
> > Sent: Friday, April 26, 2019 3:42 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
> >
> > Thanks for the KIP!
> >
> > I agree that the change makes sense, and not only for the static group
> > membership case.
> >
> > For example, if a user `closes()` a `KafkaStreams` client and creates a
> > new one (for example to recover failed threads), while the JVM is still
> > running, it is more intuitive that the thread names are number from 1 to
> > X again, and not from X+1 to 2*x on restart.
> >
> > Also, the original idea about making thread names unique across
> > application is non-intuitive itself. It might make sense if there are
> > two instances of the same application within one JVM -- however, this
> > seems to be a rather rare case. Also, the only pattern for this use case
> > seems to by dynamic scaling, and I believe we should actually void this
> > pattern by adding a `stopThread()` and `addThread()` method to
> > `KafkaStreams` directly.
> >
> >
> > -Matthias
> >
> >
> > On 4/25/19 11:13 PM, Boyang Chen wrote:
> > > Hey friends,
> > >
> > > I would like to start discussion for a very small KIP:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> > >
> > > it is trying to avoid sharing thread-id increment between multiple
> > stream instances configured in one JVM. This is an important fix for
> static
> > membership<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> >
> > to be effective for KStreams in edge case like changing `
> group.instance.id`
> > throughout restarts due to thread-id interleaving.
> > >
> > > I will open the vote thread in the main while, since this is a very
> > small fix. Feel free to continue the discussion on this thread, thank
> you!
> > >
> > > Boyang
> > >
> >
> >
>
> --
> -- Guozhang
>

Re: Fw: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Guozhang Wang <wa...@gmail.com>.
Hi Boyang,

Thanks for the KIP. I think it makes sense.

Just following up on the documentation part: since we are effectively
removing this guard against same client.ids of instances --- and btw,
semantically we would not forbid users to set the same client.ids anyways
for throttling purposes for example --- it's worth augmenting the client.id
config description by stating what users should expect client.id to be
propagated to internal embedded clients, and therefore what's the expected
outcome if they choose to set same client.ids for different Streams client.


Otherwise, I've no further comments.

Guozhang

On Mon, Apr 29, 2019 at 3:42 PM Boyang Chen <bc...@outlook.com> wrote:

> FYI
>
>
> ________________________________________
> From: Boyang Chen <bc...@outlook.com>
> Sent: Tuesday, April 30, 2019 4:32 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> Could we get more discussions on this thread?
>
> Boyang
>
> ________________________________
> From: Boyang Chen <bc...@outlook.com>
> Sent: Friday, April 26, 2019 10:51 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> Thanks for the explanation Matthias! Will enhance the KIP motivation by
> your example.
>
>
> ________________________________
> From: Matthias J. Sax <ma...@confluent.io>
> Sent: Friday, April 26, 2019 3:42 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams
>
> Thanks for the KIP!
>
> I agree that the change makes sense, and not only for the static group
> membership case.
>
> For example, if a user `closes()` a `KafkaStreams` client and creates a
> new one (for example to recover failed threads), while the JVM is still
> running, it is more intuitive that the thread names are number from 1 to
> X again, and not from X+1 to 2*x on restart.
>
> Also, the original idea about making thread names unique across
> application is non-intuitive itself. It might make sense if there are
> two instances of the same application within one JVM -- however, this
> seems to be a rather rare case. Also, the only pattern for this use case
> seems to by dynamic scaling, and I believe we should actually void this
> pattern by adding a `stopThread()` and `addThread()` method to
> `KafkaStreams` directly.
>
>
> -Matthias
>
>
> On 4/25/19 11:13 PM, Boyang Chen wrote:
> > Hey friends,
> >
> > I would like to start discussion for a very small KIP:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> >
> > it is trying to avoid sharing thread-id increment between multiple
> stream instances configured in one JVM. This is an important fix for static
> membership<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances>
> to be effective for KStreams in edge case like changing `group.instance.id`
> throughout restarts due to thread-id interleaving.
> >
> > I will open the vote thread in the main while, since this is a very
> small fix. Feel free to continue the discussion on this thread, thank you!
> >
> > Boyang
> >
>
>

-- 
-- Guozhang

Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Could we get more discussions on this thread?

Boyang

________________________________
From: Boyang Chen <bc...@outlook.com>
Sent: Friday, April 26, 2019 10:51 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Thanks for the explanation Matthias! Will enhance the KIP motivation by your example.


________________________________
From: Matthias J. Sax <ma...@confluent.io>
Sent: Friday, April 26, 2019 3:42 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Thanks for the KIP!

I agree that the change makes sense, and not only for the static group
membership case.

For example, if a user `closes()` a `KafkaStreams` client and creates a
new one (for example to recover failed threads), while the JVM is still
running, it is more intuitive that the thread names are number from 1 to
X again, and not from X+1 to 2*x on restart.

Also, the original idea about making thread names unique across
application is non-intuitive itself. It might make sense if there are
two instances of the same application within one JVM -- however, this
seems to be a rather rare case. Also, the only pattern for this use case
seems to by dynamic scaling, and I believe we should actually void this
pattern by adding a `stopThread()` and `addThread()` method to
`KafkaStreams` directly.


-Matthias


On 4/25/19 11:13 PM, Boyang Chen wrote:
> Hey friends,
>
> I would like to start discussion for a very small KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
>
> it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.
>
> I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!
>
> Boyang
>


Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by Boyang Chen <bc...@outlook.com>.
Thanks for the explanation Matthias! Will enhance the KIP motivation by your example.


________________________________
From: Matthias J. Sax <ma...@confluent.io>
Sent: Friday, April 26, 2019 3:42 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Thanks for the KIP!

I agree that the change makes sense, and not only for the static group
membership case.

For example, if a user `closes()` a `KafkaStreams` client and creates a
new one (for example to recover failed threads), while the JVM is still
running, it is more intuitive that the thread names are number from 1 to
X again, and not from X+1 to 2*x on restart.

Also, the original idea about making thread names unique across
application is non-intuitive itself. It might make sense if there are
two instances of the same application within one JVM -- however, this
seems to be a rather rare case. Also, the only pattern for this use case
seems to by dynamic scaling, and I believe we should actually void this
pattern by adding a `stopThread()` and `addThread()` method to
`KafkaStreams` directly.


-Matthias


On 4/25/19 11:13 PM, Boyang Chen wrote:
> Hey friends,
>
> I would like to start discussion for a very small KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
>
> it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.
>
> I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!
>
> Boyang
>


Re: [DISCUSS] KIP-462 : Use local thread id for KStreams

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Thanks for the KIP!

I agree that the change makes sense, and not only for the static group
membership case.

For example, if a user `closes()` a `KafkaStreams` client and creates a
new one (for example to recover failed threads), while the JVM is still
running, it is more intuitive that the thread names are number from 1 to
X again, and not from X+1 to 2*x on restart.

Also, the original idea about making thread names unique across
application is non-intuitive itself. It might make sense if there are
two instances of the same application within one JVM -- however, this
seems to be a rather rare case. Also, the only pattern for this use case
seems to by dynamic scaling, and I believe we should actually void this
pattern by adding a `stopThread()` and `addThread()` method to
`KafkaStreams` directly.


-Matthias


On 4/25/19 11:13 PM, Boyang Chen wrote:
> Hey friends,
> 
> I would like to start discussion for a very small KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-462%3A+Use+local+thread+id+for+KStreams
> 
> it is trying to avoid sharing thread-id increment between multiple stream instances configured in one JVM. This is an important fix for static membership<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances> to be effective for KStreams in edge case like changing `group.instance.id` throughout restarts due to thread-id interleaving.
> 
> I will open the vote thread in the main while, since this is a very small fix. Feel free to continue the discussion on this thread, thank you!
> 
> Boyang
>