You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Leah Thomas <lt...@confluent.io> on 2021/01/13 15:46:52 UTC

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Hey all,

Bringing this back up for discussion.

It seems like the next steps are to:
1. rename the config "window.size.ms"
2. ensure that users set window size EITHER through the config OR through
the constructor. On this note, it may make sense to remove the default for
the `window.size.ms` config, so that there won't be a fall back if the
window size isn't set in either spot. WDYT? This could also address the
issue of multiple window sizes within a streams app.

I see what Sophie is saying about the `default.windowed.key.serde.inner`
config, but I do think deprecating and moving those configs would require a
larger discussion. I'm open to looping them into this KIP if we feel like
it's vital (or incredibly convenient with low cost to users), but my
initial reaction is to leave that out for now and work within the current
set-up for window size.

Thanks for all the comments so far,
Leah

On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <so...@confluent.io>
wrote:

> There are two cases where you need to specify the window size -- directly
> using a
> Consumer (eg the console consumer) or reading as an input topic within
> Streams.
> We need a config for the first case, since you can't pass a Deserializer
> object to the
> console consumer. In the Streams case, the reverse is true, and you have to
> pass in
> an actual Serde object.
>
> Imo we should keep these two cases separate and not use the config for the
> Streams
> case at all. But that's hard to enforce (we'd have to strip the config out
> of the user's
> StreamsConfig if they tried to use it within Streams, for example) and it
> also puts us
> in an awkward position due to the  default.windowed.inner.serde.class
> configs. If
> they can specify the inner serde class through their Streams app config,
> they
> should be able to specify the window size through config as well. Otherwise
> we
> either force a mix-and-match as Matthias described, or you just always have
> to
> specify both the inner class and the window size in the constructor, at
> which
> point, why even have the default.windowed.inner.serde.class config at all?
>
> ...
> that's not a rhetorical question, actually. Personally I do think we should
> deprecate the default.windowed.serde.inner.class and replace it with
> separate
> windowed.serializer.inner.class/windowed.deserializer.inner.class configs.
> This
> way it's immediately obvious that the configs are only for the
> Consumer/Producer,
> and you should construct your own TimeWindowedSerde with all the necessary
> parameters for use in your Streams app.
>
> That might be too radical, and maybe the problem isn't worth the burden of
> forcing users to change their code and replace the config with actual Serde
> objects. But it should be an easy change to make, and if it isn't, that's
> probably a good sign that you're using the serde incorrectly somewhere.
>
> If we don't deprecate the default.windowed.serde.inner.class, then it's
> less clear
> to me how to proceed. The only really consistent thing to do seems to be to
> name and position the new window size config as a default config and allow
> it to be used similar to the default inner class configs. Which, as
> established
> throughout this discussion, seems very very wrong
>
> So yes, I think we should just stick with the original name window.size.ms
> .
> Or
> better yet, call it windowed.deserializer.window.size.ms and name the
> default.windowed.serde.inner.class replacements
> windowed.deserializer.inner.class and windowed.serializer.inner.class
>
> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> wrote:
>
> > From my understanding, the KIP aims for the case when a user does not
> > control the code, eg, when using the command line consumer (or similar
> > tools).
> >
> > If the user writes code, we should always encourage her to instantiate
> > the deserializer explicitly and not relying on reflection+config.
> >
> > I also don't think that the `default` prefix does make sense, as it
> > indicates that there might be a non-default. However, IMHO, we should
> > not allow "overwrite semantics" but rather throw an exception if the
> > config is set and a window size is provided via the constructor. We
> > should not allow to mix-and-match both and should stick to a strict
> > either-or pattern.
> >
> >
> > -Matthias
> >
> > On 9/8/20 11:52 AM, Guozhang Wang wrote:
> > > Hi Sophie,
> > >
> > > Seems I do have some mis-understanding of the KIP's motivation here :)
> > Just
> > > for clarification my reasoning is that:
> > >
> > > 1) today Streams itself never uses a windowed deserializer itself since
> > its
> > > built-in operators only need the serializer and users do not need to
> > > override it, plus standby / restore active tasks would just copy the
> > bytes
> > > directly. So this KIP's motivation is not for Stream's own code
> anyways.
> > >
> > > 2) It is only when user specified serde is missing the window size,
> which
> > > is either when a) one is trying to read a source topic as windowed
> > records
> > > in Streams, this is a big blocker for KIP-300, and when b) one is
> trying
> > to
> > > read a topic as windowed records in Consumer, we would have issues if
> > users
> > > fail to use the appropriate serde constructs.
> > >
> > > I thought the main motivation of this KIP is for 2.a), in which we
> would
> > > just encourage the users to use the right constructor with the window
> > size
> > > by deprecating the other constructs. But I'm not sure how this would
> help
> > > with 2.b) since the proposal is on adding to StreamsConfig. If it is
> the
> > > case, then I agree that probably we can just not add an extra config
> but
> > > just deprecating the constructs.
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
> sophie@confluent.io
> > >
> > > wrote:
> > >
> > >> Hey Guozhang & Leah,
> > >>
> > >> I want to push back a bit on the assumption that we would fall back on
> > this
> > >> config
> > >> in the case of an unspecified window size in a Streams serde. I don't
> > think
> > >> it should
> > >> be a default at all, either in name or in effect. To borrow the
> > rhetorical
> > >> question that
> > >> John raised earlier: what is the default window size of an
> application?
> > >>
> > >> Personally, I agree that that doesn't make much sense. Sure, if you
> only
> > >> have a single
> > >> windowed operation in your app then you could just specify the window
> > size
> > >> by config,
> > >> but how is that any more ergonomic than specifying the window size in
> > the
> > >> Serde's
> > >> constructor? If anything, it seems worse to put physical and mental
> > >> distance between
> > >> the specification and the actual usage of such parameters. What if you
> > add
> > >> another
> > >> windowed operation later, with a different size, and forget to specify
> > the
> > >> new size in
> > >> the new Serde? Or what if you never specify a default window size
> > config at
> > >> all and
> > >> accidentally end up using the default config value of Long.MAX_VALUE?
> > >> Avoiding this
> > >> possibility was/is one of the main motivations of this KIP, and the
> > whole
> > >> point of
> > >> deprecating the TimeWindowedSerde(innerClass) constructor.
> > >>
> > >> I actually would have advocated to remove this config entirely, but as
> > John
> > >> pointed
> > >> out, we still need it to configure things like the console consumer
> > >>
> > >> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lt...@confluent.io>
> > wrote:
> > >>
> > >>> Hi Guozhang,
> > >>>
> > >>> Yes, the config would read them as a single window size. I think this
> > >>> relates to John's comments about having variably sized windows, which
> > >> this
> > >>> config doesn't handle. I like the name change and updated the wiki to
> > >>> reflect that, and to clarify that the default value will still be
> > >>> Long.MAX_VALUE.
> > >>>
> > >>> Thanks for your feedback!
> > >>> Leah
> > >>>
> > >>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <wa...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Hello Leah,
> > >>>>
> > >>>> Thanks for initiating this. I just have one minor clarification
> > >> question
> > >>>> here: the config "window.size.ms" seems to be used as the default
> > >> window
> > >>>> size when reading from a topic that represents windowed records
> right?
> > >>> I.e.
> > >>>> if there are multiple topics that represent windowed records but
> their
> > >>>> window sizes are different, with this config we can only read them
> > >> with a
> > >>>> single window size? If yes, could we rename the config as "
> > >>>> default.window.size.ms" and make that clear in the description as
> > >> well?
> > >>>> Also we'd better also include its default value which I think would
> > >> still
> > >>>> be MAX_VALUE for compatibility.
> > >>>>
> > >>>>
> > >>>> Guozhang
> > >>>>
> > >>>>
> > >>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <lt...@confluent.io>
> > >> wrote:
> > >>>>
> > >>>>> Hey all,
> > >>>>>
> > >>>>> We should be good to wrap up voting now that the discussion has
> been
> > >>>>> resolved.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Leah
> > >>>>>
> > >>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <mj...@apache.org>
> > >>> wrote:
> > >>>>>
> > >>>>>> +1 (binding)
> > >>>>>>
> > >>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> I've just sent a new message to the DISCUSS thread. We
> > >>>>>>> forgot to include the Scala API in the proposal.
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> -John
> > >>>>>>>
> > >>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> > >>>>>>> wrote:
> > >>>>>>>> Thanks for the KIP! +1 (non-binding)
> > >>>>>>>>
> > >>>>>>>> Sophie
> > >>>>>>>>
> > >>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> > >> vvcephei@apache.org
> > >>>>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Thanks Leah,
> > >>>>>>>>> I’m +1 (binding)
> > >>>>>>>>>
> > >>>>>>>>> -John
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> > >>>>>>>>>> Hi everyone,
> > >>>>>>>>>>
> > >>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
> > >>>>>>>>>> TimeWindowedDeserializer
> > >>>>>>>>>> and TimeWindowedSerde to handle window size.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Leah
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> -- Guozhang
> > >>>>
> > >>>
> > >>
> > >
> > >
> >
> >
>

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Posted by Leah Thomas <lt...@confluent.io>.
Thanks everyone, with that I'll go ahead and close this KIP with 4 binding
votes (Sophie, Guozhang, John, Matthias). Thanks for the lively discussion!

Leah

On Wed, Jan 20, 2021 at 6:00 PM Matthias J. Sax <mj...@apache.org> wrote:

> Thanks for reviving this KIP, Leah.
>
> I agree that we should not extend the scope of this KIP to potentially
> deprecate/rename the `default.windowed.[key|value].serde.inner` configs.
>
> @Sophie: if you feel strong about it, let's do a separate KIP.
>
>
> +1 (binding)
>
>
> -Matthias
>
> On 1/19/21 3:00 PM, John Roesler wrote:
> > Hi all,
> >
> > I've just caught up on the thread, and FWIW, I'm still +1.
> >
> > Thanks,
> > -John
> >
> > On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote:
> >> Read the above updates and the KIP's scope. Makes sense to me. +1 still
> >> counts :)
> >>
> >> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <
> sophie@confluent.io>
> >> wrote:
> >>
> >>> That sounds good to me. Thanks for reviving this
> >>>
> >>> Sophie
> >>>
> >>> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <lt...@confluent.io>
> wrote:
> >>>
> >>>> Hey all,
> >>>>
> >>>> Bringing this back up for discussion.
> >>>>
> >>>> It seems like the next steps are to:
> >>>> 1. rename the config "window.size.ms"
> >>>> 2. ensure that users set window size EITHER through the config OR
> through
> >>>> the constructor. On this note, it may make sense to remove the default
> >>> for
> >>>> the `window.size.ms` config, so that there won't be a fall back if
> the
> >>>> window size isn't set in either spot. WDYT? This could also address
> the
> >>>> issue of multiple window sizes within a streams app.
> >>>>
> >>>> I see what Sophie is saying about the
> `default.windowed.key.serde.inner`
> >>>> config, but I do think deprecating and moving those configs would
> >>> require a
> >>>> larger discussion. I'm open to looping them into this KIP if we feel
> like
> >>>> it's vital (or incredibly convenient with low cost to users), but my
> >>>> initial reaction is to leave that out for now and work within the
> current
> >>>> set-up for window size.
> >>>>
> >>>> Thanks for all the comments so far,
> >>>> Leah
> >>>>
> >>>> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <
> >>> sophie@confluent.io>
> >>>> wrote:
> >>>>
> >>>>> There are two cases where you need to specify the window size --
> >>> directly
> >>>>> using a
> >>>>> Consumer (eg the console consumer) or reading as an input topic
> within
> >>>>> Streams.
> >>>>> We need a config for the first case, since you can't pass a
> >>> Deserializer
> >>>>> object to the
> >>>>> console consumer. In the Streams case, the reverse is true, and you
> >>> have
> >>>> to
> >>>>> pass in
> >>>>> an actual Serde object.
> >>>>>
> >>>>> Imo we should keep these two cases separate and not use the config
> for
> >>>> the
> >>>>> Streams
> >>>>> case at all. But that's hard to enforce (we'd have to strip the
> config
> >>>> out
> >>>>> of the user's
> >>>>> StreamsConfig if they tried to use it within Streams, for example)
> and
> >>> it
> >>>>> also puts us
> >>>>> in an awkward position due to the  default.windowed.inner.serde.class
> >>>>> configs. If
> >>>>> they can specify the inner serde class through their Streams app
> >>> config,
> >>>>> they
> >>>>> should be able to specify the window size through config as well.
> >>>> Otherwise
> >>>>> we
> >>>>> either force a mix-and-match as Matthias described, or you just
> always
> >>>> have
> >>>>> to
> >>>>> specify both the inner class and the window size in the constructor,
> at
> >>>>> which
> >>>>> point, why even have the default.windowed.inner.serde.class config at
> >>>> all?
> >>>>>
> >>>>> ...
> >>>>> that's not a rhetorical question, actually. Personally I do think we
> >>>> should
> >>>>> deprecate the default.windowed.serde.inner.class and replace it with
> >>>>> separate
> >>>>> windowed.serializer.inner.class/windowed.deserializer.inner.class
> >>>> configs.
> >>>>> This
> >>>>> way it's immediately obvious that the configs are only for the
> >>>>> Consumer/Producer,
> >>>>> and you should construct your own TimeWindowedSerde with all the
> >>>> necessary
> >>>>> parameters for use in your Streams app.
> >>>>>
> >>>>> That might be too radical, and maybe the problem isn't worth the
> burden
> >>>> of
> >>>>> forcing users to change their code and replace the config with actual
> >>>> Serde
> >>>>> objects. But it should be an easy change to make, and if it isn't,
> >>> that's
> >>>>> probably a good sign that you're using the serde incorrectly
> somewhere.
> >>>>>
> >>>>> If we don't deprecate the default.windowed.serde.inner.class, then
> it's
> >>>>> less clear
> >>>>> to me how to proceed. The only really consistent thing to do seems to
> >>> be
> >>>> to
> >>>>> name and position the new window size config as a default config and
> >>>> allow
> >>>>> it to be used similar to the default inner class configs. Which, as
> >>>>> established
> >>>>> throughout this discussion, seems very very wrong
> >>>>>
> >>>>> So yes, I think we should just stick with the original name
> >>>> window.size.ms
> >>>>> .
> >>>>> Or
> >>>>> better yet, call it windowed.deserializer.window.size.ms and name
> the
> >>>>> default.windowed.serde.inner.class replacements
> >>>>> windowed.deserializer.inner.class and windowed.serializer.inner.class
> >>>>>
> >>>>> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> From my understanding, the KIP aims for the case when a user does
> not
> >>>>>> control the code, eg, when using the command line consumer (or
> >>> similar
> >>>>>> tools).
> >>>>>>
> >>>>>> If the user writes code, we should always encourage her to
> >>> instantiate
> >>>>>> the deserializer explicitly and not relying on reflection+config.
> >>>>>>
> >>>>>> I also don't think that the `default` prefix does make sense, as it
> >>>>>> indicates that there might be a non-default. However, IMHO, we
> should
> >>>>>> not allow "overwrite semantics" but rather throw an exception if the
> >>>>>> config is set and a window size is provided via the constructor. We
> >>>>>> should not allow to mix-and-match both and should stick to a strict
> >>>>>> either-or pattern.
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>> On 9/8/20 11:52 AM, Guozhang Wang wrote:
> >>>>>>> Hi Sophie,
> >>>>>>>
> >>>>>>> Seems I do have some mis-understanding of the KIP's motivation here
> >>>> :)
> >>>>>> Just
> >>>>>>> for clarification my reasoning is that:
> >>>>>>>
> >>>>>>> 1) today Streams itself never uses a windowed deserializer itself
> >>>> since
> >>>>>> its
> >>>>>>> built-in operators only need the serializer and users do not need
> >>> to
> >>>>>>> override it, plus standby / restore active tasks would just copy
> >>> the
> >>>>>> bytes
> >>>>>>> directly. So this KIP's motivation is not for Stream's own code
> >>>>> anyways.
> >>>>>>>
> >>>>>>> 2) It is only when user specified serde is missing the window size,
> >>>>> which
> >>>>>>> is either when a) one is trying to read a source topic as windowed
> >>>>>> records
> >>>>>>> in Streams, this is a big blocker for KIP-300, and when b) one is
> >>>>> trying
> >>>>>> to
> >>>>>>> read a topic as windowed records in Consumer, we would have issues
> >>> if
> >>>>>> users
> >>>>>>> fail to use the appropriate serde constructs.
> >>>>>>>
> >>>>>>> I thought the main motivation of this KIP is for 2.a), in which we
> >>>>> would
> >>>>>>> just encourage the users to use the right constructor with the
> >>> window
> >>>>>> size
> >>>>>>> by deprecating the other constructs. But I'm not sure how this
> >>> would
> >>>>> help
> >>>>>>> with 2.b) since the proposal is on adding to StreamsConfig. If it
> >>> is
> >>>>> the
> >>>>>>> case, then I agree that probably we can just not add an extra
> >>> config
> >>>>> but
> >>>>>>> just deprecating the constructs.
> >>>>>>>
> >>>>>>>
> >>>>>>> Guozhang
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
> >>>>> sophie@confluent.io
> >>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hey Guozhang & Leah,
> >>>>>>>>
> >>>>>>>> I want to push back a bit on the assumption that we would fall
> >>> back
> >>>> on
> >>>>>> this
> >>>>>>>> config
> >>>>>>>> in the case of an unspecified window size in a Streams serde. I
> >>>> don't
> >>>>>> think
> >>>>>>>> it should
> >>>>>>>> be a default at all, either in name or in effect. To borrow the
> >>>>>> rhetorical
> >>>>>>>> question that
> >>>>>>>> John raised earlier: what is the default window size of an
> >>>>> application?
> >>>>>>>>
> >>>>>>>> Personally, I agree that that doesn't make much sense. Sure, if
> >>> you
> >>>>> only
> >>>>>>>> have a single
> >>>>>>>> windowed operation in your app then you could just specify the
> >>>> window
> >>>>>> size
> >>>>>>>> by config,
> >>>>>>>> but how is that any more ergonomic than specifying the window size
> >>>> in
> >>>>>> the
> >>>>>>>> Serde's
> >>>>>>>> constructor? If anything, it seems worse to put physical and
> >>> mental
> >>>>>>>> distance between
> >>>>>>>> the specification and the actual usage of such parameters. What if
> >>>> you
> >>>>>> add
> >>>>>>>> another
> >>>>>>>> windowed operation later, with a different size, and forget to
> >>>> specify
> >>>>>> the
> >>>>>>>> new size in
> >>>>>>>> the new Serde? Or what if you never specify a default window size
> >>>>>> config at
> >>>>>>>> all and
> >>>>>>>> accidentally end up using the default config value of
> >>>> Long.MAX_VALUE?
> >>>>>>>> Avoiding this
> >>>>>>>> possibility was/is one of the main motivations of this KIP, and
> >>> the
> >>>>>> whole
> >>>>>>>> point of
> >>>>>>>> deprecating the TimeWindowedSerde(innerClass) constructor.
> >>>>>>>>
> >>>>>>>> I actually would have advocated to remove this config entirely,
> >>> but
> >>>> as
> >>>>>> John
> >>>>>>>> pointed
> >>>>>>>> out, we still need it to configure things like the console
> >>> consumer
> >>>>>>>>
> >>>>>>>> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lthomas@confluent.io
> >>>>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Guozhang,
> >>>>>>>>>
> >>>>>>>>> Yes, the config would read them as a single window size. I think
> >>>> this
> >>>>>>>>> relates to John's comments about having variably sized windows,
> >>>> which
> >>>>>>>> this
> >>>>>>>>> config doesn't handle. I like the name change and updated the
> >>> wiki
> >>>> to
> >>>>>>>>> reflect that, and to clarify that the default value will still be
> >>>>>>>>> Long.MAX_VALUE.
> >>>>>>>>>
> >>>>>>>>> Thanks for your feedback!
> >>>>>>>>> Leah
> >>>>>>>>>
> >>>>>>>>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <
> >>> wangguoz@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hello Leah,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for initiating this. I just have one minor clarification
> >>>>>>>> question
> >>>>>>>>>> here: the config "window.size.ms" seems to be used as the
> >>> default
> >>>>>>>> window
> >>>>>>>>>> size when reading from a topic that represents windowed records
> >>>>> right?
> >>>>>>>>> I.e.
> >>>>>>>>>> if there are multiple topics that represent windowed records but
> >>>>> their
> >>>>>>>>>> window sizes are different, with this config we can only read
> >>> them
> >>>>>>>> with a
> >>>>>>>>>> single window size? If yes, could we rename the config as "
> >>>>>>>>>> default.window.size.ms" and make that clear in the description
> >>> as
> >>>>>>>> well?
> >>>>>>>>>> Also we'd better also include its default value which I think
> >>>> would
> >>>>>>>> still
> >>>>>>>>>> be MAX_VALUE for compatibility.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Guozhang
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <
> >>> lthomas@confluent.io>
> >>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hey all,
> >>>>>>>>>>>
> >>>>>>>>>>> We should be good to wrap up voting now that the discussion has
> >>>>> been
> >>>>>>>>>>> resolved.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> Leah
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <
> >>> mjsax@apache.org
> >>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> +1 (binding)
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've just sent a new message to the DISCUSS thread. We
> >>>>>>>>>>>>> forgot to include the Scala API in the proposal.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> -John
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Thanks for the KIP! +1 (non-binding)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Sophie
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> >>>>>>>> vvcephei@apache.org
> >>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks Leah,
> >>>>>>>>>>>>>>> I’m +1 (binding)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> >>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
> >>>>>>>>>>>>>>>> TimeWindowedDeserializer
> >>>>>>>>>>>>>>>> and TimeWindowedSerde to handle window size.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Leah
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> -- Guozhang
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
> >
>

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Posted by "Matthias J. Sax" <mj...@apache.org>.
Thanks for reviving this KIP, Leah.

I agree that we should not extend the scope of this KIP to potentially
deprecate/rename the `default.windowed.[key|value].serde.inner` configs.

@Sophie: if you feel strong about it, let's do a separate KIP.


+1 (binding)


-Matthias

On 1/19/21 3:00 PM, John Roesler wrote:
> Hi all,
> 
> I've just caught up on the thread, and FWIW, I'm still +1.
> 
> Thanks,
> -John
> 
> On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote:
>> Read the above updates and the KIP's scope. Makes sense to me. +1 still
>> counts :)
>>
>> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <so...@confluent.io>
>> wrote:
>>
>>> That sounds good to me. Thanks for reviving this
>>>
>>> Sophie
>>>
>>> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <lt...@confluent.io> wrote:
>>>
>>>> Hey all,
>>>>
>>>> Bringing this back up for discussion.
>>>>
>>>> It seems like the next steps are to:
>>>> 1. rename the config "window.size.ms"
>>>> 2. ensure that users set window size EITHER through the config OR through
>>>> the constructor. On this note, it may make sense to remove the default
>>> for
>>>> the `window.size.ms` config, so that there won't be a fall back if the
>>>> window size isn't set in either spot. WDYT? This could also address the
>>>> issue of multiple window sizes within a streams app.
>>>>
>>>> I see what Sophie is saying about the `default.windowed.key.serde.inner`
>>>> config, but I do think deprecating and moving those configs would
>>> require a
>>>> larger discussion. I'm open to looping them into this KIP if we feel like
>>>> it's vital (or incredibly convenient with low cost to users), but my
>>>> initial reaction is to leave that out for now and work within the current
>>>> set-up for window size.
>>>>
>>>> Thanks for all the comments so far,
>>>> Leah
>>>>
>>>> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <
>>> sophie@confluent.io>
>>>> wrote:
>>>>
>>>>> There are two cases where you need to specify the window size --
>>> directly
>>>>> using a
>>>>> Consumer (eg the console consumer) or reading as an input topic within
>>>>> Streams.
>>>>> We need a config for the first case, since you can't pass a
>>> Deserializer
>>>>> object to the
>>>>> console consumer. In the Streams case, the reverse is true, and you
>>> have
>>>> to
>>>>> pass in
>>>>> an actual Serde object.
>>>>>
>>>>> Imo we should keep these two cases separate and not use the config for
>>>> the
>>>>> Streams
>>>>> case at all. But that's hard to enforce (we'd have to strip the config
>>>> out
>>>>> of the user's
>>>>> StreamsConfig if they tried to use it within Streams, for example) and
>>> it
>>>>> also puts us
>>>>> in an awkward position due to the  default.windowed.inner.serde.class
>>>>> configs. If
>>>>> they can specify the inner serde class through their Streams app
>>> config,
>>>>> they
>>>>> should be able to specify the window size through config as well.
>>>> Otherwise
>>>>> we
>>>>> either force a mix-and-match as Matthias described, or you just always
>>>> have
>>>>> to
>>>>> specify both the inner class and the window size in the constructor, at
>>>>> which
>>>>> point, why even have the default.windowed.inner.serde.class config at
>>>> all?
>>>>>
>>>>> ...
>>>>> that's not a rhetorical question, actually. Personally I do think we
>>>> should
>>>>> deprecate the default.windowed.serde.inner.class and replace it with
>>>>> separate
>>>>> windowed.serializer.inner.class/windowed.deserializer.inner.class
>>>> configs.
>>>>> This
>>>>> way it's immediately obvious that the configs are only for the
>>>>> Consumer/Producer,
>>>>> and you should construct your own TimeWindowedSerde with all the
>>>> necessary
>>>>> parameters for use in your Streams app.
>>>>>
>>>>> That might be too radical, and maybe the problem isn't worth the burden
>>>> of
>>>>> forcing users to change their code and replace the config with actual
>>>> Serde
>>>>> objects. But it should be an easy change to make, and if it isn't,
>>> that's
>>>>> probably a good sign that you're using the serde incorrectly somewhere.
>>>>>
>>>>> If we don't deprecate the default.windowed.serde.inner.class, then it's
>>>>> less clear
>>>>> to me how to proceed. The only really consistent thing to do seems to
>>> be
>>>> to
>>>>> name and position the new window size config as a default config and
>>>> allow
>>>>> it to be used similar to the default inner class configs. Which, as
>>>>> established
>>>>> throughout this discussion, seems very very wrong
>>>>>
>>>>> So yes, I think we should just stick with the original name
>>>> window.size.ms
>>>>> .
>>>>> Or
>>>>> better yet, call it windowed.deserializer.window.size.ms and name the
>>>>> default.windowed.serde.inner.class replacements
>>>>> windowed.deserializer.inner.class and windowed.serializer.inner.class
>>>>>
>>>>> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org>
>>> wrote:
>>>>>
>>>>>> From my understanding, the KIP aims for the case when a user does not
>>>>>> control the code, eg, when using the command line consumer (or
>>> similar
>>>>>> tools).
>>>>>>
>>>>>> If the user writes code, we should always encourage her to
>>> instantiate
>>>>>> the deserializer explicitly and not relying on reflection+config.
>>>>>>
>>>>>> I also don't think that the `default` prefix does make sense, as it
>>>>>> indicates that there might be a non-default. However, IMHO, we should
>>>>>> not allow "overwrite semantics" but rather throw an exception if the
>>>>>> config is set and a window size is provided via the constructor. We
>>>>>> should not allow to mix-and-match both and should stick to a strict
>>>>>> either-or pattern.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>> On 9/8/20 11:52 AM, Guozhang Wang wrote:
>>>>>>> Hi Sophie,
>>>>>>>
>>>>>>> Seems I do have some mis-understanding of the KIP's motivation here
>>>> :)
>>>>>> Just
>>>>>>> for clarification my reasoning is that:
>>>>>>>
>>>>>>> 1) today Streams itself never uses a windowed deserializer itself
>>>> since
>>>>>> its
>>>>>>> built-in operators only need the serializer and users do not need
>>> to
>>>>>>> override it, plus standby / restore active tasks would just copy
>>> the
>>>>>> bytes
>>>>>>> directly. So this KIP's motivation is not for Stream's own code
>>>>> anyways.
>>>>>>>
>>>>>>> 2) It is only when user specified serde is missing the window size,
>>>>> which
>>>>>>> is either when a) one is trying to read a source topic as windowed
>>>>>> records
>>>>>>> in Streams, this is a big blocker for KIP-300, and when b) one is
>>>>> trying
>>>>>> to
>>>>>>> read a topic as windowed records in Consumer, we would have issues
>>> if
>>>>>> users
>>>>>>> fail to use the appropriate serde constructs.
>>>>>>>
>>>>>>> I thought the main motivation of this KIP is for 2.a), in which we
>>>>> would
>>>>>>> just encourage the users to use the right constructor with the
>>> window
>>>>>> size
>>>>>>> by deprecating the other constructs. But I'm not sure how this
>>> would
>>>>> help
>>>>>>> with 2.b) since the proposal is on adding to StreamsConfig. If it
>>> is
>>>>> the
>>>>>>> case, then I agree that probably we can just not add an extra
>>> config
>>>>> but
>>>>>>> just deprecating the constructs.
>>>>>>>
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
>>>>> sophie@confluent.io
>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Guozhang & Leah,
>>>>>>>>
>>>>>>>> I want to push back a bit on the assumption that we would fall
>>> back
>>>> on
>>>>>> this
>>>>>>>> config
>>>>>>>> in the case of an unspecified window size in a Streams serde. I
>>>> don't
>>>>>> think
>>>>>>>> it should
>>>>>>>> be a default at all, either in name or in effect. To borrow the
>>>>>> rhetorical
>>>>>>>> question that
>>>>>>>> John raised earlier: what is the default window size of an
>>>>> application?
>>>>>>>>
>>>>>>>> Personally, I agree that that doesn't make much sense. Sure, if
>>> you
>>>>> only
>>>>>>>> have a single
>>>>>>>> windowed operation in your app then you could just specify the
>>>> window
>>>>>> size
>>>>>>>> by config,
>>>>>>>> but how is that any more ergonomic than specifying the window size
>>>> in
>>>>>> the
>>>>>>>> Serde's
>>>>>>>> constructor? If anything, it seems worse to put physical and
>>> mental
>>>>>>>> distance between
>>>>>>>> the specification and the actual usage of such parameters. What if
>>>> you
>>>>>> add
>>>>>>>> another
>>>>>>>> windowed operation later, with a different size, and forget to
>>>> specify
>>>>>> the
>>>>>>>> new size in
>>>>>>>> the new Serde? Or what if you never specify a default window size
>>>>>> config at
>>>>>>>> all and
>>>>>>>> accidentally end up using the default config value of
>>>> Long.MAX_VALUE?
>>>>>>>> Avoiding this
>>>>>>>> possibility was/is one of the main motivations of this KIP, and
>>> the
>>>>>> whole
>>>>>>>> point of
>>>>>>>> deprecating the TimeWindowedSerde(innerClass) constructor.
>>>>>>>>
>>>>>>>> I actually would have advocated to remove this config entirely,
>>> but
>>>> as
>>>>>> John
>>>>>>>> pointed
>>>>>>>> out, we still need it to configure things like the console
>>> consumer
>>>>>>>>
>>>>>>>> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lthomas@confluent.io
>>>>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Guozhang,
>>>>>>>>>
>>>>>>>>> Yes, the config would read them as a single window size. I think
>>>> this
>>>>>>>>> relates to John's comments about having variably sized windows,
>>>> which
>>>>>>>> this
>>>>>>>>> config doesn't handle. I like the name change and updated the
>>> wiki
>>>> to
>>>>>>>>> reflect that, and to clarify that the default value will still be
>>>>>>>>> Long.MAX_VALUE.
>>>>>>>>>
>>>>>>>>> Thanks for your feedback!
>>>>>>>>> Leah
>>>>>>>>>
>>>>>>>>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <
>>> wangguoz@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Leah,
>>>>>>>>>>
>>>>>>>>>> Thanks for initiating this. I just have one minor clarification
>>>>>>>> question
>>>>>>>>>> here: the config "window.size.ms" seems to be used as the
>>> default
>>>>>>>> window
>>>>>>>>>> size when reading from a topic that represents windowed records
>>>>> right?
>>>>>>>>> I.e.
>>>>>>>>>> if there are multiple topics that represent windowed records but
>>>>> their
>>>>>>>>>> window sizes are different, with this config we can only read
>>> them
>>>>>>>> with a
>>>>>>>>>> single window size? If yes, could we rename the config as "
>>>>>>>>>> default.window.size.ms" and make that clear in the description
>>> as
>>>>>>>> well?
>>>>>>>>>> Also we'd better also include its default value which I think
>>>> would
>>>>>>>> still
>>>>>>>>>> be MAX_VALUE for compatibility.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Guozhang
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <
>>> lthomas@confluent.io>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey all,
>>>>>>>>>>>
>>>>>>>>>>> We should be good to wrap up voting now that the discussion has
>>>>> been
>>>>>>>>>>> resolved.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Leah
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <
>>> mjsax@apache.org
>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've just sent a new message to the DISCUSS thread. We
>>>>>>>>>>>>> forgot to include the Scala API in the proposal.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Thanks for the KIP! +1 (non-binding)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sophie
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
>>>>>>>> vvcephei@apache.org
>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks Leah,
>>>>>>>>>>>>>>> I’m +1 (binding)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
>>>>>>>>>>>>>>>> TimeWindowedDeserializer
>>>>>>>>>>>>>>>> and TimeWindowedSerde to handle window size.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Leah
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
> 
> 

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Posted by John Roesler <vv...@apache.org>.
Hi all,

I've just caught up on the thread, and FWIW, I'm still +1.

Thanks,
-John

On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote:
> Read the above updates and the KIP's scope. Makes sense to me. +1 still
> counts :)
> 
> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <so...@confluent.io>
> wrote:
> 
> > That sounds good to me. Thanks for reviving this
> > 
> > Sophie
> > 
> > On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <lt...@confluent.io> wrote:
> > 
> > > Hey all,
> > > 
> > > Bringing this back up for discussion.
> > > 
> > > It seems like the next steps are to:
> > > 1. rename the config "window.size.ms"
> > > 2. ensure that users set window size EITHER through the config OR through
> > > the constructor. On this note, it may make sense to remove the default
> > for
> > > the `window.size.ms` config, so that there won't be a fall back if the
> > > window size isn't set in either spot. WDYT? This could also address the
> > > issue of multiple window sizes within a streams app.
> > > 
> > > I see what Sophie is saying about the `default.windowed.key.serde.inner`
> > > config, but I do think deprecating and moving those configs would
> > require a
> > > larger discussion. I'm open to looping them into this KIP if we feel like
> > > it's vital (or incredibly convenient with low cost to users), but my
> > > initial reaction is to leave that out for now and work within the current
> > > set-up for window size.
> > > 
> > > Thanks for all the comments so far,
> > > Leah
> > > 
> > > On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <
> > sophie@confluent.io>
> > > wrote:
> > > 
> > > > There are two cases where you need to specify the window size --
> > directly
> > > > using a
> > > > Consumer (eg the console consumer) or reading as an input topic within
> > > > Streams.
> > > > We need a config for the first case, since you can't pass a
> > Deserializer
> > > > object to the
> > > > console consumer. In the Streams case, the reverse is true, and you
> > have
> > > to
> > > > pass in
> > > > an actual Serde object.
> > > > 
> > > > Imo we should keep these two cases separate and not use the config for
> > > the
> > > > Streams
> > > > case at all. But that's hard to enforce (we'd have to strip the config
> > > out
> > > > of the user's
> > > > StreamsConfig if they tried to use it within Streams, for example) and
> > it
> > > > also puts us
> > > > in an awkward position due to the  default.windowed.inner.serde.class
> > > > configs. If
> > > > they can specify the inner serde class through their Streams app
> > config,
> > > > they
> > > > should be able to specify the window size through config as well.
> > > Otherwise
> > > > we
> > > > either force a mix-and-match as Matthias described, or you just always
> > > have
> > > > to
> > > > specify both the inner class and the window size in the constructor, at
> > > > which
> > > > point, why even have the default.windowed.inner.serde.class config at
> > > all?
> > > > 
> > > > ...
> > > > that's not a rhetorical question, actually. Personally I do think we
> > > should
> > > > deprecate the default.windowed.serde.inner.class and replace it with
> > > > separate
> > > > windowed.serializer.inner.class/windowed.deserializer.inner.class
> > > configs.
> > > > This
> > > > way it's immediately obvious that the configs are only for the
> > > > Consumer/Producer,
> > > > and you should construct your own TimeWindowedSerde with all the
> > > necessary
> > > > parameters for use in your Streams app.
> > > > 
> > > > That might be too radical, and maybe the problem isn't worth the burden
> > > of
> > > > forcing users to change their code and replace the config with actual
> > > Serde
> > > > objects. But it should be an easy change to make, and if it isn't,
> > that's
> > > > probably a good sign that you're using the serde incorrectly somewhere.
> > > > 
> > > > If we don't deprecate the default.windowed.serde.inner.class, then it's
> > > > less clear
> > > > to me how to proceed. The only really consistent thing to do seems to
> > be
> > > to
> > > > name and position the new window size config as a default config and
> > > allow
> > > > it to be used similar to the default inner class configs. Which, as
> > > > established
> > > > throughout this discussion, seems very very wrong
> > > > 
> > > > So yes, I think we should just stick with the original name
> > > window.size.ms
> > > > .
> > > > Or
> > > > better yet, call it windowed.deserializer.window.size.ms and name the
> > > > default.windowed.serde.inner.class replacements
> > > > windowed.deserializer.inner.class and windowed.serializer.inner.class
> > > > 
> > > > On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org>
> > wrote:
> > > > 
> > > > > From my understanding, the KIP aims for the case when a user does not
> > > > > control the code, eg, when using the command line consumer (or
> > similar
> > > > > tools).
> > > > > 
> > > > > If the user writes code, we should always encourage her to
> > instantiate
> > > > > the deserializer explicitly and not relying on reflection+config.
> > > > > 
> > > > > I also don't think that the `default` prefix does make sense, as it
> > > > > indicates that there might be a non-default. However, IMHO, we should
> > > > > not allow "overwrite semantics" but rather throw an exception if the
> > > > > config is set and a window size is provided via the constructor. We
> > > > > should not allow to mix-and-match both and should stick to a strict
> > > > > either-or pattern.
> > > > > 
> > > > > 
> > > > > -Matthias
> > > > > 
> > > > > On 9/8/20 11:52 AM, Guozhang Wang wrote:
> > > > > > Hi Sophie,
> > > > > > 
> > > > > > Seems I do have some mis-understanding of the KIP's motivation here
> > > :)
> > > > > Just
> > > > > > for clarification my reasoning is that:
> > > > > > 
> > > > > > 1) today Streams itself never uses a windowed deserializer itself
> > > since
> > > > > its
> > > > > > built-in operators only need the serializer and users do not need
> > to
> > > > > > override it, plus standby / restore active tasks would just copy
> > the
> > > > > bytes
> > > > > > directly. So this KIP's motivation is not for Stream's own code
> > > > anyways.
> > > > > > 
> > > > > > 2) It is only when user specified serde is missing the window size,
> > > > which
> > > > > > is either when a) one is trying to read a source topic as windowed
> > > > > records
> > > > > > in Streams, this is a big blocker for KIP-300, and when b) one is
> > > > trying
> > > > > to
> > > > > > read a topic as windowed records in Consumer, we would have issues
> > if
> > > > > users
> > > > > > fail to use the appropriate serde constructs.
> > > > > > 
> > > > > > I thought the main motivation of this KIP is for 2.a), in which we
> > > > would
> > > > > > just encourage the users to use the right constructor with the
> > window
> > > > > size
> > > > > > by deprecating the other constructs. But I'm not sure how this
> > would
> > > > help
> > > > > > with 2.b) since the proposal is on adding to StreamsConfig. If it
> > is
> > > > the
> > > > > > case, then I agree that probably we can just not add an extra
> > config
> > > > but
> > > > > > just deprecating the constructs.
> > > > > > 
> > > > > > 
> > > > > > Guozhang
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
> > > > sophie@confluent.io
> > > > > > 
> > > > > > wrote:
> > > > > > 
> > > > > > > Hey Guozhang & Leah,
> > > > > > > 
> > > > > > > I want to push back a bit on the assumption that we would fall
> > back
> > > on
> > > > > this
> > > > > > > config
> > > > > > > in the case of an unspecified window size in a Streams serde. I
> > > don't
> > > > > think
> > > > > > > it should
> > > > > > > be a default at all, either in name or in effect. To borrow the
> > > > > rhetorical
> > > > > > > question that
> > > > > > > John raised earlier: what is the default window size of an
> > > > application?
> > > > > > > 
> > > > > > > Personally, I agree that that doesn't make much sense. Sure, if
> > you
> > > > only
> > > > > > > have a single
> > > > > > > windowed operation in your app then you could just specify the
> > > window
> > > > > size
> > > > > > > by config,
> > > > > > > but how is that any more ergonomic than specifying the window size
> > > in
> > > > > the
> > > > > > > Serde's
> > > > > > > constructor? If anything, it seems worse to put physical and
> > mental
> > > > > > > distance between
> > > > > > > the specification and the actual usage of such parameters. What if
> > > you
> > > > > add
> > > > > > > another
> > > > > > > windowed operation later, with a different size, and forget to
> > > specify
> > > > > the
> > > > > > > new size in
> > > > > > > the new Serde? Or what if you never specify a default window size
> > > > > config at
> > > > > > > all and
> > > > > > > accidentally end up using the default config value of
> > > Long.MAX_VALUE?
> > > > > > > Avoiding this
> > > > > > > possibility was/is one of the main motivations of this KIP, and
> > the
> > > > > whole
> > > > > > > point of
> > > > > > > deprecating the TimeWindowedSerde(innerClass) constructor.
> > > > > > > 
> > > > > > > I actually would have advocated to remove this config entirely,
> > but
> > > as
> > > > > John
> > > > > > > pointed
> > > > > > > out, we still need it to configure things like the console
> > consumer
> > > > > > > 
> > > > > > > On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lthomas@confluent.io
> > > 
> > > > > wrote:
> > > > > > > 
> > > > > > > > Hi Guozhang,
> > > > > > > > 
> > > > > > > > Yes, the config would read them as a single window size. I think
> > > this
> > > > > > > > relates to John's comments about having variably sized windows,
> > > which
> > > > > > > this
> > > > > > > > config doesn't handle. I like the name change and updated the
> > wiki
> > > to
> > > > > > > > reflect that, and to clarify that the default value will still be
> > > > > > > > Long.MAX_VALUE.
> > > > > > > > 
> > > > > > > > Thanks for your feedback!
> > > > > > > > Leah
> > > > > > > > 
> > > > > > > > On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <
> > wangguoz@gmail.com>
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Hello Leah,
> > > > > > > > > 
> > > > > > > > > Thanks for initiating this. I just have one minor clarification
> > > > > > > question
> > > > > > > > > here: the config "window.size.ms" seems to be used as the
> > default
> > > > > > > window
> > > > > > > > > size when reading from a topic that represents windowed records
> > > > right?
> > > > > > > > I.e.
> > > > > > > > > if there are multiple topics that represent windowed records but
> > > > their
> > > > > > > > > window sizes are different, with this config we can only read
> > them
> > > > > > > with a
> > > > > > > > > single window size? If yes, could we rename the config as "
> > > > > > > > > default.window.size.ms" and make that clear in the description
> > as
> > > > > > > well?
> > > > > > > > > Also we'd better also include its default value which I think
> > > would
> > > > > > > still
> > > > > > > > > be MAX_VALUE for compatibility.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Guozhang
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <
> > lthomas@confluent.io>
> > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > > Hey all,
> > > > > > > > > > 
> > > > > > > > > > We should be good to wrap up voting now that the discussion has
> > > > been
> > > > > > > > > > resolved.
> > > > > > > > > > 
> > > > > > > > > > Cheers,
> > > > > > > > > > Leah
> > > > > > > > > > 
> > > > > > > > > > On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <
> > mjsax@apache.org
> > > > 
> > > > > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > > +1 (binding)
> > > > > > > > > > > 
> > > > > > > > > > > On 8/26/20 8:02 AM, John Roesler wrote:
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > > 
> > > > > > > > > > > > I've just sent a new message to the DISCUSS thread. We
> > > > > > > > > > > > forgot to include the Scala API in the proposal.
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > -John
> > > > > > > > > > > > 
> > > > > > > > > > > > On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > Thanks for the KIP! +1 (non-binding)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Sophie
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> > > > > > > vvcephei@apache.org
> > > > > > > > > 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > Thanks Leah,
> > > > > > > > > > > > > > I’m +1 (binding)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > -John
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I'd like to kick-off the vote for KIP-659: Improve
> > > > > > > > > > > > > > > TimeWindowedDeserializer
> > > > > > > > > > > > > > > and TimeWindowedSerde to handle window size.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > 
> > > > 
> > > 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Leah
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > --
> > > > > > > > > -- Guozhang
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> 



Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Posted by Guozhang Wang <wa...@gmail.com>.
Read the above updates and the KIP's scope. Makes sense to me. +1 still
counts :)

On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <so...@confluent.io>
wrote:

> That sounds good to me. Thanks for reviving this
>
> Sophie
>
> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <lt...@confluent.io> wrote:
>
> > Hey all,
> >
> > Bringing this back up for discussion.
> >
> > It seems like the next steps are to:
> > 1. rename the config "window.size.ms"
> > 2. ensure that users set window size EITHER through the config OR through
> > the constructor. On this note, it may make sense to remove the default
> for
> > the `window.size.ms` config, so that there won't be a fall back if the
> > window size isn't set in either spot. WDYT? This could also address the
> > issue of multiple window sizes within a streams app.
> >
> > I see what Sophie is saying about the `default.windowed.key.serde.inner`
> > config, but I do think deprecating and moving those configs would
> require a
> > larger discussion. I'm open to looping them into this KIP if we feel like
> > it's vital (or incredibly convenient with low cost to users), but my
> > initial reaction is to leave that out for now and work within the current
> > set-up for window size.
> >
> > Thanks for all the comments so far,
> > Leah
> >
> > On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <
> sophie@confluent.io>
> > wrote:
> >
> > > There are two cases where you need to specify the window size --
> directly
> > > using a
> > > Consumer (eg the console consumer) or reading as an input topic within
> > > Streams.
> > > We need a config for the first case, since you can't pass a
> Deserializer
> > > object to the
> > > console consumer. In the Streams case, the reverse is true, and you
> have
> > to
> > > pass in
> > > an actual Serde object.
> > >
> > > Imo we should keep these two cases separate and not use the config for
> > the
> > > Streams
> > > case at all. But that's hard to enforce (we'd have to strip the config
> > out
> > > of the user's
> > > StreamsConfig if they tried to use it within Streams, for example) and
> it
> > > also puts us
> > > in an awkward position due to the  default.windowed.inner.serde.class
> > > configs. If
> > > they can specify the inner serde class through their Streams app
> config,
> > > they
> > > should be able to specify the window size through config as well.
> > Otherwise
> > > we
> > > either force a mix-and-match as Matthias described, or you just always
> > have
> > > to
> > > specify both the inner class and the window size in the constructor, at
> > > which
> > > point, why even have the default.windowed.inner.serde.class config at
> > all?
> > >
> > > ...
> > > that's not a rhetorical question, actually. Personally I do think we
> > should
> > > deprecate the default.windowed.serde.inner.class and replace it with
> > > separate
> > > windowed.serializer.inner.class/windowed.deserializer.inner.class
> > configs.
> > > This
> > > way it's immediately obvious that the configs are only for the
> > > Consumer/Producer,
> > > and you should construct your own TimeWindowedSerde with all the
> > necessary
> > > parameters for use in your Streams app.
> > >
> > > That might be too radical, and maybe the problem isn't worth the burden
> > of
> > > forcing users to change their code and replace the config with actual
> > Serde
> > > objects. But it should be an easy change to make, and if it isn't,
> that's
> > > probably a good sign that you're using the serde incorrectly somewhere.
> > >
> > > If we don't deprecate the default.windowed.serde.inner.class, then it's
> > > less clear
> > > to me how to proceed. The only really consistent thing to do seems to
> be
> > to
> > > name and position the new window size config as a default config and
> > allow
> > > it to be used similar to the default inner class configs. Which, as
> > > established
> > > throughout this discussion, seems very very wrong
> > >
> > > So yes, I think we should just stick with the original name
> > window.size.ms
> > > .
> > > Or
> > > better yet, call it windowed.deserializer.window.size.ms and name the
> > > default.windowed.serde.inner.class replacements
> > > windowed.deserializer.inner.class and windowed.serializer.inner.class
> > >
> > > On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org>
> wrote:
> > >
> > > > From my understanding, the KIP aims for the case when a user does not
> > > > control the code, eg, when using the command line consumer (or
> similar
> > > > tools).
> > > >
> > > > If the user writes code, we should always encourage her to
> instantiate
> > > > the deserializer explicitly and not relying on reflection+config.
> > > >
> > > > I also don't think that the `default` prefix does make sense, as it
> > > > indicates that there might be a non-default. However, IMHO, we should
> > > > not allow "overwrite semantics" but rather throw an exception if the
> > > > config is set and a window size is provided via the constructor. We
> > > > should not allow to mix-and-match both and should stick to a strict
> > > > either-or pattern.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 9/8/20 11:52 AM, Guozhang Wang wrote:
> > > > > Hi Sophie,
> > > > >
> > > > > Seems I do have some mis-understanding of the KIP's motivation here
> > :)
> > > > Just
> > > > > for clarification my reasoning is that:
> > > > >
> > > > > 1) today Streams itself never uses a windowed deserializer itself
> > since
> > > > its
> > > > > built-in operators only need the serializer and users do not need
> to
> > > > > override it, plus standby / restore active tasks would just copy
> the
> > > > bytes
> > > > > directly. So this KIP's motivation is not for Stream's own code
> > > anyways.
> > > > >
> > > > > 2) It is only when user specified serde is missing the window size,
> > > which
> > > > > is either when a) one is trying to read a source topic as windowed
> > > > records
> > > > > in Streams, this is a big blocker for KIP-300, and when b) one is
> > > trying
> > > > to
> > > > > read a topic as windowed records in Consumer, we would have issues
> if
> > > > users
> > > > > fail to use the appropriate serde constructs.
> > > > >
> > > > > I thought the main motivation of this KIP is for 2.a), in which we
> > > would
> > > > > just encourage the users to use the right constructor with the
> window
> > > > size
> > > > > by deprecating the other constructs. But I'm not sure how this
> would
> > > help
> > > > > with 2.b) since the proposal is on adding to StreamsConfig. If it
> is
> > > the
> > > > > case, then I agree that probably we can just not add an extra
> config
> > > but
> > > > > just deprecating the constructs.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
> > > sophie@confluent.io
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hey Guozhang & Leah,
> > > > >>
> > > > >> I want to push back a bit on the assumption that we would fall
> back
> > on
> > > > this
> > > > >> config
> > > > >> in the case of an unspecified window size in a Streams serde. I
> > don't
> > > > think
> > > > >> it should
> > > > >> be a default at all, either in name or in effect. To borrow the
> > > > rhetorical
> > > > >> question that
> > > > >> John raised earlier: what is the default window size of an
> > > application?
> > > > >>
> > > > >> Personally, I agree that that doesn't make much sense. Sure, if
> you
> > > only
> > > > >> have a single
> > > > >> windowed operation in your app then you could just specify the
> > window
> > > > size
> > > > >> by config,
> > > > >> but how is that any more ergonomic than specifying the window size
> > in
> > > > the
> > > > >> Serde's
> > > > >> constructor? If anything, it seems worse to put physical and
> mental
> > > > >> distance between
> > > > >> the specification and the actual usage of such parameters. What if
> > you
> > > > add
> > > > >> another
> > > > >> windowed operation later, with a different size, and forget to
> > specify
> > > > the
> > > > >> new size in
> > > > >> the new Serde? Or what if you never specify a default window size
> > > > config at
> > > > >> all and
> > > > >> accidentally end up using the default config value of
> > Long.MAX_VALUE?
> > > > >> Avoiding this
> > > > >> possibility was/is one of the main motivations of this KIP, and
> the
> > > > whole
> > > > >> point of
> > > > >> deprecating the TimeWindowedSerde(innerClass) constructor.
> > > > >>
> > > > >> I actually would have advocated to remove this config entirely,
> but
> > as
> > > > John
> > > > >> pointed
> > > > >> out, we still need it to configure things like the console
> consumer
> > > > >>
> > > > >> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lthomas@confluent.io
> >
> > > > wrote:
> > > > >>
> > > > >>> Hi Guozhang,
> > > > >>>
> > > > >>> Yes, the config would read them as a single window size. I think
> > this
> > > > >>> relates to John's comments about having variably sized windows,
> > which
> > > > >> this
> > > > >>> config doesn't handle. I like the name change and updated the
> wiki
> > to
> > > > >>> reflect that, and to clarify that the default value will still be
> > > > >>> Long.MAX_VALUE.
> > > > >>>
> > > > >>> Thanks for your feedback!
> > > > >>> Leah
> > > > >>>
> > > > >>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <
> wangguoz@gmail.com>
> > > > >> wrote:
> > > > >>>
> > > > >>>> Hello Leah,
> > > > >>>>
> > > > >>>> Thanks for initiating this. I just have one minor clarification
> > > > >> question
> > > > >>>> here: the config "window.size.ms" seems to be used as the
> default
> > > > >> window
> > > > >>>> size when reading from a topic that represents windowed records
> > > right?
> > > > >>> I.e.
> > > > >>>> if there are multiple topics that represent windowed records but
> > > their
> > > > >>>> window sizes are different, with this config we can only read
> them
> > > > >> with a
> > > > >>>> single window size? If yes, could we rename the config as "
> > > > >>>> default.window.size.ms" and make that clear in the description
> as
> > > > >> well?
> > > > >>>> Also we'd better also include its default value which I think
> > would
> > > > >> still
> > > > >>>> be MAX_VALUE for compatibility.
> > > > >>>>
> > > > >>>>
> > > > >>>> Guozhang
> > > > >>>>
> > > > >>>>
> > > > >>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <
> lthomas@confluent.io>
> > > > >> wrote:
> > > > >>>>
> > > > >>>>> Hey all,
> > > > >>>>>
> > > > >>>>> We should be good to wrap up voting now that the discussion has
> > > been
> > > > >>>>> resolved.
> > > > >>>>>
> > > > >>>>> Cheers,
> > > > >>>>> Leah
> > > > >>>>>
> > > > >>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <
> mjsax@apache.org
> > >
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>>> +1 (binding)
> > > > >>>>>>
> > > > >>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
> > > > >>>>>>> Hi all,
> > > > >>>>>>>
> > > > >>>>>>> I've just sent a new message to the DISCUSS thread. We
> > > > >>>>>>> forgot to include the Scala API in the proposal.
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>> -John
> > > > >>>>>>>
> > > > >>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> > > > >>>>>>> wrote:
> > > > >>>>>>>> Thanks for the KIP! +1 (non-binding)
> > > > >>>>>>>>
> > > > >>>>>>>> Sophie
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> > > > >> vvcephei@apache.org
> > > > >>>>
> > > > >>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Thanks Leah,
> > > > >>>>>>>>> I’m +1 (binding)
> > > > >>>>>>>>>
> > > > >>>>>>>>> -John
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> > > > >>>>>>>>>> Hi everyone,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
> > > > >>>>>>>>>> TimeWindowedDeserializer
> > > > >>>>>>>>>> and TimeWindowedSerde to handle window size.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> > > > >>>>>>>>>> Thanks,
> > > > >>>>>>>>>> Leah
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> --
> > > > >>>> -- Guozhang
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
>


-- 
-- Guozhang

Re: [VOTE] KIP-659: Improve TimeWindowedDeserializer and TimeWindowedSerde to handle window size

Posted by Sophie Blee-Goldman <so...@confluent.io>.
That sounds good to me. Thanks for reviving this

Sophie

On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <lt...@confluent.io> wrote:

> Hey all,
>
> Bringing this back up for discussion.
>
> It seems like the next steps are to:
> 1. rename the config "window.size.ms"
> 2. ensure that users set window size EITHER through the config OR through
> the constructor. On this note, it may make sense to remove the default for
> the `window.size.ms` config, so that there won't be a fall back if the
> window size isn't set in either spot. WDYT? This could also address the
> issue of multiple window sizes within a streams app.
>
> I see what Sophie is saying about the `default.windowed.key.serde.inner`
> config, but I do think deprecating and moving those configs would require a
> larger discussion. I'm open to looping them into this KIP if we feel like
> it's vital (or incredibly convenient with low cost to users), but my
> initial reaction is to leave that out for now and work within the current
> set-up for window size.
>
> Thanks for all the comments so far,
> Leah
>
> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <so...@confluent.io>
> wrote:
>
> > There are two cases where you need to specify the window size -- directly
> > using a
> > Consumer (eg the console consumer) or reading as an input topic within
> > Streams.
> > We need a config for the first case, since you can't pass a Deserializer
> > object to the
> > console consumer. In the Streams case, the reverse is true, and you have
> to
> > pass in
> > an actual Serde object.
> >
> > Imo we should keep these two cases separate and not use the config for
> the
> > Streams
> > case at all. But that's hard to enforce (we'd have to strip the config
> out
> > of the user's
> > StreamsConfig if they tried to use it within Streams, for example) and it
> > also puts us
> > in an awkward position due to the  default.windowed.inner.serde.class
> > configs. If
> > they can specify the inner serde class through their Streams app config,
> > they
> > should be able to specify the window size through config as well.
> Otherwise
> > we
> > either force a mix-and-match as Matthias described, or you just always
> have
> > to
> > specify both the inner class and the window size in the constructor, at
> > which
> > point, why even have the default.windowed.inner.serde.class config at
> all?
> >
> > ...
> > that's not a rhetorical question, actually. Personally I do think we
> should
> > deprecate the default.windowed.serde.inner.class and replace it with
> > separate
> > windowed.serializer.inner.class/windowed.deserializer.inner.class
> configs.
> > This
> > way it's immediately obvious that the configs are only for the
> > Consumer/Producer,
> > and you should construct your own TimeWindowedSerde with all the
> necessary
> > parameters for use in your Streams app.
> >
> > That might be too radical, and maybe the problem isn't worth the burden
> of
> > forcing users to change their code and replace the config with actual
> Serde
> > objects. But it should be an easy change to make, and if it isn't, that's
> > probably a good sign that you're using the serde incorrectly somewhere.
> >
> > If we don't deprecate the default.windowed.serde.inner.class, then it's
> > less clear
> > to me how to proceed. The only really consistent thing to do seems to be
> to
> > name and position the new window size config as a default config and
> allow
> > it to be used similar to the default inner class configs. Which, as
> > established
> > throughout this discussion, seems very very wrong
> >
> > So yes, I think we should just stick with the original name
> window.size.ms
> > .
> > Or
> > better yet, call it windowed.deserializer.window.size.ms and name the
> > default.windowed.serde.inner.class replacements
> > windowed.deserializer.inner.class and windowed.serializer.inner.class
> >
> > On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> wrote:
> >
> > > From my understanding, the KIP aims for the case when a user does not
> > > control the code, eg, when using the command line consumer (or similar
> > > tools).
> > >
> > > If the user writes code, we should always encourage her to instantiate
> > > the deserializer explicitly and not relying on reflection+config.
> > >
> > > I also don't think that the `default` prefix does make sense, as it
> > > indicates that there might be a non-default. However, IMHO, we should
> > > not allow "overwrite semantics" but rather throw an exception if the
> > > config is set and a window size is provided via the constructor. We
> > > should not allow to mix-and-match both and should stick to a strict
> > > either-or pattern.
> > >
> > >
> > > -Matthias
> > >
> > > On 9/8/20 11:52 AM, Guozhang Wang wrote:
> > > > Hi Sophie,
> > > >
> > > > Seems I do have some mis-understanding of the KIP's motivation here
> :)
> > > Just
> > > > for clarification my reasoning is that:
> > > >
> > > > 1) today Streams itself never uses a windowed deserializer itself
> since
> > > its
> > > > built-in operators only need the serializer and users do not need to
> > > > override it, plus standby / restore active tasks would just copy the
> > > bytes
> > > > directly. So this KIP's motivation is not for Stream's own code
> > anyways.
> > > >
> > > > 2) It is only when user specified serde is missing the window size,
> > which
> > > > is either when a) one is trying to read a source topic as windowed
> > > records
> > > > in Streams, this is a big blocker for KIP-300, and when b) one is
> > trying
> > > to
> > > > read a topic as windowed records in Consumer, we would have issues if
> > > users
> > > > fail to use the appropriate serde constructs.
> > > >
> > > > I thought the main motivation of this KIP is for 2.a), in which we
> > would
> > > > just encourage the users to use the right constructor with the window
> > > size
> > > > by deprecating the other constructs. But I'm not sure how this would
> > help
> > > > with 2.b) since the proposal is on adding to StreamsConfig. If it is
> > the
> > > > case, then I agree that probably we can just not add an extra config
> > but
> > > > just deprecating the constructs.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
> > sophie@confluent.io
> > > >
> > > > wrote:
> > > >
> > > >> Hey Guozhang & Leah,
> > > >>
> > > >> I want to push back a bit on the assumption that we would fall back
> on
> > > this
> > > >> config
> > > >> in the case of an unspecified window size in a Streams serde. I
> don't
> > > think
> > > >> it should
> > > >> be a default at all, either in name or in effect. To borrow the
> > > rhetorical
> > > >> question that
> > > >> John raised earlier: what is the default window size of an
> > application?
> > > >>
> > > >> Personally, I agree that that doesn't make much sense. Sure, if you
> > only
> > > >> have a single
> > > >> windowed operation in your app then you could just specify the
> window
> > > size
> > > >> by config,
> > > >> but how is that any more ergonomic than specifying the window size
> in
> > > the
> > > >> Serde's
> > > >> constructor? If anything, it seems worse to put physical and mental
> > > >> distance between
> > > >> the specification and the actual usage of such parameters. What if
> you
> > > add
> > > >> another
> > > >> windowed operation later, with a different size, and forget to
> specify
> > > the
> > > >> new size in
> > > >> the new Serde? Or what if you never specify a default window size
> > > config at
> > > >> all and
> > > >> accidentally end up using the default config value of
> Long.MAX_VALUE?
> > > >> Avoiding this
> > > >> possibility was/is one of the main motivations of this KIP, and the
> > > whole
> > > >> point of
> > > >> deprecating the TimeWindowedSerde(innerClass) constructor.
> > > >>
> > > >> I actually would have advocated to remove this config entirely, but
> as
> > > John
> > > >> pointed
> > > >> out, we still need it to configure things like the console consumer
> > > >>
> > > >> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <lt...@confluent.io>
> > > wrote:
> > > >>
> > > >>> Hi Guozhang,
> > > >>>
> > > >>> Yes, the config would read them as a single window size. I think
> this
> > > >>> relates to John's comments about having variably sized windows,
> which
> > > >> this
> > > >>> config doesn't handle. I like the name change and updated the wiki
> to
> > > >>> reflect that, and to clarify that the default value will still be
> > > >>> Long.MAX_VALUE.
> > > >>>
> > > >>> Thanks for your feedback!
> > > >>> Leah
> > > >>>
> > > >>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <wa...@gmail.com>
> > > >> wrote:
> > > >>>
> > > >>>> Hello Leah,
> > > >>>>
> > > >>>> Thanks for initiating this. I just have one minor clarification
> > > >> question
> > > >>>> here: the config "window.size.ms" seems to be used as the default
> > > >> window
> > > >>>> size when reading from a topic that represents windowed records
> > right?
> > > >>> I.e.
> > > >>>> if there are multiple topics that represent windowed records but
> > their
> > > >>>> window sizes are different, with this config we can only read them
> > > >> with a
> > > >>>> single window size? If yes, could we rename the config as "
> > > >>>> default.window.size.ms" and make that clear in the description as
> > > >> well?
> > > >>>> Also we'd better also include its default value which I think
> would
> > > >> still
> > > >>>> be MAX_VALUE for compatibility.
> > > >>>>
> > > >>>>
> > > >>>> Guozhang
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <lt...@confluent.io>
> > > >> wrote:
> > > >>>>
> > > >>>>> Hey all,
> > > >>>>>
> > > >>>>> We should be good to wrap up voting now that the discussion has
> > been
> > > >>>>> resolved.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Leah
> > > >>>>>
> > > >>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <mjsax@apache.org
> >
> > > >>> wrote:
> > > >>>>>
> > > >>>>>> +1 (binding)
> > > >>>>>>
> > > >>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
> > > >>>>>>> Hi all,
> > > >>>>>>>
> > > >>>>>>> I've just sent a new message to the DISCUSS thread. We
> > > >>>>>>> forgot to include the Scala API in the proposal.
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> -John
> > > >>>>>>>
> > > >>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> > > >>>>>>> wrote:
> > > >>>>>>>> Thanks for the KIP! +1 (non-binding)
> > > >>>>>>>>
> > > >>>>>>>> Sophie
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> > > >> vvcephei@apache.org
> > > >>>>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Thanks Leah,
> > > >>>>>>>>> I’m +1 (binding)
> > > >>>>>>>>>
> > > >>>>>>>>> -John
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> > > >>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>
> > > >>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
> > > >>>>>>>>>> TimeWindowedDeserializer
> > > >>>>>>>>>> and TimeWindowedSerde to handle window size.
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> Leah
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> -- Guozhang
> > > >>>>
> > > >>>
> > > >>
> > > >
> > > >
> > >
> > >
> >
>