You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by John Roesler <jo...@confluent.io> on 2018/06/20 17:45:16 UTC

[DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Hello All,

I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
Specifically, we're creating CachingWindowStore with the *number of
segments* instead of the *segment size*.

Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ

additionally, here's a draft PR for clarity:
https://github.com/apache/kafka/pull/5257

Please let me know what you think!

Thanks,
-John

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Hi Ted,

Ok, I also prefer it, and I find this reasoning compelling.

Thanks,
-John

On Thu, Jun 21, 2018 at 3:40 AM Ted Yu <yu...@gmail.com> wrote:

> Window is a common term used in various streaming processing systems whose
> unit is time unit.
>
> Segment doesn't seem to be as widely used in such context.
> I think using interval in the method name would clearly convey the meaning
> intuitively.
>
> Thanks
>
>
> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Hi Ted,
> >
> > Ah, when you made that comment to me before, I thought you meant as
> opposed
> > to "segments". Now it makes sense that you meant as opposed to
> > "segmentSize".
> >
> > I named it that way to match the peer method "windowSize", which is also
> a
> > quantity of milliseconds.
> >
> > I agree that "interval" is more intuitive, but I think I favor
> consistency
> > in this case. Does that seem reasonable?
> >
> > Thanks,
> > -John
> >
> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
> >
> > > Normally size is not measured in time unit, such as milliseconds.
> > > How about naming the new method segmentInterval ?
> > > Thanks
> > > -------- Original message --------From: John Roesler <
> john@confluent.io>
> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > WindowBytesStoreSupplier
> > > Hello All,
> > >
> > > I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
> > > Specifically, we're creating CachingWindowStore with the *number of
> > > segments* instead of the *segment size*.
> > >
> > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > >
> > > additionally, here's a draft PR for clarity:
> > > https://github.com/apache/kafka/pull/5257
> > >
> > > Please let me know what you think!
> > >
> > > Thanks,
> > > -John
> > >
> >
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Ted Yu <yu...@gmail.com>.
Window is a common term used in various streaming processing systems whose
unit is time unit.

Segment doesn't seem to be as widely used in such context.
I think using interval in the method name would clearly convey the meaning
intuitively.

Thanks


On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:

> Hi Ted,
>
> Ah, when you made that comment to me before, I thought you meant as opposed
> to "segments". Now it makes sense that you meant as opposed to
> "segmentSize".
>
> I named it that way to match the peer method "windowSize", which is also a
> quantity of milliseconds.
>
> I agree that "interval" is more intuitive, but I think I favor consistency
> in this case. Does that seem reasonable?
>
> Thanks,
> -John
>
> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
>
> > Normally size is not measured in time unit, such as milliseconds.
> > How about naming the new method segmentInterval ?
> > Thanks
> > -------- Original message --------From: John Roesler <jo...@confluent.io>
> > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > WindowBytesStoreSupplier
> > Hello All,
> >
> > I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
> > Specifically, we're creating CachingWindowStore with the *number of
> > segments* instead of the *segment size*.
> >
> > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >
> > additionally, here's a draft PR for clarity:
> > https://github.com/apache/kafka/pull/5257
> >
> > Please let me know what you think!
> >
> > Thanks,
> > -John
> >
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by "Matthias J. Sax" <ma...@confluent.io>.
NP. Thanks!

On 6/26/18 2:54 PM, John Roesler wrote:
> Sorry for the misunderstanding, Matthias.
> 
> I have created https://issues.apache.org/jira/browse/KAFKA-7106 and
> https://issues.apache.org/jira/browse/KAFKA-7107 to track these issues.
> 
> Thanks,
> -John
> 
> On Mon, Jun 25, 2018 at 10:06 PM Matthias J. Sax <ma...@confluent.io>
> wrote:
> 
>> KAFKA-7080 is for this KIP.
>>
>> I meant to create a JIRA to add `segmentInterval` to `Materialized` and
>> a JIRA to add `Materialized` to `KStream#join(KStream)`.
>>
>> Thx.
>>
>>
>> -Matthias
>>
>> On 6/25/18 2:46 PM, John Roesler wrote:
>>> Ah, it turns out I did create a ticket: it's KAFKA-7080:
>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>
>>> -John
>>>
>>> On Mon, Jun 25, 2018 at 4:44 PM John Roesler <jo...@confluent.io> wrote:
>>>
>>>> Matthias,
>>>>
>>>> That's a good idea. I'm not sure why I didn't...
>>>>
>>>> Thanks,
>>>> -John
>>>>
>>>> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <ma...@confluent.io>
>>>> wrote:
>>>>
>>>>> Ok.
>>>>>
>>>>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>>>>> related, but actually an own ticket (that is blocked by not having
>>>>> Materialized for stream-stream joins).
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
>>>>>> I agree that it makes sense to have segmentInterval as a parameter to
>> a
>>>>>> store, but I also agree with Guozhang's point about not moving as part
>>>>> of
>>>>>> this KIP.
>>>>>>
>>>>>> Thanks,
>>>>>> Bill
>>>>>>
>>>>>> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io>
>> wrote:
>>>>>>
>>>>>>> Thanks Matthias and Guozhang,
>>>>>>>
>>>>>>> About deprecating the "segments" field instead of making it private.
>>>>> Yes, I
>>>>>>> just took another look at the code, and that is correct. I'll update
>>>>> the
>>>>>>> KIP.
>>>>>>>
>>>>>>> I do agree that in the long run, it makes more sense as a parameter
>> to
>>>>> the
>>>>>>> store somehow than as a parameter to the window. I think this isn't a
>>>>> super
>>>>>>> high priority, though, because it's not exposed in the DSL (or it
>>>>> wasn't
>>>>>>> intended to be).
>>>>>>>
>>>>>>> I felt Guozhang's point is valid, and that we should probably revisit
>>>>> it
>>>>>>> later, possibly in the scope of
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-4730
>>>>>>>
>>>>>>> I'll wait an hour or so for more feedback before moving on to a vote.
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> -John
>>>>>>>
>>>>>>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Re `segmentInterval` parameter in Windows: currently it is used in
>> two
>>>>>>>> places, the windowed stream aggregation, and the stream-stream
>> joins.
>>>>> For
>>>>>>>> the former, we can potentially move the parameter from windowedBy()
>> to
>>>>>>>> Materialized, but for the latter we currently do not expose a
>>>>>>> Materialized
>>>>>>>> object yet, only the Windows spec. So I think in this KIP we
>> probably
>>>>>>>> cannot move it immediately.
>>>>>>>>
>>>>>>>> But in future KIPs if we decide to expose the stream-stream join's
>>>>> store
>>>>>>> /
>>>>>>>> changelog / repartition topic names, we may well adding the
>>>>> Materialized
>>>>>>>> object into the operator, and we can then move the parameter to
>>>>>>>> Materialized by then.
>>>>>>>>
>>>>>>>>
>>>>>>>> Guozhang
>>>>>>>>
>>>>>>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>>>>> matthias@confluent.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
>>>>>>> API.
>>>>>>>>>
>>>>>>>>> Couple of comments:
>>>>>>>>>
>>>>>>>>>> Sadly there's no way to "deprecate" this
>>>>>>>>>> exposure
>>>>>>>>>
>>>>>>>>> I disagree. We can just mark the variable as deprecated and I
>>>>> advocate
>>>>>>>>> to do this. When the deprecation period passed, we can make it
>>>>> private
>>>>>>>>> (or actually remove it; cf. my next comment).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Parameter, `segmentInterval` is semantically not a "window"
>>>>>>>>> specification parameter but an implementation detail and thus a
>> store
>>>>>>>>> parameter. Would it be better to add it to `Materialized`?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
>>>>>>>>>> Thanks John.
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for the feedback, Bill and Guozhang,
>>>>>>>>>>>
>>>>>>>>>>> I've updated the KIP accordingly.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <
>> wangguoz@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>>>>>>> the
>>>>>>>>>>> wiki:
>>>>>>>>>>>> the `In Windows, we will:` section code snippet is empty.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbejeck@gmail.com
>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the JavaDoc for the segmentInterval method, there's no
>> mention
>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>> number of segments there can be at any one time.  While it's
>>>>>>> implied
>>>>>>>>>>> that
>>>>>>>>>>>>> the number of segments is potentially unbounded, would be
>> better
>>>>>>> to
>>>>>>>>>>>>> explicitly state that the previous limit on the number of
>>>>> segments
>>>>>>>> is
>>>>>>>>>>>> going
>>>>>>>>>>>>> to be removed as well?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a couple of nit comments.   The method name is still
>>>>>>>>> segmentSize
>>>>>>>>>>>> in
>>>>>>>>>>>>> the code block vs segmentInterval and the order of the
>> parameters
>>>>>>>> for
>>>>>>>>>>> the
>>>>>>>>>>>>> third persistentWindowStore don't match the order in the
>> JavaDoc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Bill
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <
>> john@confluent.io>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've updated the KIP and draft PR accordingly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <
>> john@confluent.io
>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Interesting... I did not initially consider it because I
>> didn't
>>>>>>>>>>> want
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
>>>>>>> unless
>>>>>>>>>>>>>>> developers have subclassed `Windows`, the number of segments
>>>>>>> would
>>>>>>>>>>>>> always
>>>>>>>>>>>>>>> be 3!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There's one caveat to this, which I think was a mistake. The
>>>>>>> field
>>>>>>>>>>>>>>> `segments` in Windows is public, which means that anyone can
>>>>>>>>>>> actually
>>>>>>>>>>>>> set
>>>>>>>>>>>>>>> it directly on any Window instance like:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
>>>>>>>>>>>>>>>         tw.segments = 12345;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
>>>>>>>> Windows
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> says users can't directly set it. Sadly there's no way to
>>>>>>>>>>> "deprecate"
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> exposure, so I propose just to make it private.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With this new knowledge, I agree, I think we can switch to
>>>>>>>>>>>>>>> "segmentInterval" throughout the interface.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
>>>>>>> wangguoz@gmail.com
>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello John,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Should we consider making the change on
>>>>>>>>>>>> `Stores#persistentWindowStore`
>>>>>>>>>>>>>>>> parameters as well?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
>>>>>>> john@confluent.io
>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Ted,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
>>>>>>> meant
>>>>>>>>>>>> as
>>>>>>>>>>>>>>>> opposed
>>>>>>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed
>>>>> to
>>>>>>>>>>>>>>>>> "segmentSize".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I named it that way to match the peer method "windowSize",
>>>>>>> which
>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> also a
>>>>>>>>>>>>>>>>> quantity of milliseconds.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I
>>>>> favor
>>>>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>>>>> in this case. Does that seem reasonable?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <
>> yuzhihong@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Normally size is not measured in time unit, such as
>>>>>>>>>>>> milliseconds.
>>>>>>>>>>>>>>>>>> How about naming the new method segmentInterval ?
>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>> -------- Original message --------From: John Roesler <
>>>>>>>>>>>>>>>> john@confluent.io>
>>>>>>>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
>>>>>>> dev@kafka.apache.org
>>>>>>>>>>>>>>>> Subject:
>>>>>>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>>>>>>>>>>>>>>>>>> WindowBytesStoreSupplier
>>>>>>>>>>>>>>>>>> Hello All,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified
>> in
>>>>>>>>>>>>>>>> KAFKA-7080.
>>>>>>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>>>>>>>>>>> *number
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> segments* instead of the *segment size*.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Here's the jira:
>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>>>>>>>>>>>>>>>> Here's the KIP:
>>>>> https://cwiki.apache.org/confluence/x/mQU0BQ
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> additionally, here's a draft PR for clarity:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please let me know what you think!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -- Guozhang
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>
>>
> 


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Sorry for the misunderstanding, Matthias.

I have created https://issues.apache.org/jira/browse/KAFKA-7106 and
https://issues.apache.org/jira/browse/KAFKA-7107 to track these issues.

Thanks,
-John

On Mon, Jun 25, 2018 at 10:06 PM Matthias J. Sax <ma...@confluent.io>
wrote:

> KAFKA-7080 is for this KIP.
>
> I meant to create a JIRA to add `segmentInterval` to `Materialized` and
> a JIRA to add `Materialized` to `KStream#join(KStream)`.
>
> Thx.
>
>
> -Matthias
>
> On 6/25/18 2:46 PM, John Roesler wrote:
> > Ah, it turns out I did create a ticket: it's KAFKA-7080:
> > https://issues.apache.org/jira/browse/KAFKA-7080
> >
> > -John
> >
> > On Mon, Jun 25, 2018 at 4:44 PM John Roesler <jo...@confluent.io> wrote:
> >
> >> Matthias,
> >>
> >> That's a good idea. I'm not sure why I didn't...
> >>
> >> Thanks,
> >> -John
> >>
> >> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <ma...@confluent.io>
> >> wrote:
> >>
> >>> Ok.
> >>>
> >>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
> >>> related, but actually an own ticket (that is blocked by not having
> >>> Materialized for stream-stream joins).
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
> >>>> I agree that it makes sense to have segmentInterval as a parameter to
> a
> >>>> store, but I also agree with Guozhang's point about not moving as part
> >>> of
> >>>> this KIP.
> >>>>
> >>>> Thanks,
> >>>> Bill
> >>>>
> >>>> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io>
> wrote:
> >>>>
> >>>>> Thanks Matthias and Guozhang,
> >>>>>
> >>>>> About deprecating the "segments" field instead of making it private.
> >>> Yes, I
> >>>>> just took another look at the code, and that is correct. I'll update
> >>> the
> >>>>> KIP.
> >>>>>
> >>>>> I do agree that in the long run, it makes more sense as a parameter
> to
> >>> the
> >>>>> store somehow than as a parameter to the window. I think this isn't a
> >>> super
> >>>>> high priority, though, because it's not exposed in the DSL (or it
> >>> wasn't
> >>>>> intended to be).
> >>>>>
> >>>>> I felt Guozhang's point is valid, and that we should probably revisit
> >>> it
> >>>>> later, possibly in the scope of
> >>>>> https://issues.apache.org/jira/browse/KAFKA-4730
> >>>>>
> >>>>> I'll wait an hour or so for more feedback before moving on to a vote.
> >>>>>
> >>>>> Thanks again,
> >>>>> -John
> >>>>>
> >>>>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> Re `segmentInterval` parameter in Windows: currently it is used in
> two
> >>>>>> places, the windowed stream aggregation, and the stream-stream
> joins.
> >>> For
> >>>>>> the former, we can potentially move the parameter from windowedBy()
> to
> >>>>>> Materialized, but for the latter we currently do not expose a
> >>>>> Materialized
> >>>>>> object yet, only the Windows spec. So I think in this KIP we
> probably
> >>>>>> cannot move it immediately.
> >>>>>>
> >>>>>> But in future KIPs if we decide to expose the stream-stream join's
> >>> store
> >>>>> /
> >>>>>> changelog / repartition topic names, we may well adding the
> >>> Materialized
> >>>>>> object into the operator, and we can then move the parameter to
> >>>>>> Materialized by then.
> >>>>>>
> >>>>>>
> >>>>>> Guozhang
> >>>>>>
> >>>>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
> >>> matthias@confluent.io>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
> >>>>> API.
> >>>>>>>
> >>>>>>> Couple of comments:
> >>>>>>>
> >>>>>>>> Sadly there's no way to "deprecate" this
> >>>>>>>> exposure
> >>>>>>>
> >>>>>>> I disagree. We can just mark the variable as deprecated and I
> >>> advocate
> >>>>>>> to do this. When the deprecation period passed, we can make it
> >>> private
> >>>>>>> (or actually remove it; cf. my next comment).
> >>>>>>>
> >>>>>>>
> >>>>>>> Parameter, `segmentInterval` is semantically not a "window"
> >>>>>>> specification parameter but an implementation detail and thus a
> store
> >>>>>>> parameter. Would it be better to add it to `Materialized`?
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
> >>>>>>>> Thanks John.
> >>>>>>>>
> >>>>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the feedback, Bill and Guozhang,
> >>>>>>>>>
> >>>>>>>>> I've updated the KIP accordingly.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <
> wangguoz@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> >>>>> the
> >>>>>>>>> wiki:
> >>>>>>>>>> the `In Windows, we will:` section code snippet is empty.
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbejeck@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi John,
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
> >>>>>>>>>>>
> >>>>>>>>>>> In the JavaDoc for the segmentInterval method, there's no
> mention
> >>>>> of
> >>>>>>>>> the
> >>>>>>>>>>> number of segments there can be at any one time.  While it's
> >>>>> implied
> >>>>>>>>> that
> >>>>>>>>>>> the number of segments is potentially unbounded, would be
> better
> >>>>> to
> >>>>>>>>>>> explicitly state that the previous limit on the number of
> >>> segments
> >>>>>> is
> >>>>>>>>>> going
> >>>>>>>>>>> to be removed as well?
> >>>>>>>>>>>
> >>>>>>>>>>> I have a couple of nit comments.   The method name is still
> >>>>>>> segmentSize
> >>>>>>>>>> in
> >>>>>>>>>>> the code block vs segmentInterval and the order of the
> parameters
> >>>>>> for
> >>>>>>>>> the
> >>>>>>>>>>> third persistentWindowStore don't match the order in the
> JavaDoc.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Bill
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <
> john@confluent.io>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I've updated the KIP and draft PR accordingly.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <
> john@confluent.io
> >>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Interesting... I did not initially consider it because I
> didn't
> >>>>>>>>> want
> >>>>>>>>>> to
> >>>>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
> >>>>> unless
> >>>>>>>>>>>>> developers have subclassed `Windows`, the number of segments
> >>>>> would
> >>>>>>>>>>> always
> >>>>>>>>>>>>> be 3!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> There's one caveat to this, which I think was a mistake. The
> >>>>> field
> >>>>>>>>>>>>> `segments` in Windows is public, which means that anyone can
> >>>>>>>>> actually
> >>>>>>>>>>> set
> >>>>>>>>>>>>> it directly on any Window instance like:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
> >>>>>>>>>>>>>         tw.segments = 12345;
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
> >>>>>> Windows
> >>>>>>>>>>> that
> >>>>>>>>>>>>> says users can't directly set it. Sadly there's no way to
> >>>>>>>>> "deprecate"
> >>>>>>>>>>>> this
> >>>>>>>>>>>>> exposure, so I propose just to make it private.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> With this new knowledge, I agree, I think we can switch to
> >>>>>>>>>>>>> "segmentInterval" throughout the interface.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
> >>>>> wangguoz@gmail.com
> >>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hello John,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the KIP.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Should we consider making the change on
> >>>>>>>>>> `Stores#persistentWindowStore`
> >>>>>>>>>>>>>> parameters as well?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
> >>>>> john@confluent.io
> >>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Ted,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
> >>>>> meant
> >>>>>>>>>> as
> >>>>>>>>>>>>>> opposed
> >>>>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed
> >>> to
> >>>>>>>>>>>>>>> "segmentSize".
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I named it that way to match the peer method "windowSize",
> >>>>> which
> >>>>>>>>>> is
> >>>>>>>>>>>>>> also a
> >>>>>>>>>>>>>>> quantity of milliseconds.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I
> >>> favor
> >>>>>>>>>>>>>> consistency
> >>>>>>>>>>>>>>> in this case. Does that seem reasonable?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <
> yuzhihong@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Normally size is not measured in time unit, such as
> >>>>>>>>>> milliseconds.
> >>>>>>>>>>>>>>>> How about naming the new method segmentInterval ?
> >>>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>> -------- Original message --------From: John Roesler <
> >>>>>>>>>>>>>> john@confluent.io>
> >>>>>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
> >>>>> dev@kafka.apache.org
> >>>>>>>>>>>>>> Subject:
> >>>>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> >>>>>>>>>>>>>>>> WindowBytesStoreSupplier
> >>>>>>>>>>>>>>>> Hello All,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified
> in
> >>>>>>>>>>>>>> KAFKA-7080.
> >>>>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
> >>>>>>>>> *number
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>>> segments* instead of the *segment size*.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Here's the jira:
> >>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
> >>>>>>>>>>>>>>>> Here's the KIP:
> >>> https://cwiki.apache.org/confluence/x/mQU0BQ
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> additionally, here's a draft PR for clarity:
> >>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Please let me know what you think!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> -- Guozhang
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -- Guozhang
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >
>
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by "Matthias J. Sax" <ma...@confluent.io>.
KAFKA-7080 is for this KIP.

I meant to create a JIRA to add `segmentInterval` to `Materialized` and
a JIRA to add `Materialized` to `KStream#join(KStream)`.

Thx.


-Matthias

On 6/25/18 2:46 PM, John Roesler wrote:
> Ah, it turns out I did create a ticket: it's KAFKA-7080:
> https://issues.apache.org/jira/browse/KAFKA-7080
> 
> -John
> 
> On Mon, Jun 25, 2018 at 4:44 PM John Roesler <jo...@confluent.io> wrote:
> 
>> Matthias,
>>
>> That's a good idea. I'm not sure why I didn't...
>>
>> Thanks,
>> -John
>>
>> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <ma...@confluent.io>
>> wrote:
>>
>>> Ok.
>>>
>>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>>> related, but actually an own ticket (that is blocked by not having
>>> Materialized for stream-stream joins).
>>>
>>>
>>> -Matthias
>>>
>>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
>>>> I agree that it makes sense to have segmentInterval as a parameter to a
>>>> store, but I also agree with Guozhang's point about not moving as part
>>> of
>>>> this KIP.
>>>>
>>>> Thanks,
>>>> Bill
>>>>
>>>> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io> wrote:
>>>>
>>>>> Thanks Matthias and Guozhang,
>>>>>
>>>>> About deprecating the "segments" field instead of making it private.
>>> Yes, I
>>>>> just took another look at the code, and that is correct. I'll update
>>> the
>>>>> KIP.
>>>>>
>>>>> I do agree that in the long run, it makes more sense as a parameter to
>>> the
>>>>> store somehow than as a parameter to the window. I think this isn't a
>>> super
>>>>> high priority, though, because it's not exposed in the DSL (or it
>>> wasn't
>>>>> intended to be).
>>>>>
>>>>> I felt Guozhang's point is valid, and that we should probably revisit
>>> it
>>>>> later, possibly in the scope of
>>>>> https://issues.apache.org/jira/browse/KAFKA-4730
>>>>>
>>>>> I'll wait an hour or so for more feedback before moving on to a vote.
>>>>>
>>>>> Thanks again,
>>>>> -John
>>>>>
>>>>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com>
>>> wrote:
>>>>>
>>>>>> Re `segmentInterval` parameter in Windows: currently it is used in two
>>>>>> places, the windowed stream aggregation, and the stream-stream joins.
>>> For
>>>>>> the former, we can potentially move the parameter from windowedBy() to
>>>>>> Materialized, but for the latter we currently do not expose a
>>>>> Materialized
>>>>>> object yet, only the Windows spec. So I think in this KIP we probably
>>>>>> cannot move it immediately.
>>>>>>
>>>>>> But in future KIPs if we decide to expose the stream-stream join's
>>> store
>>>>> /
>>>>>> changelog / repartition topic names, we may well adding the
>>> Materialized
>>>>>> object into the operator, and we can then move the parameter to
>>>>>> Materialized by then.
>>>>>>
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>>> matthias@confluent.io>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
>>>>> API.
>>>>>>>
>>>>>>> Couple of comments:
>>>>>>>
>>>>>>>> Sadly there's no way to "deprecate" this
>>>>>>>> exposure
>>>>>>>
>>>>>>> I disagree. We can just mark the variable as deprecated and I
>>> advocate
>>>>>>> to do this. When the deprecation period passed, we can make it
>>> private
>>>>>>> (or actually remove it; cf. my next comment).
>>>>>>>
>>>>>>>
>>>>>>> Parameter, `segmentInterval` is semantically not a "window"
>>>>>>> specification parameter but an implementation detail and thus a store
>>>>>>> parameter. Would it be better to add it to `Materialized`?
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
>>>>>>>> Thanks John.
>>>>>>>>
>>>>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the feedback, Bill and Guozhang,
>>>>>>>>>
>>>>>>>>> I've updated the KIP accordingly.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>>>>> the
>>>>>>>>> wiki:
>>>>>>>>>> the `In Windows, we will:` section code snippet is empty.
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi John,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
>>>>>>>>>>>
>>>>>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention
>>>>> of
>>>>>>>>> the
>>>>>>>>>>> number of segments there can be at any one time.  While it's
>>>>> implied
>>>>>>>>> that
>>>>>>>>>>> the number of segments is potentially unbounded, would be better
>>>>> to
>>>>>>>>>>> explicitly state that the previous limit on the number of
>>> segments
>>>>>> is
>>>>>>>>>> going
>>>>>>>>>>> to be removed as well?
>>>>>>>>>>>
>>>>>>>>>>> I have a couple of nit comments.   The method name is still
>>>>>>> segmentSize
>>>>>>>>>> in
>>>>>>>>>>> the code block vs segmentInterval and the order of the parameters
>>>>>> for
>>>>>>>>> the
>>>>>>>>>>> third persistentWindowStore don't match the order in the JavaDoc.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bill
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I've updated the KIP and draft PR accordingly.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <john@confluent.io
>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Interesting... I did not initially consider it because I didn't
>>>>>>>>> want
>>>>>>>>>> to
>>>>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
>>>>> unless
>>>>>>>>>>>>> developers have subclassed `Windows`, the number of segments
>>>>> would
>>>>>>>>>>> always
>>>>>>>>>>>>> be 3!
>>>>>>>>>>>>>
>>>>>>>>>>>>> There's one caveat to this, which I think was a mistake. The
>>>>> field
>>>>>>>>>>>>> `segments` in Windows is public, which means that anyone can
>>>>>>>>> actually
>>>>>>>>>>> set
>>>>>>>>>>>>> it directly on any Window instance like:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
>>>>>>>>>>>>>         tw.segments = 12345;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
>>>>>> Windows
>>>>>>>>>>> that
>>>>>>>>>>>>> says users can't directly set it. Sadly there's no way to
>>>>>>>>> "deprecate"
>>>>>>>>>>>> this
>>>>>>>>>>>>> exposure, so I propose just to make it private.
>>>>>>>>>>>>>
>>>>>>>>>>>>> With this new knowledge, I agree, I think we can switch to
>>>>>>>>>>>>> "segmentInterval" throughout the interface.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
>>>>> wangguoz@gmail.com
>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello John,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Should we consider making the change on
>>>>>>>>>> `Stores#persistentWindowStore`
>>>>>>>>>>>>>> parameters as well?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
>>>>> john@confluent.io
>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ted,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
>>>>> meant
>>>>>>>>>> as
>>>>>>>>>>>>>> opposed
>>>>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed
>>> to
>>>>>>>>>>>>>>> "segmentSize".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I named it that way to match the peer method "windowSize",
>>>>> which
>>>>>>>>>> is
>>>>>>>>>>>>>> also a
>>>>>>>>>>>>>>> quantity of milliseconds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I
>>> favor
>>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>>> in this case. Does that seem reasonable?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Normally size is not measured in time unit, such as
>>>>>>>>>> milliseconds.
>>>>>>>>>>>>>>>> How about naming the new method segmentInterval ?
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> -------- Original message --------From: John Roesler <
>>>>>>>>>>>>>> john@confluent.io>
>>>>>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
>>>>> dev@kafka.apache.org
>>>>>>>>>>>>>> Subject:
>>>>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>>>>>>>>>>>>>>>> WindowBytesStoreSupplier
>>>>>>>>>>>>>>>> Hello All,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
>>>>>>>>>>>>>> KAFKA-7080.
>>>>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>>>>>>>>> *number
>>>>>>>>>>> of
>>>>>>>>>>>>>>>> segments* instead of the *segment size*.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here's the jira:
>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>>>>>>>>>>>>>> Here's the KIP:
>>> https://cwiki.apache.org/confluence/x/mQU0BQ
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> additionally, here's a draft PR for clarity:
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let me know what you think!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -- Guozhang
>>>>>>
>>>>>
>>>>
>>>
>>>
> 


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Ah, it turns out I did create a ticket: it's KAFKA-7080:
https://issues.apache.org/jira/browse/KAFKA-7080

-John

On Mon, Jun 25, 2018 at 4:44 PM John Roesler <jo...@confluent.io> wrote:

> Matthias,
>
> That's a good idea. I'm not sure why I didn't...
>
> Thanks,
> -John
>
> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <ma...@confluent.io>
> wrote:
>
>> Ok.
>>
>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>> related, but actually an own ticket (that is blocked by not having
>> Materialized for stream-stream joins).
>>
>>
>> -Matthias
>>
>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
>> > I agree that it makes sense to have segmentInterval as a parameter to a
>> > store, but I also agree with Guozhang's point about not moving as part
>> of
>> > this KIP.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io> wrote:
>> >
>> >> Thanks Matthias and Guozhang,
>> >>
>> >> About deprecating the "segments" field instead of making it private.
>> Yes, I
>> >> just took another look at the code, and that is correct. I'll update
>> the
>> >> KIP.
>> >>
>> >> I do agree that in the long run, it makes more sense as a parameter to
>> the
>> >> store somehow than as a parameter to the window. I think this isn't a
>> super
>> >> high priority, though, because it's not exposed in the DSL (or it
>> wasn't
>> >> intended to be).
>> >>
>> >> I felt Guozhang's point is valid, and that we should probably revisit
>> it
>> >> later, possibly in the scope of
>> >> https://issues.apache.org/jira/browse/KAFKA-4730
>> >>
>> >> I'll wait an hour or so for more feedback before moving on to a vote.
>> >>
>> >> Thanks again,
>> >> -John
>> >>
>> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >>
>> >>> Re `segmentInterval` parameter in Windows: currently it is used in two
>> >>> places, the windowed stream aggregation, and the stream-stream joins.
>> For
>> >>> the former, we can potentially move the parameter from windowedBy() to
>> >>> Materialized, but for the latter we currently do not expose a
>> >> Materialized
>> >>> object yet, only the Windows spec. So I think in this KIP we probably
>> >>> cannot move it immediately.
>> >>>
>> >>> But in future KIPs if we decide to expose the stream-stream join's
>> store
>> >> /
>> >>> changelog / repartition topic names, we may well adding the
>> Materialized
>> >>> object into the operator, and we can then move the parameter to
>> >>> Materialized by then.
>> >>>
>> >>>
>> >>> Guozhang
>> >>>
>> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>> matthias@confluent.io>
>> >>> wrote:
>> >>>
>> >>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
>> >> API.
>> >>>>
>> >>>> Couple of comments:
>> >>>>
>> >>>>> Sadly there's no way to "deprecate" this
>> >>>>> exposure
>> >>>>
>> >>>> I disagree. We can just mark the variable as deprecated and I
>> advocate
>> >>>> to do this. When the deprecation period passed, we can make it
>> private
>> >>>> (or actually remove it; cf. my next comment).
>> >>>>
>> >>>>
>> >>>> Parameter, `segmentInterval` is semantically not a "window"
>> >>>> specification parameter but an implementation detail and thus a store
>> >>>> parameter. Would it be better to add it to `Materialized`?
>> >>>>
>> >>>>
>> >>>> -Matthias
>> >>>>
>> >>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
>> >>>>> Thanks John.
>> >>>>>
>> >>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
>> >>> wrote:
>> >>>>>
>> >>>>>> Thanks for the feedback, Bill and Guozhang,
>> >>>>>>
>> >>>>>> I've updated the KIP accordingly.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> -John
>> >>>>>>
>> >>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
>> >>>> wrote:
>> >>>>>>
>> >>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>> >> the
>> >>>>>> wiki:
>> >>>>>>> the `In Windows, we will:` section code snippet is empty.
>> >>>>>>>
>> >>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>>> Hi John,
>> >>>>>>>>
>> >>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
>> >>>>>>>>
>> >>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention
>> >> of
>> >>>>>> the
>> >>>>>>>> number of segments there can be at any one time.  While it's
>> >> implied
>> >>>>>> that
>> >>>>>>>> the number of segments is potentially unbounded, would be better
>> >> to
>> >>>>>>>> explicitly state that the previous limit on the number of
>> segments
>> >>> is
>> >>>>>>> going
>> >>>>>>>> to be removed as well?
>> >>>>>>>>
>> >>>>>>>> I have a couple of nit comments.   The method name is still
>> >>>> segmentSize
>> >>>>>>> in
>> >>>>>>>> the code block vs segmentInterval and the order of the parameters
>> >>> for
>> >>>>>> the
>> >>>>>>>> third persistentWindowStore don't match the order in the JavaDoc.
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Bill
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
>> >>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> I've updated the KIP and draft PR accordingly.
>> >>>>>>>>>
>> >>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <john@confluent.io
>> >
>> >>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Interesting... I did not initially consider it because I didn't
>> >>>>>> want
>> >>>>>>> to
>> >>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
>> >> unless
>> >>>>>>>>>> developers have subclassed `Windows`, the number of segments
>> >> would
>> >>>>>>>> always
>> >>>>>>>>>> be 3!
>> >>>>>>>>>>
>> >>>>>>>>>> There's one caveat to this, which I think was a mistake. The
>> >> field
>> >>>>>>>>>> `segments` in Windows is public, which means that anyone can
>> >>>>>> actually
>> >>>>>>>> set
>> >>>>>>>>>> it directly on any Window instance like:
>> >>>>>>>>>>
>> >>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
>> >>>>>>>>>>         tw.segments = 12345;
>> >>>>>>>>>>
>> >>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
>> >>> Windows
>> >>>>>>>> that
>> >>>>>>>>>> says users can't directly set it. Sadly there's no way to
>> >>>>>> "deprecate"
>> >>>>>>>>> this
>> >>>>>>>>>> exposure, so I propose just to make it private.
>> >>>>>>>>>>
>> >>>>>>>>>> With this new knowledge, I agree, I think we can switch to
>> >>>>>>>>>> "segmentInterval" throughout the interface.
>> >>>>>>>>>>
>> >>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
>> >> wangguoz@gmail.com
>> >>>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hello John,
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks for the KIP.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Should we consider making the change on
>> >>>>>>> `Stores#persistentWindowStore`
>> >>>>>>>>>>> parameters as well?
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> Guozhang
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
>> >> john@confluent.io
>> >>>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi Ted,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
>> >> meant
>> >>>>>>> as
>> >>>>>>>>>>> opposed
>> >>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed
>> to
>> >>>>>>>>>>>> "segmentSize".
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I named it that way to match the peer method "windowSize",
>> >> which
>> >>>>>>> is
>> >>>>>>>>>>> also a
>> >>>>>>>>>>>> quantity of milliseconds.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I
>> favor
>> >>>>>>>>>>> consistency
>> >>>>>>>>>>>> in this case. Does that seem reasonable?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>> -John
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> Normally size is not measured in time unit, such as
>> >>>>>>> milliseconds.
>> >>>>>>>>>>>>> How about naming the new method segmentInterval ?
>> >>>>>>>>>>>>> Thanks
>> >>>>>>>>>>>>> -------- Original message --------From: John Roesler <
>> >>>>>>>>>>> john@confluent.io>
>> >>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
>> >> dev@kafka.apache.org
>> >>>>>>>>>>> Subject:
>> >>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>> >>>>>>>>>>>>> WindowBytesStoreSupplier
>> >>>>>>>>>>>>> Hello All,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
>> >>>>>>>>>>> KAFKA-7080.
>> >>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>> >>>>>> *number
>> >>>>>>>> of
>> >>>>>>>>>>>>> segments* instead of the *segment size*.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Here's the jira:
>> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
>> >>>>>>>>>>>>> Here's the KIP:
>> https://cwiki.apache.org/confluence/x/mQU0BQ
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> additionally, here's a draft PR for clarity:
>> >>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Please let me know what you think!
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>> -John
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> -- Guozhang
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> -- Guozhang
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> -- Guozhang
>> >>>
>> >>
>> >
>>
>>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Matthias,

That's a good idea. I'm not sure why I didn't...

Thanks,
-John

On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <ma...@confluent.io>
wrote:

> Ok.
>
> @John: can you create a JIRA to track this? I think KAFKA-4730 is
> related, but actually an own ticket (that is blocked by not having
> Materialized for stream-stream joins).
>
>
> -Matthias
>
> On 6/25/18 2:10 PM, Bill Bejeck wrote:
> > I agree that it makes sense to have segmentInterval as a parameter to a
> > store, but I also agree with Guozhang's point about not moving as part of
> > this KIP.
> >
> > Thanks,
> > Bill
> >
> > On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io> wrote:
> >
> >> Thanks Matthias and Guozhang,
> >>
> >> About deprecating the "segments" field instead of making it private.
> Yes, I
> >> just took another look at the code, and that is correct. I'll update the
> >> KIP.
> >>
> >> I do agree that in the long run, it makes more sense as a parameter to
> the
> >> store somehow than as a parameter to the window. I think this isn't a
> super
> >> high priority, though, because it's not exposed in the DSL (or it wasn't
> >> intended to be).
> >>
> >> I felt Guozhang's point is valid, and that we should probably revisit it
> >> later, possibly in the scope of
> >> https://issues.apache.org/jira/browse/KAFKA-4730
> >>
> >> I'll wait an hour or so for more feedback before moving on to a vote.
> >>
> >> Thanks again,
> >> -John
> >>
> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> >>
> >>> Re `segmentInterval` parameter in Windows: currently it is used in two
> >>> places, the windowed stream aggregation, and the stream-stream joins.
> For
> >>> the former, we can potentially move the parameter from windowedBy() to
> >>> Materialized, but for the latter we currently do not expose a
> >> Materialized
> >>> object yet, only the Windows spec. So I think in this KIP we probably
> >>> cannot move it immediately.
> >>>
> >>> But in future KIPs if we decide to expose the stream-stream join's
> store
> >> /
> >>> changelog / repartition topic names, we may well adding the
> Materialized
> >>> object into the operator, and we can then move the parameter to
> >>> Materialized by then.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
> matthias@confluent.io>
> >>> wrote:
> >>>
> >>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
> >> API.
> >>>>
> >>>> Couple of comments:
> >>>>
> >>>>> Sadly there's no way to "deprecate" this
> >>>>> exposure
> >>>>
> >>>> I disagree. We can just mark the variable as deprecated and I advocate
> >>>> to do this. When the deprecation period passed, we can make it private
> >>>> (or actually remove it; cf. my next comment).
> >>>>
> >>>>
> >>>> Parameter, `segmentInterval` is semantically not a "window"
> >>>> specification parameter but an implementation detail and thus a store
> >>>> parameter. Would it be better to add it to `Materialized`?
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
> >>>>> Thanks John.
> >>>>>
> >>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
> >>> wrote:
> >>>>>
> >>>>>> Thanks for the feedback, Bill and Guozhang,
> >>>>>>
> >>>>>> I've updated the KIP accordingly.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> -John
> >>>>>>
> >>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
> >>>> wrote:
> >>>>>>
> >>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> >> the
> >>>>>> wiki:
> >>>>>>> the `In Windows, we will:` section code snippet is empty.
> >>>>>>>
> >>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Hi John,
> >>>>>>>>
> >>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
> >>>>>>>>
> >>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention
> >> of
> >>>>>> the
> >>>>>>>> number of segments there can be at any one time.  While it's
> >> implied
> >>>>>> that
> >>>>>>>> the number of segments is potentially unbounded, would be better
> >> to
> >>>>>>>> explicitly state that the previous limit on the number of segments
> >>> is
> >>>>>>> going
> >>>>>>>> to be removed as well?
> >>>>>>>>
> >>>>>>>> I have a couple of nit comments.   The method name is still
> >>>> segmentSize
> >>>>>>> in
> >>>>>>>> the code block vs segmentInterval and the order of the parameters
> >>> for
> >>>>>> the
> >>>>>>>> third persistentWindowStore don't match the order in the JavaDoc.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Bill
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I've updated the KIP and draft PR accordingly.
> >>>>>>>>>
> >>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Interesting... I did not initially consider it because I didn't
> >>>>>> want
> >>>>>>> to
> >>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
> >> unless
> >>>>>>>>>> developers have subclassed `Windows`, the number of segments
> >> would
> >>>>>>>> always
> >>>>>>>>>> be 3!
> >>>>>>>>>>
> >>>>>>>>>> There's one caveat to this, which I think was a mistake. The
> >> field
> >>>>>>>>>> `segments` in Windows is public, which means that anyone can
> >>>>>> actually
> >>>>>>>> set
> >>>>>>>>>> it directly on any Window instance like:
> >>>>>>>>>>
> >>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
> >>>>>>>>>>         tw.segments = 12345;
> >>>>>>>>>>
> >>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
> >>> Windows
> >>>>>>>> that
> >>>>>>>>>> says users can't directly set it. Sadly there's no way to
> >>>>>> "deprecate"
> >>>>>>>>> this
> >>>>>>>>>> exposure, so I propose just to make it private.
> >>>>>>>>>>
> >>>>>>>>>> With this new knowledge, I agree, I think we can switch to
> >>>>>>>>>> "segmentInterval" throughout the interface.
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
> >> wangguoz@gmail.com
> >>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hello John,
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the KIP.
> >>>>>>>>>>>
> >>>>>>>>>>> Should we consider making the change on
> >>>>>>> `Stores#persistentWindowStore`
> >>>>>>>>>>> parameters as well?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Guozhang
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
> >> john@confluent.io
> >>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Ted,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
> >> meant
> >>>>>>> as
> >>>>>>>>>>> opposed
> >>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed to
> >>>>>>>>>>>> "segmentSize".
> >>>>>>>>>>>>
> >>>>>>>>>>>> I named it that way to match the peer method "windowSize",
> >> which
> >>>>>>> is
> >>>>>>>>>>> also a
> >>>>>>>>>>>> quantity of milliseconds.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I favor
> >>>>>>>>>>> consistency
> >>>>>>>>>>>> in this case. Does that seem reasonable?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> -John
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Normally size is not measured in time unit, such as
> >>>>>>> milliseconds.
> >>>>>>>>>>>>> How about naming the new method segmentInterval ?
> >>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>> -------- Original message --------From: John Roesler <
> >>>>>>>>>>> john@confluent.io>
> >>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
> >> dev@kafka.apache.org
> >>>>>>>>>>> Subject:
> >>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> >>>>>>>>>>>>> WindowBytesStoreSupplier
> >>>>>>>>>>>>> Hello All,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
> >>>>>>>>>>> KAFKA-7080.
> >>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
> >>>>>> *number
> >>>>>>>> of
> >>>>>>>>>>>>> segments* instead of the *segment size*.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Here's the jira:
> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
> >>>>>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> additionally, here's a draft PR for clarity:
> >>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Please let me know what you think!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> -John
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> -- Guozhang
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >
>
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Ok.

@John: can you create a JIRA to track this? I think KAFKA-4730 is
related, but actually an own ticket (that is blocked by not having
Materialized for stream-stream joins).


-Matthias

On 6/25/18 2:10 PM, Bill Bejeck wrote:
> I agree that it makes sense to have segmentInterval as a parameter to a
> store, but I also agree with Guozhang's point about not moving as part of
> this KIP.
> 
> Thanks,
> Bill
> 
> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io> wrote:
> 
>> Thanks Matthias and Guozhang,
>>
>> About deprecating the "segments" field instead of making it private. Yes, I
>> just took another look at the code, and that is correct. I'll update the
>> KIP.
>>
>> I do agree that in the long run, it makes more sense as a parameter to the
>> store somehow than as a parameter to the window. I think this isn't a super
>> high priority, though, because it's not exposed in the DSL (or it wasn't
>> intended to be).
>>
>> I felt Guozhang's point is valid, and that we should probably revisit it
>> later, possibly in the scope of
>> https://issues.apache.org/jira/browse/KAFKA-4730
>>
>> I'll wait an hour or so for more feedback before moving on to a vote.
>>
>> Thanks again,
>> -John
>>
>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com> wrote:
>>
>>> Re `segmentInterval` parameter in Windows: currently it is used in two
>>> places, the windowed stream aggregation, and the stream-stream joins. For
>>> the former, we can potentially move the parameter from windowedBy() to
>>> Materialized, but for the latter we currently do not expose a
>> Materialized
>>> object yet, only the Windows spec. So I think in this KIP we probably
>>> cannot move it immediately.
>>>
>>> But in future KIPs if we decide to expose the stream-stream join's store
>> /
>>> changelog / repartition topic names, we may well adding the Materialized
>>> object into the operator, and we can then move the parameter to
>>> Materialized by then.
>>>
>>>
>>> Guozhang
>>>
>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <ma...@confluent.io>
>>> wrote:
>>>
>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
>> API.
>>>>
>>>> Couple of comments:
>>>>
>>>>> Sadly there's no way to "deprecate" this
>>>>> exposure
>>>>
>>>> I disagree. We can just mark the variable as deprecated and I advocate
>>>> to do this. When the deprecation period passed, we can make it private
>>>> (or actually remove it; cf. my next comment).
>>>>
>>>>
>>>> Parameter, `segmentInterval` is semantically not a "window"
>>>> specification parameter but an implementation detail and thus a store
>>>> parameter. Would it be better to add it to `Materialized`?
>>>>
>>>>
>>>> -Matthias
>>>>
>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
>>>>> Thanks John.
>>>>>
>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
>>> wrote:
>>>>>
>>>>>> Thanks for the feedback, Bill and Guozhang,
>>>>>>
>>>>>> I've updated the KIP accordingly.
>>>>>>
>>>>>> Thanks,
>>>>>> -John
>>>>>>
>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
>>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>> the
>>>>>> wiki:
>>>>>>> the `In Windows, we will:` section code snippet is empty.
>>>>>>>
>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
>>>>>>>>
>>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention
>> of
>>>>>> the
>>>>>>>> number of segments there can be at any one time.  While it's
>> implied
>>>>>> that
>>>>>>>> the number of segments is potentially unbounded, would be better
>> to
>>>>>>>> explicitly state that the previous limit on the number of segments
>>> is
>>>>>>> going
>>>>>>>> to be removed as well?
>>>>>>>>
>>>>>>>> I have a couple of nit comments.   The method name is still
>>>> segmentSize
>>>>>>> in
>>>>>>>> the code block vs segmentInterval and the order of the parameters
>>> for
>>>>>> the
>>>>>>>> third persistentWindowStore don't match the order in the JavaDoc.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bill
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I've updated the KIP and draft PR accordingly.
>>>>>>>>>
>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Interesting... I did not initially consider it because I didn't
>>>>>> want
>>>>>>> to
>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
>> unless
>>>>>>>>>> developers have subclassed `Windows`, the number of segments
>> would
>>>>>>>> always
>>>>>>>>>> be 3!
>>>>>>>>>>
>>>>>>>>>> There's one caveat to this, which I think was a mistake. The
>> field
>>>>>>>>>> `segments` in Windows is public, which means that anyone can
>>>>>> actually
>>>>>>>> set
>>>>>>>>>> it directly on any Window instance like:
>>>>>>>>>>
>>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
>>>>>>>>>>         tw.segments = 12345;
>>>>>>>>>>
>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
>>> Windows
>>>>>>>> that
>>>>>>>>>> says users can't directly set it. Sadly there's no way to
>>>>>> "deprecate"
>>>>>>>>> this
>>>>>>>>>> exposure, so I propose just to make it private.
>>>>>>>>>>
>>>>>>>>>> With this new knowledge, I agree, I think we can switch to
>>>>>>>>>> "segmentInterval" throughout the interface.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
>> wangguoz@gmail.com
>>>>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello John,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>
>>>>>>>>>>> Should we consider making the change on
>>>>>>> `Stores#persistentWindowStore`
>>>>>>>>>>> parameters as well?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Guozhang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
>> john@confluent.io
>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Ted,
>>>>>>>>>>>>
>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
>> meant
>>>>>>> as
>>>>>>>>>>> opposed
>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed to
>>>>>>>>>>>> "segmentSize".
>>>>>>>>>>>>
>>>>>>>>>>>> I named it that way to match the peer method "windowSize",
>> which
>>>>>>> is
>>>>>>>>>>> also a
>>>>>>>>>>>> quantity of milliseconds.
>>>>>>>>>>>>
>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I favor
>>>>>>>>>>> consistency
>>>>>>>>>>>> in this case. Does that seem reasonable?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Normally size is not measured in time unit, such as
>>>>>>> milliseconds.
>>>>>>>>>>>>> How about naming the new method segmentInterval ?
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> -------- Original message --------From: John Roesler <
>>>>>>>>>>> john@confluent.io>
>>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
>> dev@kafka.apache.org
>>>>>>>>>>> Subject:
>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>>>>>>>>>>>>> WindowBytesStoreSupplier
>>>>>>>>>>>>> Hello All,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
>>>>>>>>>>> KAFKA-7080.
>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>>>>>> *number
>>>>>>>> of
>>>>>>>>>>>>> segments* instead of the *segment size*.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the jira:
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>>>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>>>>>>>>>>>>>
>>>>>>>>>>>>> additionally, here's a draft PR for clarity:
>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know what you think!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -- Guozhang
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Bill Bejeck <bb...@gmail.com>.
I agree that it makes sense to have segmentInterval as a parameter to a
store, but I also agree with Guozhang's point about not moving as part of
this KIP.

Thanks,
Bill

On Mon, Jun 25, 2018 at 4:17 PM John Roesler <jo...@confluent.io> wrote:

> Thanks Matthias and Guozhang,
>
> About deprecating the "segments" field instead of making it private. Yes, I
> just took another look at the code, and that is correct. I'll update the
> KIP.
>
> I do agree that in the long run, it makes more sense as a parameter to the
> store somehow than as a parameter to the window. I think this isn't a super
> high priority, though, because it's not exposed in the DSL (or it wasn't
> intended to be).
>
> I felt Guozhang's point is valid, and that we should probably revisit it
> later, possibly in the scope of
> https://issues.apache.org/jira/browse/KAFKA-4730
>
> I'll wait an hour or so for more feedback before moving on to a vote.
>
> Thanks again,
> -John
>
> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Re `segmentInterval` parameter in Windows: currently it is used in two
> > places, the windowed stream aggregation, and the stream-stream joins. For
> > the former, we can potentially move the parameter from windowedBy() to
> > Materialized, but for the latter we currently do not expose a
> Materialized
> > object yet, only the Windows spec. So I think in this KIP we probably
> > cannot move it immediately.
> >
> > But in future KIPs if we decide to expose the stream-stream join's store
> /
> > changelog / repartition topic names, we may well adding the Materialized
> > object into the operator, and we can then move the parameter to
> > Materialized by then.
> >
> >
> > Guozhang
> >
> > On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <ma...@confluent.io>
> > wrote:
> >
> > > Thanks for the KIP. Overall, I think it makes sense to clean up the
> API.
> > >
> > > Couple of comments:
> > >
> > > > Sadly there's no way to "deprecate" this
> > > > exposure
> > >
> > > I disagree. We can just mark the variable as deprecated and I advocate
> > > to do this. When the deprecation period passed, we can make it private
> > > (or actually remove it; cf. my next comment).
> > >
> > >
> > > Parameter, `segmentInterval` is semantically not a "window"
> > > specification parameter but an implementation detail and thus a store
> > > parameter. Would it be better to add it to `Materialized`?
> > >
> > >
> > > -Matthias
> > >
> > > On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > > > Thanks John.
> > > >
> > > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > >> Thanks for the feedback, Bill and Guozhang,
> > > >>
> > > >> I've updated the KIP accordingly.
> > > >>
> > > >> Thanks,
> > > >> -John
> > > >>
> > > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >>
> > > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> the
> > > >> wiki:
> > > >>> the `In Windows, we will:` section code snippet is empty.
> > > >>>
> > > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
> > > wrote:
> > > >>>
> > > >>>> Hi John,
> > > >>>>
> > > >>>> Thanks for the KIP, and overall it's a +1 for me.
> > > >>>>
> > > >>>> In the JavaDoc for the segmentInterval method, there's no mention
> of
> > > >> the
> > > >>>> number of segments there can be at any one time.  While it's
> implied
> > > >> that
> > > >>>> the number of segments is potentially unbounded, would be better
> to
> > > >>>> explicitly state that the previous limit on the number of segments
> > is
> > > >>> going
> > > >>>> to be removed as well?
> > > >>>>
> > > >>>> I have a couple of nit comments.   The method name is still
> > > segmentSize
> > > >>> in
> > > >>>> the code block vs segmentInterval and the order of the parameters
> > for
> > > >> the
> > > >>>> third persistentWindowStore don't match the order in the JavaDoc.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Bill
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
> > > >> wrote:
> > > >>>>
> > > >>>>> I've updated the KIP and draft PR accordingly.
> > > >>>>>
> > > >>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> > > >>> wrote:
> > > >>>>>
> > > >>>>>> Interesting... I did not initially consider it because I didn't
> > > >> want
> > > >>> to
> > > >>>>>> have an impact on anyone's Streams apps, but now I see that
> unless
> > > >>>>>> developers have subclassed `Windows`, the number of segments
> would
> > > >>>> always
> > > >>>>>> be 3!
> > > >>>>>>
> > > >>>>>> There's one caveat to this, which I think was a mistake. The
> field
> > > >>>>>> `segments` in Windows is public, which means that anyone can
> > > >> actually
> > > >>>> set
> > > >>>>>> it directly on any Window instance like:
> > > >>>>>>
> > > >>>>>>         TimeWindows tw = TimeWindows.of(100);
> > > >>>>>>         tw.segments = 12345;
> > > >>>>>>
> > > >>>>>> Bypassing the bounds check and contradicting the javadoc in
> > Windows
> > > >>>> that
> > > >>>>>> says users can't directly set it. Sadly there's no way to
> > > >> "deprecate"
> > > >>>>> this
> > > >>>>>> exposure, so I propose just to make it private.
> > > >>>>>>
> > > >>>>>> With this new knowledge, I agree, I think we can switch to
> > > >>>>>> "segmentInterval" throughout the interface.
> > > >>>>>>
> > > >>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hello John,
> > > >>>>>>>
> > > >>>>>>> Thanks for the KIP.
> > > >>>>>>>
> > > >>>>>>> Should we consider making the change on
> > > >>> `Stores#persistentWindowStore`
> > > >>>>>>> parameters as well?
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Guozhang
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
> john@confluent.io
> > >
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Ted,
> > > >>>>>>>>
> > > >>>>>>>> Ah, when you made that comment to me before, I thought you
> meant
> > > >>> as
> > > >>>>>>> opposed
> > > >>>>>>>> to "segments". Now it makes sense that you meant as opposed to
> > > >>>>>>>> "segmentSize".
> > > >>>>>>>>
> > > >>>>>>>> I named it that way to match the peer method "windowSize",
> which
> > > >>> is
> > > >>>>>>> also a
> > > >>>>>>>> quantity of milliseconds.
> > > >>>>>>>>
> > > >>>>>>>> I agree that "interval" is more intuitive, but I think I favor
> > > >>>>>>> consistency
> > > >>>>>>>> in this case. Does that seem reasonable?
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> -John
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Normally size is not measured in time unit, such as
> > > >>> milliseconds.
> > > >>>>>>>>> How about naming the new method segmentInterval ?
> > > >>>>>>>>> Thanks
> > > >>>>>>>>> -------- Original message --------From: John Roesler <
> > > >>>>>>> john@confluent.io>
> > > >>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
> dev@kafka.apache.org
> > > >>>>>>> Subject:
> > > >>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > >>>>>>>>> WindowBytesStoreSupplier
> > > >>>>>>>>> Hello All,
> > > >>>>>>>>>
> > > >>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
> > > >>>>>>> KAFKA-7080.
> > > >>>>>>>>> Specifically, we're creating CachingWindowStore with the
> > > >> *number
> > > >>>> of
> > > >>>>>>>>> segments* instead of the *segment size*.
> > > >>>>>>>>>
> > > >>>>>>>>> Here's the jira:
> > > >>> https://issues.apache.org/jira/browse/KAFKA-7080
> > > >>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > > >>>>>>>>>
> > > >>>>>>>>> additionally, here's a draft PR for clarity:
> > > >>>>>>>>> https://github.com/apache/kafka/pull/5257
> > > >>>>>>>>>
> > > >>>>>>>>> Please let me know what you think!
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> -John
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>> -- Guozhang
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> -- Guozhang
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Thanks Matthias and Guozhang,

About deprecating the "segments" field instead of making it private. Yes, I
just took another look at the code, and that is correct. I'll update the
KIP.

I do agree that in the long run, it makes more sense as a parameter to the
store somehow than as a parameter to the window. I think this isn't a super
high priority, though, because it's not exposed in the DSL (or it wasn't
intended to be).

I felt Guozhang's point is valid, and that we should probably revisit it
later, possibly in the scope of
https://issues.apache.org/jira/browse/KAFKA-4730

I'll wait an hour or so for more feedback before moving on to a vote.

Thanks again,
-John

On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wa...@gmail.com> wrote:

> Re `segmentInterval` parameter in Windows: currently it is used in two
> places, the windowed stream aggregation, and the stream-stream joins. For
> the former, we can potentially move the parameter from windowedBy() to
> Materialized, but for the latter we currently do not expose a Materialized
> object yet, only the Windows spec. So I think in this KIP we probably
> cannot move it immediately.
>
> But in future KIPs if we decide to expose the stream-stream join's store /
> changelog / repartition topic names, we may well adding the Materialized
> object into the operator, and we can then move the parameter to
> Materialized by then.
>
>
> Guozhang
>
> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <ma...@confluent.io>
> wrote:
>
> > Thanks for the KIP. Overall, I think it makes sense to clean up the API.
> >
> > Couple of comments:
> >
> > > Sadly there's no way to "deprecate" this
> > > exposure
> >
> > I disagree. We can just mark the variable as deprecated and I advocate
> > to do this. When the deprecation period passed, we can make it private
> > (or actually remove it; cf. my next comment).
> >
> >
> > Parameter, `segmentInterval` is semantically not a "window"
> > specification parameter but an implementation detail and thus a store
> > parameter. Would it be better to add it to `Materialized`?
> >
> >
> > -Matthias
> >
> > On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > > Thanks John.
> > >
> > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io>
> wrote:
> > >
> > >> Thanks for the feedback, Bill and Guozhang,
> > >>
> > >> I've updated the KIP accordingly.
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >>
> > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> > >> wiki:
> > >>> the `In Windows, we will:` section code snippet is empty.
> > >>>
> > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi John,
> > >>>>
> > >>>> Thanks for the KIP, and overall it's a +1 for me.
> > >>>>
> > >>>> In the JavaDoc for the segmentInterval method, there's no mention of
> > >> the
> > >>>> number of segments there can be at any one time.  While it's implied
> > >> that
> > >>>> the number of segments is potentially unbounded, would be better to
> > >>>> explicitly state that the previous limit on the number of segments
> is
> > >>> going
> > >>>> to be removed as well?
> > >>>>
> > >>>> I have a couple of nit comments.   The method name is still
> > segmentSize
> > >>> in
> > >>>> the code block vs segmentInterval and the order of the parameters
> for
> > >> the
> > >>>> third persistentWindowStore don't match the order in the JavaDoc.
> > >>>>
> > >>>> Thanks,
> > >>>> Bill
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
> > >> wrote:
> > >>>>
> > >>>>> I've updated the KIP and draft PR accordingly.
> > >>>>>
> > >>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> > >>> wrote:
> > >>>>>
> > >>>>>> Interesting... I did not initially consider it because I didn't
> > >> want
> > >>> to
> > >>>>>> have an impact on anyone's Streams apps, but now I see that unless
> > >>>>>> developers have subclassed `Windows`, the number of segments would
> > >>>> always
> > >>>>>> be 3!
> > >>>>>>
> > >>>>>> There's one caveat to this, which I think was a mistake. The field
> > >>>>>> `segments` in Windows is public, which means that anyone can
> > >> actually
> > >>>> set
> > >>>>>> it directly on any Window instance like:
> > >>>>>>
> > >>>>>>         TimeWindows tw = TimeWindows.of(100);
> > >>>>>>         tw.segments = 12345;
> > >>>>>>
> > >>>>>> Bypassing the bounds check and contradicting the javadoc in
> Windows
> > >>>> that
> > >>>>>> says users can't directly set it. Sadly there's no way to
> > >> "deprecate"
> > >>>>> this
> > >>>>>> exposure, so I propose just to make it private.
> > >>>>>>
> > >>>>>> With this new knowledge, I agree, I think we can switch to
> > >>>>>> "segmentInterval" throughout the interface.
> > >>>>>>
> > >>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wangguoz@gmail.com
> >
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Hello John,
> > >>>>>>>
> > >>>>>>> Thanks for the KIP.
> > >>>>>>>
> > >>>>>>> Should we consider making the change on
> > >>> `Stores#persistentWindowStore`
> > >>>>>>> parameters as well?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Guozhang
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <john@confluent.io
> >
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Ted,
> > >>>>>>>>
> > >>>>>>>> Ah, when you made that comment to me before, I thought you meant
> > >>> as
> > >>>>>>> opposed
> > >>>>>>>> to "segments". Now it makes sense that you meant as opposed to
> > >>>>>>>> "segmentSize".
> > >>>>>>>>
> > >>>>>>>> I named it that way to match the peer method "windowSize", which
> > >>> is
> > >>>>>>> also a
> > >>>>>>>> quantity of milliseconds.
> > >>>>>>>>
> > >>>>>>>> I agree that "interval" is more intuitive, but I think I favor
> > >>>>>>> consistency
> > >>>>>>>> in this case. Does that seem reasonable?
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> -John
> > >>>>>>>>
> > >>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> > >>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Normally size is not measured in time unit, such as
> > >>> milliseconds.
> > >>>>>>>>> How about naming the new method segmentInterval ?
> > >>>>>>>>> Thanks
> > >>>>>>>>> -------- Original message --------From: John Roesler <
> > >>>>>>> john@confluent.io>
> > >>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> > >>>>>>> Subject:
> > >>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> > >>>>>>>>> WindowBytesStoreSupplier
> > >>>>>>>>> Hello All,
> > >>>>>>>>>
> > >>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
> > >>>>>>> KAFKA-7080.
> > >>>>>>>>> Specifically, we're creating CachingWindowStore with the
> > >> *number
> > >>>> of
> > >>>>>>>>> segments* instead of the *segment size*.
> > >>>>>>>>>
> > >>>>>>>>> Here's the jira:
> > >>> https://issues.apache.org/jira/browse/KAFKA-7080
> > >>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > >>>>>>>>>
> > >>>>>>>>> additionally, here's a draft PR for clarity:
> > >>>>>>>>> https://github.com/apache/kafka/pull/5257
> > >>>>>>>>>
> > >>>>>>>>> Please let me know what you think!
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> -John
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> -- Guozhang
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> -- Guozhang
> > >>>
> > >>
> > >
> > >
> > >
> >
> >
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Guozhang Wang <wa...@gmail.com>.
Re `segmentInterval` parameter in Windows: currently it is used in two
places, the windowed stream aggregation, and the stream-stream joins. For
the former, we can potentially move the parameter from windowedBy() to
Materialized, but for the latter we currently do not expose a Materialized
object yet, only the Windows spec. So I think in this KIP we probably
cannot move it immediately.

But in future KIPs if we decide to expose the stream-stream join's store /
changelog / repartition topic names, we may well adding the Materialized
object into the operator, and we can then move the parameter to
Materialized by then.


Guozhang

On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <ma...@confluent.io>
wrote:

> Thanks for the KIP. Overall, I think it makes sense to clean up the API.
>
> Couple of comments:
>
> > Sadly there's no way to "deprecate" this
> > exposure
>
> I disagree. We can just mark the variable as deprecated and I advocate
> to do this. When the deprecation period passed, we can make it private
> (or actually remove it; cf. my next comment).
>
>
> Parameter, `segmentInterval` is semantically not a "window"
> specification parameter but an implementation detail and thus a store
> parameter. Would it be better to add it to `Materialized`?
>
>
> -Matthias
>
> On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > Thanks John.
> >
> > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io> wrote:
> >
> >> Thanks for the feedback, Bill and Guozhang,
> >>
> >> I've updated the KIP accordingly.
> >>
> >> Thanks,
> >> -John
> >>
> >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> >>
> >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> >> wiki:
> >>> the `In Windows, we will:` section code snippet is empty.
> >>>
> >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com>
> wrote:
> >>>
> >>>> Hi John,
> >>>>
> >>>> Thanks for the KIP, and overall it's a +1 for me.
> >>>>
> >>>> In the JavaDoc for the segmentInterval method, there's no mention of
> >> the
> >>>> number of segments there can be at any one time.  While it's implied
> >> that
> >>>> the number of segments is potentially unbounded, would be better to
> >>>> explicitly state that the previous limit on the number of segments is
> >>> going
> >>>> to be removed as well?
> >>>>
> >>>> I have a couple of nit comments.   The method name is still
> segmentSize
> >>> in
> >>>> the code block vs segmentInterval and the order of the parameters for
> >> the
> >>>> third persistentWindowStore don't match the order in the JavaDoc.
> >>>>
> >>>> Thanks,
> >>>> Bill
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
> >> wrote:
> >>>>
> >>>>> I've updated the KIP and draft PR accordingly.
> >>>>>
> >>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> >>> wrote:
> >>>>>
> >>>>>> Interesting... I did not initially consider it because I didn't
> >> want
> >>> to
> >>>>>> have an impact on anyone's Streams apps, but now I see that unless
> >>>>>> developers have subclassed `Windows`, the number of segments would
> >>>> always
> >>>>>> be 3!
> >>>>>>
> >>>>>> There's one caveat to this, which I think was a mistake. The field
> >>>>>> `segments` in Windows is public, which means that anyone can
> >> actually
> >>>> set
> >>>>>> it directly on any Window instance like:
> >>>>>>
> >>>>>>         TimeWindows tw = TimeWindows.of(100);
> >>>>>>         tw.segments = 12345;
> >>>>>>
> >>>>>> Bypassing the bounds check and contradicting the javadoc in Windows
> >>>> that
> >>>>>> says users can't directly set it. Sadly there's no way to
> >> "deprecate"
> >>>>> this
> >>>>>> exposure, so I propose just to make it private.
> >>>>>>
> >>>>>> With this new knowledge, I agree, I think we can switch to
> >>>>>> "segmentInterval" throughout the interface.
> >>>>>>
> >>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hello John,
> >>>>>>>
> >>>>>>> Thanks for the KIP.
> >>>>>>>
> >>>>>>> Should we consider making the change on
> >>> `Stores#persistentWindowStore`
> >>>>>>> parameters as well?
> >>>>>>>
> >>>>>>>
> >>>>>>> Guozhang
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Ted,
> >>>>>>>>
> >>>>>>>> Ah, when you made that comment to me before, I thought you meant
> >>> as
> >>>>>>> opposed
> >>>>>>>> to "segments". Now it makes sense that you meant as opposed to
> >>>>>>>> "segmentSize".
> >>>>>>>>
> >>>>>>>> I named it that way to match the peer method "windowSize", which
> >>> is
> >>>>>>> also a
> >>>>>>>> quantity of milliseconds.
> >>>>>>>>
> >>>>>>>> I agree that "interval" is more intuitive, but I think I favor
> >>>>>>> consistency
> >>>>>>>> in this case. Does that seem reasonable?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> -John
> >>>>>>>>
> >>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>>> Normally size is not measured in time unit, such as
> >>> milliseconds.
> >>>>>>>>> How about naming the new method segmentInterval ?
> >>>>>>>>> Thanks
> >>>>>>>>> -------- Original message --------From: John Roesler <
> >>>>>>> john@confluent.io>
> >>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> >>>>>>> Subject:
> >>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> >>>>>>>>> WindowBytesStoreSupplier
> >>>>>>>>> Hello All,
> >>>>>>>>>
> >>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
> >>>>>>> KAFKA-7080.
> >>>>>>>>> Specifically, we're creating CachingWindowStore with the
> >> *number
> >>>> of
> >>>>>>>>> segments* instead of the *segment size*.
> >>>>>>>>>
> >>>>>>>>> Here's the jira:
> >>> https://issues.apache.org/jira/browse/KAFKA-7080
> >>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >>>>>>>>>
> >>>>>>>>> additionally, here's a draft PR for clarity:
> >>>>>>>>> https://github.com/apache/kafka/pull/5257
> >>>>>>>>>
> >>>>>>>>> Please let me know what you think!
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> -- Guozhang
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >
> >
> >
>
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Thanks for the KIP. Overall, I think it makes sense to clean up the API.

Couple of comments:

> Sadly there's no way to "deprecate" this
> exposure

I disagree. We can just mark the variable as deprecated and I advocate
to do this. When the deprecation period passed, we can make it private
(or actually remove it; cf. my next comment).


Parameter, `segmentInterval` is semantically not a "window"
specification parameter but an implementation detail and thus a store
parameter. Would it be better to add it to `Materialized`?


-Matthias

On 6/22/18 5:13 PM, Guozhang Wang wrote:
> Thanks John.
> 
> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io> wrote:
> 
>> Thanks for the feedback, Bill and Guozhang,
>>
>> I've updated the KIP accordingly.
>>
>> Thanks,
>> -John
>>
>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com> wrote:
>>
>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
>> wiki:
>>> the `In Windows, we will:` section code snippet is empty.
>>>
>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com> wrote:
>>>
>>>> Hi John,
>>>>
>>>> Thanks for the KIP, and overall it's a +1 for me.
>>>>
>>>> In the JavaDoc for the segmentInterval method, there's no mention of
>> the
>>>> number of segments there can be at any one time.  While it's implied
>> that
>>>> the number of segments is potentially unbounded, would be better to
>>>> explicitly state that the previous limit on the number of segments is
>>> going
>>>> to be removed as well?
>>>>
>>>> I have a couple of nit comments.   The method name is still segmentSize
>>> in
>>>> the code block vs segmentInterval and the order of the parameters for
>> the
>>>> third persistentWindowStore don't match the order in the JavaDoc.
>>>>
>>>> Thanks,
>>>> Bill
>>>>
>>>>
>>>>
>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
>> wrote:
>>>>
>>>>> I've updated the KIP and draft PR accordingly.
>>>>>
>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
>>> wrote:
>>>>>
>>>>>> Interesting... I did not initially consider it because I didn't
>> want
>>> to
>>>>>> have an impact on anyone's Streams apps, but now I see that unless
>>>>>> developers have subclassed `Windows`, the number of segments would
>>>> always
>>>>>> be 3!
>>>>>>
>>>>>> There's one caveat to this, which I think was a mistake. The field
>>>>>> `segments` in Windows is public, which means that anyone can
>> actually
>>>> set
>>>>>> it directly on any Window instance like:
>>>>>>
>>>>>>         TimeWindows tw = TimeWindows.of(100);
>>>>>>         tw.segments = 12345;
>>>>>>
>>>>>> Bypassing the bounds check and contradicting the javadoc in Windows
>>>> that
>>>>>> says users can't directly set it. Sadly there's no way to
>> "deprecate"
>>>>> this
>>>>>> exposure, so I propose just to make it private.
>>>>>>
>>>>>> With this new knowledge, I agree, I think we can switch to
>>>>>> "segmentInterval" throughout the interface.
>>>>>>
>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> Hello John,
>>>>>>>
>>>>>>> Thanks for the KIP.
>>>>>>>
>>>>>>> Should we consider making the change on
>>> `Stores#persistentWindowStore`
>>>>>>> parameters as well?
>>>>>>>
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Ted,
>>>>>>>>
>>>>>>>> Ah, when you made that comment to me before, I thought you meant
>>> as
>>>>>>> opposed
>>>>>>>> to "segments". Now it makes sense that you meant as opposed to
>>>>>>>> "segmentSize".
>>>>>>>>
>>>>>>>> I named it that way to match the peer method "windowSize", which
>>> is
>>>>>>> also a
>>>>>>>> quantity of milliseconds.
>>>>>>>>
>>>>>>>> I agree that "interval" is more intuitive, but I think I favor
>>>>>>> consistency
>>>>>>>> in this case. Does that seem reasonable?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -John
>>>>>>>>
>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
>>> wrote:
>>>>>>>>
>>>>>>>>> Normally size is not measured in time unit, such as
>>> milliseconds.
>>>>>>>>> How about naming the new method segmentInterval ?
>>>>>>>>> Thanks
>>>>>>>>> -------- Original message --------From: John Roesler <
>>>>>>> john@confluent.io>
>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
>>>>>>> Subject:
>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>>>>>>>>> WindowBytesStoreSupplier
>>>>>>>>> Hello All,
>>>>>>>>>
>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
>>>>>>> KAFKA-7080.
>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>> *number
>>>> of
>>>>>>>>> segments* instead of the *segment size*.
>>>>>>>>>
>>>>>>>>> Here's the jira:
>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>>>>>>>>>
>>>>>>>>> additionally, here's a draft PR for clarity:
>>>>>>>>> https://github.com/apache/kafka/pull/5257
>>>>>>>>>
>>>>>>>>> Please let me know what you think!
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -- Guozhang
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 
> 
> 


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks John.

On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <jo...@confluent.io> wrote:

> Thanks for the feedback, Bill and Guozhang,
>
> I've updated the KIP accordingly.
>
> Thanks,
> -John
>
> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> wiki:
> > the `In Windows, we will:` section code snippet is empty.
> >
> > On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com> wrote:
> >
> > > Hi John,
> > >
> > > Thanks for the KIP, and overall it's a +1 for me.
> > >
> > > In the JavaDoc for the segmentInterval method, there's no mention of
> the
> > > number of segments there can be at any one time.  While it's implied
> that
> > > the number of segments is potentially unbounded, would be better to
> > > explicitly state that the previous limit on the number of segments is
> > going
> > > to be removed as well?
> > >
> > > I have a couple of nit comments.   The method name is still segmentSize
> > in
> > > the code block vs segmentInterval and the order of the parameters for
> the
> > > third persistentWindowStore don't match the order in the JavaDoc.
> > >
> > > Thanks,
> > > Bill
> > >
> > >
> > >
> > > On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io>
> wrote:
> > >
> > > > I've updated the KIP and draft PR accordingly.
> > > >
> > > > On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > > > Interesting... I did not initially consider it because I didn't
> want
> > to
> > > > > have an impact on anyone's Streams apps, but now I see that unless
> > > > > developers have subclassed `Windows`, the number of segments would
> > > always
> > > > > be 3!
> > > > >
> > > > > There's one caveat to this, which I think was a mistake. The field
> > > > > `segments` in Windows is public, which means that anyone can
> actually
> > > set
> > > > > it directly on any Window instance like:
> > > > >
> > > > >         TimeWindows tw = TimeWindows.of(100);
> > > > >         tw.segments = 12345;
> > > > >
> > > > > Bypassing the bounds check and contradicting the javadoc in Windows
> > > that
> > > > > says users can't directly set it. Sadly there's no way to
> "deprecate"
> > > > this
> > > > > exposure, so I propose just to make it private.
> > > > >
> > > > > With this new knowledge, I agree, I think we can switch to
> > > > > "segmentInterval" throughout the interface.
> > > > >
> > > > > On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
> > > > wrote:
> > > > >
> > > > >> Hello John,
> > > > >>
> > > > >> Thanks for the KIP.
> > > > >>
> > > > >> Should we consider making the change on
> > `Stores#persistentWindowStore`
> > > > >> parameters as well?
> > > > >>
> > > > >>
> > > > >> Guozhang
> > > > >>
> > > > >>
> > > > >> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
> > > > wrote:
> > > > >>
> > > > >> > Hi Ted,
> > > > >> >
> > > > >> > Ah, when you made that comment to me before, I thought you meant
> > as
> > > > >> opposed
> > > > >> > to "segments". Now it makes sense that you meant as opposed to
> > > > >> > "segmentSize".
> > > > >> >
> > > > >> > I named it that way to match the peer method "windowSize", which
> > is
> > > > >> also a
> > > > >> > quantity of milliseconds.
> > > > >> >
> > > > >> > I agree that "interval" is more intuitive, but I think I favor
> > > > >> consistency
> > > > >> > in this case. Does that seem reasonable?
> > > > >> >
> > > > >> > Thanks,
> > > > >> > -John
> > > > >> >
> > > > >> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> > wrote:
> > > > >> >
> > > > >> > > Normally size is not measured in time unit, such as
> > milliseconds.
> > > > >> > > How about naming the new method segmentInterval ?
> > > > >> > > Thanks
> > > > >> > > -------- Original message --------From: John Roesler <
> > > > >> john@confluent.io>
> > > > >> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> > > > >> Subject:
> > > > >> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > > >> > > WindowBytesStoreSupplier
> > > > >> > > Hello All,
> > > > >> > >
> > > > >> > > I'd like to propose KIP-319 to fix an issue I identified in
> > > > >> KAFKA-7080.
> > > > >> > > Specifically, we're creating CachingWindowStore with the
> *number
> > > of
> > > > >> > > segments* instead of the *segment size*.
> > > > >> > >
> > > > >> > > Here's the jira:
> > https://issues.apache.org/jira/browse/KAFKA-7080
> > > > >> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > > > >> > >
> > > > >> > > additionally, here's a draft PR for clarity:
> > > > >> > > https://github.com/apache/kafka/pull/5257
> > > > >> > >
> > > > >> > > Please let me know what you think!
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > -John
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> -- Guozhang
> > > > >>
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Thanks for the feedback, Bill and Guozhang,

I've updated the KIP accordingly.

Thanks,
-John

On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wa...@gmail.com> wrote:

> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the wiki:
> the `In Windows, we will:` section code snippet is empty.
>
> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com> wrote:
>
> > Hi John,
> >
> > Thanks for the KIP, and overall it's a +1 for me.
> >
> > In the JavaDoc for the segmentInterval method, there's no mention of the
> > number of segments there can be at any one time.  While it's implied that
> > the number of segments is potentially unbounded, would be better to
> > explicitly state that the previous limit on the number of segments is
> going
> > to be removed as well?
> >
> > I have a couple of nit comments.   The method name is still segmentSize
> in
> > the code block vs segmentInterval and the order of the parameters for the
> > third persistentWindowStore don't match the order in the JavaDoc.
> >
> > Thanks,
> > Bill
> >
> >
> >
> > On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > I've updated the KIP and draft PR accordingly.
> > >
> > > On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io>
> wrote:
> > >
> > > > Interesting... I did not initially consider it because I didn't want
> to
> > > > have an impact on anyone's Streams apps, but now I see that unless
> > > > developers have subclassed `Windows`, the number of segments would
> > always
> > > > be 3!
> > > >
> > > > There's one caveat to this, which I think was a mistake. The field
> > > > `segments` in Windows is public, which means that anyone can actually
> > set
> > > > it directly on any Window instance like:
> > > >
> > > >         TimeWindows tw = TimeWindows.of(100);
> > > >         tw.segments = 12345;
> > > >
> > > > Bypassing the bounds check and contradicting the javadoc in Windows
> > that
> > > > says users can't directly set it. Sadly there's no way to "deprecate"
> > > this
> > > > exposure, so I propose just to make it private.
> > > >
> > > > With this new knowledge, I agree, I think we can switch to
> > > > "segmentInterval" throughout the interface.
> > > >
> > > > On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > >> Hello John,
> > > >>
> > > >> Thanks for the KIP.
> > > >>
> > > >> Should we consider making the change on
> `Stores#persistentWindowStore`
> > > >> parameters as well?
> > > >>
> > > >>
> > > >> Guozhang
> > > >>
> > > >>
> > > >> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
> > > wrote:
> > > >>
> > > >> > Hi Ted,
> > > >> >
> > > >> > Ah, when you made that comment to me before, I thought you meant
> as
> > > >> opposed
> > > >> > to "segments". Now it makes sense that you meant as opposed to
> > > >> > "segmentSize".
> > > >> >
> > > >> > I named it that way to match the peer method "windowSize", which
> is
> > > >> also a
> > > >> > quantity of milliseconds.
> > > >> >
> > > >> > I agree that "interval" is more intuitive, but I think I favor
> > > >> consistency
> > > >> > in this case. Does that seem reasonable?
> > > >> >
> > > >> > Thanks,
> > > >> > -John
> > > >> >
> > > >> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com>
> wrote:
> > > >> >
> > > >> > > Normally size is not measured in time unit, such as
> milliseconds.
> > > >> > > How about naming the new method segmentInterval ?
> > > >> > > Thanks
> > > >> > > -------- Original message --------From: John Roesler <
> > > >> john@confluent.io>
> > > >> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> > > >> Subject:
> > > >> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > >> > > WindowBytesStoreSupplier
> > > >> > > Hello All,
> > > >> > >
> > > >> > > I'd like to propose KIP-319 to fix an issue I identified in
> > > >> KAFKA-7080.
> > > >> > > Specifically, we're creating CachingWindowStore with the *number
> > of
> > > >> > > segments* instead of the *segment size*.
> > > >> > >
> > > >> > > Here's the jira:
> https://issues.apache.org/jira/browse/KAFKA-7080
> > > >> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > > >> > >
> > > >> > > additionally, here's a draft PR for clarity:
> > > >> > > https://github.com/apache/kafka/pull/5257
> > > >> > >
> > > >> > > Please let me know what you think!
> > > >> > >
> > > >> > > Thanks,
> > > >> > > -John
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -- Guozhang
> > > >>
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks for the KIP. I'm +1 on the proposal. One minor comment on the wiki:
the `In Windows, we will:` section code snippet is empty.

On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bb...@gmail.com> wrote:

> Hi John,
>
> Thanks for the KIP, and overall it's a +1 for me.
>
> In the JavaDoc for the segmentInterval method, there's no mention of the
> number of segments there can be at any one time.  While it's implied that
> the number of segments is potentially unbounded, would be better to
> explicitly state that the previous limit on the number of segments is going
> to be removed as well?
>
> I have a couple of nit comments.   The method name is still segmentSize in
> the code block vs segmentInterval and the order of the parameters for the
> third persistentWindowStore don't match the order in the JavaDoc.
>
> Thanks,
> Bill
>
>
>
> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io> wrote:
>
> > I've updated the KIP and draft PR accordingly.
> >
> > On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Interesting... I did not initially consider it because I didn't want to
> > > have an impact on anyone's Streams apps, but now I see that unless
> > > developers have subclassed `Windows`, the number of segments would
> always
> > > be 3!
> > >
> > > There's one caveat to this, which I think was a mistake. The field
> > > `segments` in Windows is public, which means that anyone can actually
> set
> > > it directly on any Window instance like:
> > >
> > >         TimeWindows tw = TimeWindows.of(100);
> > >         tw.segments = 12345;
> > >
> > > Bypassing the bounds check and contradicting the javadoc in Windows
> that
> > > says users can't directly set it. Sadly there's no way to "deprecate"
> > this
> > > exposure, so I propose just to make it private.
> > >
> > > With this new knowledge, I agree, I think we can switch to
> > > "segmentInterval" throughout the interface.
> > >
> > > On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > >> Hello John,
> > >>
> > >> Thanks for the KIP.
> > >>
> > >> Should we consider making the change on `Stores#persistentWindowStore`
> > >> parameters as well?
> > >>
> > >>
> > >> Guozhang
> > >>
> > >>
> > >> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
> > wrote:
> > >>
> > >> > Hi Ted,
> > >> >
> > >> > Ah, when you made that comment to me before, I thought you meant as
> > >> opposed
> > >> > to "segments". Now it makes sense that you meant as opposed to
> > >> > "segmentSize".
> > >> >
> > >> > I named it that way to match the peer method "windowSize", which is
> > >> also a
> > >> > quantity of milliseconds.
> > >> >
> > >> > I agree that "interval" is more intuitive, but I think I favor
> > >> consistency
> > >> > in this case. Does that seem reasonable?
> > >> >
> > >> > Thanks,
> > >> > -John
> > >> >
> > >> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
> > >> >
> > >> > > Normally size is not measured in time unit, such as milliseconds.
> > >> > > How about naming the new method segmentInterval ?
> > >> > > Thanks
> > >> > > -------- Original message --------From: John Roesler <
> > >> john@confluent.io>
> > >> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> > >> Subject:
> > >> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > >> > > WindowBytesStoreSupplier
> > >> > > Hello All,
> > >> > >
> > >> > > I'd like to propose KIP-319 to fix an issue I identified in
> > >> KAFKA-7080.
> > >> > > Specifically, we're creating CachingWindowStore with the *number
> of
> > >> > > segments* instead of the *segment size*.
> > >> > >
> > >> > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > >> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > >> > >
> > >> > > additionally, here's a draft PR for clarity:
> > >> > > https://github.com/apache/kafka/pull/5257
> > >> > >
> > >> > > Please let me know what you think!
> > >> > >
> > >> > > Thanks,
> > >> > > -John
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> > >
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Bill Bejeck <bb...@gmail.com>.
Hi John,

Thanks for the KIP, and overall it's a +1 for me.

In the JavaDoc for the segmentInterval method, there's no mention of the
number of segments there can be at any one time.  While it's implied that
the number of segments is potentially unbounded, would be better to
explicitly state that the previous limit on the number of segments is going
to be removed as well?

I have a couple of nit comments.   The method name is still segmentSize in
the code block vs segmentInterval and the order of the parameters for the
third persistentWindowStore don't match the order in the JavaDoc.

Thanks,
Bill



On Thu, Jun 21, 2018 at 3:32 PM John Roesler <jo...@confluent.io> wrote:

> I've updated the KIP and draft PR accordingly.
>
> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io> wrote:
>
> > Interesting... I did not initially consider it because I didn't want to
> > have an impact on anyone's Streams apps, but now I see that unless
> > developers have subclassed `Windows`, the number of segments would always
> > be 3!
> >
> > There's one caveat to this, which I think was a mistake. The field
> > `segments` in Windows is public, which means that anyone can actually set
> > it directly on any Window instance like:
> >
> >         TimeWindows tw = TimeWindows.of(100);
> >         tw.segments = 12345;
> >
> > Bypassing the bounds check and contradicting the javadoc in Windows that
> > says users can't directly set it. Sadly there's no way to "deprecate"
> this
> > exposure, so I propose just to make it private.
> >
> > With this new knowledge, I agree, I think we can switch to
> > "segmentInterval" throughout the interface.
> >
> > On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> >> Hello John,
> >>
> >> Thanks for the KIP.
> >>
> >> Should we consider making the change on `Stores#persistentWindowStore`
> >> parameters as well?
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io>
> wrote:
> >>
> >> > Hi Ted,
> >> >
> >> > Ah, when you made that comment to me before, I thought you meant as
> >> opposed
> >> > to "segments". Now it makes sense that you meant as opposed to
> >> > "segmentSize".
> >> >
> >> > I named it that way to match the peer method "windowSize", which is
> >> also a
> >> > quantity of milliseconds.
> >> >
> >> > I agree that "interval" is more intuitive, but I think I favor
> >> consistency
> >> > in this case. Does that seem reasonable?
> >> >
> >> > Thanks,
> >> > -John
> >> >
> >> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
> >> >
> >> > > Normally size is not measured in time unit, such as milliseconds.
> >> > > How about naming the new method segmentInterval ?
> >> > > Thanks
> >> > > -------- Original message --------From: John Roesler <
> >> john@confluent.io>
> >> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> >> Subject:
> >> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> >> > > WindowBytesStoreSupplier
> >> > > Hello All,
> >> > >
> >> > > I'd like to propose KIP-319 to fix an issue I identified in
> >> KAFKA-7080.
> >> > > Specifically, we're creating CachingWindowStore with the *number of
> >> > > segments* instead of the *segment size*.
> >> > >
> >> > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> >> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >> > >
> >> > > additionally, here's a draft PR for clarity:
> >> > > https://github.com/apache/kafka/pull/5257
> >> > >
> >> > > Please let me know what you think!
> >> > >
> >> > > Thanks,
> >> > > -John
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
I've updated the KIP and draft PR accordingly.

On Thu, Jun 21, 2018 at 2:03 PM John Roesler <jo...@confluent.io> wrote:

> Interesting... I did not initially consider it because I didn't want to
> have an impact on anyone's Streams apps, but now I see that unless
> developers have subclassed `Windows`, the number of segments would always
> be 3!
>
> There's one caveat to this, which I think was a mistake. The field
> `segments` in Windows is public, which means that anyone can actually set
> it directly on any Window instance like:
>
>         TimeWindows tw = TimeWindows.of(100);
>         tw.segments = 12345;
>
> Bypassing the bounds check and contradicting the javadoc in Windows that
> says users can't directly set it. Sadly there's no way to "deprecate" this
> exposure, so I propose just to make it private.
>
> With this new knowledge, I agree, I think we can switch to
> "segmentInterval" throughout the interface.
>
> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Hello John,
>>
>> Thanks for the KIP.
>>
>> Should we consider making the change on `Stores#persistentWindowStore`
>> parameters as well?
>>
>>
>> Guozhang
>>
>>
>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:
>>
>> > Hi Ted,
>> >
>> > Ah, when you made that comment to me before, I thought you meant as
>> opposed
>> > to "segments". Now it makes sense that you meant as opposed to
>> > "segmentSize".
>> >
>> > I named it that way to match the peer method "windowSize", which is
>> also a
>> > quantity of milliseconds.
>> >
>> > I agree that "interval" is more intuitive, but I think I favor
>> consistency
>> > in this case. Does that seem reasonable?
>> >
>> > Thanks,
>> > -John
>> >
>> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
>> >
>> > > Normally size is not measured in time unit, such as milliseconds.
>> > > How about naming the new method segmentInterval ?
>> > > Thanks
>> > > -------- Original message --------From: John Roesler <
>> john@confluent.io>
>> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
>> Subject:
>> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
>> > > WindowBytesStoreSupplier
>> > > Hello All,
>> > >
>> > > I'd like to propose KIP-319 to fix an issue I identified in
>> KAFKA-7080.
>> > > Specifically, we're creating CachingWindowStore with the *number of
>> > > segments* instead of the *segment size*.
>> > >
>> > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
>> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>> > >
>> > > additionally, here's a draft PR for clarity:
>> > > https://github.com/apache/kafka/pull/5257
>> > >
>> > > Please let me know what you think!
>> > >
>> > > Thanks,
>> > > -John
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Ted Yu <yu...@gmail.com>.
bq. propose just to make it private

+1

On Thu, Jun 21, 2018 at 12:03 PM, John Roesler <jo...@confluent.io> wrote:

> Interesting... I did not initially consider it because I didn't want to
> have an impact on anyone's Streams apps, but now I see that unless
> developers have subclassed `Windows`, the number of segments would always
> be 3!
>
> There's one caveat to this, which I think was a mistake. The field
> `segments` in Windows is public, which means that anyone can actually set
> it directly on any Window instance like:
>
>         TimeWindows tw = TimeWindows.of(100);
>         tw.segments = 12345;
>
> Bypassing the bounds check and contradicting the javadoc in Windows that
> says users can't directly set it. Sadly there's no way to "deprecate" this
> exposure, so I propose just to make it private.
>
> With this new knowledge, I agree, I think we can switch to
> "segmentInterval" throughout the interface.
>
> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hello John,
> >
> > Thanks for the KIP.
> >
> > Should we consider making the change on `Stores#persistentWindowStore`
> > parameters as well?
> >
> >
> > Guozhang
> >
> >
> > On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Ted,
> > >
> > > Ah, when you made that comment to me before, I thought you meant as
> > opposed
> > > to "segments". Now it makes sense that you meant as opposed to
> > > "segmentSize".
> > >
> > > I named it that way to match the peer method "windowSize", which is
> also
> > a
> > > quantity of milliseconds.
> > >
> > > I agree that "interval" is more intuitive, but I think I favor
> > consistency
> > > in this case. Does that seem reasonable?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Normally size is not measured in time unit, such as milliseconds.
> > > > How about naming the new method segmentInterval ?
> > > > Thanks
> > > > -------- Original message --------From: John Roesler <
> > john@confluent.io>
> > > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> Subject:
> > > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > > WindowBytesStoreSupplier
> > > > Hello All,
> > > >
> > > > I'd like to propose KIP-319 to fix an issue I identified in
> KAFKA-7080.
> > > > Specifically, we're creating CachingWindowStore with the *number of
> > > > segments* instead of the *segment size*.
> > > >
> > > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > > >
> > > > additionally, here's a draft PR for clarity:
> > > > https://github.com/apache/kafka/pull/5257
> > > >
> > > > Please let me know what you think!
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Interesting... I did not initially consider it because I didn't want to
have an impact on anyone's Streams apps, but now I see that unless
developers have subclassed `Windows`, the number of segments would always
be 3!

There's one caveat to this, which I think was a mistake. The field
`segments` in Windows is public, which means that anyone can actually set
it directly on any Window instance like:

        TimeWindows tw = TimeWindows.of(100);
        tw.segments = 12345;

Bypassing the bounds check and contradicting the javadoc in Windows that
says users can't directly set it. Sadly there's no way to "deprecate" this
exposure, so I propose just to make it private.

With this new knowledge, I agree, I think we can switch to
"segmentInterval" throughout the interface.

On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hello John,
>
> Thanks for the KIP.
>
> Should we consider making the change on `Stores#persistentWindowStore`
> parameters as well?
>
>
> Guozhang
>
>
> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Hi Ted,
> >
> > Ah, when you made that comment to me before, I thought you meant as
> opposed
> > to "segments". Now it makes sense that you meant as opposed to
> > "segmentSize".
> >
> > I named it that way to match the peer method "windowSize", which is also
> a
> > quantity of milliseconds.
> >
> > I agree that "interval" is more intuitive, but I think I favor
> consistency
> > in this case. Does that seem reasonable?
> >
> > Thanks,
> > -John
> >
> > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
> >
> > > Normally size is not measured in time unit, such as milliseconds.
> > > How about naming the new method segmentInterval ?
> > > Thanks
> > > -------- Original message --------From: John Roesler <
> john@confluent.io>
> > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > WindowBytesStoreSupplier
> > > Hello All,
> > >
> > > I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
> > > Specifically, we're creating CachingWindowStore with the *number of
> > > segments* instead of the *segment size*.
> > >
> > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > >
> > > additionally, here's a draft PR for clarity:
> > > https://github.com/apache/kafka/pull/5257
> > >
> > > Please let me know what you think!
> > >
> > > Thanks,
> > > -John
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Guozhang Wang <wa...@gmail.com>.
Hello John,

Thanks for the KIP.

Should we consider making the change on `Stores#persistentWindowStore`
parameters as well?


Guozhang


On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <jo...@confluent.io> wrote:

> Hi Ted,
>
> Ah, when you made that comment to me before, I thought you meant as opposed
> to "segments". Now it makes sense that you meant as opposed to
> "segmentSize".
>
> I named it that way to match the peer method "windowSize", which is also a
> quantity of milliseconds.
>
> I agree that "interval" is more intuitive, but I think I favor consistency
> in this case. Does that seem reasonable?
>
> Thanks,
> -John
>
> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:
>
> > Normally size is not measured in time unit, such as milliseconds.
> > How about naming the new method segmentInterval ?
> > Thanks
> > -------- Original message --------From: John Roesler <jo...@confluent.io>
> > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > WindowBytesStoreSupplier
> > Hello All,
> >
> > I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
> > Specifically, we're creating CachingWindowStore with the *number of
> > segments* instead of the *segment size*.
> >
> > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >
> > additionally, here's a draft PR for clarity:
> > https://github.com/apache/kafka/pull/5257
> >
> > Please let me know what you think!
> >
> > Thanks,
> > -John
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by John Roesler <jo...@confluent.io>.
Hi Ted,

Ah, when you made that comment to me before, I thought you meant as opposed
to "segments". Now it makes sense that you meant as opposed to
"segmentSize".

I named it that way to match the peer method "windowSize", which is also a
quantity of milliseconds.

I agree that "interval" is more intuitive, but I think I favor consistency
in this case. Does that seem reasonable?

Thanks,
-John

On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yu...@gmail.com> wrote:

> Normally size is not measured in time unit, such as milliseconds.
> How about naming the new method segmentInterval ?
> Thanks
> -------- Original message --------From: John Roesler <jo...@confluent.io>
> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> [DISCUSS] KIP-319: Replace segments with segmentSize in
> WindowBytesStoreSupplier
> Hello All,
>
> I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
> Specifically, we're creating CachingWindowStore with the *number of
> segments* instead of the *segment size*.
>
> Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>
> additionally, here's a draft PR for clarity:
> https://github.com/apache/kafka/pull/5257
>
> Please let me know what you think!
>
> Thanks,
> -John
>

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

Posted by Ted Yu <yu...@gmail.com>.
Normally size is not measured in time unit, such as milliseconds. 
How about naming the new method segmentInterval ?
Thanks
-------- Original message --------From: John Roesler <jo...@confluent.io> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org Subject: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier 
Hello All,

I'd like to propose KIP-319 to fix an issue I identified in KAFKA-7080.
Specifically, we're creating CachingWindowStore with the *number of
segments* instead of the *segment size*.

Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ

additionally, here's a draft PR for clarity:
https://github.com/apache/kafka/pull/5257

Please let me know what you think!

Thanks,
-John