You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by John Roesler <jo...@confluent.io> on 2018/06/26 20:11:16 UTC

[DISCUSS] KIP-328: Ability to suppress updates for KTables

Hello devs and users,

Please take some time to consider this proposal for Kafka Streams:

KIP-328: Ability to suppress updates for KTables

link: https://cwiki.apache.org/confluence/x/sQU0BQ

The basic idea is to provide:
* more usable control over update rate (vs the current state store caches)
* the final-result-for-windowed-computations feature which several people
have requested

I look forward to your feedback!

Thanks,
-John

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hello again all,

I realized today that I neglected to include metrics in the proposal. I
have added them just now.

Thanks,
-John

On Tue, Jun 26, 2018 at 3:11 PM John Roesler <jo...@confluent.io> wrote:

> Hello devs and users,
>
> Please take some time to consider this proposal for Kafka Streams:
>
> KIP-328: Ability to suppress updates for KTables
>
> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>
> The basic idea is to provide:
> * more usable control over update rate (vs the current state store caches)
> * the final-result-for-windowed-computations feature which several people
> have requested
>
> I look forward to your feedback!
>
> Thanks,
> -John
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Guozhang,

Thanks for the clarification.

To answer your questions:
1. Yes, specifically Y < X makes sense and is by design.

The scenario is to support IQ queries over windows that are closed but not
evicted. For example, suppose we have a metrics application backed by
Streams. Let's say we do time windows of 1 minute with retention until 30
days, and we compute the final event after an additional 5 minutes.

When you load the app, it builds the graphs from the IQ stores and then
proceeds to live update the page using final updates + foreach. If we
didn't support Y < X, this use case wouldn't work at all.


As far as X < Y, goes, I think that we should set Y = max(Y, X -
windowSize). This is semantically harmless, and gives the best
responsiveness, since that's the earliest time we know the window result is
"final".


Altogether, it seems like you think that people mostly set the window
retention time tightly to when they would want to close the window and emit
a final event, whereas I think that people would set retention time much
greater than than the latest they expect to see an event when they are
using IQ. You have much more experience with this domain than I do, though,
so I feel I'll have to defer to your instincts.



2. I believe so. In my response to Matthias, I gave the example of making
Streams ignore out-of-order events with "suppressLateEvents(Duration.ZERO)".
This was actually something I wished for in the past. More generally, it
seems plausible to me that bounded lateness would be a useful invariant for
some kinds of custom processors or foreach blocks.



I've also been thinking this weekend about the concerns you raised, and I
have some more thoughts. I'll send a separate reply to keep the messages
short.

Thanks for helping to hash this out,
-john


On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hi John,
>
> Regarding the metrics: yeah I think I'm with you that the dropped records
> due to window retention or emit suppression policies should be recorded
> differently, and using this KIP's proposed metric would be fine. If you
> also think we can use this KIP's proposed metrics to cover the window
> retention cased skipping records, then we can include the changes in this
> KIP as well.
>
> Regarding the current proposal, I'm actually not too worried about the
> inconsistency between query semantics and downstream emit semantics. For
> queries, we will always return the current running results of the windows,
> being it partial or final results depending on the window retention time
> anyways, which has nothing to do whether the emitted stream should be one
> final output per key or not. I also agree that having a unified operation
> is generally better for users to focus on leveraging that one only than
> learning about two set of operations. The only question I had is, for final
> updates of window stores, if it is a bit awkward to understand the
> configuration combo. Thinking about this more, I think my root worry in the
> "suppressLateEvents" call for windowed tables, since from a user
> perspective: if my retention time is X which means "pay the cost to allow
> late records up to X to still be applied updating the tables", why would I
> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> updates up to Y, which means the downstream operator or sink topic for this
> stream would actually see a truncated update stream while I've paid larger
> cost for that"; and of course, Y > X would not make sense either as you
> would not see any updates later than X anyways. So in all, my feeling is
> that it makes less sense for windowed table's "suppressLateEvents" with a
> parameter that is not equal to the window retention, and opening the door
> in the current proposal may confuse people with that.
>
> Again, above is just a subjective opinion and probably we can also bring up
> some scenarios that users does want to set X != Y.. but personally I feel
> that even if the semantics for this scenario if intuitive for user to
> understand, doe that really make sense and should we really open the door
> for it. So I think maybe separating the final update in a separate API's
> benefits may overwhelm the advantage of having one uniform definition. And
> for my alternative proposal, the rationale was from both my concern about
> "suppressLateEvents" for windowed store, and Matthias' question about
> "suppressLateEvents" for non-windowed stores, that if it is less meaningful
> for both, we can consider removing it completely and only do
> "IntermediateSuppression" in Suppress instead.
>
> So I'd summarize my thoughts in the following questions:
>
> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
> for windowed stores make sense in practice?
> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
> make sense in practice?
>
>
>
> Guozhang
>
>
> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>
> > Thanks for the explanation, that does make sense.  I have some questions
> on
> > operations, but I'll just wait for the PR and tests.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Bill,
> > >
> > > Thanks for the review!
> > >
> > > Your question is very much applicable to the KIP and not at all an
> > > implementation detail. Thanks for bringing it up.
> > >
> > > I'm proposing not to change the existing caches and configurations at
> all
> > > (for now).
> > >
> > > Imagine you have a topology like this:
> > > commit.interval.ms = 100
> > >
> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> > >
> > > The first ktable (ktable1) will respect the commit interval and buffer
> > > events for 100ms before logging, storing, or forwarding them (IIRC).
> > > Therefore, the second ktable (suppress) will only see the events at a
> > rate
> > > of once per 100ms. It will apply its own buffering, and emit once per
> > 200ms
> > > This case is pretty trivial because the suppress time is a multiple of
> > the
> > > commit interval.
> > >
> > > When it's not an integer multiple, you'll get behavior like in this
> > marble
> > > diagram:
> > >
> > >
> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >
> > > [ KTable caching with commit interval = 2 ]
> > >
> > > <--------(k:2)---------(k:4)---------(k:6)->
> > >
> > >       [ suppress with emitAfter = 3 ]
> > >
> > > <---------------(k:2)----------------(k:6)->
> > >
> > >
> > > If this behavior isn't desired (for example, if you wanted to emit
> (k:3)
> > at
> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> > > modifying the topology to disable caching. Then, the behavior is more
> > > simply determined just by the suppress operator.
> > >
> > > Does that seem right to you?
> > >
> > >
> > > Regarding the changelogs, because the suppression operator hangs onto
> > > events for a while, it will need its own changelog. The changelog
> > > should represent the current state of the buffer at all times. So when
> > the
> > > suppress operator sees (k:2), for example, it will log (k:2). When it
> > > later gets to time 3, it's time to emit (k:2) downstream. Because k is
> no
> > > longer buffered, the suppress operator will log (k:null). Thus, when
> > > recovering,
> > > it can rebuild the buffer by reading its changelog.
> > >
> > > What do you think about this?
> > >
> > > Thanks,
> > > -John
> > >
> > >
> > >
> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
> > >
> > > > Hi John,  thanks for the KIP.
> > > >
> > > > Early on in the KIP, you mention the current approaches for
> controlling
> > > the
> > > > rate of downstream records from a KTable, cache size configuration
> and
> > > > commit time.
> > > >
> > > > Will these configuration parameters still be in effect for tables
> that
> > > > don't use suppression?  For tables taking advantage of suppression,
> > will
> > > > these configurations have no impact?
> > > > This last question may be to implementation specific but if the
> > requested
> > > > suppression time is longer than the specified commit time, will the
> > > latest
> > > > record in the suppression buffer get stored in a changelog?
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > > > Thanks for the feedback, Matthias,
> > > > >
> > > > > It seems like in straightforward relational processing cases, it
> > would
> > > > not
> > > > > make sense to bound the lateness of KTables. In general, it seems
> > > better
> > > > to
> > > > > have "guard rails" in place that make it easier to write sensible
> > > > programs
> > > > > than insensible ones.
> > > > >
> > > > > But I'm still going to argue in favor of keeping it for all KTables
> > ;)
> > > > >
> > > > > 1. I believe it is simpler to understand the operator if it has one
> > > > uniform
> > > > > definition, regardless of context. It's well defined and intuitive
> > what
> > > > > will happen when you use late-event suppression on a KTable, so I
> > think
> > > > > nothing surprising or dangerous will happen in that case. From my
> > > > > perspective, having two sets of allowed operations is actually an
> > > > increase
> > > > > in cognitive complexity.
> > > > >
> > > > > 2. To me, it's not crazy to use the operator this way. For example,
> > in
> > > > lieu
> > > > > of full-featured timestamp semantics, I can implement MVCC behavior
> > > when
> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> > > that
> > > > > there are other, non-obvious applications of suppressing late
> events
> > on
> > > > > KTables.
> > > > >
> > > > > 3. Not to get too much into implementation details in a KIP
> > discussion,
> > > > but
> > > > > if we did want to make late-event suppression available only on
> > > windowed
> > > > > KTables, we have two enforcement options:
> > > > >   a. check when we build the topology - this would be simple to
> > > > implement,
> > > > > but would be a runtime check. Hopefully, people write tests for
> their
> > > > > topology before deploying them, so the feedback loop isn't
> > > instantaneous,
> > > > > but it's not too long either.
> > > > >   b. add a new WindowedKTable type - this would be a compile time
> > > check,
> > > > > but would also be substantial increase of both interface and code
> > > > > complexity.
> > > > >
> > > > > We should definitely strive to have guard rails protecting against
> > > > > surprising or dangerous behavior. Protecting against programs that
> we
> > > > don't
> > > > > currently predict is a lesser benefit, and I think we can put up
> > guard
> > > > > rails on a case-by-case basis for that. It seems like the increase
> in
> > > > > cognitive (and potentially code and interface) complexity makes me
> > > think
> > > > we
> > > > > should skip this case.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > matthias@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the KIP John.
> > > > > >
> > > > > > One initial comments about the last example "Bounded lateness":
> > For a
> > > > > > non-windowed KTable bounding the lateness does not really make
> > sense,
> > > > > > does it?
> > > > > >
> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
> for
> > > this
> > > > > > case? It seems to be better to only allow it for
> windowed-KTables.
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > >
> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > > I noticed this (lack of primary parameter) as well.
> > > > > > >
> > > > > > > What you gave as new example is semantically the same as what I
> > > > > > suggested.
> > > > > > > So it is good by me.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for taking look, Ted,
> > > > > > >>
> > > > > > >> I agree this is a departure from the conventions of Streams
> DSL.
> > > > > > >>
> > > > > > >> Most of our config objects have one or two "required"
> > parameters,
> > > > > which
> > > > > > fit
> > > > > > >> naturally with the static factory method approach. TimeWindow,
> > for
> > > > > > example,
> > > > > > >> requires a size parameter, so we can naturally say
> > > > > TimeWindows.of(size).
> > > > > > >>
> > > > > > >> I think in the case of a suppression, there's really no "core"
> > > > > > parameter,
> > > > > > >> and "Suppression.of()" seems sillier than "new
> Suppression()". I
> > > > think
> > > > > > that
> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
> > many
> > > > > > durations
> > > > > > >> that we can configure.
> > > > > > >>
> > > > > > >> However, thinking about it again, I suppose that I can give
> each
> > > > > > >> configuration method a static version, which would let you
> > replace
> > > > > "new
> > > > > > >> Suppression()." with "Suppression." in all the examples.
> > > Basically,
> > > > > > instead
> > > > > > >> of "of()", we'd support any of the methods I listed.
> > > > > > >>
> > > > > > >> For example:
> > > > > > >>
> > > > > > >> windowCounts
> > > > > > >>     .suppress(
> > > > > > >>         Suppression
> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > > > >>             .suppressIntermediateEvents(
> > > > > > >>
> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > > > >>             )
> > > > > > >>     );
> > > > > > >>
> > > > > > >>
> > > > > > >> Does that seem better?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> -John
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > > > wrote:
> > > > > > >>
> > > > > > >>> I started to read this KIP which contains a lot of materials.
> > > > > > >>>
> > > > > > >>> One suggestion:
> > > > > > >>>
> > > > > > >>>     .suppress(
> > > > > > >>>         new Suppression()
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Do you think it would be more consistent with the rest of
> > Streams
> > > > > data
> > > > > > >>> structures by supporting `of` ?
> > > > > > >>>
> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Cheers
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > john@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> Hello devs and users,
> > > > > > >>>>
> > > > > > >>>> Please take some time to consider this proposal for Kafka
> > > Streams:
> > > > > > >>>>
> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > > > >>>>
> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > > >>>>
> > > > > > >>>> The basic idea is to provide:
> > > > > > >>>> * more usable control over update rate (vs the current state
> > > store
> > > > > > >>> caches)
> > > > > > >>>> * the final-result-for-windowed-computations feature which
> > > several
> > > > > > >> people
> > > > > > >>>> have requested
> > > > > > >>>>
> > > > > > >>>> I look forward to your feedback!
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> -John
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Guozhang,

Thanks for the clarification.

To answer your questions:
1. Yes, specifically Y < X makes sense and is by design.

The scenario is to support IQ queries over windows that are closed but not
evicted. For example, suppose we have a metrics application backed by
Streams. Let's say we do time windows of 1 minute with retention until 30
days, and we compute the final event after an additional 5 minutes.

When you load the app, it builds the graphs from the IQ stores and then
proceeds to live update the page using final updates + foreach. If we
didn't support Y < X, this use case wouldn't work at all.


As far as X < Y, goes, I think that we should set Y = max(Y, X -
windowSize). This is semantically harmless, and gives the best
responsiveness, since that's the earliest time we know the window result is
"final".


Altogether, it seems like you think that people mostly set the window
retention time tightly to when they would want to close the window and emit
a final event, whereas I think that people would set retention time much
greater than than the latest they expect to see an event when they are
using IQ. You have much more experience with this domain than I do, though,
so I feel I'll have to defer to your instincts.



2. I believe so. In my response to Matthias, I gave the example of making
Streams ignore out-of-order events with "suppressLateEvents(Duration.ZERO)".
This was actually something I wished for in the past. More generally, it
seems plausible to me that bounded lateness would be a useful invariant for
some kinds of custom processors or foreach blocks.



I've also been thinking this weekend about the concerns you raised, and I
have some more thoughts. I'll send a separate reply to keep the messages
short.

Thanks for helping to hash this out,
-john


On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hi John,
>
> Regarding the metrics: yeah I think I'm with you that the dropped records
> due to window retention or emit suppression policies should be recorded
> differently, and using this KIP's proposed metric would be fine. If you
> also think we can use this KIP's proposed metrics to cover the window
> retention cased skipping records, then we can include the changes in this
> KIP as well.
>
> Regarding the current proposal, I'm actually not too worried about the
> inconsistency between query semantics and downstream emit semantics. For
> queries, we will always return the current running results of the windows,
> being it partial or final results depending on the window retention time
> anyways, which has nothing to do whether the emitted stream should be one
> final output per key or not. I also agree that having a unified operation
> is generally better for users to focus on leveraging that one only than
> learning about two set of operations. The only question I had is, for final
> updates of window stores, if it is a bit awkward to understand the
> configuration combo. Thinking about this more, I think my root worry in the
> "suppressLateEvents" call for windowed tables, since from a user
> perspective: if my retention time is X which means "pay the cost to allow
> late records up to X to still be applied updating the tables", why would I
> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> updates up to Y, which means the downstream operator or sink topic for this
> stream would actually see a truncated update stream while I've paid larger
> cost for that"; and of course, Y > X would not make sense either as you
> would not see any updates later than X anyways. So in all, my feeling is
> that it makes less sense for windowed table's "suppressLateEvents" with a
> parameter that is not equal to the window retention, and opening the door
> in the current proposal may confuse people with that.
>
> Again, above is just a subjective opinion and probably we can also bring up
> some scenarios that users does want to set X != Y.. but personally I feel
> that even if the semantics for this scenario if intuitive for user to
> understand, doe that really make sense and should we really open the door
> for it. So I think maybe separating the final update in a separate API's
> benefits may overwhelm the advantage of having one uniform definition. And
> for my alternative proposal, the rationale was from both my concern about
> "suppressLateEvents" for windowed store, and Matthias' question about
> "suppressLateEvents" for non-windowed stores, that if it is less meaningful
> for both, we can consider removing it completely and only do
> "IntermediateSuppression" in Suppress instead.
>
> So I'd summarize my thoughts in the following questions:
>
> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
> for windowed stores make sense in practice?
> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
> make sense in practice?
>
>
>
> Guozhang
>
>
> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>
> > Thanks for the explanation, that does make sense.  I have some questions
> on
> > operations, but I'll just wait for the PR and tests.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Bill,
> > >
> > > Thanks for the review!
> > >
> > > Your question is very much applicable to the KIP and not at all an
> > > implementation detail. Thanks for bringing it up.
> > >
> > > I'm proposing not to change the existing caches and configurations at
> all
> > > (for now).
> > >
> > > Imagine you have a topology like this:
> > > commit.interval.ms = 100
> > >
> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> > >
> > > The first ktable (ktable1) will respect the commit interval and buffer
> > > events for 100ms before logging, storing, or forwarding them (IIRC).
> > > Therefore, the second ktable (suppress) will only see the events at a
> > rate
> > > of once per 100ms. It will apply its own buffering, and emit once per
> > 200ms
> > > This case is pretty trivial because the suppress time is a multiple of
> > the
> > > commit interval.
> > >
> > > When it's not an integer multiple, you'll get behavior like in this
> > marble
> > > diagram:
> > >
> > >
> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >
> > > [ KTable caching with commit interval = 2 ]
> > >
> > > <--------(k:2)---------(k:4)---------(k:6)->
> > >
> > >       [ suppress with emitAfter = 3 ]
> > >
> > > <---------------(k:2)----------------(k:6)->
> > >
> > >
> > > If this behavior isn't desired (for example, if you wanted to emit
> (k:3)
> > at
> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> > > modifying the topology to disable caching. Then, the behavior is more
> > > simply determined just by the suppress operator.
> > >
> > > Does that seem right to you?
> > >
> > >
> > > Regarding the changelogs, because the suppression operator hangs onto
> > > events for a while, it will need its own changelog. The changelog
> > > should represent the current state of the buffer at all times. So when
> > the
> > > suppress operator sees (k:2), for example, it will log (k:2). When it
> > > later gets to time 3, it's time to emit (k:2) downstream. Because k is
> no
> > > longer buffered, the suppress operator will log (k:null). Thus, when
> > > recovering,
> > > it can rebuild the buffer by reading its changelog.
> > >
> > > What do you think about this?
> > >
> > > Thanks,
> > > -John
> > >
> > >
> > >
> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
> > >
> > > > Hi John,  thanks for the KIP.
> > > >
> > > > Early on in the KIP, you mention the current approaches for
> controlling
> > > the
> > > > rate of downstream records from a KTable, cache size configuration
> and
> > > > commit time.
> > > >
> > > > Will these configuration parameters still be in effect for tables
> that
> > > > don't use suppression?  For tables taking advantage of suppression,
> > will
> > > > these configurations have no impact?
> > > > This last question may be to implementation specific but if the
> > requested
> > > > suppression time is longer than the specified commit time, will the
> > > latest
> > > > record in the suppression buffer get stored in a changelog?
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > > > Thanks for the feedback, Matthias,
> > > > >
> > > > > It seems like in straightforward relational processing cases, it
> > would
> > > > not
> > > > > make sense to bound the lateness of KTables. In general, it seems
> > > better
> > > > to
> > > > > have "guard rails" in place that make it easier to write sensible
> > > > programs
> > > > > than insensible ones.
> > > > >
> > > > > But I'm still going to argue in favor of keeping it for all KTables
> > ;)
> > > > >
> > > > > 1. I believe it is simpler to understand the operator if it has one
> > > > uniform
> > > > > definition, regardless of context. It's well defined and intuitive
> > what
> > > > > will happen when you use late-event suppression on a KTable, so I
> > think
> > > > > nothing surprising or dangerous will happen in that case. From my
> > > > > perspective, having two sets of allowed operations is actually an
> > > > increase
> > > > > in cognitive complexity.
> > > > >
> > > > > 2. To me, it's not crazy to use the operator this way. For example,
> > in
> > > > lieu
> > > > > of full-featured timestamp semantics, I can implement MVCC behavior
> > > when
> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> > > that
> > > > > there are other, non-obvious applications of suppressing late
> events
> > on
> > > > > KTables.
> > > > >
> > > > > 3. Not to get too much into implementation details in a KIP
> > discussion,
> > > > but
> > > > > if we did want to make late-event suppression available only on
> > > windowed
> > > > > KTables, we have two enforcement options:
> > > > >   a. check when we build the topology - this would be simple to
> > > > implement,
> > > > > but would be a runtime check. Hopefully, people write tests for
> their
> > > > > topology before deploying them, so the feedback loop isn't
> > > instantaneous,
> > > > > but it's not too long either.
> > > > >   b. add a new WindowedKTable type - this would be a compile time
> > > check,
> > > > > but would also be substantial increase of both interface and code
> > > > > complexity.
> > > > >
> > > > > We should definitely strive to have guard rails protecting against
> > > > > surprising or dangerous behavior. Protecting against programs that
> we
> > > > don't
> > > > > currently predict is a lesser benefit, and I think we can put up
> > guard
> > > > > rails on a case-by-case basis for that. It seems like the increase
> in
> > > > > cognitive (and potentially code and interface) complexity makes me
> > > think
> > > > we
> > > > > should skip this case.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > matthias@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the KIP John.
> > > > > >
> > > > > > One initial comments about the last example "Bounded lateness":
> > For a
> > > > > > non-windowed KTable bounding the lateness does not really make
> > sense,
> > > > > > does it?
> > > > > >
> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
> for
> > > this
> > > > > > case? It seems to be better to only allow it for
> windowed-KTables.
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > >
> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > > I noticed this (lack of primary parameter) as well.
> > > > > > >
> > > > > > > What you gave as new example is semantically the same as what I
> > > > > > suggested.
> > > > > > > So it is good by me.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for taking look, Ted,
> > > > > > >>
> > > > > > >> I agree this is a departure from the conventions of Streams
> DSL.
> > > > > > >>
> > > > > > >> Most of our config objects have one or two "required"
> > parameters,
> > > > > which
> > > > > > fit
> > > > > > >> naturally with the static factory method approach. TimeWindow,
> > for
> > > > > > example,
> > > > > > >> requires a size parameter, so we can naturally say
> > > > > TimeWindows.of(size).
> > > > > > >>
> > > > > > >> I think in the case of a suppression, there's really no "core"
> > > > > > parameter,
> > > > > > >> and "Suppression.of()" seems sillier than "new
> Suppression()". I
> > > > think
> > > > > > that
> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
> > many
> > > > > > durations
> > > > > > >> that we can configure.
> > > > > > >>
> > > > > > >> However, thinking about it again, I suppose that I can give
> each
> > > > > > >> configuration method a static version, which would let you
> > replace
> > > > > "new
> > > > > > >> Suppression()." with "Suppression." in all the examples.
> > > Basically,
> > > > > > instead
> > > > > > >> of "of()", we'd support any of the methods I listed.
> > > > > > >>
> > > > > > >> For example:
> > > > > > >>
> > > > > > >> windowCounts
> > > > > > >>     .suppress(
> > > > > > >>         Suppression
> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > > > >>             .suppressIntermediateEvents(
> > > > > > >>
> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > > > >>             )
> > > > > > >>     );
> > > > > > >>
> > > > > > >>
> > > > > > >> Does that seem better?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> -John
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > > > wrote:
> > > > > > >>
> > > > > > >>> I started to read this KIP which contains a lot of materials.
> > > > > > >>>
> > > > > > >>> One suggestion:
> > > > > > >>>
> > > > > > >>>     .suppress(
> > > > > > >>>         new Suppression()
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Do you think it would be more consistent with the rest of
> > Streams
> > > > > data
> > > > > > >>> structures by supporting `of` ?
> > > > > > >>>
> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Cheers
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > john@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> Hello devs and users,
> > > > > > >>>>
> > > > > > >>>> Please take some time to consider this proposal for Kafka
> > > Streams:
> > > > > > >>>>
> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > > > >>>>
> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > > >>>>
> > > > > > >>>> The basic idea is to provide:
> > > > > > >>>> * more usable control over update rate (vs the current state
> > > store
> > > > > > >>> caches)
> > > > > > >>>> * the final-result-for-windowed-computations feature which
> > > several
> > > > > > >> people
> > > > > > >>>> have requested
> > > > > > >>>>
> > > > > > >>>> I look forward to your feedback!
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> -John
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by "Matthias J. Sax" <ma...@confluent.io>.

Thanks for the update. I did a first pass over the updated KIP and think
it makes sense.

-Matthias

On 7/11/18 5:47 PM, John Roesler wrote:
> Hi all,
> 
> I have updated KIP-328 with all the feedback I've gotten so far. Please
> take another look and let me know what you think!
> 
> Thanks,
> -John
> 
> On Wed, Jul 11, 2018 at 12:28 AM Guozhang Wang <wa...@gmail.com> wrote:
> 
>> That is a good point..
>>
>> I cannot think of a better option than documentation and warning, and also
>> given that we'd probably better not reusing the function name `until` for
>> close time.
>>
>>
>> Guozhang
>>
>>
>> On Tue, Jul 10, 2018 at 3:31 PM, John Roesler <jo...@confluent.io> wrote:
>>
>>> I had some opportunity to reflect on the default for close time today...
>>>
>>> Note that the current "close time" is equal to the retention time, and
>>> therefore "close" today shares the default retention of 24h.
>>>
>>> It would definitely break any application that today specifies a
>> retention
>>> time to set close shorter than that time. It's also likely to break apps
>> if
>>> they *don't* set the retention time and rely on the 24h default. So it's
>>> unfortunate, but I think if "close" isn't set, we should use the
>> retention
>>> time instead of a fixed default.
>>>
>>> When we ultimately remove the retention time parameter ("until"), we will
>>> have to set "close" to a default of 24h.
>>>
>>> Of course, this has a negative impact on the user of "final results",
>> since
>>> they won't see any output at all for retentionTime/24h, and may find this
>>> confusing. What can we do about this except document it well? Maybe log a
>>> warning if we see that close wasn't explicitly set while using "final
>>> results"?
>>>
>>> Thanks,
>>> -John
>>>
>>> On Tue, Jul 10, 2018 at 10:46 AM John Roesler <jo...@confluent.io> wrote:
>>>
>>>> Hi Guozhang,
>>>>
>>>> That sounds good to me. I'll include that in the KIP.
>>>>
>>>> Thanks,
>>>> -John
>>>>
>>>> On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang <wa...@gmail.com>
>> wrote:
>>>>
>>>>> Let me clarify a bit on what I meant about moving `retentionPeriod` to
>>>>> WindowStoreBuilder:
>>>>>
>>>>> In another discussion we had around KIP-319 / 330, that the "retention
>>>>> period" should not really be a window spec, but only a window store
>>> spec,
>>>>> as it only affects how long to retain each window to be queryable
>> along
>>>>> with the storage cost.
>>>>>
>>>>> More specifically, today the "maintainMs" returned from Windows is
>> used
>>> in
>>>>> three places:
>>>>>
>>>>> 1) for windowed aggregations, they are passed in directly into
>>>>> `Stores.persistentWindows()` as the retention period parameters. For
>>> this
>>>>> use case we should just let the WindowStoreBuilder to specify this
>> value
>>>>> itself.
>>>>>
>>>>> NOTE: It is also returned in the KStreamWindowAggregate processor, to
>>>>> determine if a received record should be dropped due to its lateness.
>> We
>>>>> may need to think of another way to get this value inside the
>> processor
>>>>>
>>>>> 2) for windowed stream-stream join, it is used as the join range
>>> parameter
>>>>> but only to check that "windowSizeMs <= retentionPeriodMs". We can do
>>> this
>>>>> check at the store builder lever instead of at the processor level.
>>>>>
>>>>>
>>>>> If we can remove its usage in both 1) and 2), then we should be able
>> to
>>>>> safely remove this from the `Windows` spec.
>>>>>
>>>>>
>>>>> Guozhang
>>>>>
>>>>>
>>>>> On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io>
>> wrote:
>>>>>
>>>>>> Thanks for the reply, Guozhang,
>>>>>>
>>>>>> Good! I agree, that is also a good reason, and I actually made use
>> of
>>>>> that
>>>>>> in my tests. I'll update the KIP.
>>>>>>
>>>>>> By the way, I chose "allowedLateness" as I was trying to pick a
>> better
>>>>> name
>>>>>> than "close", but I think it's actually the wrong name. We don't
>> want
>>> to
>>>>>> bound the lateness of events in general, only with respect to the
>> end
>>> of
>>>>>> their window.
>>>>>>
>>>>>> If we have a window [0,10), with "allowedLateness" of 5, then if we
>>> get
>>>>> an
>>>>>> event with timestamp 3 at time 9, the name implies we'd reject it,
>>> which
>>>>>> seems silly. Really, we'd only want to start rejecting that event at
>>>>> stream
>>>>>> time 15.
>>>>>>
>>>>>> What I meant was more like "allowedLatenessAfterWindowEnd", but
>>> that's
>>>>> too
>>>>>> verbose. I think that "close" + some documentation about what it
>> means
>>>>> will
>>>>>> be better.
>>>>>>
>>>>>> 1: "Close" would be measured from the end of the window, so a
>>> reasonable
>>>>>> default would be "0". Recall that "close" really only needs to be
>>>>> specified
>>>>>> for final results, and a default of 0 would produce the most
>> intuitive
>>>>>> results. If folks later discover that they are missing some late
>>> events,
>>>>>> they can adjust the parameter accordingly. IMHO, any other value
>> would
>>>>> just
>>>>>> be a guess on our part.
>>>>>>
>>>>>> 2a:
>>>>>> I think you're saying to re-use "until" instead of adding "close" to
>>> the
>>>>>> window.
>>>>>>
>>>>>> The downside here would be that the semantic change could be more
>>>>> confusing
>>>>>> than deprecating "until" and introducing window "close" and a
>>>>>> "retentionTime" on the store builder. The deprecation is a good,
>>>>> controlled
>>>>>> way for us to make sure people are getting the semantics they think
>>>>> they're
>>>>>> getting, as well as giving us an opportunity to link people to the
>> API
>>>>> they
>>>>>> should use instead.
>>>>>>
>>>>>> I didn't fully understand the second part, but it sounds like you're
>>>>>> suggesting to add a new "retentionTime" setter to Windows to bridge
>>> the
>>>>> gap
>>>>>> until we add it to the store builder? That seems kind of roundabout
>> to
>>>>> me,
>>>>>> if that's what you meant. We could just immediately add it to the
>>> store
>>>>>> builders in the same PR.
>>>>>>
>>>>>> 2b: Sounds good to me!
>>>>>>
>>>>>> Thanks again,
>>>>>> -John
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> John,
>>>>>>>
>>>>>>> Thanks for your replies. As for the two options of the API, I
>> think
>>>>> I'm
>>>>>>> slightly inclined to the first option as well. My motivation is a
>>> bit
>>>>>>> different, as I think of the first one maybe more flexible, for
>>>>> example:
>>>>>>>
>>>>>>> KTable<Windowed<..>> table = ... count();
>>>>>>>
>>>>>>> table.toStream().peek(..);   // want to peek at the changelog
>>> stream,
>>>>> do
>>>>>>> not care about final results.
>>>>>>>
>>>>>>> table.suppress().toStream().to("topic");    // sending to a topic,
>>>>> want
>>>>>> to
>>>>>>> only send the final results.
>>>>>>>
>>>>>>> --------------
>>>>>>>
>>>>>>> Besides that, I have a few more minor questions:
>>>>>>>
>>>>>>> 1. For "allowedLateness", what should be the default value? I.e.
>> if
>>>>> user
>>>>>> do
>>>>>>> not specify "allowedLateness" in TimeWindows, what value should we
>>>>> set?
>>>>>>>
>>>>>>> 2. For API names, some personal suggestions here:
>>>>>>>
>>>>>>> 2.a) "allowedLateness"  -> "until" (semantics changed, and also
>>> value
>>>>> is
>>>>>>> defined as delta on top of window length), where "until" ->
>>>>>>> "retentionPeriod", and the latter will be removed from `Windows`
>> to
>>> `
>>>>>>> WindowStoreBuilder` in the future.
>>>>>>>
>>>>>>> 2.b) "BufferConfig" -> "Buffered" ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Matthias and Guozhang,
>>>>>>>>
>>>>>>>> Sorry for the slow reply. I was mulling about your feedback and
>>>>>> weighing
>>>>>>>> some ideas in a sketchbook PR: https://github.com/apache/
>>>>>> kafka/pull/5337
>>>>>>> .
>>>>>>>>
>>>>>>>> Your thought about keeping suppression independent of business
>>> logic
>>>>>> is a
>>>>>>>> very good one. I agree that it would make more sense to add some
>>>>> kind
>>>>>> of
>>>>>>>> "window close" concept to the window definition.
>>>>>>>>
>>>>>>>> In fact, doing that immediately solves the inconsistency problem
>>>>>> Guozhang
>>>>>>>> brought up. There's no need to add a "final results" or
>> "emission"
>>>>>> option
>>>>>>>> to the windowed aggregation.
>>>>>>>>
>>>>>>>> What do you think about an API more like this:
>>>>>>>>
>>>>>>>> final StreamsBuilder builder = new StreamsBuilder();
>>>>>>>>
>>>>>>>> builder
>>>>>>>>   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
>>>>>>>>   .groupBy(
>>>>>>>>     (String k1, String v1) -> k1,
>>>>>>>>     Serialized.with(STRING_SERDE, STRING_SERDE)
>>>>>>>>   )
>>>>>>>>   .windowedBy(TimeWindows
>>>>>>>>     .of(scaledTime(2L))
>>>>>>>>     .until(scaledTime(3L))
>>>>>>>>     .allowedLateness(scaledTime(1L))
>>>>>>>>   )
>>>>>>>>   .count(Materialized.as("counts"))
>>>>>>>>   .suppress(
>>>>>>>>     emitFinalResultsOnly(
>>>>>>>>       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
>>>>>> SHUT_DOWN)
>>>>>>>>     )
>>>>>>>>   )
>>>>>>>>   .toStream()
>>>>>>>>   .to("output-suppressed", Produced.with(STRING_SERDE,
>>> LONG_SERDE));
>>>>>>>>
>>>>>>>> Note that:
>>>>>>>>  * "emitFinalResultsOnly" is available *only* on windowed tables
>>>>>>> (enforced
>>>>>>>> by the type system at compile time), and it determines the time
>> to
>>>>> wait
>>>>>>> by
>>>>>>>> looking at "allowedLateness" on the TimeWindows config.
>>>>>>>>  * querying "counts" will produce results (eventually)
>> consistent
>>>>> with
>>>>>>>> what's observable in "output-suppressed".
>>>>>>>>  * in all cases, "suppress" has no effect on business logic,
>> just
>>> on
>>>>>>> event
>>>>>>>> suppression.
>>>>>>>>
>>>>>>>> Is this API straightforward? Or do you still prefer the version
>>> that
>>>>>> both
>>>>>>>> proposed:
>>>>>>>>
>>>>>>>>   ...
>>>>>>>>   .windowedBy(TimeWindows
>>>>>>>>     .of(scaledTime(2L))
>>>>>>>>     .until(scaledTime(3L))
>>>>>>>>     .allowedLateness(scaledTime(1L))
>>>>>>>>   )
>>>>>>>>   .count(
>>>>>>>>     Materialized.as("counts"),
>>>>>>>>     emitFinalResultsOnly(
>>>>>>>>       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
>>>>>> SHUT_DOWN)
>>>>>>>>     )
>>>>>>>>   )
>>>>>>>>   ...
>>>>>>>>
>>>>>>>> To me, these two are practically identical, and I still vaguely
>>>>> prefer
>>>>>>> the
>>>>>>>> first one.
>>>>>>>>
>>>>>>>> The prototype has made clearer to me that users of "final
>> results
>>>>> for
>>>>>>>> windows" and users of "suppression for table events" both need
>> to
>>>>>>> configure
>>>>>>>> the suppression buffer.
>>>>>>>>
>>>>>>>> This buffer configuration consists of:
>>>>>>>> 1. how many keys or bytes to keep in memory
>>>>>>>> 2. what to do if memory runs out (shut down, start using disk,
>>> ...)
>>>>>>>>
>>>>>>>> So it's not as simple as setting a "final results" flag. We'll
>>>>> either
>>>>>>> have
>>>>>>>> an "Emit" config object on the windowed aggregators that takes
>> the
>>>>> same
>>>>>>>> BufferConfig that the "Suppress" config on the suppression
>>>>> operator, or
>>>>>>> we
>>>>>>>> just use the suppression operator for both.
>>>>>>>>
>>>>>>>> Perhaps it would sweeten the deal a little to point out that we
>>>>> have 2
>>>>>>>> overloads already for each windowed aggregator (with and without
>>>>>>>> Materialized). Adding "Emitted" or something would mean that
>> we'd
>>>>> add a
>>>>>>> new
>>>>>>>> overload for each one, taking us up to 4 overloads each for
>>> "count",
>>>>>>>> "aggregate" and "reduce". Using "suppress" means that we don't
>> add
>>>>> any
>>>>>>> new
>>>>>>>> overloads.
>>>>>>>>
>>>>>>>> Thanks again for helping to hash this out,
>>>>>>>> -John
>>>>>>>>
>>>>>>>> On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <
>> wangguoz@gmail.com>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I think I agree with Matthias for having dedicated APIs for
>>>>> windowed
>>>>>>>>> operation final output scenario, PLUS separating the window
>>> close
>>>>>> which
>>>>>>>> the
>>>>>>>>> "final output" would rely on, from the window retention time
>>>>> itself
>>>>>>>>> (admittedly it would make this KIP effort larger, but if we
>>>>> believe
>>>>>> we
>>>>>>>> need
>>>>>>>>> to do this separation anyways we could just do it now).
>>>>>>>>>
>>>>>>>>> And then we can have the `KTable#suppress()` for
>>>>>>> intermediate-suppression
>>>>>>>>> only, not for late-record-suppression, until we've seen that
>>>>> becomes
>>>>>> a
>>>>>>>>> common feature request because our current design still allows
>>> to
>>>>> be
>>>>>>>>> extended for that purpose.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guozhang
>>>>>>>>>
>>>>>>>>> On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
>>>>>>> matthias@confluent.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for the discussion. I am just catching up.
>>>>>>>>>>
>>>>>>>>>> In general, I think we have different uses cases and
>>>>> non-windowed
>>>>>> and
>>>>>>>>>> windowed is quite different. For the non-windowed case,
>>>>> suppress()
>>>>>>> has
>>>>>>>>>> no (useful) close or retention time, no final semantics, and
>>>>> also
>>>>>> no
>>>>>>>>>> business logic impact.
>>>>>>>>>>
>>>>>>>>>> On the other hand, for windowed aggregations, close time and
>>>>> final
>>>>>>>>>> result do have a meaning. IMHO, `close()` is part of
>> business
>>>>> logic
>>>>>>>>>> while retention time is not. Also, suppression of
>> intermediate
>>>>>> result
>>>>>>>> is
>>>>>>>>>> not a business rule and there might be use case for which
>>> either
>>>>>>> "early
>>>>>>>>>> intermediate" (before window end time) are suppressed only,
>> or
>>>>> all
>>>>>>>>>> intermediates are suppressed (maybe also something in the
>>>>> middle,
>>>>>> ie,
>>>>>>>>>> just reduce the load of intermediate updates). Thus,
>>>>>>> window-suppression
>>>>>>>>>> is much richer.
>>>>>>>>>>
>>>>>>>>>> IMHO, a generic `suppress()` operator that can be inserted
>>> into
>>>>> the
>>>>>>>> data
>>>>>>>>>> flow at any point is useful. Maybe we should keep is as
>>> generic
>>>>> as
>>>>>>>>>> possible. However, it might be difficult to use with regard
>> to
>>>>>>>>>> windowing, as the mental effort to use it is high.
>>>>>>>>>>
>>>>>>>>>> With regard to Guozhang's comment:
>>>>>>>>>>
>>>>>>>>>>> we will actually
>>>>>>>>>>> process data as old as 30 days as well, while most of the
>>> late
>>>>>>>> updates
>>>>>>>>>>> beyond 5 minutes would be discarded anyways.
>>>>>>>>>>
>>>>>>>>>> If we use `suppress()` as a standalone operator, this is
>>> correct
>>>>>> and
>>>>>>>>>> intended IMHO. To address the issue if the behavior is
>>>>> unwanted, I
>>>>>>>> would
>>>>>>>>>> suggest to add a "suppress option" directly to
>>>>>>>>>> `count()/reduce()/aggregate()` window operator similar to
>>>>>>>>>> `Materialized`. This would be an "embedded suppress" and
>> avoid
>>>>> the
>>>>>>>>>> issue. It would also address the issue about mental effort
>> for
>>>>>>> "single
>>>>>>>>>> final window result" use case.
>>>>>>>>>>
>>>>>>>>>> I also think that a shorter close-time than retention time
>> is
>>>>>> useful
>>>>>>>> for
>>>>>>>>>> window aggregation. If we add close() to the window
>> definition
>>>>> and
>>>>>>>>>> until() to `Materialized`, we can separate both correctly
>>> IMHO.
>>>>>>>>>>
>>>>>>>>>> About setting `close = min(close,retention)` I am not sure.
>> We
>>>>>> might
>>>>>>>>>> rather throw an exception than reducing the close time
>>>>>> automatically.
>>>>>>>>>> Otherwise, I see many user question about "I set close to X
>>> but
>>>>> it
>>>>>>> does
>>>>>>>>>> not get updated for some data that is with delay of X".
>>>>>>>>>>
>>>>>>>>>> The tricky question might be to design the API in a backward
>>>>>>> compatible
>>>>>>>>>> way though.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -Matthias
>>>>>>>>>>
>>>>>>>>>> On 7/3/18 5:38 AM, John Roesler wrote:
>>>>>>>>>>> Hi Guozhang,
>>>>>>>>>>>
>>>>>>>>>>> I see. It seems like if we want to decouple 1) and 2), we
>>>>> need to
>>>>>>>> alter
>>>>>>>>>> the
>>>>>>>>>>> definition of the window. Do you think it would close the
>>> gap
>>>>> if
>>>>>> we
>>>>>>>>>> added a
>>>>>>>>>>> "window close" time to the window definition?
>>>>>>>>>>>
>>>>>>>>>>> Such as:
>>>>>>>>>>>
>>>>>>>>>>> builder.stream("input")
>>>>>>>>>>> .groupByKey()
>>>>>>>>>>> .windowedBy(
>>>>>>>>>>>   TimeWindows
>>>>>>>>>>>     .of(60_000)
>>>>>>>>>>>     .closeAfter(10 * 60)
>>>>>>>>>>>     .until(30L * 24 * 60 * 60 * 1000)
>>>>>>>>>>> )
>>>>>>>>>>> .count()
>>>>>>>>>>> .suppress(Suppression.finalResultsOnly());
>>>>>>>>>>>
>>>>>>>>>>> Possibly called "finalResultsAtWindowClose" or something?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <
>>>>> wangguoz@gmail.com
>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey John,
>>>>>>>>>>>>
>>>>>>>>>>>> Obviously I'm too lazy on email replying diligence
>> compared
>>>>> with
>>>>>>> you
>>>>>>>>> :)
>>>>>>>>>>>> Will try to reply them separately:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ------------------------------
>>> ------------------------------
>>>>>>>>>> -----------------
>>>>>>>>>>>>
>>>>>>>>>>>> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
>>>>>>>>>>>>
>>>>>>>>>>>> I'm aware of this use case, but again, the concern is
>> that,
>>>>> in
>>>>>>> this
>>>>>>>>>> setting
>>>>>>>>>>>> in order to let the window be queryable for 30 days, we
>>> will
>>>>>>>> actually
>>>>>>>>>>>> process data as old as 30 days as well, while most of the
>>>>> late
>>>>>>>> updates
>>>>>>>>>>>> beyond 5 minutes would be discarded anyways. Personally I
>>>>> think
>>>>>>> for
>>>>>>>>> the
>>>>>>>>>>>> final update scenario, the ideal situation users would
>> want
>>>>> is
>>>>>>> that
>>>>>>>>> "do
>>>>>>>>>> not
>>>>>>>>>>>> process any data that is less than 5 minutes, and of
>> course
>>>>> no
>>>>>>>> update
>>>>>>>>>>>> records to the downstream later than 5 minutes either;
>> but
>>>>>> retain
>>>>>>>> the
>>>>>>>>>>>> window to be queryable for 30 days". And by doing that
>> the
>>>>> final
>>>>>>>>> window
>>>>>>>>>>>> snapshot would also be aligned with the update stream as
>>>>> well.
>>>>>> In
>>>>>>>>> other
>>>>>>>>>>>> words, among these three periods:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) the retention length of the window / table.
>>>>>>>>>>>> 2) the late records acceptance for updating the window.
>>>>>>>>>>>> 3) the late records update to be sent downstream.
>>>>>>>>>>>>
>>>>>>>>>>>> Final update use cases would naturally want 2) = 3),
>> while
>>> 1)
>>>>>> may
>>>>>>> be
>>>>>>>>>>>> different and larger, while what we provide now is that
>> 1)
>>> =
>>>>> 2),
>>>>>>>> which
>>>>>>>>>>>> could be different and in practice larger than 3), hence
>>> not
>>>>> the
>>>>>>>> most
>>>>>>>>>>>> intuitive for their needs.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ------------------------------
>>> ------------------------------
>>>>>>>>>> -----------------
>>>>>>>>>>>>
>>>>>>>>>>>> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
>>>>>>>>>>>>
>>>>>>>>>>>> I'd like option 2) over option 1) better as well from
>>>>>> programming
>>>>>>>> pov.
>>>>>>>>>> But
>>>>>>>>>>>> I'm wondering if option 2) would provide the above
>>> semantics
>>>>> or
>>>>>> it
>>>>>>>> is
>>>>>>>>>> still
>>>>>>>>>>>> coupling 1) with 2) as well ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <
>>>>> john@confluent.io
>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> In fact, to push the idea further (which IIRC is what
>>>>> Matthias
>>>>>>>>>> originally
>>>>>>>>>>>>> proposed), if we can accept
>> "Suppression#finalResultsOnly"
>>>>> in
>>>>>> my
>>>>>>>> last
>>>>>>>>>>>>> email, then we could also consider whether to eliminate
>>>>>>>>>>>>> "suppressLateEvents" entirely.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We could always add it later, but you've both expressed
>>>>> doubt
>>>>>>> that
>>>>>>>>>> there
>>>>>>>>>>>>> are practical use cases for it outside of final-results.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
>>>>>> john@confluent.io>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi again, Guozhang ;) Here's the second part of my
>>>>> response...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It seems like your main concern is: "if I'm a user who
>>>>> wants
>>>>>>> final
>>>>>>>>>>>> update
>>>>>>>>>>>>>> semantics, how complicated is it for me to get it?"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think we have to assume that people don't always have
>>>>> time
>>>>>> to
>>>>>>>>> become
>>>>>>>>>>>>>> deeply familiar with all the nuances of a programming
>>>>>>> environment
>>>>>>>>>>>> before
>>>>>>>>>>>>>> they use it. Especially if they're evaluating several
>>>>>> frameworks
>>>>>>>> for
>>>>>>>>>>>>> their
>>>>>>>>>>>>>> use case, it's very valuable to make it as obvious as
>>>>> possible
>>>>>>> how
>>>>>>>>> to
>>>>>>>>>>>>>> accomplish various computations with Streams.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To me the biggest question is whether with a fresh
>>>>>> perspective,
>>>>>>>>> people
>>>>>>>>>>>>>> would say "oh, I get it, I have to bound my lateness
>> and
>>>>>>> suppress
>>>>>>>>>>>>>> intermediate updates, and of course I'll get only the
>>> final
>>>>>>>>> result!",
>>>>>>>>>>>> or
>>>>>>>>>>>>> if
>>>>>>>>>>>>>> it's more like "wtf? all I want is the final result,
>> what
>>>>> are
>>>>>>> all
>>>>>>>>>> these
>>>>>>>>>>>>>> parameters?".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was talking with Matthias a while back, and he had an
>>>>> idea
>>>>>>> that
>>>>>>>> I
>>>>>>>>>>>> think
>>>>>>>>>>>>>> can help, which is to essentially set up a final-result
>>>>> recipe
>>>>>>> in
>>>>>>>>>>>>> addition
>>>>>>>>>>>>>> to the raw parameters. I previously thought that it
>>>>> wouldn't
>>>>>> be
>>>>>>>>>>>> possible
>>>>>>>>>>>>> to
>>>>>>>>>>>>>> restrict its usage to Windowed KTables, but thinking
>>> about
>>>>> it
>>>>>>>> again
>>>>>>>>>>>> this
>>>>>>>>>>>>>> weekend, I have a couple of ideas:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ================
>>>>>>>>>>>>>> = 1. Static Wrapper =
>>>>>>>>>>>>>> ================
>>>>>>>>>>>>>> We can define an extra static function that "wraps" a
>>>>> KTable
>>>>>>> with
>>>>>>>>>>>>>> final-result semantics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> public static <K extends Windowed, V> KTable<K, V>
>>>>>>>> finalResultsOnly(
>>>>>>>>>>>>>>   final KTable<K, V> windowedKTable,
>>>>>>>>>>>>>>   final Duration maxAllowedLateness,
>>>>>>>>>>>>>>   final Suppression.BufferFullStrategy
>>> bufferFullStrategy)
>>>>> {
>>>>>>>>>>>>>>     return windowedKTable.suppress(
>>>>>>>>>>>>>>         Suppression.suppressLateEvents(
>>> maxAllowedLateness)
>>>>>>>>>>>>>>                    .suppressIntermediateEvents(
>>>>>>>>>>>>>>                      IntermediateSuppression
>>>>>>>>>>>>>>                        .emitAfter(maxAllowedLateness)
>>>>>>>>>>>>>>                        .bufferFullStrategy(
>>>>>> bufferFullStrategy)
>>>>>>>>>>>>>>                    )
>>>>>>>>>>>>>>     );
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Because windowedKTable is a parameter, the static
>>> function
>>>>> can
>>>>>>>>> easily
>>>>>>>>>>>>>> impose an extra bound on the key type, that it extends
>>>>>> Windowed.
>>>>>>>>> This
>>>>>>>>>>>>> would
>>>>>>>>>>>>>> make "final results only" only available on windowed
>>>>> ktables.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's how it would look to use:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> final KTable<Windowed<Integer>, Long> windowCounts =
>> ...
>>>>>>>>>>>>>> final KTable<Windowed<Integer>, Long> finalCounts =
>>>>>>>>>>>>>>   finalResultsOnly(
>>>>>>>>>>>>>>     windowCounts,
>>>>>>>>>>>>>>     Duration.ofMinutes(10),
>>>>>>>>>>>>>>     Suppression.BufferFullStrategy.SHUT_DOWN
>>>>>>>>>>>>>>   );
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Trying to use it on a non-windowed KTable yields:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Error:(129, 35) java: method finalResultsOnly in class
>>>>>>>>>>>>>>> org.apache.kafka.streams.kstream.internals.
>>>>>> KTableAggregateTest
>>>>>>>>>> cannot
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> applied to given types;
>>>>>>>>>>>>>>>   required:
>>>>>>>>>>>>>>>
>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
>>>>>>>>>>>>> Duration,org.apache.kafka.streams.kstream.Suppression.
>>>>>>>>>> BufferFullStrategy
>>>>>>>>>>>>>>>   found:
>>>>>>>>>>>>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
>>>>>>>>>>>>> String,java.lang.String>,java.time.Duration,org.apache.
>>>>>>>>>>>>> kafka.streams.kstream.Suppression.BufferFullStrategy
>>>>>>>>>>>>>>>   reason: inference variable K has incompatible bounds
>>>>>>>>>>>>>>>     equality constraints: java.lang.String
>>>>>>>>>>>>>>>     upper bounds:
>>>>> org.apache.kafka.streams.kstream.Windowed
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> =================================================
>>>>>>>>>>>>>> = 2. Add <K,V> parameters and recipe method to
>>> Suppression
>>>>> =
>>>>>>>>>>>>>> =================================================
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> By adding K,V parameters to Suppression, we can
>> provide a
>>>>>>>> similarly
>>>>>>>>>>>>>> bounded config method directly on the Suppression
>> class:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> public static <K extends Windowed, V> Suppression<K, V>
>>>>>>>>>>>>>> finalResultsOnly(final Duration maxAllowedLateness,
>> final
>>>>>>>>>>>>>> BufferFullStrategy bufferFullStrategy) {
>>>>>>>>>>>>>>     return Suppression
>>>>>>>>>>>>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
>>>>>>>>>>>>>>         .suppressIntermediateEvents(
>>> IntermediateSuppression
>>>>>>>>>>>>>>             .emitAfter(maxAllowedLateness)
>>>>>>>>>>>>>>             .bufferFullStrategy(bufferFullStrategy)
>>>>>>>>>>>>>>         );
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Then, here's how it would look to use it:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> final KTable<Windowed<Integer>, Long> windowCounts =
>> ...
>>>>>>>>>>>>>> final KTable<Windowed<Integer>, Long> finalCounts =
>>>>>>>>>>>>>>   windowCounts.suppress(
>>>>>>>>>>>>>>     Suppression.finalResultsOnly(
>>>>>>>>>>>>>>       Duration.ofMinutes(10)
>>>>>>>>>>>>>>       Suppression.BufferFullStrategy.SHUT_DOWN
>>>>>>>>>>>>>>     )
>>>>>>>>>>>>>>   );
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Trying to use it on a non-windowed ktable yields:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Error:(127, 35) java: method finalResultsOnly in class
>>>>>>>>>>>>>>> org.apache.kafka.streams.kstream.Suppression<K,V>
>>> cannot
>>>>> be
>>>>>>>> applied
>>>>>>>>>> to
>>>>>>>>>>>>>>> given types;
>>>>>>>>>>>>>>>   required:
>>>>>>>>>>>>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>>>>>>>>>>>>> Suppression.BufferFullStrategy
>>>>>>>>>>>>>>>   found:
>>>>>>>>>>>>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>>>>>>>>>>>>> Suppression.BufferFullStrategy
>>>>>>>>>>>>>>>   reason: explicit type argument java.lang.String does
>>> not
>>>>>>>> conform
>>>>>>>>> to
>>>>>>>>>>>>>>> declared bound(s)
>>>>> org.apache.kafka.streams.kstream.Windowed
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ============
>>>>>>>>>>>>>> = Downsides =
>>>>>>>>>>>>>> ============
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Of course, there's a downside either way:
>>>>>>>>>>>>>> * for 1:  this "wrapper" interaction would be the first
>>> in
>>>>> the
>>>>>>>> DSL.
>>>>>>>>> Is
>>>>>>>>>>>> it
>>>>>>>>>>>>>> too strange, and how discoverable would it be?
>>>>>>>>>>>>>> * for 2: adding those type parameters to Suppression
>> will
>>>>>> force
>>>>>>>> all
>>>>>>>>>>>>>> callers to provide them in the event of a chained
>>>>> construction
>>>>>>>>> because
>>>>>>>>>>>>> Java
>>>>>>>>>>>>>> doesn't do RHS recursive type inference. This is
>> already
>>>>>> visible
>>>>>>>> in
>>>>>>>>>>>> other
>>>>>>>>>>>>>> parts of the Streams DSL. For example, often calls to
>>>>>>> Materialized
>>>>>>>>>>>>> builders
>>>>>>>>>>>>>> have to provide seemingly obvious type bounds.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ============
>>>>>>>>>>>>>> = Conclusion =
>>>>>>>>>>>>>> ============
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think option 2 is more "normal" and discoverable. It
>>> does
>>>>>>> have a
>>>>>>>>>>>>>> downside, but it's one that's pre-existing elsewhere in
>>> the
>>>>>> DSL.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> WDYT? Would the addition of this "recipe" method to
>>>>>> Suppression
>>>>>>>>>> resolve
>>>>>>>>>>>>>> your concern?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
>>>>>>> wangguoz@gmail.com
>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regarding the metrics: yeah I think I'm with you that
>>> the
>>>>>>> dropped
>>>>>>>>>>>>> records
>>>>>>>>>>>>>>> due to window retention or emit suppression policies
>>>>> should
>>>>>> be
>>>>>>>>>>>> recorded
>>>>>>>>>>>>>>> differently, and using this KIP's proposed metric
>> would
>>> be
>>>>>>> fine.
>>>>>>>> If
>>>>>>>>>>>> you
>>>>>>>>>>>>>>> also think we can use this KIP's proposed metrics to
>>> cover
>>>>>> the
>>>>>>>>> window
>>>>>>>>>>>>>>> retention cased skipping records, then we can include
>>> the
>>>>>>> changes
>>>>>>>>> in
>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> KIP as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regarding the current proposal, I'm actually not too
>>>>> worried
>>>>>>>> about
>>>>>>>>>> the
>>>>>>>>>>>>>>> inconsistency between query semantics and downstream
>>> emit
>>>>>>>>> semantics.
>>>>>>>>>>>> For
>>>>>>>>>>>>>>> queries, we will always return the current running
>>>>> results of
>>>>>>> the
>>>>>>>>>>>>> windows,
>>>>>>>>>>>>>>> being it partial or final results depending on the
>>> window
>>>>>>>> retention
>>>>>>>>>>>> time
>>>>>>>>>>>>>>> anyways, which has nothing to do whether the emitted
>>>>> stream
>>>>>>>> should
>>>>>>>>> be
>>>>>>>>>>>>> one
>>>>>>>>>>>>>>> final output per key or not. I also agree that having
>> a
>>>>>> unified
>>>>>>>>>>>>> operation
>>>>>>>>>>>>>>> is generally better for users to focus on leveraging
>>> that
>>>>> one
>>>>>>>> only
>>>>>>>>>>>> than
>>>>>>>>>>>>>>> learning about two set of operations. The only
>> question
>>> I
>>>>> had
>>>>>>> is,
>>>>>>>>> for
>>>>>>>>>>>>>>> final
>>>>>>>>>>>>>>> updates of window stores, if it is a bit awkward to
>>>>>> understand
>>>>>>>> the
>>>>>>>>>>>>>>> configuration combo. Thinking about this more, I think
>>> my
>>>>>> root
>>>>>>>>> worry
>>>>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> "suppressLateEvents" call for windowed tables, since
>>> from
>>>>> a
>>>>>>> user
>>>>>>>>>>>>>>> perspective: if my retention time is X which means
>> "pay
>>>>> the
>>>>>>> cost
>>>>>>>> to
>>>>>>>>>>>>> allow
>>>>>>>>>>>>>>> late records up to X to still be applied updating the
>>>>>> tables",
>>>>>>>> why
>>>>>>>>>>>>> would I
>>>>>>>>>>>>>>> ever want to suppressLateEvents by Y ( < X), to say
>> "do
>>>>> not
>>>>>>> send
>>>>>>>>> the
>>>>>>>>>>>>>>> updates up to Y, which means the downstream operator
>> or
>>>>> sink
>>>>>>>> topic
>>>>>>>>>> for
>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> stream would actually see a truncated update stream
>>> while
>>>>>> I've
>>>>>>>> paid
>>>>>>>>>>>>> larger
>>>>>>>>>>>>>>> cost for that"; and of course, Y > X would not make
>>> sense
>>>>>>> either
>>>>>>>> as
>>>>>>>>>>>> you
>>>>>>>>>>>>>>> would not see any updates later than X anyways. So in
>>>>> all, my
>>>>>>>>> feeling
>>>>>>>>>>>> is
>>>>>>>>>>>>>>> that it makes less sense for windowed table's
>>>>>>>> "suppressLateEvents"
>>>>>>>>>>>> with
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>> parameter that is not equal to the window retention,
>> and
>>>>>>> opening
>>>>>>>>> the
>>>>>>>>>>>>> door
>>>>>>>>>>>>>>> in the current proposal may confuse people with that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Again, above is just a subjective opinion and probably
>>> we
>>>>> can
>>>>>>>> also
>>>>>>>>>>>> bring
>>>>>>>>>>>>>>> up
>>>>>>>>>>>>>>> some scenarios that users does want to set X != Y..
>> but
>>>>>>>> personally
>>>>>>>>> I
>>>>>>>>>>>>> feel
>>>>>>>>>>>>>>> that even if the semantics for this scenario if
>>> intuitive
>>>>> for
>>>>>>>> user
>>>>>>>>> to
>>>>>>>>>>>>>>> understand, doe that really make sense and should we
>>>>> really
>>>>>>> open
>>>>>>>>> the
>>>>>>>>>>>>> door
>>>>>>>>>>>>>>> for it. So I think maybe separating the final update
>> in
>>> a
>>>>>>>> separate
>>>>>>>>>>>> API's
>>>>>>>>>>>>>>> benefits may overwhelm the advantage of having one
>>> uniform
>>>>>>>>>> definition.
>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> for my alternative proposal, the rationale was from
>> both
>>>>> my
>>>>>>>> concern
>>>>>>>>>>>>> about
>>>>>>>>>>>>>>> "suppressLateEvents" for windowed store, and Matthias'
>>>>>> question
>>>>>>>>> about
>>>>>>>>>>>>>>> "suppressLateEvents" for non-windowed stores, that if
>> it
>>>>> is
>>>>>>> less
>>>>>>>>>>>>>>> meaningful
>>>>>>>>>>>>>>> for both, we can consider removing it completely and
>>> only
>>>>> do
>>>>>>>>>>>>>>> "IntermediateSuppression" in Suppress instead.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I'd summarize my thoughts in the following
>> questions:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Does "suppressLateEvents" with parameter Y != X
>>> (window
>>>>>>>>> retention
>>>>>>>>>>>>> time)
>>>>>>>>>>>>>>> for windowed stores make sense in practice?
>>>>>>>>>>>>>>> 2. Does "suppressLateEvents" with any parameter Y for
>>>>>>>> non-windowed
>>>>>>>>>>>>> stores
>>>>>>>>>>>>>>> make sense in practice?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
>>>>>>> bbejeck@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the explanation, that does make sense.  I
>>> have
>>>>>> some
>>>>>>>>>>>>>>> questions on
>>>>>>>>>>>>>>>> operations, but I'll just wait for the PR and tests.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Bill
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
>>>>>>> john@confluent.io
>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Bill,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for the review!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Your question is very much applicable to the KIP and
>>>>> not at
>>>>>>> all
>>>>>>>>> an
>>>>>>>>>>>>>>>>> implementation detail. Thanks for bringing it up.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm proposing not to change the existing caches and
>>>>>>>>> configurations
>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> (for now).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Imagine you have a topology like this:
>>>>>>>>>>>>>>>>> commit.interval.ms = 100
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The first ktable (ktable1) will respect the commit
>>>>> interval
>>>>>>> and
>>>>>>>>>>>>> buffer
>>>>>>>>>>>>>>>>> events for 100ms before logging, storing, or
>>> forwarding
>>>>>> them
>>>>>>>>>>>> (IIRC).
>>>>>>>>>>>>>>>>> Therefore, the second ktable (suppress) will only
>> see
>>>>> the
>>>>>>>> events
>>>>>>>>>>>> at
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> rate
>>>>>>>>>>>>>>>>> of once per 100ms. It will apply its own buffering,
>>> and
>>>>>> emit
>>>>>>>> once
>>>>>>>>>>>>> per
>>>>>>>>>>>>>>>> 200ms
>>>>>>>>>>>>>>>>> This case is pretty trivial because the suppress
>> time
>>>>> is a
>>>>>>>>>>>> multiple
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> commit interval.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When it's not an integer multiple, you'll get
>> behavior
>>>>> like
>>>>>>> in
>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> marble
>>>>>>>>>>>>>>>>> diagram:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [ KTable caching with commit interval = 2 ]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       [ suppress with emitAfter = 3 ]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <---------------(k:2)----------------(k:6)->
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If this behavior isn't desired (for example, if you
>>>>> wanted
>>>>>> to
>>>>>>>>> emit
>>>>>>>>>>>>>>> (k:3)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>> time 3, I'd recommend setting the
>>>>>> "cache.max.bytes.buffering"
>>>>>>>> to
>>>>>>>>> 0
>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>> modifying the topology to disable caching. Then, the
>>>>>> behavior
>>>>>>>> is
>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>> simply determined just by the suppress operator.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does that seem right to you?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regarding the changelogs, because the suppression
>>>>> operator
>>>>>>>> hangs
>>>>>>>>>>>>> onto
>>>>>>>>>>>>>>>>> events for a while, it will need its own changelog.
>>> The
>>>>>>>> changelog
>>>>>>>>>>>>>>>>> should represent the current state of the buffer at
>>> all
>>>>>>> times.
>>>>>>>> So
>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> suppress operator sees (k:2), for example, it will
>> log
>>>>>> (k:2).
>>>>>>>>> When
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> later gets to time 3, it's time to emit (k:2)
>>>>> downstream.
>>>>>>>> Because
>>>>>>>>>>>> k
>>>>>>>>>>>>>>> is no
>>>>>>>>>>>>>>>>> longer buffered, the suppress operator will log
>>>>> (k:null).
>>>>>>> Thus,
>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> recovering,
>>>>>>>>>>>>>>>>> it can rebuild the buffer by reading its changelog.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What do you think about this?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
>>>>>>> bbejeck@gmail.com
>>>>>>>>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi John,  thanks for the KIP.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Early on in the KIP, you mention the current
>>> approaches
>>>>>> for
>>>>>>>>>>>>>>> controlling
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> rate of downstream records from a KTable, cache
>> size
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> commit time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Will these configuration parameters still be in
>>> effect
>>>>> for
>>>>>>>>>>>> tables
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> don't use suppression?  For tables taking advantage
>>> of
>>>>>>>>>>>>> suppression,
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>> these configurations have no impact?
>>>>>>>>>>>>>>>>>> This last question may be to implementation
>> specific
>>>>> but
>>>>>> if
>>>>>>>> the
>>>>>>>>>>>>>>>> requested
>>>>>>>>>>>>>>>>>> suppression time is longer than the specified
>> commit
>>>>> time,
>>>>>>>> will
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> latest
>>>>>>>>>>>>>>>>>> record in the suppression buffer get stored in a
>>>>>> changelog?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Bill
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
>>>>>>>> john@confluent.io
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for the feedback, Matthias,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It seems like in straightforward relational
>>> processing
>>>>>>> cases,
>>>>>>>>>>>> it
>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> make sense to bound the lateness of KTables. In
>>>>> general,
>>>>>> it
>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> have "guard rails" in place that make it easier to
>>>>> write
>>>>>>>>>>>>> sensible
>>>>>>>>>>>>>>>>>> programs
>>>>>>>>>>>>>>>>>>> than insensible ones.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But I'm still going to argue in favor of keeping
>> it
>>>>> for
>>>>>> all
>>>>>>>>>>>>>>> KTables
>>>>>>>>>>>>>>>> ;)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 1. I believe it is simpler to understand the
>>> operator
>>>>> if
>>>>>> it
>>>>>>>>>>>> has
>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>> uniform
>>>>>>>>>>>>>>>>>>> definition, regardless of context. It's well
>> defined
>>>>> and
>>>>>>>>>>>>> intuitive
>>>>>>>>>>>>>>>> what
>>>>>>>>>>>>>>>>>>> will happen when you use late-event suppression
>> on a
>>>>>>> KTable,
>>>>>>>>>>>> so
>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>>> nothing surprising or dangerous will happen in
>> that
>>>>> case.
>>>>>>>> From
>>>>>>>>>>>>> my
>>>>>>>>>>>>>>>>>>> perspective, having two sets of allowed operations
>>> is
>>>>>>>> actually
>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>> increase
>>>>>>>>>>>>>>>>>>> in cognitive complexity.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2. To me, it's not crazy to use the operator this
>>> way.
>>>>>> For
>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> lieu
>>>>>>>>>>>>>>>>>>> of full-featured timestamp semantics, I can
>>> implement
>>>>>> MVCC
>>>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> building a KTable by
>>>>> "suppressLateEvents(Duration.ZERO)".
>>>>>> I
>>>>>>>>>>>>>>> suspect
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>> there are other, non-obvious applications of
>>>>> suppressing
>>>>>>> late
>>>>>>>>>>>>>>> events
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> KTables.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 3. Not to get too much into implementation details
>>> in
>>>>> a
>>>>>> KIP
>>>>>>>>>>>>>>>> discussion,
>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>> if we did want to make late-event suppression
>>>>> available
>>>>>>> only
>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>> windowed
>>>>>>>>>>>>>>>>>>> KTables, we have two enforcement options:
>>>>>>>>>>>>>>>>>>>   a. check when we build the topology - this would
>>> be
>>>>>>> simple
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> implement,
>>>>>>>>>>>>>>>>>>> but would be a runtime check. Hopefully, people
>>> write
>>>>>> tests
>>>>>>>>>>>> for
>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>>>>> topology before deploying them, so the feedback
>> loop
>>>>>> isn't
>>>>>>>>>>>>>>>>> instantaneous,
>>>>>>>>>>>>>>>>>>> but it's not too long either.
>>>>>>>>>>>>>>>>>>>   b. add a new WindowedKTable type - this would
>> be a
>>>>>>> compile
>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>> check,
>>>>>>>>>>>>>>>>>>> but would also be substantial increase of both
>>>>> interface
>>>>>>> and
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We should definitely strive to have guard rails
>>>>>> protecting
>>>>>>>>>>>>> against
>>>>>>>>>>>>>>>>>>> surprising or dangerous behavior. Protecting
>> against
>>>>>>> programs
>>>>>>>>>>>>>>> that we
>>>>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>>>> currently predict is a lesser benefit, and I think
>>> we
>>>>> can
>>>>>>> put
>>>>>>>>>>>> up
>>>>>>>>>>>>>>>> guard
>>>>>>>>>>>>>>>>>>> rails on a case-by-case basis for that. It seems
>>> like
>>>>> the
>>>>>>>>>>>>>>> increase in
>>>>>>>>>>>>>>>>>>> cognitive (and potentially code and interface)
>>>>> complexity
>>>>>>>>>>>> makes
>>>>>>>>>>>>> me
>>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> should skip this case.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
>>>>>>>>>>>>>>>>> matthias@confluent.io>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks for the KIP John.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> One initial comments about the last example
>>> "Bounded
>>>>>>>>>>>>> lateness":
>>>>>>>>>>>>>>>> For a
>>>>>>>>>>>>>>>>>>>> non-windowed KTable bounding the lateness does
>> not
>>>>>> really
>>>>>>>>>>>> make
>>>>>>>>>>>>>>>> sense,
>>>>>>>>>>>>>>>>>>>> does it?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thus, I am wondering if we should allow
>>>>>>>>>>>> `suppressLateEvents()`
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> case? It seems to be better to only allow it for
>>>>>>>>>>>>>>> windowed-KTables.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
>>>>>>>>>>>>>>>>>>>>> I noticed this (lack of primary parameter) as
>>> well.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What you gave as new example is semantically the
>>>>> same
>>>>>> as
>>>>>>>>>>>>> what
>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>>> suggested.
>>>>>>>>>>>>>>>>>>>>> So it is good by me.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
>>>>>>>>>>>>>>> john@confluent.io
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks for taking look, Ted,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I agree this is a departure from the
>> conventions
>>> of
>>>>>>>>>>>> Streams
>>>>>>>>>>>>>>> DSL.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Most of our config objects have one or two
>>>>> "required"
>>>>>>>>>>>>>>>> parameters,
>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>> fit
>>>>>>>>>>>>>>>>>>>>>> naturally with the static factory method
>>> approach.
>>>>>>>>>>>>>>> TimeWindow,
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>>>>>>>> requires a size parameter, so we can naturally
>>> say
>>>>>>>>>>>>>>>>>>> TimeWindows.of(size).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think in the case of a suppression, there's
>>>>> really
>>>>>> no
>>>>>>>>>>>>>>> "core"
>>>>>>>>>>>>>>>>>>>> parameter,
>>>>>>>>>>>>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
>>>>>>>>>>>>>>> Suppression()". I
>>>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>> Suppression.of(duration) would be ambiguous,
>>> since
>>>>>> there
>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>> many
>>>>>>>>>>>>>>>>>>>> durations
>>>>>>>>>>>>>>>>>>>>>> that we can configure.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> However, thinking about it again, I suppose
>> that
>>> I
>>>>> can
>>>>>>>>>>>> give
>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>> configuration method a static version, which
>>> would
>>>>> let
>>>>>>>>>>>> you
>>>>>>>>>>>>>>>> replace
>>>>>>>>>>>>>>>>>>> "new
>>>>>>>>>>>>>>>>>>>>>> Suppression()." with "Suppression." in all the
>>>>>> examples.
>>>>>>>>>>>>>>>>> Basically,
>>>>>>>>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>>>>>>>> of "of()", we'd support any of the methods I
>>>>> listed.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For example:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> windowCounts
>>>>>>>>>>>>>>>>>>>>>>     .suppress(
>>>>>>>>>>>>>>>>>>>>>>         Suppression
>>>>>>>>>>>>>>>>>>>>>>             .suppressLateEvents(Duration.
>>>>>> ofMinutes(10))
>>>>>>>>>>>>>>>>>>>>>>             .suppressIntermediateEvents(
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>> IntermediateSuppression.emitAfter(Duration.ofMinutes(
>>>>>> 10))
>>>>>>>>>>>>>>>>>>>>>>             )
>>>>>>>>>>>>>>>>>>>>>>     );
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Does that seem better?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
>>>>>>>>>>>>> yuzhihong@gmail.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I started to read this KIP which contains a
>> lot
>>> of
>>>>>>>>>>>>>>> materials.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> One suggestion:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>     .suppress(
>>>>>>>>>>>>>>>>>>>>>>>         new Suppression()
>>>>>>>>>>>>>>>>>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi all,

I have updated KIP-328 with all the feedback I've gotten so far. Please
take another look and let me know what you think!

Thanks,
-John

On Wed, Jul 11, 2018 at 12:28 AM Guozhang Wang <wa...@gmail.com> wrote:

> That is a good point..
>
> I cannot think of a better option than documentation and warning, and also
> given that we'd probably better not reusing the function name `until` for
> close time.
>
>
> Guozhang
>
>
> On Tue, Jul 10, 2018 at 3:31 PM, John Roesler <jo...@confluent.io> wrote:
>
> > I had some opportunity to reflect on the default for close time today...
> >
> > Note that the current "close time" is equal to the retention time, and
> > therefore "close" today shares the default retention of 24h.
> >
> > It would definitely break any application that today specifies a
> retention
> > time to set close shorter than that time. It's also likely to break apps
> if
> > they *don't* set the retention time and rely on the 24h default. So it's
> > unfortunate, but I think if "close" isn't set, we should use the
> retention
> > time instead of a fixed default.
> >
> > When we ultimately remove the retention time parameter ("until"), we will
> > have to set "close" to a default of 24h.
> >
> > Of course, this has a negative impact on the user of "final results",
> since
> > they won't see any output at all for retentionTime/24h, and may find this
> > confusing. What can we do about this except document it well? Maybe log a
> > warning if we see that close wasn't explicitly set while using "final
> > results"?
> >
> > Thanks,
> > -John
> >
> > On Tue, Jul 10, 2018 at 10:46 AM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Guozhang,
> > >
> > > That sounds good to me. I'll include that in the KIP.
> > >
> > > Thanks,
> > > -John
> > >
> > > On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> > >
> > >> Let me clarify a bit on what I meant about moving `retentionPeriod` to
> > >> WindowStoreBuilder:
> > >>
> > >> In another discussion we had around KIP-319 / 330, that the "retention
> > >> period" should not really be a window spec, but only a window store
> > spec,
> > >> as it only affects how long to retain each window to be queryable
> along
> > >> with the storage cost.
> > >>
> > >> More specifically, today the "maintainMs" returned from Windows is
> used
> > in
> > >> three places:
> > >>
> > >> 1) for windowed aggregations, they are passed in directly into
> > >> `Stores.persistentWindows()` as the retention period parameters. For
> > this
> > >> use case we should just let the WindowStoreBuilder to specify this
> value
> > >> itself.
> > >>
> > >> NOTE: It is also returned in the KStreamWindowAggregate processor, to
> > >> determine if a received record should be dropped due to its lateness.
> We
> > >> may need to think of another way to get this value inside the
> processor
> > >>
> > >> 2) for windowed stream-stream join, it is used as the join range
> > parameter
> > >> but only to check that "windowSizeMs <= retentionPeriodMs". We can do
> > this
> > >> check at the store builder lever instead of at the processor level.
> > >>
> > >>
> > >> If we can remove its usage in both 1) and 2), then we should be able
> to
> > >> safely remove this from the `Windows` spec.
> > >>
> > >>
> > >> Guozhang
> > >>
> > >>
> > >> On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io>
> wrote:
> > >>
> > >> > Thanks for the reply, Guozhang,
> > >> >
> > >> > Good! I agree, that is also a good reason, and I actually made use
> of
> > >> that
> > >> > in my tests. I'll update the KIP.
> > >> >
> > >> > By the way, I chose "allowedLateness" as I was trying to pick a
> better
> > >> name
> > >> > than "close", but I think it's actually the wrong name. We don't
> want
> > to
> > >> > bound the lateness of events in general, only with respect to the
> end
> > of
> > >> > their window.
> > >> >
> > >> > If we have a window [0,10), with "allowedLateness" of 5, then if we
> > get
> > >> an
> > >> > event with timestamp 3 at time 9, the name implies we'd reject it,
> > which
> > >> > seems silly. Really, we'd only want to start rejecting that event at
> > >> stream
> > >> > time 15.
> > >> >
> > >> > What I meant was more like "allowedLatenessAfterWindowEnd", but
> > that's
> > >> too
> > >> > verbose. I think that "close" + some documentation about what it
> means
> > >> will
> > >> > be better.
> > >> >
> > >> > 1: "Close" would be measured from the end of the window, so a
> > reasonable
> > >> > default would be "0". Recall that "close" really only needs to be
> > >> specified
> > >> > for final results, and a default of 0 would produce the most
> intuitive
> > >> > results. If folks later discover that they are missing some late
> > events,
> > >> > they can adjust the parameter accordingly. IMHO, any other value
> would
> > >> just
> > >> > be a guess on our part.
> > >> >
> > >> > 2a:
> > >> > I think you're saying to re-use "until" instead of adding "close" to
> > the
> > >> > window.
> > >> >
> > >> > The downside here would be that the semantic change could be more
> > >> confusing
> > >> > than deprecating "until" and introducing window "close" and a
> > >> > "retentionTime" on the store builder. The deprecation is a good,
> > >> controlled
> > >> > way for us to make sure people are getting the semantics they think
> > >> they're
> > >> > getting, as well as giving us an opportunity to link people to the
> API
> > >> they
> > >> > should use instead.
> > >> >
> > >> > I didn't fully understand the second part, but it sounds like you're
> > >> > suggesting to add a new "retentionTime" setter to Windows to bridge
> > the
> > >> gap
> > >> > until we add it to the store builder? That seems kind of roundabout
> to
> > >> me,
> > >> > if that's what you meant. We could just immediately add it to the
> > store
> > >> > builders in the same PR.
> > >> >
> > >> > 2b: Sounds good to me!
> > >> >
> > >> > Thanks again,
> > >> > -John
> > >> >
> > >> >
> > >> > On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > John,
> > >> > >
> > >> > > Thanks for your replies. As for the two options of the API, I
> think
> > >> I'm
> > >> > > slightly inclined to the first option as well. My motivation is a
> > bit
> > >> > > different, as I think of the first one maybe more flexible, for
> > >> example:
> > >> > >
> > >> > > KTable<Windowed<..>> table = ... count();
> > >> > >
> > >> > > table.toStream().peek(..);   // want to peek at the changelog
> > stream,
> > >> do
> > >> > > not care about final results.
> > >> > >
> > >> > > table.suppress().toStream().to("topic");    // sending to a topic,
> > >> want
> > >> > to
> > >> > > only send the final results.
> > >> > >
> > >> > > --------------
> > >> > >
> > >> > > Besides that, I have a few more minor questions:
> > >> > >
> > >> > > 1. For "allowedLateness", what should be the default value? I.e.
> if
> > >> user
> > >> > do
> > >> > > not specify "allowedLateness" in TimeWindows, what value should we
> > >> set?
> > >> > >
> > >> > > 2. For API names, some personal suggestions here:
> > >> > >
> > >> > > 2.a) "allowedLateness"  -> "until" (semantics changed, and also
> > value
> > >> is
> > >> > > defined as delta on top of window length), where "until" ->
> > >> > > "retentionPeriod", and the latter will be removed from `Windows`
> to
> > `
> > >> > > WindowStoreBuilder` in the future.
> > >> > >
> > >> > > 2.b) "BufferConfig" -> "Buffered" ?
> > >> > >
> > >> > >
> > >> > >
> > >> > > Guozhang
> > >> > >
> > >> > >
> > >> > > On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io>
> > >> wrote:
> > >> > >
> > >> > > > Hey Matthias and Guozhang,
> > >> > > >
> > >> > > > Sorry for the slow reply. I was mulling about your feedback and
> > >> > weighing
> > >> > > > some ideas in a sketchbook PR: https://github.com/apache/
> > >> > kafka/pull/5337
> > >> > > .
> > >> > > >
> > >> > > > Your thought about keeping suppression independent of business
> > logic
> > >> > is a
> > >> > > > very good one. I agree that it would make more sense to add some
> > >> kind
> > >> > of
> > >> > > > "window close" concept to the window definition.
> > >> > > >
> > >> > > > In fact, doing that immediately solves the inconsistency problem
> > >> > Guozhang
> > >> > > > brought up. There's no need to add a "final results" or
> "emission"
> > >> > option
> > >> > > > to the windowed aggregation.
> > >> > > >
> > >> > > > What do you think about an API more like this:
> > >> > > >
> > >> > > > final StreamsBuilder builder = new StreamsBuilder();
> > >> > > >
> > >> > > > builder
> > >> > > >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
> > >> > > >   .groupBy(
> > >> > > >     (String k1, String v1) -> k1,
> > >> > > >     Serialized.with(STRING_SERDE, STRING_SERDE)
> > >> > > >   )
> > >> > > >   .windowedBy(TimeWindows
> > >> > > >     .of(scaledTime(2L))
> > >> > > >     .until(scaledTime(3L))
> > >> > > >     .allowedLateness(scaledTime(1L))
> > >> > > >   )
> > >> > > >   .count(Materialized.as("counts"))
> > >> > > >   .suppress(
> > >> > > >     emitFinalResultsOnly(
> > >> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> > >> > SHUT_DOWN)
> > >> > > >     )
> > >> > > >   )
> > >> > > >   .toStream()
> > >> > > >   .to("output-suppressed", Produced.with(STRING_SERDE,
> > LONG_SERDE));
> > >> > > >
> > >> > > > Note that:
> > >> > > >  * "emitFinalResultsOnly" is available *only* on windowed tables
> > >> > > (enforced
> > >> > > > by the type system at compile time), and it determines the time
> to
> > >> wait
> > >> > > by
> > >> > > > looking at "allowedLateness" on the TimeWindows config.
> > >> > > >  * querying "counts" will produce results (eventually)
> consistent
> > >> with
> > >> > > > what's observable in "output-suppressed".
> > >> > > >  * in all cases, "suppress" has no effect on business logic,
> just
> > on
> > >> > > event
> > >> > > > suppression.
> > >> > > >
> > >> > > > Is this API straightforward? Or do you still prefer the version
> > that
> > >> > both
> > >> > > > proposed:
> > >> > > >
> > >> > > >   ...
> > >> > > >   .windowedBy(TimeWindows
> > >> > > >     .of(scaledTime(2L))
> > >> > > >     .until(scaledTime(3L))
> > >> > > >     .allowedLateness(scaledTime(1L))
> > >> > > >   )
> > >> > > >   .count(
> > >> > > >     Materialized.as("counts"),
> > >> > > >     emitFinalResultsOnly(
> > >> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> > >> > SHUT_DOWN)
> > >> > > >     )
> > >> > > >   )
> > >> > > >   ...
> > >> > > >
> > >> > > > To me, these two are practically identical, and I still vaguely
> > >> prefer
> > >> > > the
> > >> > > > first one.
> > >> > > >
> > >> > > > The prototype has made clearer to me that users of "final
> results
> > >> for
> > >> > > > windows" and users of "suppression for table events" both need
> to
> > >> > > configure
> > >> > > > the suppression buffer.
> > >> > > >
> > >> > > > This buffer configuration consists of:
> > >> > > > 1. how many keys or bytes to keep in memory
> > >> > > > 2. what to do if memory runs out (shut down, start using disk,
> > ...)
> > >> > > >
> > >> > > > So it's not as simple as setting a "final results" flag. We'll
> > >> either
> > >> > > have
> > >> > > > an "Emit" config object on the windowed aggregators that takes
> the
> > >> same
> > >> > > > BufferConfig that the "Suppress" config on the suppression
> > >> operator, or
> > >> > > we
> > >> > > > just use the suppression operator for both.
> > >> > > >
> > >> > > > Perhaps it would sweeten the deal a little to point out that we
> > >> have 2
> > >> > > > overloads already for each windowed aggregator (with and without
> > >> > > > Materialized). Adding "Emitted" or something would mean that
> we'd
> > >> add a
> > >> > > new
> > >> > > > overload for each one, taking us up to 4 overloads each for
> > "count",
> > >> > > > "aggregate" and "reduce". Using "suppress" means that we don't
> add
> > >> any
> > >> > > new
> > >> > > > overloads.
> > >> > > >
> > >> > > > Thanks again for helping to hash this out,
> > >> > > > -John
> > >> > > >
> > >> > > > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <
> wangguoz@gmail.com>
> > >> > wrote:
> > >> > > >
> > >> > > > > I think I agree with Matthias for having dedicated APIs for
> > >> windowed
> > >> > > > > operation final output scenario, PLUS separating the window
> > close
> > >> > which
> > >> > > > the
> > >> > > > > "final output" would rely on, from the window retention time
> > >> itself
> > >> > > > > (admittedly it would make this KIP effort larger, but if we
> > >> believe
> > >> > we
> > >> > > > need
> > >> > > > > to do this separation anyways we could just do it now).
> > >> > > > >
> > >> > > > > And then we can have the `KTable#suppress()` for
> > >> > > intermediate-suppression
> > >> > > > > only, not for late-record-suppression, until we've seen that
> > >> becomes
> > >> > a
> > >> > > > > common feature request because our current design still allows
> > to
> > >> be
> > >> > > > > extended for that purpose.
> > >> > > > >
> > >> > > > >
> > >> > > > > Guozhang
> > >> > > > >
> > >> > > > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
> > >> > > matthias@confluent.io>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Thanks for the discussion. I am just catching up.
> > >> > > > > >
> > >> > > > > > In general, I think we have different uses cases and
> > >> non-windowed
> > >> > and
> > >> > > > > > windowed is quite different. For the non-windowed case,
> > >> suppress()
> > >> > > has
> > >> > > > > > no (useful) close or retention time, no final semantics, and
> > >> also
> > >> > no
> > >> > > > > > business logic impact.
> > >> > > > > >
> > >> > > > > > On the other hand, for windowed aggregations, close time and
> > >> final
> > >> > > > > > result do have a meaning. IMHO, `close()` is part of
> business
> > >> logic
> > >> > > > > > while retention time is not. Also, suppression of
> intermediate
> > >> > result
> > >> > > > is
> > >> > > > > > not a business rule and there might be use case for which
> > either
> > >> > > "early
> > >> > > > > > intermediate" (before window end time) are suppressed only,
> or
> > >> all
> > >> > > > > > intermediates are suppressed (maybe also something in the
> > >> middle,
> > >> > ie,
> > >> > > > > > just reduce the load of intermediate updates). Thus,
> > >> > > window-suppression
> > >> > > > > > is much richer.
> > >> > > > > >
> > >> > > > > > IMHO, a generic `suppress()` operator that can be inserted
> > into
> > >> the
> > >> > > > data
> > >> > > > > > flow at any point is useful. Maybe we should keep is as
> > generic
> > >> as
> > >> > > > > > possible. However, it might be difficult to use with regard
> to
> > >> > > > > > windowing, as the mental effort to use it is high.
> > >> > > > > >
> > >> > > > > > With regard to Guozhang's comment:
> > >> > > > > >
> > >> > > > > > > we will actually
> > >> > > > > > > process data as old as 30 days as well, while most of the
> > late
> > >> > > > updates
> > >> > > > > > > beyond 5 minutes would be discarded anyways.
> > >> > > > > >
> > >> > > > > > If we use `suppress()` as a standalone operator, this is
> > correct
> > >> > and
> > >> > > > > > intended IMHO. To address the issue if the behavior is
> > >> unwanted, I
> > >> > > > would
> > >> > > > > > suggest to add a "suppress option" directly to
> > >> > > > > > `count()/reduce()/aggregate()` window operator similar to
> > >> > > > > > `Materialized`. This would be an "embedded suppress" and
> avoid
> > >> the
> > >> > > > > > issue. It would also address the issue about mental effort
> for
> > >> > > "single
> > >> > > > > > final window result" use case.
> > >> > > > > >
> > >> > > > > > I also think that a shorter close-time than retention time
> is
> > >> > useful
> > >> > > > for
> > >> > > > > > window aggregation. If we add close() to the window
> definition
> > >> and
> > >> > > > > > until() to `Materialized`, we can separate both correctly
> > IMHO.
> > >> > > > > >
> > >> > > > > > About setting `close = min(close,retention)` I am not sure.
> We
> > >> > might
> > >> > > > > > rather throw an exception than reducing the close time
> > >> > automatically.
> > >> > > > > > Otherwise, I see many user question about "I set close to X
> > but
> > >> it
> > >> > > does
> > >> > > > > > not get updated for some data that is with delay of X".
> > >> > > > > >
> > >> > > > > > The tricky question might be to design the API in a backward
> > >> > > compatible
> > >> > > > > > way though.
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > -Matthias
> > >> > > > > >
> > >> > > > > > On 7/3/18 5:38 AM, John Roesler wrote:
> > >> > > > > > > Hi Guozhang,
> > >> > > > > > >
> > >> > > > > > > I see. It seems like if we want to decouple 1) and 2), we
> > >> need to
> > >> > > > alter
> > >> > > > > > the
> > >> > > > > > > definition of the window. Do you think it would close the
> > gap
> > >> if
> > >> > we
> > >> > > > > > added a
> > >> > > > > > > "window close" time to the window definition?
> > >> > > > > > >
> > >> > > > > > > Such as:
> > >> > > > > > >
> > >> > > > > > > builder.stream("input")
> > >> > > > > > > .groupByKey()
> > >> > > > > > > .windowedBy(
> > >> > > > > > >   TimeWindows
> > >> > > > > > >     .of(60_000)
> > >> > > > > > >     .closeAfter(10 * 60)
> > >> > > > > > >     .until(30L * 24 * 60 * 60 * 1000)
> > >> > > > > > > )
> > >> > > > > > > .count()
> > >> > > > > > > .suppress(Suppression.finalResultsOnly());
> > >> > > > > > >
> > >> > > > > > > Possibly called "finalResultsAtWindowClose" or something?
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > > -John
> > >> > > > > > >
> > >> > > > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <
> > >> wangguoz@gmail.com
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > >
> > >> > > > > > >> Hey John,
> > >> > > > > > >>
> > >> > > > > > >> Obviously I'm too lazy on email replying diligence
> compared
> > >> with
> > >> > > you
> > >> > > > > :)
> > >> > > > > > >> Will try to reply them separately:
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> ------------------------------
> > ------------------------------
> > >> > > > > > -----------------
> > >> > > > > > >>
> > >> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > >> > > > > > >>
> > >> > > > > > >> I'm aware of this use case, but again, the concern is
> that,
> > >> in
> > >> > > this
> > >> > > > > > setting
> > >> > > > > > >> in order to let the window be queryable for 30 days, we
> > will
> > >> > > > actually
> > >> > > > > > >> process data as old as 30 days as well, while most of the
> > >> late
> > >> > > > updates
> > >> > > > > > >> beyond 5 minutes would be discarded anyways. Personally I
> > >> think
> > >> > > for
> > >> > > > > the
> > >> > > > > > >> final update scenario, the ideal situation users would
> want
> > >> is
> > >> > > that
> > >> > > > > "do
> > >> > > > > > not
> > >> > > > > > >> process any data that is less than 5 minutes, and of
> course
> > >> no
> > >> > > > update
> > >> > > > > > >> records to the downstream later than 5 minutes either;
> but
> > >> > retain
> > >> > > > the
> > >> > > > > > >> window to be queryable for 30 days". And by doing that
> the
> > >> final
> > >> > > > > window
> > >> > > > > > >> snapshot would also be aligned with the update stream as
> > >> well.
> > >> > In
> > >> > > > > other
> > >> > > > > > >> words, among these three periods:
> > >> > > > > > >>
> > >> > > > > > >> 1) the retention length of the window / table.
> > >> > > > > > >> 2) the late records acceptance for updating the window.
> > >> > > > > > >> 3) the late records update to be sent downstream.
> > >> > > > > > >>
> > >> > > > > > >> Final update use cases would naturally want 2) = 3),
> while
> > 1)
> > >> > may
> > >> > > be
> > >> > > > > > >> different and larger, while what we provide now is that
> 1)
> > =
> > >> 2),
> > >> > > > which
> > >> > > > > > >> could be different and in practice larger than 3), hence
> > not
> > >> the
> > >> > > > most
> > >> > > > > > >> intuitive for their needs.
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> ------------------------------
> > ------------------------------
> > >> > > > > > -----------------
> > >> > > > > > >>
> > >> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > >> > > > > > >>
> > >> > > > > > >> I'd like option 2) over option 1) better as well from
> > >> > programming
> > >> > > > pov.
> > >> > > > > > But
> > >> > > > > > >> I'm wondering if option 2) would provide the above
> > semantics
> > >> or
> > >> > it
> > >> > > > is
> > >> > > > > > still
> > >> > > > > > >> coupling 1) with 2) as well ?
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> Guozhang
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <
> > >> john@confluent.io
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > >>
> > >> > > > > > >>> In fact, to push the idea further (which IIRC is what
> > >> Matthias
> > >> > > > > > originally
> > >> > > > > > >>> proposed), if we can accept
> "Suppression#finalResultsOnly"
> > >> in
> > >> > my
> > >> > > > last
> > >> > > > > > >>> email, then we could also consider whether to eliminate
> > >> > > > > > >>> "suppressLateEvents" entirely.
> > >> > > > > > >>>
> > >> > > > > > >>> We could always add it later, but you've both expressed
> > >> doubt
> > >> > > that
> > >> > > > > > there
> > >> > > > > > >>> are practical use cases for it outside of final-results.
> > >> > > > > > >>>
> > >> > > > > > >>> -John
> > >> > > > > > >>>
> > >> > > > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
> > >> > john@confluent.io>
> > >> > > > > > wrote:
> > >> > > > > > >>>
> > >> > > > > > >>>> Hi again, Guozhang ;) Here's the second part of my
> > >> response...
> > >> > > > > > >>>>
> > >> > > > > > >>>> It seems like your main concern is: "if I'm a user who
> > >> wants
> > >> > > final
> > >> > > > > > >> update
> > >> > > > > > >>>> semantics, how complicated is it for me to get it?"
> > >> > > > > > >>>>
> > >> > > > > > >>>> I think we have to assume that people don't always have
> > >> time
> > >> > to
> > >> > > > > become
> > >> > > > > > >>>> deeply familiar with all the nuances of a programming
> > >> > > environment
> > >> > > > > > >> before
> > >> > > > > > >>>> they use it. Especially if they're evaluating several
> > >> > frameworks
> > >> > > > for
> > >> > > > > > >>> their
> > >> > > > > > >>>> use case, it's very valuable to make it as obvious as
> > >> possible
> > >> > > how
> > >> > > > > to
> > >> > > > > > >>>> accomplish various computations with Streams.
> > >> > > > > > >>>>
> > >> > > > > > >>>> To me the biggest question is whether with a fresh
> > >> > perspective,
> > >> > > > > people
> > >> > > > > > >>>> would say "oh, I get it, I have to bound my lateness
> and
> > >> > > suppress
> > >> > > > > > >>>> intermediate updates, and of course I'll get only the
> > final
> > >> > > > > result!",
> > >> > > > > > >> or
> > >> > > > > > >>> if
> > >> > > > > > >>>> it's more like "wtf? all I want is the final result,
> what
> > >> are
> > >> > > all
> > >> > > > > > these
> > >> > > > > > >>>> parameters?".
> > >> > > > > > >>>>
> > >> > > > > > >>>> I was talking with Matthias a while back, and he had an
> > >> idea
> > >> > > that
> > >> > > > I
> > >> > > > > > >> think
> > >> > > > > > >>>> can help, which is to essentially set up a final-result
> > >> recipe
> > >> > > in
> > >> > > > > > >>> addition
> > >> > > > > > >>>> to the raw parameters. I previously thought that it
> > >> wouldn't
> > >> > be
> > >> > > > > > >> possible
> > >> > > > > > >>> to
> > >> > > > > > >>>> restrict its usage to Windowed KTables, but thinking
> > about
> > >> it
> > >> > > > again
> > >> > > > > > >> this
> > >> > > > > > >>>> weekend, I have a couple of ideas:
> > >> > > > > > >>>>
> > >> > > > > > >>>> ================
> > >> > > > > > >>>> = 1. Static Wrapper =
> > >> > > > > > >>>> ================
> > >> > > > > > >>>> We can define an extra static function that "wraps" a
> > >> KTable
> > >> > > with
> > >> > > > > > >>>> final-result semantics.
> > >> > > > > > >>>>
> > >> > > > > > >>>> public static <K extends Windowed, V> KTable<K, V>
> > >> > > > finalResultsOnly(
> > >> > > > > > >>>>   final KTable<K, V> windowedKTable,
> > >> > > > > > >>>>   final Duration maxAllowedLateness,
> > >> > > > > > >>>>   final Suppression.BufferFullStrategy
> > bufferFullStrategy)
> > >> {
> > >> > > > > > >>>>     return windowedKTable.suppress(
> > >> > > > > > >>>>         Suppression.suppressLateEvents(
> > maxAllowedLateness)
> > >> > > > > > >>>>                    .suppressIntermediateEvents(
> > >> > > > > > >>>>                      IntermediateSuppression
> > >> > > > > > >>>>                        .emitAfter(maxAllowedLateness)
> > >> > > > > > >>>>                        .bufferFullStrategy(
> > >> > bufferFullStrategy)
> > >> > > > > > >>>>                    )
> > >> > > > > > >>>>     );
> > >> > > > > > >>>> }
> > >> > > > > > >>>>
> > >> > > > > > >>>> Because windowedKTable is a parameter, the static
> > function
> > >> can
> > >> > > > > easily
> > >> > > > > > >>>> impose an extra bound on the key type, that it extends
> > >> > Windowed.
> > >> > > > > This
> > >> > > > > > >>> would
> > >> > > > > > >>>> make "final results only" only available on windowed
> > >> ktables.
> > >> > > > > > >>>>
> > >> > > > > > >>>> Here's how it would look to use:
> > >> > > > > > >>>>
> > >> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts =
> ...
> > >> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > >> > > > > > >>>>   finalResultsOnly(
> > >> > > > > > >>>>     windowCounts,
> > >> > > > > > >>>>     Duration.ofMinutes(10),
> > >> > > > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > >> > > > > > >>>>   );
> > >> > > > > > >>>>
> > >> > > > > > >>>> Trying to use it on a non-windowed KTable yields:
> > >> > > > > > >>>>
> > >> > > > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > >> > > > > > >>>>> org.apache.kafka.streams.kstream.internals.
> > >> > KTableAggregateTest
> > >> > > > > > cannot
> > >> > > > > > >>> be
> > >> > > > > > >>>>> applied to given types;
> > >> > > > > > >>>>>   required:
> > >> > > > > > >>>>>
> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > >> > > > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > >> > > > > > BufferFullStrategy
> > >> > > > > > >>>>>   found:
> > >> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > >> > > > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > >> > > > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > >> > > > > > >>>>>   reason: inference variable K has incompatible bounds
> > >> > > > > > >>>>>     equality constraints: java.lang.String
> > >> > > > > > >>>>>     upper bounds:
> > >> org.apache.kafka.streams.kstream.Windowed
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>> =================================================
> > >> > > > > > >>>> = 2. Add <K,V> parameters and recipe method to
> > Suppression
> > >> =
> > >> > > > > > >>>> =================================================
> > >> > > > > > >>>>
> > >> > > > > > >>>> By adding K,V parameters to Suppression, we can
> provide a
> > >> > > > similarly
> > >> > > > > > >>>> bounded config method directly on the Suppression
> class:
> > >> > > > > > >>>>
> > >> > > > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > >> > > > > > >>>> finalResultsOnly(final Duration maxAllowedLateness,
> final
> > >> > > > > > >>>> BufferFullStrategy bufferFullStrategy) {
> > >> > > > > > >>>>     return Suppression
> > >> > > > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > >> > > > > > >>>>         .suppressIntermediateEvents(
> > IntermediateSuppression
> > >> > > > > > >>>>             .emitAfter(maxAllowedLateness)
> > >> > > > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > >> > > > > > >>>>         );
> > >> > > > > > >>>> }
> > >> > > > > > >>>>
> > >> > > > > > >>>> Then, here's how it would look to use it:
> > >> > > > > > >>>>
> > >> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts =
> ...
> > >> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > >> > > > > > >>>>   windowCounts.suppress(
> > >> > > > > > >>>>     Suppression.finalResultsOnly(
> > >> > > > > > >>>>       Duration.ofMinutes(10)
> > >> > > > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > >> > > > > > >>>>     )
> > >> > > > > > >>>>   );
> > >> > > > > > >>>>
> > >> > > > > > >>>> Trying to use it on a non-windowed ktable yields:
> > >> > > > > > >>>>
> > >> > > > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > >> > > > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V>
> > cannot
> > >> be
> > >> > > > applied
> > >> > > > > > to
> > >> > > > > > >>>>> given types;
> > >> > > > > > >>>>>   required:
> > >> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > >> > > > > > >>> Suppression.BufferFullStrategy
> > >> > > > > > >>>>>   found:
> > >> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > >> > > > > > >>> Suppression.BufferFullStrategy
> > >> > > > > > >>>>>   reason: explicit type argument java.lang.String does
> > not
> > >> > > > conform
> > >> > > > > to
> > >> > > > > > >>>>> declared bound(s)
> > >> org.apache.kafka.streams.kstream.Windowed
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>> ============
> > >> > > > > > >>>> = Downsides =
> > >> > > > > > >>>> ============
> > >> > > > > > >>>>
> > >> > > > > > >>>> Of course, there's a downside either way:
> > >> > > > > > >>>> * for 1:  this "wrapper" interaction would be the first
> > in
> > >> the
> > >> > > > DSL.
> > >> > > > > Is
> > >> > > > > > >> it
> > >> > > > > > >>>> too strange, and how discoverable would it be?
> > >> > > > > > >>>> * for 2: adding those type parameters to Suppression
> will
> > >> > force
> > >> > > > all
> > >> > > > > > >>>> callers to provide them in the event of a chained
> > >> construction
> > >> > > > > because
> > >> > > > > > >>> Java
> > >> > > > > > >>>> doesn't do RHS recursive type inference. This is
> already
> > >> > visible
> > >> > > > in
> > >> > > > > > >> other
> > >> > > > > > >>>> parts of the Streams DSL. For example, often calls to
> > >> > > Materialized
> > >> > > > > > >>> builders
> > >> > > > > > >>>> have to provide seemingly obvious type bounds.
> > >> > > > > > >>>>
> > >> > > > > > >>>> ============
> > >> > > > > > >>>> = Conclusion =
> > >> > > > > > >>>> ============
> > >> > > > > > >>>>
> > >> > > > > > >>>> I think option 2 is more "normal" and discoverable. It
> > does
> > >> > > have a
> > >> > > > > > >>>> downside, but it's one that's pre-existing elsewhere in
> > the
> > >> > DSL.
> > >> > > > > > >>>>
> > >> > > > > > >>>> WDYT? Would the addition of this "recipe" method to
> > >> > Suppression
> > >> > > > > > resolve
> > >> > > > > > >>>> your concern?
> > >> > > > > > >>>>
> > >> > > > > > >>>> Thanks again,
> > >> > > > > > >>>> -John
> > >> > > > > > >>>>
> > >> > > > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
> > >> > > wangguoz@gmail.com
> > >> > > > >
> > >> > > > > > >>> wrote:
> > >> > > > > > >>>>
> > >> > > > > > >>>>> Hi John,
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Regarding the metrics: yeah I think I'm with you that
> > the
> > >> > > dropped
> > >> > > > > > >>> records
> > >> > > > > > >>>>> due to window retention or emit suppression policies
> > >> should
> > >> > be
> > >> > > > > > >> recorded
> > >> > > > > > >>>>> differently, and using this KIP's proposed metric
> would
> > be
> > >> > > fine.
> > >> > > > If
> > >> > > > > > >> you
> > >> > > > > > >>>>> also think we can use this KIP's proposed metrics to
> > cover
> > >> > the
> > >> > > > > window
> > >> > > > > > >>>>> retention cased skipping records, then we can include
> > the
> > >> > > changes
> > >> > > > > in
> > >> > > > > > >>> this
> > >> > > > > > >>>>> KIP as well.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Regarding the current proposal, I'm actually not too
> > >> worried
> > >> > > > about
> > >> > > > > > the
> > >> > > > > > >>>>> inconsistency between query semantics and downstream
> > emit
> > >> > > > > semantics.
> > >> > > > > > >> For
> > >> > > > > > >>>>> queries, we will always return the current running
> > >> results of
> > >> > > the
> > >> > > > > > >>> windows,
> > >> > > > > > >>>>> being it partial or final results depending on the
> > window
> > >> > > > retention
> > >> > > > > > >> time
> > >> > > > > > >>>>> anyways, which has nothing to do whether the emitted
> > >> stream
> > >> > > > should
> > >> > > > > be
> > >> > > > > > >>> one
> > >> > > > > > >>>>> final output per key or not. I also agree that having
> a
> > >> > unified
> > >> > > > > > >>> operation
> > >> > > > > > >>>>> is generally better for users to focus on leveraging
> > that
> > >> one
> > >> > > > only
> > >> > > > > > >> than
> > >> > > > > > >>>>> learning about two set of operations. The only
> question
> > I
> > >> had
> > >> > > is,
> > >> > > > > for
> > >> > > > > > >>>>> final
> > >> > > > > > >>>>> updates of window stores, if it is a bit awkward to
> > >> > understand
> > >> > > > the
> > >> > > > > > >>>>> configuration combo. Thinking about this more, I think
> > my
> > >> > root
> > >> > > > > worry
> > >> > > > > > >> in
> > >> > > > > > >>>>> the
> > >> > > > > > >>>>> "suppressLateEvents" call for windowed tables, since
> > from
> > >> a
> > >> > > user
> > >> > > > > > >>>>> perspective: if my retention time is X which means
> "pay
> > >> the
> > >> > > cost
> > >> > > > to
> > >> > > > > > >>> allow
> > >> > > > > > >>>>> late records up to X to still be applied updating the
> > >> > tables",
> > >> > > > why
> > >> > > > > > >>> would I
> > >> > > > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say
> "do
> > >> not
> > >> > > send
> > >> > > > > the
> > >> > > > > > >>>>> updates up to Y, which means the downstream operator
> or
> > >> sink
> > >> > > > topic
> > >> > > > > > for
> > >> > > > > > >>>>> this
> > >> > > > > > >>>>> stream would actually see a truncated update stream
> > while
> > >> > I've
> > >> > > > paid
> > >> > > > > > >>> larger
> > >> > > > > > >>>>> cost for that"; and of course, Y > X would not make
> > sense
> > >> > > either
> > >> > > > as
> > >> > > > > > >> you
> > >> > > > > > >>>>> would not see any updates later than X anyways. So in
> > >> all, my
> > >> > > > > feeling
> > >> > > > > > >> is
> > >> > > > > > >>>>> that it makes less sense for windowed table's
> > >> > > > "suppressLateEvents"
> > >> > > > > > >> with
> > >> > > > > > >>> a
> > >> > > > > > >>>>> parameter that is not equal to the window retention,
> and
> > >> > > opening
> > >> > > > > the
> > >> > > > > > >>> door
> > >> > > > > > >>>>> in the current proposal may confuse people with that.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Again, above is just a subjective opinion and probably
> > we
> > >> can
> > >> > > > also
> > >> > > > > > >> bring
> > >> > > > > > >>>>> up
> > >> > > > > > >>>>> some scenarios that users does want to set X != Y..
> but
> > >> > > > personally
> > >> > > > > I
> > >> > > > > > >>> feel
> > >> > > > > > >>>>> that even if the semantics for this scenario if
> > intuitive
> > >> for
> > >> > > > user
> > >> > > > > to
> > >> > > > > > >>>>> understand, doe that really make sense and should we
> > >> really
> > >> > > open
> > >> > > > > the
> > >> > > > > > >>> door
> > >> > > > > > >>>>> for it. So I think maybe separating the final update
> in
> > a
> > >> > > > separate
> > >> > > > > > >> API's
> > >> > > > > > >>>>> benefits may overwhelm the advantage of having one
> > uniform
> > >> > > > > > definition.
> > >> > > > > > >>> And
> > >> > > > > > >>>>> for my alternative proposal, the rationale was from
> both
> > >> my
> > >> > > > concern
> > >> > > > > > >>> about
> > >> > > > > > >>>>> "suppressLateEvents" for windowed store, and Matthias'
> > >> > question
> > >> > > > > about
> > >> > > > > > >>>>> "suppressLateEvents" for non-windowed stores, that if
> it
> > >> is
> > >> > > less
> > >> > > > > > >>>>> meaningful
> > >> > > > > > >>>>> for both, we can consider removing it completely and
> > only
> > >> do
> > >> > > > > > >>>>> "IntermediateSuppression" in Suppress instead.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> So I'd summarize my thoughts in the following
> questions:
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X
> > (window
> > >> > > > > retention
> > >> > > > > > >>> time)
> > >> > > > > > >>>>> for windowed stores make sense in practice?
> > >> > > > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> > >> > > > non-windowed
> > >> > > > > > >>> stores
> > >> > > > > > >>>>> make sense in practice?
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Guozhang
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
> > >> > > bbejeck@gmail.com>
> > >> > > > > > >> wrote:
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>> Thanks for the explanation, that does make sense.  I
> > have
> > >> > some
> > >> > > > > > >>>>> questions on
> > >> > > > > > >>>>>> operations, but I'll just wait for the PR and tests.
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> Thanks,
> > >> > > > > > >>>>>> Bill
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
> > >> > > john@confluent.io
> > >> > > > >
> > >> > > > > > >>> wrote:
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>>> Hi Bill,
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Thanks for the review!
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Your question is very much applicable to the KIP and
> > >> not at
> > >> > > all
> > >> > > > > an
> > >> > > > > > >>>>>>> implementation detail. Thanks for bringing it up.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> I'm proposing not to change the existing caches and
> > >> > > > > configurations
> > >> > > > > > >>> at
> > >> > > > > > >>>>> all
> > >> > > > > > >>>>>>> (for now).
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Imagine you have a topology like this:
> > >> > > > > > >>>>>>> commit.interval.ms = 100
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> The first ktable (ktable1) will respect the commit
> > >> interval
> > >> > > and
> > >> > > > > > >>> buffer
> > >> > > > > > >>>>>>> events for 100ms before logging, storing, or
> > forwarding
> > >> > them
> > >> > > > > > >> (IIRC).
> > >> > > > > > >>>>>>> Therefore, the second ktable (suppress) will only
> see
> > >> the
> > >> > > > events
> > >> > > > > > >> at
> > >> > > > > > >>> a
> > >> > > > > > >>>>>> rate
> > >> > > > > > >>>>>>> of once per 100ms. It will apply its own buffering,
> > and
> > >> > emit
> > >> > > > once
> > >> > > > > > >>> per
> > >> > > > > > >>>>>> 200ms
> > >> > > > > > >>>>>>> This case is pretty trivial because the suppress
> time
> > >> is a
> > >> > > > > > >> multiple
> > >> > > > > > >>> of
> > >> > > > > > >>>>>> the
> > >> > > > > > >>>>>>> commit interval.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> When it's not an integer multiple, you'll get
> behavior
> > >> like
> > >> > > in
> > >> > > > > > >> this
> > >> > > > > > >>>>>> marble
> > >> > > > > > >>>>>>> diagram:
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> [ KTable caching with commit interval = 2 ]
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>       [ suppress with emitAfter = 3 ]
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> <---------------(k:2)----------------(k:6)->
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> If this behavior isn't desired (for example, if you
> > >> wanted
> > >> > to
> > >> > > > > emit
> > >> > > > > > >>>>> (k:3)
> > >> > > > > > >>>>>> at
> > >> > > > > > >>>>>>> time 3, I'd recommend setting the
> > >> > "cache.max.bytes.buffering"
> > >> > > > to
> > >> > > > > 0
> > >> > > > > > >>> or
> > >> > > > > > >>>>>>> modifying the topology to disable caching. Then, the
> > >> > behavior
> > >> > > > is
> > >> > > > > > >>> more
> > >> > > > > > >>>>>>> simply determined just by the suppress operator.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Does that seem right to you?
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Regarding the changelogs, because the suppression
> > >> operator
> > >> > > > hangs
> > >> > > > > > >>> onto
> > >> > > > > > >>>>>>> events for a while, it will need its own changelog.
> > The
> > >> > > > changelog
> > >> > > > > > >>>>>>> should represent the current state of the buffer at
> > all
> > >> > > times.
> > >> > > > So
> > >> > > > > > >>> when
> > >> > > > > > >>>>>> the
> > >> > > > > > >>>>>>> suppress operator sees (k:2), for example, it will
> log
> > >> > (k:2).
> > >> > > > > When
> > >> > > > > > >>> it
> > >> > > > > > >>>>>>> later gets to time 3, it's time to emit (k:2)
> > >> downstream.
> > >> > > > Because
> > >> > > > > > >> k
> > >> > > > > > >>>>> is no
> > >> > > > > > >>>>>>> longer buffered, the suppress operator will log
> > >> (k:null).
> > >> > > Thus,
> > >> > > > > > >> when
> > >> > > > > > >>>>>>> recovering,
> > >> > > > > > >>>>>>> it can rebuild the buffer by reading its changelog.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> What do you think about this?
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Thanks,
> > >> > > > > > >>>>>>> -John
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
> > >> > > bbejeck@gmail.com
> > >> > > > >
> > >> > > > > > >>>>> wrote:
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> Hi John,  thanks for the KIP.
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> Early on in the KIP, you mention the current
> > approaches
> > >> > for
> > >> > > > > > >>>>> controlling
> > >> > > > > > >>>>>>> the
> > >> > > > > > >>>>>>>> rate of downstream records from a KTable, cache
> size
> > >> > > > > > >> configuration
> > >> > > > > > >>>>> and
> > >> > > > > > >>>>>>>> commit time.
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> Will these configuration parameters still be in
> > effect
> > >> for
> > >> > > > > > >> tables
> > >> > > > > > >>>>> that
> > >> > > > > > >>>>>>>> don't use suppression?  For tables taking advantage
> > of
> > >> > > > > > >>> suppression,
> > >> > > > > > >>>>>> will
> > >> > > > > > >>>>>>>> these configurations have no impact?
> > >> > > > > > >>>>>>>> This last question may be to implementation
> specific
> > >> but
> > >> > if
> > >> > > > the
> > >> > > > > > >>>>>> requested
> > >> > > > > > >>>>>>>> suppression time is longer than the specified
> commit
> > >> time,
> > >> > > > will
> > >> > > > > > >>> the
> > >> > > > > > >>>>>>> latest
> > >> > > > > > >>>>>>>> record in the suppression buffer get stored in a
> > >> > changelog?
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> Thanks,
> > >> > > > > > >>>>>>>> Bill
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> > >> > > > john@confluent.io
> > >> > > > > > >>>
> > >> > > > > > >>>>>> wrote:
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>>> Thanks for the feedback, Matthias,
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> It seems like in straightforward relational
> > processing
> > >> > > cases,
> > >> > > > > > >> it
> > >> > > > > > >>>>>> would
> > >> > > > > > >>>>>>>> not
> > >> > > > > > >>>>>>>>> make sense to bound the lateness of KTables. In
> > >> general,
> > >> > it
> > >> > > > > > >>> seems
> > >> > > > > > >>>>>>> better
> > >> > > > > > >>>>>>>> to
> > >> > > > > > >>>>>>>>> have "guard rails" in place that make it easier to
> > >> write
> > >> > > > > > >>> sensible
> > >> > > > > > >>>>>>>> programs
> > >> > > > > > >>>>>>>>> than insensible ones.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> But I'm still going to argue in favor of keeping
> it
> > >> for
> > >> > all
> > >> > > > > > >>>>> KTables
> > >> > > > > > >>>>>> ;)
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> 1. I believe it is simpler to understand the
> > operator
> > >> if
> > >> > it
> > >> > > > > > >> has
> > >> > > > > > >>>>> one
> > >> > > > > > >>>>>>>> uniform
> > >> > > > > > >>>>>>>>> definition, regardless of context. It's well
> defined
> > >> and
> > >> > > > > > >>> intuitive
> > >> > > > > > >>>>>> what
> > >> > > > > > >>>>>>>>> will happen when you use late-event suppression
> on a
> > >> > > KTable,
> > >> > > > > > >> so
> > >> > > > > > >>> I
> > >> > > > > > >>>>>> think
> > >> > > > > > >>>>>>>>> nothing surprising or dangerous will happen in
> that
> > >> case.
> > >> > > > From
> > >> > > > > > >>> my
> > >> > > > > > >>>>>>>>> perspective, having two sets of allowed operations
> > is
> > >> > > > actually
> > >> > > > > > >>> an
> > >> > > > > > >>>>>>>> increase
> > >> > > > > > >>>>>>>>> in cognitive complexity.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this
> > way.
> > >> > For
> > >> > > > > > >>>>> example,
> > >> > > > > > >>>>>> in
> > >> > > > > > >>>>>>>> lieu
> > >> > > > > > >>>>>>>>> of full-featured timestamp semantics, I can
> > implement
> > >> > MVCC
> > >> > > > > > >>>>> behavior
> > >> > > > > > >>>>>>> when
> > >> > > > > > >>>>>>>>> building a KTable by
> > >> "suppressLateEvents(Duration.ZERO)".
> > >> > I
> > >> > > > > > >>>>> suspect
> > >> > > > > > >>>>>>> that
> > >> > > > > > >>>>>>>>> there are other, non-obvious applications of
> > >> suppressing
> > >> > > late
> > >> > > > > > >>>>> events
> > >> > > > > > >>>>>> on
> > >> > > > > > >>>>>>>>> KTables.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> 3. Not to get too much into implementation details
> > in
> > >> a
> > >> > KIP
> > >> > > > > > >>>>>> discussion,
> > >> > > > > > >>>>>>>> but
> > >> > > > > > >>>>>>>>> if we did want to make late-event suppression
> > >> available
> > >> > > only
> > >> > > > > > >> on
> > >> > > > > > >>>>>>> windowed
> > >> > > > > > >>>>>>>>> KTables, we have two enforcement options:
> > >> > > > > > >>>>>>>>>   a. check when we build the topology - this would
> > be
> > >> > > simple
> > >> > > > > > >> to
> > >> > > > > > >>>>>>>> implement,
> > >> > > > > > >>>>>>>>> but would be a runtime check. Hopefully, people
> > write
> > >> > tests
> > >> > > > > > >> for
> > >> > > > > > >>>>> their
> > >> > > > > > >>>>>>>>> topology before deploying them, so the feedback
> loop
> > >> > isn't
> > >> > > > > > >>>>>>> instantaneous,
> > >> > > > > > >>>>>>>>> but it's not too long either.
> > >> > > > > > >>>>>>>>>   b. add a new WindowedKTable type - this would
> be a
> > >> > > compile
> > >> > > > > > >>> time
> > >> > > > > > >>>>>>> check,
> > >> > > > > > >>>>>>>>> but would also be substantial increase of both
> > >> interface
> > >> > > and
> > >> > > > > > >>> code
> > >> > > > > > >>>>>>>>> complexity.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> We should definitely strive to have guard rails
> > >> > protecting
> > >> > > > > > >>> against
> > >> > > > > > >>>>>>>>> surprising or dangerous behavior. Protecting
> against
> > >> > > programs
> > >> > > > > > >>>>> that we
> > >> > > > > > >>>>>>>> don't
> > >> > > > > > >>>>>>>>> currently predict is a lesser benefit, and I think
> > we
> > >> can
> > >> > > put
> > >> > > > > > >> up
> > >> > > > > > >>>>>> guard
> > >> > > > > > >>>>>>>>> rails on a case-by-case basis for that. It seems
> > like
> > >> the
> > >> > > > > > >>>>> increase in
> > >> > > > > > >>>>>>>>> cognitive (and potentially code and interface)
> > >> complexity
> > >> > > > > > >> makes
> > >> > > > > > >>> me
> > >> > > > > > >>>>>>> think
> > >> > > > > > >>>>>>>> we
> > >> > > > > > >>>>>>>>> should skip this case.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> What do you think?
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> Thanks,
> > >> > > > > > >>>>>>>>> -John
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > >> > > > > > >>>>>>> matthias@confluent.io>
> > >> > > > > > >>>>>>>>> wrote:
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>> Thanks for the KIP John.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> One initial comments about the last example
> > "Bounded
> > >> > > > > > >>> lateness":
> > >> > > > > > >>>>>> For a
> > >> > > > > > >>>>>>>>>> non-windowed KTable bounding the lateness does
> not
> > >> > really
> > >> > > > > > >> make
> > >> > > > > > >>>>>> sense,
> > >> > > > > > >>>>>>>>>> does it?
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> Thus, I am wondering if we should allow
> > >> > > > > > >> `suppressLateEvents()`
> > >> > > > > > >>>>> for
> > >> > > > > > >>>>>>> this
> > >> > > > > > >>>>>>>>>> case? It seems to be better to only allow it for
> > >> > > > > > >>>>> windowed-KTables.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> -Matthias
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > >> > > > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as
> > well.
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> What you gave as new example is semantically the
> > >> same
> > >> > as
> > >> > > > > > >>> what
> > >> > > > > > >>>>> I
> > >> > > > > > >>>>>>>>>> suggested.
> > >> > > > > > >>>>>>>>>>> So it is good by me.
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> Thanks
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > >> > > > > > >>>>> john@confluent.io
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>> wrote:
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Thanks for taking look, Ted,
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> I agree this is a departure from the
> conventions
> > of
> > >> > > > > > >> Streams
> > >> > > > > > >>>>> DSL.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Most of our config objects have one or two
> > >> "required"
> > >> > > > > > >>>>>> parameters,
> > >> > > > > > >>>>>>>>> which
> > >> > > > > > >>>>>>>>>> fit
> > >> > > > > > >>>>>>>>>>>> naturally with the static factory method
> > approach.
> > >> > > > > > >>>>> TimeWindow,
> > >> > > > > > >>>>>> for
> > >> > > > > > >>>>>>>>>> example,
> > >> > > > > > >>>>>>>>>>>> requires a size parameter, so we can naturally
> > say
> > >> > > > > > >>>>>>>>> TimeWindows.of(size).
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> I think in the case of a suppression, there's
> > >> really
> > >> > no
> > >> > > > > > >>>>> "core"
> > >> > > > > > >>>>>>>>>> parameter,
> > >> > > > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > >> > > > > > >>>>> Suppression()". I
> > >> > > > > > >>>>>>>> think
> > >> > > > > > >>>>>>>>>> that
> > >> > > > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous,
> > since
> > >> > there
> > >> > > > > > >>> are
> > >> > > > > > >>>>>> many
> > >> > > > > > >>>>>>>>>> durations
> > >> > > > > > >>>>>>>>>>>> that we can configure.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> However, thinking about it again, I suppose
> that
> > I
> > >> can
> > >> > > > > > >> give
> > >> > > > > > >>>>> each
> > >> > > > > > >>>>>>>>>>>> configuration method a static version, which
> > would
> > >> let
> > >> > > > > > >> you
> > >> > > > > > >>>>>> replace
> > >> > > > > > >>>>>>>>> "new
> > >> > > > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the
> > >> > examples.
> > >> > > > > > >>>>>>> Basically,
> > >> > > > > > >>>>>>>>>> instead
> > >> > > > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I
> > >> listed.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> For example:
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> windowCounts
> > >> > > > > > >>>>>>>>>>>>     .suppress(
> > >> > > > > > >>>>>>>>>>>>         Suppression
> > >> > > > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.
> > >> > ofMinutes(10))
> > >> > > > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>
> > >> IntermediateSuppression.emitAfter(Duration.ofMinutes(
> > >> > 10))
> > >> > > > > > >>>>>>>>>>>>             )
> > >> > > > > > >>>>>>>>>>>>     );
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Does that seem better?
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Thanks,
> > >> > > > > > >>>>>>>>>>>> -John
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > >> > > > > > >>> yuzhihong@gmail.com
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>>>> wrote:
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>> I started to read this KIP which contains a
> lot
> > of
> > >> > > > > > >>>>> materials.
> > >> > > > > > >>>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>> One suggestion:
> > >> > > > > > >>>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>>     .suppress(
> > >> > > > > > >>>>>>>>>>>>>         new Suppression()
> > >> > > > > > >>>>>>>>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

That is a good point..

I cannot think of a better option than documentation and warning, and also
given that we'd probably better not reusing the function name `until` for
close time.


Guozhang


On Tue, Jul 10, 2018 at 3:31 PM, John Roesler <jo...@confluent.io> wrote:

> I had some opportunity to reflect on the default for close time today...
>
> Note that the current "close time" is equal to the retention time, and
> therefore "close" today shares the default retention of 24h.
>
> It would definitely break any application that today specifies a retention
> time to set close shorter than that time. It's also likely to break apps if
> they *don't* set the retention time and rely on the 24h default. So it's
> unfortunate, but I think if "close" isn't set, we should use the retention
> time instead of a fixed default.
>
> When we ultimately remove the retention time parameter ("until"), we will
> have to set "close" to a default of 24h.
>
> Of course, this has a negative impact on the user of "final results", since
> they won't see any output at all for retentionTime/24h, and may find this
> confusing. What can we do about this except document it well? Maybe log a
> warning if we see that close wasn't explicitly set while using "final
> results"?
>
> Thanks,
> -John
>
> On Tue, Jul 10, 2018 at 10:46 AM John Roesler <jo...@confluent.io> wrote:
>
> > Hi Guozhang,
> >
> > That sounds good to me. I'll include that in the KIP.
> >
> > Thanks,
> > -John
> >
> > On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> >> Let me clarify a bit on what I meant about moving `retentionPeriod` to
> >> WindowStoreBuilder:
> >>
> >> In another discussion we had around KIP-319 / 330, that the "retention
> >> period" should not really be a window spec, but only a window store
> spec,
> >> as it only affects how long to retain each window to be queryable along
> >> with the storage cost.
> >>
> >> More specifically, today the "maintainMs" returned from Windows is used
> in
> >> three places:
> >>
> >> 1) for windowed aggregations, they are passed in directly into
> >> `Stores.persistentWindows()` as the retention period parameters. For
> this
> >> use case we should just let the WindowStoreBuilder to specify this value
> >> itself.
> >>
> >> NOTE: It is also returned in the KStreamWindowAggregate processor, to
> >> determine if a received record should be dropped due to its lateness. We
> >> may need to think of another way to get this value inside the processor
> >>
> >> 2) for windowed stream-stream join, it is used as the join range
> parameter
> >> but only to check that "windowSizeMs <= retentionPeriodMs". We can do
> this
> >> check at the store builder lever instead of at the processor level.
> >>
> >>
> >> If we can remove its usage in both 1) and 2), then we should be able to
> >> safely remove this from the `Windows` spec.
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io> wrote:
> >>
> >> > Thanks for the reply, Guozhang,
> >> >
> >> > Good! I agree, that is also a good reason, and I actually made use of
> >> that
> >> > in my tests. I'll update the KIP.
> >> >
> >> > By the way, I chose "allowedLateness" as I was trying to pick a better
> >> name
> >> > than "close", but I think it's actually the wrong name. We don't want
> to
> >> > bound the lateness of events in general, only with respect to the end
> of
> >> > their window.
> >> >
> >> > If we have a window [0,10), with "allowedLateness" of 5, then if we
> get
> >> an
> >> > event with timestamp 3 at time 9, the name implies we'd reject it,
> which
> >> > seems silly. Really, we'd only want to start rejecting that event at
> >> stream
> >> > time 15.
> >> >
> >> > What I meant was more like "allowedLatenessAfterWindowEnd", but
> that's
> >> too
> >> > verbose. I think that "close" + some documentation about what it means
> >> will
> >> > be better.
> >> >
> >> > 1: "Close" would be measured from the end of the window, so a
> reasonable
> >> > default would be "0". Recall that "close" really only needs to be
> >> specified
> >> > for final results, and a default of 0 would produce the most intuitive
> >> > results. If folks later discover that they are missing some late
> events,
> >> > they can adjust the parameter accordingly. IMHO, any other value would
> >> just
> >> > be a guess on our part.
> >> >
> >> > 2a:
> >> > I think you're saying to re-use "until" instead of adding "close" to
> the
> >> > window.
> >> >
> >> > The downside here would be that the semantic change could be more
> >> confusing
> >> > than deprecating "until" and introducing window "close" and a
> >> > "retentionTime" on the store builder. The deprecation is a good,
> >> controlled
> >> > way for us to make sure people are getting the semantics they think
> >> they're
> >> > getting, as well as giving us an opportunity to link people to the API
> >> they
> >> > should use instead.
> >> >
> >> > I didn't fully understand the second part, but it sounds like you're
> >> > suggesting to add a new "retentionTime" setter to Windows to bridge
> the
> >> gap
> >> > until we add it to the store builder? That seems kind of roundabout to
> >> me,
> >> > if that's what you meant. We could just immediately add it to the
> store
> >> > builders in the same PR.
> >> >
> >> > 2b: Sounds good to me!
> >> >
> >> > Thanks again,
> >> > -John
> >> >
> >> >
> >> > On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com>
> >> wrote:
> >> >
> >> > > John,
> >> > >
> >> > > Thanks for your replies. As for the two options of the API, I think
> >> I'm
> >> > > slightly inclined to the first option as well. My motivation is a
> bit
> >> > > different, as I think of the first one maybe more flexible, for
> >> example:
> >> > >
> >> > > KTable<Windowed<..>> table = ... count();
> >> > >
> >> > > table.toStream().peek(..);   // want to peek at the changelog
> stream,
> >> do
> >> > > not care about final results.
> >> > >
> >> > > table.suppress().toStream().to("topic");    // sending to a topic,
> >> want
> >> > to
> >> > > only send the final results.
> >> > >
> >> > > --------------
> >> > >
> >> > > Besides that, I have a few more minor questions:
> >> > >
> >> > > 1. For "allowedLateness", what should be the default value? I.e. if
> >> user
> >> > do
> >> > > not specify "allowedLateness" in TimeWindows, what value should we
> >> set?
> >> > >
> >> > > 2. For API names, some personal suggestions here:
> >> > >
> >> > > 2.a) "allowedLateness"  -> "until" (semantics changed, and also
> value
> >> is
> >> > > defined as delta on top of window length), where "until" ->
> >> > > "retentionPeriod", and the latter will be removed from `Windows` to
> `
> >> > > WindowStoreBuilder` in the future.
> >> > >
> >> > > 2.b) "BufferConfig" -> "Buffered" ?
> >> > >
> >> > >
> >> > >
> >> > > Guozhang
> >> > >
> >> > >
> >> > > On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io>
> >> wrote:
> >> > >
> >> > > > Hey Matthias and Guozhang,
> >> > > >
> >> > > > Sorry for the slow reply. I was mulling about your feedback and
> >> > weighing
> >> > > > some ideas in a sketchbook PR: https://github.com/apache/
> >> > kafka/pull/5337
> >> > > .
> >> > > >
> >> > > > Your thought about keeping suppression independent of business
> logic
> >> > is a
> >> > > > very good one. I agree that it would make more sense to add some
> >> kind
> >> > of
> >> > > > "window close" concept to the window definition.
> >> > > >
> >> > > > In fact, doing that immediately solves the inconsistency problem
> >> > Guozhang
> >> > > > brought up. There's no need to add a "final results" or "emission"
> >> > option
> >> > > > to the windowed aggregation.
> >> > > >
> >> > > > What do you think about an API more like this:
> >> > > >
> >> > > > final StreamsBuilder builder = new StreamsBuilder();
> >> > > >
> >> > > > builder
> >> > > >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
> >> > > >   .groupBy(
> >> > > >     (String k1, String v1) -> k1,
> >> > > >     Serialized.with(STRING_SERDE, STRING_SERDE)
> >> > > >   )
> >> > > >   .windowedBy(TimeWindows
> >> > > >     .of(scaledTime(2L))
> >> > > >     .until(scaledTime(3L))
> >> > > >     .allowedLateness(scaledTime(1L))
> >> > > >   )
> >> > > >   .count(Materialized.as("counts"))
> >> > > >   .suppress(
> >> > > >     emitFinalResultsOnly(
> >> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> >> > SHUT_DOWN)
> >> > > >     )
> >> > > >   )
> >> > > >   .toStream()
> >> > > >   .to("output-suppressed", Produced.with(STRING_SERDE,
> LONG_SERDE));
> >> > > >
> >> > > > Note that:
> >> > > >  * "emitFinalResultsOnly" is available *only* on windowed tables
> >> > > (enforced
> >> > > > by the type system at compile time), and it determines the time to
> >> wait
> >> > > by
> >> > > > looking at "allowedLateness" on the TimeWindows config.
> >> > > >  * querying "counts" will produce results (eventually) consistent
> >> with
> >> > > > what's observable in "output-suppressed".
> >> > > >  * in all cases, "suppress" has no effect on business logic, just
> on
> >> > > event
> >> > > > suppression.
> >> > > >
> >> > > > Is this API straightforward? Or do you still prefer the version
> that
> >> > both
> >> > > > proposed:
> >> > > >
> >> > > >   ...
> >> > > >   .windowedBy(TimeWindows
> >> > > >     .of(scaledTime(2L))
> >> > > >     .until(scaledTime(3L))
> >> > > >     .allowedLateness(scaledTime(1L))
> >> > > >   )
> >> > > >   .count(
> >> > > >     Materialized.as("counts"),
> >> > > >     emitFinalResultsOnly(
> >> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> >> > SHUT_DOWN)
> >> > > >     )
> >> > > >   )
> >> > > >   ...
> >> > > >
> >> > > > To me, these two are practically identical, and I still vaguely
> >> prefer
> >> > > the
> >> > > > first one.
> >> > > >
> >> > > > The prototype has made clearer to me that users of "final results
> >> for
> >> > > > windows" and users of "suppression for table events" both need to
> >> > > configure
> >> > > > the suppression buffer.
> >> > > >
> >> > > > This buffer configuration consists of:
> >> > > > 1. how many keys or bytes to keep in memory
> >> > > > 2. what to do if memory runs out (shut down, start using disk,
> ...)
> >> > > >
> >> > > > So it's not as simple as setting a "final results" flag. We'll
> >> either
> >> > > have
> >> > > > an "Emit" config object on the windowed aggregators that takes the
> >> same
> >> > > > BufferConfig that the "Suppress" config on the suppression
> >> operator, or
> >> > > we
> >> > > > just use the suppression operator for both.
> >> > > >
> >> > > > Perhaps it would sweeten the deal a little to point out that we
> >> have 2
> >> > > > overloads already for each windowed aggregator (with and without
> >> > > > Materialized). Adding "Emitted" or something would mean that we'd
> >> add a
> >> > > new
> >> > > > overload for each one, taking us up to 4 overloads each for
> "count",
> >> > > > "aggregate" and "reduce". Using "suppress" means that we don't add
> >> any
> >> > > new
> >> > > > overloads.
> >> > > >
> >> > > > Thanks again for helping to hash this out,
> >> > > > -John
> >> > > >
> >> > > > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > > > I think I agree with Matthias for having dedicated APIs for
> >> windowed
> >> > > > > operation final output scenario, PLUS separating the window
> close
> >> > which
> >> > > > the
> >> > > > > "final output" would rely on, from the window retention time
> >> itself
> >> > > > > (admittedly it would make this KIP effort larger, but if we
> >> believe
> >> > we
> >> > > > need
> >> > > > > to do this separation anyways we could just do it now).
> >> > > > >
> >> > > > > And then we can have the `KTable#suppress()` for
> >> > > intermediate-suppression
> >> > > > > only, not for late-record-suppression, until we've seen that
> >> becomes
> >> > a
> >> > > > > common feature request because our current design still allows
> to
> >> be
> >> > > > > extended for that purpose.
> >> > > > >
> >> > > > >
> >> > > > > Guozhang
> >> > > > >
> >> > > > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
> >> > > matthias@confluent.io>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Thanks for the discussion. I am just catching up.
> >> > > > > >
> >> > > > > > In general, I think we have different uses cases and
> >> non-windowed
> >> > and
> >> > > > > > windowed is quite different. For the non-windowed case,
> >> suppress()
> >> > > has
> >> > > > > > no (useful) close or retention time, no final semantics, and
> >> also
> >> > no
> >> > > > > > business logic impact.
> >> > > > > >
> >> > > > > > On the other hand, for windowed aggregations, close time and
> >> final
> >> > > > > > result do have a meaning. IMHO, `close()` is part of business
> >> logic
> >> > > > > > while retention time is not. Also, suppression of intermediate
> >> > result
> >> > > > is
> >> > > > > > not a business rule and there might be use case for which
> either
> >> > > "early
> >> > > > > > intermediate" (before window end time) are suppressed only, or
> >> all
> >> > > > > > intermediates are suppressed (maybe also something in the
> >> middle,
> >> > ie,
> >> > > > > > just reduce the load of intermediate updates). Thus,
> >> > > window-suppression
> >> > > > > > is much richer.
> >> > > > > >
> >> > > > > > IMHO, a generic `suppress()` operator that can be inserted
> into
> >> the
> >> > > > data
> >> > > > > > flow at any point is useful. Maybe we should keep is as
> generic
> >> as
> >> > > > > > possible. However, it might be difficult to use with regard to
> >> > > > > > windowing, as the mental effort to use it is high.
> >> > > > > >
> >> > > > > > With regard to Guozhang's comment:
> >> > > > > >
> >> > > > > > > we will actually
> >> > > > > > > process data as old as 30 days as well, while most of the
> late
> >> > > > updates
> >> > > > > > > beyond 5 minutes would be discarded anyways.
> >> > > > > >
> >> > > > > > If we use `suppress()` as a standalone operator, this is
> correct
> >> > and
> >> > > > > > intended IMHO. To address the issue if the behavior is
> >> unwanted, I
> >> > > > would
> >> > > > > > suggest to add a "suppress option" directly to
> >> > > > > > `count()/reduce()/aggregate()` window operator similar to
> >> > > > > > `Materialized`. This would be an "embedded suppress" and avoid
> >> the
> >> > > > > > issue. It would also address the issue about mental effort for
> >> > > "single
> >> > > > > > final window result" use case.
> >> > > > > >
> >> > > > > > I also think that a shorter close-time than retention time is
> >> > useful
> >> > > > for
> >> > > > > > window aggregation. If we add close() to the window definition
> >> and
> >> > > > > > until() to `Materialized`, we can separate both correctly
> IMHO.
> >> > > > > >
> >> > > > > > About setting `close = min(close,retention)` I am not sure. We
> >> > might
> >> > > > > > rather throw an exception than reducing the close time
> >> > automatically.
> >> > > > > > Otherwise, I see many user question about "I set close to X
> but
> >> it
> >> > > does
> >> > > > > > not get updated for some data that is with delay of X".
> >> > > > > >
> >> > > > > > The tricky question might be to design the API in a backward
> >> > > compatible
> >> > > > > > way though.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > -Matthias
> >> > > > > >
> >> > > > > > On 7/3/18 5:38 AM, John Roesler wrote:
> >> > > > > > > Hi Guozhang,
> >> > > > > > >
> >> > > > > > > I see. It seems like if we want to decouple 1) and 2), we
> >> need to
> >> > > > alter
> >> > > > > > the
> >> > > > > > > definition of the window. Do you think it would close the
> gap
> >> if
> >> > we
> >> > > > > > added a
> >> > > > > > > "window close" time to the window definition?
> >> > > > > > >
> >> > > > > > > Such as:
> >> > > > > > >
> >> > > > > > > builder.stream("input")
> >> > > > > > > .groupByKey()
> >> > > > > > > .windowedBy(
> >> > > > > > >   TimeWindows
> >> > > > > > >     .of(60_000)
> >> > > > > > >     .closeAfter(10 * 60)
> >> > > > > > >     .until(30L * 24 * 60 * 60 * 1000)
> >> > > > > > > )
> >> > > > > > > .count()
> >> > > > > > > .suppress(Suppression.finalResultsOnly());
> >> > > > > > >
> >> > > > > > > Possibly called "finalResultsAtWindowClose" or something?
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > -John
> >> > > > > > >
> >> > > > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <
> >> wangguoz@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > >> Hey John,
> >> > > > > > >>
> >> > > > > > >> Obviously I'm too lazy on email replying diligence compared
> >> with
> >> > > you
> >> > > > > :)
> >> > > > > > >> Will try to reply them separately:
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> ------------------------------
> ------------------------------
> >> > > > > > -----------------
> >> > > > > > >>
> >> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> >> > > > > > >>
> >> > > > > > >> I'm aware of this use case, but again, the concern is that,
> >> in
> >> > > this
> >> > > > > > setting
> >> > > > > > >> in order to let the window be queryable for 30 days, we
> will
> >> > > > actually
> >> > > > > > >> process data as old as 30 days as well, while most of the
> >> late
> >> > > > updates
> >> > > > > > >> beyond 5 minutes would be discarded anyways. Personally I
> >> think
> >> > > for
> >> > > > > the
> >> > > > > > >> final update scenario, the ideal situation users would want
> >> is
> >> > > that
> >> > > > > "do
> >> > > > > > not
> >> > > > > > >> process any data that is less than 5 minutes, and of course
> >> no
> >> > > > update
> >> > > > > > >> records to the downstream later than 5 minutes either; but
> >> > retain
> >> > > > the
> >> > > > > > >> window to be queryable for 30 days". And by doing that the
> >> final
> >> > > > > window
> >> > > > > > >> snapshot would also be aligned with the update stream as
> >> well.
> >> > In
> >> > > > > other
> >> > > > > > >> words, among these three periods:
> >> > > > > > >>
> >> > > > > > >> 1) the retention length of the window / table.
> >> > > > > > >> 2) the late records acceptance for updating the window.
> >> > > > > > >> 3) the late records update to be sent downstream.
> >> > > > > > >>
> >> > > > > > >> Final update use cases would naturally want 2) = 3), while
> 1)
> >> > may
> >> > > be
> >> > > > > > >> different and larger, while what we provide now is that 1)
> =
> >> 2),
> >> > > > which
> >> > > > > > >> could be different and in practice larger than 3), hence
> not
> >> the
> >> > > > most
> >> > > > > > >> intuitive for their needs.
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> ------------------------------
> ------------------------------
> >> > > > > > -----------------
> >> > > > > > >>
> >> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> >> > > > > > >>
> >> > > > > > >> I'd like option 2) over option 1) better as well from
> >> > programming
> >> > > > pov.
> >> > > > > > But
> >> > > > > > >> I'm wondering if option 2) would provide the above
> semantics
> >> or
> >> > it
> >> > > > is
> >> > > > > > still
> >> > > > > > >> coupling 1) with 2) as well ?
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> Guozhang
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <
> >> john@confluent.io
> >> > >
> >> > > > > wrote:
> >> > > > > > >>
> >> > > > > > >>> In fact, to push the idea further (which IIRC is what
> >> Matthias
> >> > > > > > originally
> >> > > > > > >>> proposed), if we can accept "Suppression#finalResultsOnly"
> >> in
> >> > my
> >> > > > last
> >> > > > > > >>> email, then we could also consider whether to eliminate
> >> > > > > > >>> "suppressLateEvents" entirely.
> >> > > > > > >>>
> >> > > > > > >>> We could always add it later, but you've both expressed
> >> doubt
> >> > > that
> >> > > > > > there
> >> > > > > > >>> are practical use cases for it outside of final-results.
> >> > > > > > >>>
> >> > > > > > >>> -John
> >> > > > > > >>>
> >> > > > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
> >> > john@confluent.io>
> >> > > > > > wrote:
> >> > > > > > >>>
> >> > > > > > >>>> Hi again, Guozhang ;) Here's the second part of my
> >> response...
> >> > > > > > >>>>
> >> > > > > > >>>> It seems like your main concern is: "if I'm a user who
> >> wants
> >> > > final
> >> > > > > > >> update
> >> > > > > > >>>> semantics, how complicated is it for me to get it?"
> >> > > > > > >>>>
> >> > > > > > >>>> I think we have to assume that people don't always have
> >> time
> >> > to
> >> > > > > become
> >> > > > > > >>>> deeply familiar with all the nuances of a programming
> >> > > environment
> >> > > > > > >> before
> >> > > > > > >>>> they use it. Especially if they're evaluating several
> >> > frameworks
> >> > > > for
> >> > > > > > >>> their
> >> > > > > > >>>> use case, it's very valuable to make it as obvious as
> >> possible
> >> > > how
> >> > > > > to
> >> > > > > > >>>> accomplish various computations with Streams.
> >> > > > > > >>>>
> >> > > > > > >>>> To me the biggest question is whether with a fresh
> >> > perspective,
> >> > > > > people
> >> > > > > > >>>> would say "oh, I get it, I have to bound my lateness and
> >> > > suppress
> >> > > > > > >>>> intermediate updates, and of course I'll get only the
> final
> >> > > > > result!",
> >> > > > > > >> or
> >> > > > > > >>> if
> >> > > > > > >>>> it's more like "wtf? all I want is the final result, what
> >> are
> >> > > all
> >> > > > > > these
> >> > > > > > >>>> parameters?".
> >> > > > > > >>>>
> >> > > > > > >>>> I was talking with Matthias a while back, and he had an
> >> idea
> >> > > that
> >> > > > I
> >> > > > > > >> think
> >> > > > > > >>>> can help, which is to essentially set up a final-result
> >> recipe
> >> > > in
> >> > > > > > >>> addition
> >> > > > > > >>>> to the raw parameters. I previously thought that it
> >> wouldn't
> >> > be
> >> > > > > > >> possible
> >> > > > > > >>> to
> >> > > > > > >>>> restrict its usage to Windowed KTables, but thinking
> about
> >> it
> >> > > > again
> >> > > > > > >> this
> >> > > > > > >>>> weekend, I have a couple of ideas:
> >> > > > > > >>>>
> >> > > > > > >>>> ================
> >> > > > > > >>>> = 1. Static Wrapper =
> >> > > > > > >>>> ================
> >> > > > > > >>>> We can define an extra static function that "wraps" a
> >> KTable
> >> > > with
> >> > > > > > >>>> final-result semantics.
> >> > > > > > >>>>
> >> > > > > > >>>> public static <K extends Windowed, V> KTable<K, V>
> >> > > > finalResultsOnly(
> >> > > > > > >>>>   final KTable<K, V> windowedKTable,
> >> > > > > > >>>>   final Duration maxAllowedLateness,
> >> > > > > > >>>>   final Suppression.BufferFullStrategy
> bufferFullStrategy)
> >> {
> >> > > > > > >>>>     return windowedKTable.suppress(
> >> > > > > > >>>>         Suppression.suppressLateEvents(
> maxAllowedLateness)
> >> > > > > > >>>>                    .suppressIntermediateEvents(
> >> > > > > > >>>>                      IntermediateSuppression
> >> > > > > > >>>>                        .emitAfter(maxAllowedLateness)
> >> > > > > > >>>>                        .bufferFullStrategy(
> >> > bufferFullStrategy)
> >> > > > > > >>>>                    )
> >> > > > > > >>>>     );
> >> > > > > > >>>> }
> >> > > > > > >>>>
> >> > > > > > >>>> Because windowedKTable is a parameter, the static
> function
> >> can
> >> > > > > easily
> >> > > > > > >>>> impose an extra bound on the key type, that it extends
> >> > Windowed.
> >> > > > > This
> >> > > > > > >>> would
> >> > > > > > >>>> make "final results only" only available on windowed
> >> ktables.
> >> > > > > > >>>>
> >> > > > > > >>>> Here's how it would look to use:
> >> > > > > > >>>>
> >> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> >> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> >> > > > > > >>>>   finalResultsOnly(
> >> > > > > > >>>>     windowCounts,
> >> > > > > > >>>>     Duration.ofMinutes(10),
> >> > > > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> >> > > > > > >>>>   );
> >> > > > > > >>>>
> >> > > > > > >>>> Trying to use it on a non-windowed KTable yields:
> >> > > > > > >>>>
> >> > > > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> >> > > > > > >>>>> org.apache.kafka.streams.kstream.internals.
> >> > KTableAggregateTest
> >> > > > > > cannot
> >> > > > > > >>> be
> >> > > > > > >>>>> applied to given types;
> >> > > > > > >>>>>   required:
> >> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> >> > > > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> >> > > > > > BufferFullStrategy
> >> > > > > > >>>>>   found:
> >> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> >> > > > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> >> > > > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> >> > > > > > >>>>>   reason: inference variable K has incompatible bounds
> >> > > > > > >>>>>     equality constraints: java.lang.String
> >> > > > > > >>>>>     upper bounds:
> >> org.apache.kafka.streams.kstream.Windowed
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> =================================================
> >> > > > > > >>>> = 2. Add <K,V> parameters and recipe method to
> Suppression
> >> =
> >> > > > > > >>>> =================================================
> >> > > > > > >>>>
> >> > > > > > >>>> By adding K,V parameters to Suppression, we can provide a
> >> > > > similarly
> >> > > > > > >>>> bounded config method directly on the Suppression class:
> >> > > > > > >>>>
> >> > > > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> >> > > > > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> >> > > > > > >>>> BufferFullStrategy bufferFullStrategy) {
> >> > > > > > >>>>     return Suppression
> >> > > > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> >> > > > > > >>>>         .suppressIntermediateEvents(
> IntermediateSuppression
> >> > > > > > >>>>             .emitAfter(maxAllowedLateness)
> >> > > > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> >> > > > > > >>>>         );
> >> > > > > > >>>> }
> >> > > > > > >>>>
> >> > > > > > >>>> Then, here's how it would look to use it:
> >> > > > > > >>>>
> >> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> >> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> >> > > > > > >>>>   windowCounts.suppress(
> >> > > > > > >>>>     Suppression.finalResultsOnly(
> >> > > > > > >>>>       Duration.ofMinutes(10)
> >> > > > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> >> > > > > > >>>>     )
> >> > > > > > >>>>   );
> >> > > > > > >>>>
> >> > > > > > >>>> Trying to use it on a non-windowed ktable yields:
> >> > > > > > >>>>
> >> > > > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> >> > > > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V>
> cannot
> >> be
> >> > > > applied
> >> > > > > > to
> >> > > > > > >>>>> given types;
> >> > > > > > >>>>>   required:
> >> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> >> > > > > > >>> Suppression.BufferFullStrategy
> >> > > > > > >>>>>   found:
> >> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> >> > > > > > >>> Suppression.BufferFullStrategy
> >> > > > > > >>>>>   reason: explicit type argument java.lang.String does
> not
> >> > > > conform
> >> > > > > to
> >> > > > > > >>>>> declared bound(s)
> >> org.apache.kafka.streams.kstream.Windowed
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> ============
> >> > > > > > >>>> = Downsides =
> >> > > > > > >>>> ============
> >> > > > > > >>>>
> >> > > > > > >>>> Of course, there's a downside either way:
> >> > > > > > >>>> * for 1:  this "wrapper" interaction would be the first
> in
> >> the
> >> > > > DSL.
> >> > > > > Is
> >> > > > > > >> it
> >> > > > > > >>>> too strange, and how discoverable would it be?
> >> > > > > > >>>> * for 2: adding those type parameters to Suppression will
> >> > force
> >> > > > all
> >> > > > > > >>>> callers to provide them in the event of a chained
> >> construction
> >> > > > > because
> >> > > > > > >>> Java
> >> > > > > > >>>> doesn't do RHS recursive type inference. This is already
> >> > visible
> >> > > > in
> >> > > > > > >> other
> >> > > > > > >>>> parts of the Streams DSL. For example, often calls to
> >> > > Materialized
> >> > > > > > >>> builders
> >> > > > > > >>>> have to provide seemingly obvious type bounds.
> >> > > > > > >>>>
> >> > > > > > >>>> ============
> >> > > > > > >>>> = Conclusion =
> >> > > > > > >>>> ============
> >> > > > > > >>>>
> >> > > > > > >>>> I think option 2 is more "normal" and discoverable. It
> does
> >> > > have a
> >> > > > > > >>>> downside, but it's one that's pre-existing elsewhere in
> the
> >> > DSL.
> >> > > > > > >>>>
> >> > > > > > >>>> WDYT? Would the addition of this "recipe" method to
> >> > Suppression
> >> > > > > > resolve
> >> > > > > > >>>> your concern?
> >> > > > > > >>>>
> >> > > > > > >>>> Thanks again,
> >> > > > > > >>>> -John
> >> > > > > > >>>>
> >> > > > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
> >> > > wangguoz@gmail.com
> >> > > > >
> >> > > > > > >>> wrote:
> >> > > > > > >>>>
> >> > > > > > >>>>> Hi John,
> >> > > > > > >>>>>
> >> > > > > > >>>>> Regarding the metrics: yeah I think I'm with you that
> the
> >> > > dropped
> >> > > > > > >>> records
> >> > > > > > >>>>> due to window retention or emit suppression policies
> >> should
> >> > be
> >> > > > > > >> recorded
> >> > > > > > >>>>> differently, and using this KIP's proposed metric would
> be
> >> > > fine.
> >> > > > If
> >> > > > > > >> you
> >> > > > > > >>>>> also think we can use this KIP's proposed metrics to
> cover
> >> > the
> >> > > > > window
> >> > > > > > >>>>> retention cased skipping records, then we can include
> the
> >> > > changes
> >> > > > > in
> >> > > > > > >>> this
> >> > > > > > >>>>> KIP as well.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Regarding the current proposal, I'm actually not too
> >> worried
> >> > > > about
> >> > > > > > the
> >> > > > > > >>>>> inconsistency between query semantics and downstream
> emit
> >> > > > > semantics.
> >> > > > > > >> For
> >> > > > > > >>>>> queries, we will always return the current running
> >> results of
> >> > > the
> >> > > > > > >>> windows,
> >> > > > > > >>>>> being it partial or final results depending on the
> window
> >> > > > retention
> >> > > > > > >> time
> >> > > > > > >>>>> anyways, which has nothing to do whether the emitted
> >> stream
> >> > > > should
> >> > > > > be
> >> > > > > > >>> one
> >> > > > > > >>>>> final output per key or not. I also agree that having a
> >> > unified
> >> > > > > > >>> operation
> >> > > > > > >>>>> is generally better for users to focus on leveraging
> that
> >> one
> >> > > > only
> >> > > > > > >> than
> >> > > > > > >>>>> learning about two set of operations. The only question
> I
> >> had
> >> > > is,
> >> > > > > for
> >> > > > > > >>>>> final
> >> > > > > > >>>>> updates of window stores, if it is a bit awkward to
> >> > understand
> >> > > > the
> >> > > > > > >>>>> configuration combo. Thinking about this more, I think
> my
> >> > root
> >> > > > > worry
> >> > > > > > >> in
> >> > > > > > >>>>> the
> >> > > > > > >>>>> "suppressLateEvents" call for windowed tables, since
> from
> >> a
> >> > > user
> >> > > > > > >>>>> perspective: if my retention time is X which means "pay
> >> the
> >> > > cost
> >> > > > to
> >> > > > > > >>> allow
> >> > > > > > >>>>> late records up to X to still be applied updating the
> >> > tables",
> >> > > > why
> >> > > > > > >>> would I
> >> > > > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do
> >> not
> >> > > send
> >> > > > > the
> >> > > > > > >>>>> updates up to Y, which means the downstream operator or
> >> sink
> >> > > > topic
> >> > > > > > for
> >> > > > > > >>>>> this
> >> > > > > > >>>>> stream would actually see a truncated update stream
> while
> >> > I've
> >> > > > paid
> >> > > > > > >>> larger
> >> > > > > > >>>>> cost for that"; and of course, Y > X would not make
> sense
> >> > > either
> >> > > > as
> >> > > > > > >> you
> >> > > > > > >>>>> would not see any updates later than X anyways. So in
> >> all, my
> >> > > > > feeling
> >> > > > > > >> is
> >> > > > > > >>>>> that it makes less sense for windowed table's
> >> > > > "suppressLateEvents"
> >> > > > > > >> with
> >> > > > > > >>> a
> >> > > > > > >>>>> parameter that is not equal to the window retention, and
> >> > > opening
> >> > > > > the
> >> > > > > > >>> door
> >> > > > > > >>>>> in the current proposal may confuse people with that.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Again, above is just a subjective opinion and probably
> we
> >> can
> >> > > > also
> >> > > > > > >> bring
> >> > > > > > >>>>> up
> >> > > > > > >>>>> some scenarios that users does want to set X != Y.. but
> >> > > > personally
> >> > > > > I
> >> > > > > > >>> feel
> >> > > > > > >>>>> that even if the semantics for this scenario if
> intuitive
> >> for
> >> > > > user
> >> > > > > to
> >> > > > > > >>>>> understand, doe that really make sense and should we
> >> really
> >> > > open
> >> > > > > the
> >> > > > > > >>> door
> >> > > > > > >>>>> for it. So I think maybe separating the final update in
> a
> >> > > > separate
> >> > > > > > >> API's
> >> > > > > > >>>>> benefits may overwhelm the advantage of having one
> uniform
> >> > > > > > definition.
> >> > > > > > >>> And
> >> > > > > > >>>>> for my alternative proposal, the rationale was from both
> >> my
> >> > > > concern
> >> > > > > > >>> about
> >> > > > > > >>>>> "suppressLateEvents" for windowed store, and Matthias'
> >> > question
> >> > > > > about
> >> > > > > > >>>>> "suppressLateEvents" for non-windowed stores, that if it
> >> is
> >> > > less
> >> > > > > > >>>>> meaningful
> >> > > > > > >>>>> for both, we can consider removing it completely and
> only
> >> do
> >> > > > > > >>>>> "IntermediateSuppression" in Suppress instead.
> >> > > > > > >>>>>
> >> > > > > > >>>>> So I'd summarize my thoughts in the following questions:
> >> > > > > > >>>>>
> >> > > > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X
> (window
> >> > > > > retention
> >> > > > > > >>> time)
> >> > > > > > >>>>> for windowed stores make sense in practice?
> >> > > > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> >> > > > non-windowed
> >> > > > > > >>> stores
> >> > > > > > >>>>> make sense in practice?
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> Guozhang
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
> >> > > bbejeck@gmail.com>
> >> > > > > > >> wrote:
> >> > > > > > >>>>>
> >> > > > > > >>>>>> Thanks for the explanation, that does make sense.  I
> have
> >> > some
> >> > > > > > >>>>> questions on
> >> > > > > > >>>>>> operations, but I'll just wait for the PR and tests.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> Thanks,
> >> > > > > > >>>>>> Bill
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
> >> > > john@confluent.io
> >> > > > >
> >> > > > > > >>> wrote:
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>> Hi Bill,
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Thanks for the review!
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Your question is very much applicable to the KIP and
> >> not at
> >> > > all
> >> > > > > an
> >> > > > > > >>>>>>> implementation detail. Thanks for bringing it up.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> I'm proposing not to change the existing caches and
> >> > > > > configurations
> >> > > > > > >>> at
> >> > > > > > >>>>> all
> >> > > > > > >>>>>>> (for now).
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Imagine you have a topology like this:
> >> > > > > > >>>>>>> commit.interval.ms = 100
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> The first ktable (ktable1) will respect the commit
> >> interval
> >> > > and
> >> > > > > > >>> buffer
> >> > > > > > >>>>>>> events for 100ms before logging, storing, or
> forwarding
> >> > them
> >> > > > > > >> (IIRC).
> >> > > > > > >>>>>>> Therefore, the second ktable (suppress) will only see
> >> the
> >> > > > events
> >> > > > > > >> at
> >> > > > > > >>> a
> >> > > > > > >>>>>> rate
> >> > > > > > >>>>>>> of once per 100ms. It will apply its own buffering,
> and
> >> > emit
> >> > > > once
> >> > > > > > >>> per
> >> > > > > > >>>>>> 200ms
> >> > > > > > >>>>>>> This case is pretty trivial because the suppress time
> >> is a
> >> > > > > > >> multiple
> >> > > > > > >>> of
> >> > > > > > >>>>>> the
> >> > > > > > >>>>>>> commit interval.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> When it's not an integer multiple, you'll get behavior
> >> like
> >> > > in
> >> > > > > > >> this
> >> > > > > > >>>>>> marble
> >> > > > > > >>>>>>> diagram:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> [ KTable caching with commit interval = 2 ]
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>       [ suppress with emitAfter = 3 ]
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> <---------------(k:2)----------------(k:6)->
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> If this behavior isn't desired (for example, if you
> >> wanted
> >> > to
> >> > > > > emit
> >> > > > > > >>>>> (k:3)
> >> > > > > > >>>>>> at
> >> > > > > > >>>>>>> time 3, I'd recommend setting the
> >> > "cache.max.bytes.buffering"
> >> > > > to
> >> > > > > 0
> >> > > > > > >>> or
> >> > > > > > >>>>>>> modifying the topology to disable caching. Then, the
> >> > behavior
> >> > > > is
> >> > > > > > >>> more
> >> > > > > > >>>>>>> simply determined just by the suppress operator.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Does that seem right to you?
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Regarding the changelogs, because the suppression
> >> operator
> >> > > > hangs
> >> > > > > > >>> onto
> >> > > > > > >>>>>>> events for a while, it will need its own changelog.
> The
> >> > > > changelog
> >> > > > > > >>>>>>> should represent the current state of the buffer at
> all
> >> > > times.
> >> > > > So
> >> > > > > > >>> when
> >> > > > > > >>>>>> the
> >> > > > > > >>>>>>> suppress operator sees (k:2), for example, it will log
> >> > (k:2).
> >> > > > > When
> >> > > > > > >>> it
> >> > > > > > >>>>>>> later gets to time 3, it's time to emit (k:2)
> >> downstream.
> >> > > > Because
> >> > > > > > >> k
> >> > > > > > >>>>> is no
> >> > > > > > >>>>>>> longer buffered, the suppress operator will log
> >> (k:null).
> >> > > Thus,
> >> > > > > > >> when
> >> > > > > > >>>>>>> recovering,
> >> > > > > > >>>>>>> it can rebuild the buffer by reading its changelog.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> What do you think about this?
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Thanks,
> >> > > > > > >>>>>>> -John
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
> >> > > bbejeck@gmail.com
> >> > > > >
> >> > > > > > >>>>> wrote:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> Hi John,  thanks for the KIP.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Early on in the KIP, you mention the current
> approaches
> >> > for
> >> > > > > > >>>>> controlling
> >> > > > > > >>>>>>> the
> >> > > > > > >>>>>>>> rate of downstream records from a KTable, cache size
> >> > > > > > >> configuration
> >> > > > > > >>>>> and
> >> > > > > > >>>>>>>> commit time.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Will these configuration parameters still be in
> effect
> >> for
> >> > > > > > >> tables
> >> > > > > > >>>>> that
> >> > > > > > >>>>>>>> don't use suppression?  For tables taking advantage
> of
> >> > > > > > >>> suppression,
> >> > > > > > >>>>>> will
> >> > > > > > >>>>>>>> these configurations have no impact?
> >> > > > > > >>>>>>>> This last question may be to implementation specific
> >> but
> >> > if
> >> > > > the
> >> > > > > > >>>>>> requested
> >> > > > > > >>>>>>>> suppression time is longer than the specified commit
> >> time,
> >> > > > will
> >> > > > > > >>> the
> >> > > > > > >>>>>>> latest
> >> > > > > > >>>>>>>> record in the suppression buffer get stored in a
> >> > changelog?
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Thanks,
> >> > > > > > >>>>>>>> Bill
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> >> > > > john@confluent.io
> >> > > > > > >>>
> >> > > > > > >>>>>> wrote:
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>> Thanks for the feedback, Matthias,
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> It seems like in straightforward relational
> processing
> >> > > cases,
> >> > > > > > >> it
> >> > > > > > >>>>>> would
> >> > > > > > >>>>>>>> not
> >> > > > > > >>>>>>>>> make sense to bound the lateness of KTables. In
> >> general,
> >> > it
> >> > > > > > >>> seems
> >> > > > > > >>>>>>> better
> >> > > > > > >>>>>>>> to
> >> > > > > > >>>>>>>>> have "guard rails" in place that make it easier to
> >> write
> >> > > > > > >>> sensible
> >> > > > > > >>>>>>>> programs
> >> > > > > > >>>>>>>>> than insensible ones.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> But I'm still going to argue in favor of keeping it
> >> for
> >> > all
> >> > > > > > >>>>> KTables
> >> > > > > > >>>>>> ;)
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> 1. I believe it is simpler to understand the
> operator
> >> if
> >> > it
> >> > > > > > >> has
> >> > > > > > >>>>> one
> >> > > > > > >>>>>>>> uniform
> >> > > > > > >>>>>>>>> definition, regardless of context. It's well defined
> >> and
> >> > > > > > >>> intuitive
> >> > > > > > >>>>>> what
> >> > > > > > >>>>>>>>> will happen when you use late-event suppression on a
> >> > > KTable,
> >> > > > > > >> so
> >> > > > > > >>> I
> >> > > > > > >>>>>> think
> >> > > > > > >>>>>>>>> nothing surprising or dangerous will happen in that
> >> case.
> >> > > > From
> >> > > > > > >>> my
> >> > > > > > >>>>>>>>> perspective, having two sets of allowed operations
> is
> >> > > > actually
> >> > > > > > >>> an
> >> > > > > > >>>>>>>> increase
> >> > > > > > >>>>>>>>> in cognitive complexity.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this
> way.
> >> > For
> >> > > > > > >>>>> example,
> >> > > > > > >>>>>> in
> >> > > > > > >>>>>>>> lieu
> >> > > > > > >>>>>>>>> of full-featured timestamp semantics, I can
> implement
> >> > MVCC
> >> > > > > > >>>>> behavior
> >> > > > > > >>>>>>> when
> >> > > > > > >>>>>>>>> building a KTable by
> >> "suppressLateEvents(Duration.ZERO)".
> >> > I
> >> > > > > > >>>>> suspect
> >> > > > > > >>>>>>> that
> >> > > > > > >>>>>>>>> there are other, non-obvious applications of
> >> suppressing
> >> > > late
> >> > > > > > >>>>> events
> >> > > > > > >>>>>> on
> >> > > > > > >>>>>>>>> KTables.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> 3. Not to get too much into implementation details
> in
> >> a
> >> > KIP
> >> > > > > > >>>>>> discussion,
> >> > > > > > >>>>>>>> but
> >> > > > > > >>>>>>>>> if we did want to make late-event suppression
> >> available
> >> > > only
> >> > > > > > >> on
> >> > > > > > >>>>>>> windowed
> >> > > > > > >>>>>>>>> KTables, we have two enforcement options:
> >> > > > > > >>>>>>>>>   a. check when we build the topology - this would
> be
> >> > > simple
> >> > > > > > >> to
> >> > > > > > >>>>>>>> implement,
> >> > > > > > >>>>>>>>> but would be a runtime check. Hopefully, people
> write
> >> > tests
> >> > > > > > >> for
> >> > > > > > >>>>> their
> >> > > > > > >>>>>>>>> topology before deploying them, so the feedback loop
> >> > isn't
> >> > > > > > >>>>>>> instantaneous,
> >> > > > > > >>>>>>>>> but it's not too long either.
> >> > > > > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a
> >> > > compile
> >> > > > > > >>> time
> >> > > > > > >>>>>>> check,
> >> > > > > > >>>>>>>>> but would also be substantial increase of both
> >> interface
> >> > > and
> >> > > > > > >>> code
> >> > > > > > >>>>>>>>> complexity.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> We should definitely strive to have guard rails
> >> > protecting
> >> > > > > > >>> against
> >> > > > > > >>>>>>>>> surprising or dangerous behavior. Protecting against
> >> > > programs
> >> > > > > > >>>>> that we
> >> > > > > > >>>>>>>> don't
> >> > > > > > >>>>>>>>> currently predict is a lesser benefit, and I think
> we
> >> can
> >> > > put
> >> > > > > > >> up
> >> > > > > > >>>>>> guard
> >> > > > > > >>>>>>>>> rails on a case-by-case basis for that. It seems
> like
> >> the
> >> > > > > > >>>>> increase in
> >> > > > > > >>>>>>>>> cognitive (and potentially code and interface)
> >> complexity
> >> > > > > > >> makes
> >> > > > > > >>> me
> >> > > > > > >>>>>>> think
> >> > > > > > >>>>>>>> we
> >> > > > > > >>>>>>>>> should skip this case.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> What do you think?
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> Thanks,
> >> > > > > > >>>>>>>>> -John
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> >> > > > > > >>>>>>> matthias@confluent.io>
> >> > > > > > >>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>> Thanks for the KIP John.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> One initial comments about the last example
> "Bounded
> >> > > > > > >>> lateness":
> >> > > > > > >>>>>> For a
> >> > > > > > >>>>>>>>>> non-windowed KTable bounding the lateness does not
> >> > really
> >> > > > > > >> make
> >> > > > > > >>>>>> sense,
> >> > > > > > >>>>>>>>>> does it?
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> Thus, I am wondering if we should allow
> >> > > > > > >> `suppressLateEvents()`
> >> > > > > > >>>>> for
> >> > > > > > >>>>>>> this
> >> > > > > > >>>>>>>>>> case? It seems to be better to only allow it for
> >> > > > > > >>>>> windowed-KTables.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> -Matthias
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> >> > > > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as
> well.
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> What you gave as new example is semantically the
> >> same
> >> > as
> >> > > > > > >>> what
> >> > > > > > >>>>> I
> >> > > > > > >>>>>>>>>> suggested.
> >> > > > > > >>>>>>>>>>> So it is good by me.
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> Thanks
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> >> > > > > > >>>>> john@confluent.io
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Thanks for taking look, Ted,
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> I agree this is a departure from the conventions
> of
> >> > > > > > >> Streams
> >> > > > > > >>>>> DSL.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Most of our config objects have one or two
> >> "required"
> >> > > > > > >>>>>> parameters,
> >> > > > > > >>>>>>>>> which
> >> > > > > > >>>>>>>>>> fit
> >> > > > > > >>>>>>>>>>>> naturally with the static factory method
> approach.
> >> > > > > > >>>>> TimeWindow,
> >> > > > > > >>>>>> for
> >> > > > > > >>>>>>>>>> example,
> >> > > > > > >>>>>>>>>>>> requires a size parameter, so we can naturally
> say
> >> > > > > > >>>>>>>>> TimeWindows.of(size).
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> I think in the case of a suppression, there's
> >> really
> >> > no
> >> > > > > > >>>>> "core"
> >> > > > > > >>>>>>>>>> parameter,
> >> > > > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> >> > > > > > >>>>> Suppression()". I
> >> > > > > > >>>>>>>> think
> >> > > > > > >>>>>>>>>> that
> >> > > > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous,
> since
> >> > there
> >> > > > > > >>> are
> >> > > > > > >>>>>> many
> >> > > > > > >>>>>>>>>> durations
> >> > > > > > >>>>>>>>>>>> that we can configure.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> However, thinking about it again, I suppose that
> I
> >> can
> >> > > > > > >> give
> >> > > > > > >>>>> each
> >> > > > > > >>>>>>>>>>>> configuration method a static version, which
> would
> >> let
> >> > > > > > >> you
> >> > > > > > >>>>>> replace
> >> > > > > > >>>>>>>>> "new
> >> > > > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the
> >> > examples.
> >> > > > > > >>>>>>> Basically,
> >> > > > > > >>>>>>>>>> instead
> >> > > > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I
> >> listed.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> For example:
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> windowCounts
> >> > > > > > >>>>>>>>>>>>     .suppress(
> >> > > > > > >>>>>>>>>>>>         Suppression
> >> > > > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.
> >> > ofMinutes(10))
> >> > > > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> IntermediateSuppression.emitAfter(Duration.ofMinutes(
> >> > 10))
> >> > > > > > >>>>>>>>>>>>             )
> >> > > > > > >>>>>>>>>>>>     );
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Does that seem better?
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Thanks,
> >> > > > > > >>>>>>>>>>>> -John
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> >> > > > > > >>> yuzhihong@gmail.com
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>>> wrote:
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> I started to read this KIP which contains a lot
> of
> >> > > > > > >>>>> materials.
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> One suggestion:
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>     .suppress(
> >> > > > > > >>>>>>>>>>>>>         new Suppression()
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> Do you think it would be more consistent with
> the
> >> > rest
> >> > > > > > >> of
> >> > > > > > >>>>>> Streams
> >> > > > > > >>>>>>>>> data
> >> > > > > > >>>>>>>>>>>>> structures by supporting `of` ?
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> Cheers
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> >> > > > > > >>>>>> john@confluent.io
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>>>> wrote:
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> Hello devs and users,
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> Please take some time to consider this proposal
> >> for
> >> > > > > > >> Kafka
> >> > > > > > >>>>>>> Streams:
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for
> KTables
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> link:
> >> https://cwiki.apache.org/confluence/x/sQU0BQ
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> The basic idea is to provide:
> >> > > > > > >>>>>>>>>>>>>> * more usable control over update rate (vs the
> >> > current
> >> > > > > > >>>>> state
> >> > > > > > >>>>>>> store
> >> > > > > > >>>>>>>>>>>>> caches)
> >> > > > > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations
> >> > feature
> >> > > > > > >>> which
> >> > > > > > >>>>>>> several
> >> > > > > > >>>>>>>>>>>> people
> >> > > > > > >>>>>>>>>>>>>> have requested
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> I look forward to your feedback!
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>> Thanks,
> >> > > > > > >>>>>>>>>>>>>> -John
> >> > > > > > >>>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> --
> >> > > > > > >>>>> -- Guozhang
> >> > > > > > >>>>>
> >> > > > > > >>>>
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> --
> >> > > > > > >> -- Guozhang
> >> > > > > > >>
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > -- Guozhang
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > -- Guozhang
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

I had some opportunity to reflect on the default for close time today...

Note that the current "close time" is equal to the retention time, and
therefore "close" today shares the default retention of 24h.

It would definitely break any application that today specifies a retention
time to set close shorter than that time. It's also likely to break apps if
they *don't* set the retention time and rely on the 24h default. So it's
unfortunate, but I think if "close" isn't set, we should use the retention
time instead of a fixed default.

When we ultimately remove the retention time parameter ("until"), we will
have to set "close" to a default of 24h.

Of course, this has a negative impact on the user of "final results", since
they won't see any output at all for retentionTime/24h, and may find this
confusing. What can we do about this except document it well? Maybe log a
warning if we see that close wasn't explicitly set while using "final
results"?

Thanks,
-John

On Tue, Jul 10, 2018 at 10:46 AM John Roesler <jo...@confluent.io> wrote:

> Hi Guozhang,
>
> That sounds good to me. I'll include that in the KIP.
>
> Thanks,
> -John
>
> On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Let me clarify a bit on what I meant about moving `retentionPeriod` to
>> WindowStoreBuilder:
>>
>> In another discussion we had around KIP-319 / 330, that the "retention
>> period" should not really be a window spec, but only a window store spec,
>> as it only affects how long to retain each window to be queryable along
>> with the storage cost.
>>
>> More specifically, today the "maintainMs" returned from Windows is used in
>> three places:
>>
>> 1) for windowed aggregations, they are passed in directly into
>> `Stores.persistentWindows()` as the retention period parameters. For this
>> use case we should just let the WindowStoreBuilder to specify this value
>> itself.
>>
>> NOTE: It is also returned in the KStreamWindowAggregate processor, to
>> determine if a received record should be dropped due to its lateness. We
>> may need to think of another way to get this value inside the processor
>>
>> 2) for windowed stream-stream join, it is used as the join range parameter
>> but only to check that "windowSizeMs <= retentionPeriodMs". We can do this
>> check at the store builder lever instead of at the processor level.
>>
>>
>> If we can remove its usage in both 1) and 2), then we should be able to
>> safely remove this from the `Windows` spec.
>>
>>
>> Guozhang
>>
>>
>> On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io> wrote:
>>
>> > Thanks for the reply, Guozhang,
>> >
>> > Good! I agree, that is also a good reason, and I actually made use of
>> that
>> > in my tests. I'll update the KIP.
>> >
>> > By the way, I chose "allowedLateness" as I was trying to pick a better
>> name
>> > than "close", but I think it's actually the wrong name. We don't want to
>> > bound the lateness of events in general, only with respect to the end of
>> > their window.
>> >
>> > If we have a window [0,10), with "allowedLateness" of 5, then if we get
>> an
>> > event with timestamp 3 at time 9, the name implies we'd reject it, which
>> > seems silly. Really, we'd only want to start rejecting that event at
>> stream
>> > time 15.
>> >
>> > What I meant was more like "allowedLatenessAfterWindowEnd", but that's
>> too
>> > verbose. I think that "close" + some documentation about what it means
>> will
>> > be better.
>> >
>> > 1: "Close" would be measured from the end of the window, so a reasonable
>> > default would be "0". Recall that "close" really only needs to be
>> specified
>> > for final results, and a default of 0 would produce the most intuitive
>> > results. If folks later discover that they are missing some late events,
>> > they can adjust the parameter accordingly. IMHO, any other value would
>> just
>> > be a guess on our part.
>> >
>> > 2a:
>> > I think you're saying to re-use "until" instead of adding "close" to the
>> > window.
>> >
>> > The downside here would be that the semantic change could be more
>> confusing
>> > than deprecating "until" and introducing window "close" and a
>> > "retentionTime" on the store builder. The deprecation is a good,
>> controlled
>> > way for us to make sure people are getting the semantics they think
>> they're
>> > getting, as well as giving us an opportunity to link people to the API
>> they
>> > should use instead.
>> >
>> > I didn't fully understand the second part, but it sounds like you're
>> > suggesting to add a new "retentionTime" setter to Windows to bridge the
>> gap
>> > until we add it to the store builder? That seems kind of roundabout to
>> me,
>> > if that's what you meant. We could just immediately add it to the store
>> > builders in the same PR.
>> >
>> > 2b: Sounds good to me!
>> >
>> > Thanks again,
>> > -John
>> >
>> >
>> > On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >
>> > > John,
>> > >
>> > > Thanks for your replies. As for the two options of the API, I think
>> I'm
>> > > slightly inclined to the first option as well. My motivation is a bit
>> > > different, as I think of the first one maybe more flexible, for
>> example:
>> > >
>> > > KTable<Windowed<..>> table = ... count();
>> > >
>> > > table.toStream().peek(..);   // want to peek at the changelog stream,
>> do
>> > > not care about final results.
>> > >
>> > > table.suppress().toStream().to("topic");    // sending to a topic,
>> want
>> > to
>> > > only send the final results.
>> > >
>> > > --------------
>> > >
>> > > Besides that, I have a few more minor questions:
>> > >
>> > > 1. For "allowedLateness", what should be the default value? I.e. if
>> user
>> > do
>> > > not specify "allowedLateness" in TimeWindows, what value should we
>> set?
>> > >
>> > > 2. For API names, some personal suggestions here:
>> > >
>> > > 2.a) "allowedLateness"  -> "until" (semantics changed, and also value
>> is
>> > > defined as delta on top of window length), where "until" ->
>> > > "retentionPeriod", and the latter will be removed from `Windows` to `
>> > > WindowStoreBuilder` in the future.
>> > >
>> > > 2.b) "BufferConfig" -> "Buffered" ?
>> > >
>> > >
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io>
>> wrote:
>> > >
>> > > > Hey Matthias and Guozhang,
>> > > >
>> > > > Sorry for the slow reply. I was mulling about your feedback and
>> > weighing
>> > > > some ideas in a sketchbook PR: https://github.com/apache/
>> > kafka/pull/5337
>> > > .
>> > > >
>> > > > Your thought about keeping suppression independent of business logic
>> > is a
>> > > > very good one. I agree that it would make more sense to add some
>> kind
>> > of
>> > > > "window close" concept to the window definition.
>> > > >
>> > > > In fact, doing that immediately solves the inconsistency problem
>> > Guozhang
>> > > > brought up. There's no need to add a "final results" or "emission"
>> > option
>> > > > to the windowed aggregation.
>> > > >
>> > > > What do you think about an API more like this:
>> > > >
>> > > > final StreamsBuilder builder = new StreamsBuilder();
>> > > >
>> > > > builder
>> > > >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
>> > > >   .groupBy(
>> > > >     (String k1, String v1) -> k1,
>> > > >     Serialized.with(STRING_SERDE, STRING_SERDE)
>> > > >   )
>> > > >   .windowedBy(TimeWindows
>> > > >     .of(scaledTime(2L))
>> > > >     .until(scaledTime(3L))
>> > > >     .allowedLateness(scaledTime(1L))
>> > > >   )
>> > > >   .count(Materialized.as("counts"))
>> > > >   .suppress(
>> > > >     emitFinalResultsOnly(
>> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
>> > SHUT_DOWN)
>> > > >     )
>> > > >   )
>> > > >   .toStream()
>> > > >   .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));
>> > > >
>> > > > Note that:
>> > > >  * "emitFinalResultsOnly" is available *only* on windowed tables
>> > > (enforced
>> > > > by the type system at compile time), and it determines the time to
>> wait
>> > > by
>> > > > looking at "allowedLateness" on the TimeWindows config.
>> > > >  * querying "counts" will produce results (eventually) consistent
>> with
>> > > > what's observable in "output-suppressed".
>> > > >  * in all cases, "suppress" has no effect on business logic, just on
>> > > event
>> > > > suppression.
>> > > >
>> > > > Is this API straightforward? Or do you still prefer the version that
>> > both
>> > > > proposed:
>> > > >
>> > > >   ...
>> > > >   .windowedBy(TimeWindows
>> > > >     .of(scaledTime(2L))
>> > > >     .until(scaledTime(3L))
>> > > >     .allowedLateness(scaledTime(1L))
>> > > >   )
>> > > >   .count(
>> > > >     Materialized.as("counts"),
>> > > >     emitFinalResultsOnly(
>> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
>> > SHUT_DOWN)
>> > > >     )
>> > > >   )
>> > > >   ...
>> > > >
>> > > > To me, these two are practically identical, and I still vaguely
>> prefer
>> > > the
>> > > > first one.
>> > > >
>> > > > The prototype has made clearer to me that users of "final results
>> for
>> > > > windows" and users of "suppression for table events" both need to
>> > > configure
>> > > > the suppression buffer.
>> > > >
>> > > > This buffer configuration consists of:
>> > > > 1. how many keys or bytes to keep in memory
>> > > > 2. what to do if memory runs out (shut down, start using disk, ...)
>> > > >
>> > > > So it's not as simple as setting a "final results" flag. We'll
>> either
>> > > have
>> > > > an "Emit" config object on the windowed aggregators that takes the
>> same
>> > > > BufferConfig that the "Suppress" config on the suppression
>> operator, or
>> > > we
>> > > > just use the suppression operator for both.
>> > > >
>> > > > Perhaps it would sweeten the deal a little to point out that we
>> have 2
>> > > > overloads already for each windowed aggregator (with and without
>> > > > Materialized). Adding "Emitted" or something would mean that we'd
>> add a
>> > > new
>> > > > overload for each one, taking us up to 4 overloads each for "count",
>> > > > "aggregate" and "reduce". Using "suppress" means that we don't add
>> any
>> > > new
>> > > > overloads.
>> > > >
>> > > > Thanks again for helping to hash this out,
>> > > > -John
>> > > >
>> > > > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com>
>> > wrote:
>> > > >
>> > > > > I think I agree with Matthias for having dedicated APIs for
>> windowed
>> > > > > operation final output scenario, PLUS separating the window close
>> > which
>> > > > the
>> > > > > "final output" would rely on, from the window retention time
>> itself
>> > > > > (admittedly it would make this KIP effort larger, but if we
>> believe
>> > we
>> > > > need
>> > > > > to do this separation anyways we could just do it now).
>> > > > >
>> > > > > And then we can have the `KTable#suppress()` for
>> > > intermediate-suppression
>> > > > > only, not for late-record-suppression, until we've seen that
>> becomes
>> > a
>> > > > > common feature request because our current design still allows to
>> be
>> > > > > extended for that purpose.
>> > > > >
>> > > > >
>> > > > > Guozhang
>> > > > >
>> > > > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
>> > > matthias@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks for the discussion. I am just catching up.
>> > > > > >
>> > > > > > In general, I think we have different uses cases and
>> non-windowed
>> > and
>> > > > > > windowed is quite different. For the non-windowed case,
>> suppress()
>> > > has
>> > > > > > no (useful) close or retention time, no final semantics, and
>> also
>> > no
>> > > > > > business logic impact.
>> > > > > >
>> > > > > > On the other hand, for windowed aggregations, close time and
>> final
>> > > > > > result do have a meaning. IMHO, `close()` is part of business
>> logic
>> > > > > > while retention time is not. Also, suppression of intermediate
>> > result
>> > > > is
>> > > > > > not a business rule and there might be use case for which either
>> > > "early
>> > > > > > intermediate" (before window end time) are suppressed only, or
>> all
>> > > > > > intermediates are suppressed (maybe also something in the
>> middle,
>> > ie,
>> > > > > > just reduce the load of intermediate updates). Thus,
>> > > window-suppression
>> > > > > > is much richer.
>> > > > > >
>> > > > > > IMHO, a generic `suppress()` operator that can be inserted into
>> the
>> > > > data
>> > > > > > flow at any point is useful. Maybe we should keep is as generic
>> as
>> > > > > > possible. However, it might be difficult to use with regard to
>> > > > > > windowing, as the mental effort to use it is high.
>> > > > > >
>> > > > > > With regard to Guozhang's comment:
>> > > > > >
>> > > > > > > we will actually
>> > > > > > > process data as old as 30 days as well, while most of the late
>> > > > updates
>> > > > > > > beyond 5 minutes would be discarded anyways.
>> > > > > >
>> > > > > > If we use `suppress()` as a standalone operator, this is correct
>> > and
>> > > > > > intended IMHO. To address the issue if the behavior is
>> unwanted, I
>> > > > would
>> > > > > > suggest to add a "suppress option" directly to
>> > > > > > `count()/reduce()/aggregate()` window operator similar to
>> > > > > > `Materialized`. This would be an "embedded suppress" and avoid
>> the
>> > > > > > issue. It would also address the issue about mental effort for
>> > > "single
>> > > > > > final window result" use case.
>> > > > > >
>> > > > > > I also think that a shorter close-time than retention time is
>> > useful
>> > > > for
>> > > > > > window aggregation. If we add close() to the window definition
>> and
>> > > > > > until() to `Materialized`, we can separate both correctly IMHO.
>> > > > > >
>> > > > > > About setting `close = min(close,retention)` I am not sure. We
>> > might
>> > > > > > rather throw an exception than reducing the close time
>> > automatically.
>> > > > > > Otherwise, I see many user question about "I set close to X but
>> it
>> > > does
>> > > > > > not get updated for some data that is with delay of X".
>> > > > > >
>> > > > > > The tricky question might be to design the API in a backward
>> > > compatible
>> > > > > > way though.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > -Matthias
>> > > > > >
>> > > > > > On 7/3/18 5:38 AM, John Roesler wrote:
>> > > > > > > Hi Guozhang,
>> > > > > > >
>> > > > > > > I see. It seems like if we want to decouple 1) and 2), we
>> need to
>> > > > alter
>> > > > > > the
>> > > > > > > definition of the window. Do you think it would close the gap
>> if
>> > we
>> > > > > > added a
>> > > > > > > "window close" time to the window definition?
>> > > > > > >
>> > > > > > > Such as:
>> > > > > > >
>> > > > > > > builder.stream("input")
>> > > > > > > .groupByKey()
>> > > > > > > .windowedBy(
>> > > > > > >   TimeWindows
>> > > > > > >     .of(60_000)
>> > > > > > >     .closeAfter(10 * 60)
>> > > > > > >     .until(30L * 24 * 60 * 60 * 1000)
>> > > > > > > )
>> > > > > > > .count()
>> > > > > > > .suppress(Suppression.finalResultsOnly());
>> > > > > > >
>> > > > > > > Possibly called "finalResultsAtWindowClose" or something?
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > -John
>> > > > > > >
>> > > > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <
>> wangguoz@gmail.com
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > >> Hey John,
>> > > > > > >>
>> > > > > > >> Obviously I'm too lazy on email replying diligence compared
>> with
>> > > you
>> > > > > :)
>> > > > > > >> Will try to reply them separately:
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> ------------------------------------------------------------
>> > > > > > -----------------
>> > > > > > >>
>> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
>> > > > > > >>
>> > > > > > >> I'm aware of this use case, but again, the concern is that,
>> in
>> > > this
>> > > > > > setting
>> > > > > > >> in order to let the window be queryable for 30 days, we will
>> > > > actually
>> > > > > > >> process data as old as 30 days as well, while most of the
>> late
>> > > > updates
>> > > > > > >> beyond 5 minutes would be discarded anyways. Personally I
>> think
>> > > for
>> > > > > the
>> > > > > > >> final update scenario, the ideal situation users would want
>> is
>> > > that
>> > > > > "do
>> > > > > > not
>> > > > > > >> process any data that is less than 5 minutes, and of course
>> no
>> > > > update
>> > > > > > >> records to the downstream later than 5 minutes either; but
>> > retain
>> > > > the
>> > > > > > >> window to be queryable for 30 days". And by doing that the
>> final
>> > > > > window
>> > > > > > >> snapshot would also be aligned with the update stream as
>> well.
>> > In
>> > > > > other
>> > > > > > >> words, among these three periods:
>> > > > > > >>
>> > > > > > >> 1) the retention length of the window / table.
>> > > > > > >> 2) the late records acceptance for updating the window.
>> > > > > > >> 3) the late records update to be sent downstream.
>> > > > > > >>
>> > > > > > >> Final update use cases would naturally want 2) = 3), while 1)
>> > may
>> > > be
>> > > > > > >> different and larger, while what we provide now is that 1) =
>> 2),
>> > > > which
>> > > > > > >> could be different and in practice larger than 3), hence not
>> the
>> > > > most
>> > > > > > >> intuitive for their needs.
>> > > > > > >>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> ------------------------------------------------------------
>> > > > > > -----------------
>> > > > > > >>
>> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
>> > > > > > >>
>> > > > > > >> I'd like option 2) over option 1) better as well from
>> > programming
>> > > > pov.
>> > > > > > But
>> > > > > > >> I'm wondering if option 2) would provide the above semantics
>> or
>> > it
>> > > > is
>> > > > > > still
>> > > > > > >> coupling 1) with 2) as well ?
>> > > > > > >>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> Guozhang
>> > > > > > >>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <
>> john@confluent.io
>> > >
>> > > > > wrote:
>> > > > > > >>
>> > > > > > >>> In fact, to push the idea further (which IIRC is what
>> Matthias
>> > > > > > originally
>> > > > > > >>> proposed), if we can accept "Suppression#finalResultsOnly"
>> in
>> > my
>> > > > last
>> > > > > > >>> email, then we could also consider whether to eliminate
>> > > > > > >>> "suppressLateEvents" entirely.
>> > > > > > >>>
>> > > > > > >>> We could always add it later, but you've both expressed
>> doubt
>> > > that
>> > > > > > there
>> > > > > > >>> are practical use cases for it outside of final-results.
>> > > > > > >>>
>> > > > > > >>> -John
>> > > > > > >>>
>> > > > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
>> > john@confluent.io>
>> > > > > > wrote:
>> > > > > > >>>
>> > > > > > >>>> Hi again, Guozhang ;) Here's the second part of my
>> response...
>> > > > > > >>>>
>> > > > > > >>>> It seems like your main concern is: "if I'm a user who
>> wants
>> > > final
>> > > > > > >> update
>> > > > > > >>>> semantics, how complicated is it for me to get it?"
>> > > > > > >>>>
>> > > > > > >>>> I think we have to assume that people don't always have
>> time
>> > to
>> > > > > become
>> > > > > > >>>> deeply familiar with all the nuances of a programming
>> > > environment
>> > > > > > >> before
>> > > > > > >>>> they use it. Especially if they're evaluating several
>> > frameworks
>> > > > for
>> > > > > > >>> their
>> > > > > > >>>> use case, it's very valuable to make it as obvious as
>> possible
>> > > how
>> > > > > to
>> > > > > > >>>> accomplish various computations with Streams.
>> > > > > > >>>>
>> > > > > > >>>> To me the biggest question is whether with a fresh
>> > perspective,
>> > > > > people
>> > > > > > >>>> would say "oh, I get it, I have to bound my lateness and
>> > > suppress
>> > > > > > >>>> intermediate updates, and of course I'll get only the final
>> > > > > result!",
>> > > > > > >> or
>> > > > > > >>> if
>> > > > > > >>>> it's more like "wtf? all I want is the final result, what
>> are
>> > > all
>> > > > > > these
>> > > > > > >>>> parameters?".
>> > > > > > >>>>
>> > > > > > >>>> I was talking with Matthias a while back, and he had an
>> idea
>> > > that
>> > > > I
>> > > > > > >> think
>> > > > > > >>>> can help, which is to essentially set up a final-result
>> recipe
>> > > in
>> > > > > > >>> addition
>> > > > > > >>>> to the raw parameters. I previously thought that it
>> wouldn't
>> > be
>> > > > > > >> possible
>> > > > > > >>> to
>> > > > > > >>>> restrict its usage to Windowed KTables, but thinking about
>> it
>> > > > again
>> > > > > > >> this
>> > > > > > >>>> weekend, I have a couple of ideas:
>> > > > > > >>>>
>> > > > > > >>>> ================
>> > > > > > >>>> = 1. Static Wrapper =
>> > > > > > >>>> ================
>> > > > > > >>>> We can define an extra static function that "wraps" a
>> KTable
>> > > with
>> > > > > > >>>> final-result semantics.
>> > > > > > >>>>
>> > > > > > >>>> public static <K extends Windowed, V> KTable<K, V>
>> > > > finalResultsOnly(
>> > > > > > >>>>   final KTable<K, V> windowedKTable,
>> > > > > > >>>>   final Duration maxAllowedLateness,
>> > > > > > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy)
>> {
>> > > > > > >>>>     return windowedKTable.suppress(
>> > > > > > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
>> > > > > > >>>>                    .suppressIntermediateEvents(
>> > > > > > >>>>                      IntermediateSuppression
>> > > > > > >>>>                        .emitAfter(maxAllowedLateness)
>> > > > > > >>>>                        .bufferFullStrategy(
>> > bufferFullStrategy)
>> > > > > > >>>>                    )
>> > > > > > >>>>     );
>> > > > > > >>>> }
>> > > > > > >>>>
>> > > > > > >>>> Because windowedKTable is a parameter, the static function
>> can
>> > > > > easily
>> > > > > > >>>> impose an extra bound on the key type, that it extends
>> > Windowed.
>> > > > > This
>> > > > > > >>> would
>> > > > > > >>>> make "final results only" only available on windowed
>> ktables.
>> > > > > > >>>>
>> > > > > > >>>> Here's how it would look to use:
>> > > > > > >>>>
>> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
>> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
>> > > > > > >>>>   finalResultsOnly(
>> > > > > > >>>>     windowCounts,
>> > > > > > >>>>     Duration.ofMinutes(10),
>> > > > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
>> > > > > > >>>>   );
>> > > > > > >>>>
>> > > > > > >>>> Trying to use it on a non-windowed KTable yields:
>> > > > > > >>>>
>> > > > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
>> > > > > > >>>>> org.apache.kafka.streams.kstream.internals.
>> > KTableAggregateTest
>> > > > > > cannot
>> > > > > > >>> be
>> > > > > > >>>>> applied to given types;
>> > > > > > >>>>>   required:
>> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
>> > > > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
>> > > > > > BufferFullStrategy
>> > > > > > >>>>>   found:
>> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
>> > > > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
>> > > > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
>> > > > > > >>>>>   reason: inference variable K has incompatible bounds
>> > > > > > >>>>>     equality constraints: java.lang.String
>> > > > > > >>>>>     upper bounds:
>> org.apache.kafka.streams.kstream.Windowed
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> =================================================
>> > > > > > >>>> = 2. Add <K,V> parameters and recipe method to Suppression
>> =
>> > > > > > >>>> =================================================
>> > > > > > >>>>
>> > > > > > >>>> By adding K,V parameters to Suppression, we can provide a
>> > > > similarly
>> > > > > > >>>> bounded config method directly on the Suppression class:
>> > > > > > >>>>
>> > > > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
>> > > > > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
>> > > > > > >>>> BufferFullStrategy bufferFullStrategy) {
>> > > > > > >>>>     return Suppression
>> > > > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
>> > > > > > >>>>         .suppressIntermediateEvents(IntermediateSuppression
>> > > > > > >>>>             .emitAfter(maxAllowedLateness)
>> > > > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
>> > > > > > >>>>         );
>> > > > > > >>>> }
>> > > > > > >>>>
>> > > > > > >>>> Then, here's how it would look to use it:
>> > > > > > >>>>
>> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
>> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
>> > > > > > >>>>   windowCounts.suppress(
>> > > > > > >>>>     Suppression.finalResultsOnly(
>> > > > > > >>>>       Duration.ofMinutes(10)
>> > > > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
>> > > > > > >>>>     )
>> > > > > > >>>>   );
>> > > > > > >>>>
>> > > > > > >>>> Trying to use it on a non-windowed ktable yields:
>> > > > > > >>>>
>> > > > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
>> > > > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot
>> be
>> > > > applied
>> > > > > > to
>> > > > > > >>>>> given types;
>> > > > > > >>>>>   required:
>> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>> > > > > > >>> Suppression.BufferFullStrategy
>> > > > > > >>>>>   found:
>> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>> > > > > > >>> Suppression.BufferFullStrategy
>> > > > > > >>>>>   reason: explicit type argument java.lang.String does not
>> > > > conform
>> > > > > to
>> > > > > > >>>>> declared bound(s)
>> org.apache.kafka.streams.kstream.Windowed
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> ============
>> > > > > > >>>> = Downsides =
>> > > > > > >>>> ============
>> > > > > > >>>>
>> > > > > > >>>> Of course, there's a downside either way:
>> > > > > > >>>> * for 1:  this "wrapper" interaction would be the first in
>> the
>> > > > DSL.
>> > > > > Is
>> > > > > > >> it
>> > > > > > >>>> too strange, and how discoverable would it be?
>> > > > > > >>>> * for 2: adding those type parameters to Suppression will
>> > force
>> > > > all
>> > > > > > >>>> callers to provide them in the event of a chained
>> construction
>> > > > > because
>> > > > > > >>> Java
>> > > > > > >>>> doesn't do RHS recursive type inference. This is already
>> > visible
>> > > > in
>> > > > > > >> other
>> > > > > > >>>> parts of the Streams DSL. For example, often calls to
>> > > Materialized
>> > > > > > >>> builders
>> > > > > > >>>> have to provide seemingly obvious type bounds.
>> > > > > > >>>>
>> > > > > > >>>> ============
>> > > > > > >>>> = Conclusion =
>> > > > > > >>>> ============
>> > > > > > >>>>
>> > > > > > >>>> I think option 2 is more "normal" and discoverable. It does
>> > > have a
>> > > > > > >>>> downside, but it's one that's pre-existing elsewhere in the
>> > DSL.
>> > > > > > >>>>
>> > > > > > >>>> WDYT? Would the addition of this "recipe" method to
>> > Suppression
>> > > > > > resolve
>> > > > > > >>>> your concern?
>> > > > > > >>>>
>> > > > > > >>>> Thanks again,
>> > > > > > >>>> -John
>> > > > > > >>>>
>> > > > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
>> > > wangguoz@gmail.com
>> > > > >
>> > > > > > >>> wrote:
>> > > > > > >>>>
>> > > > > > >>>>> Hi John,
>> > > > > > >>>>>
>> > > > > > >>>>> Regarding the metrics: yeah I think I'm with you that the
>> > > dropped
>> > > > > > >>> records
>> > > > > > >>>>> due to window retention or emit suppression policies
>> should
>> > be
>> > > > > > >> recorded
>> > > > > > >>>>> differently, and using this KIP's proposed metric would be
>> > > fine.
>> > > > If
>> > > > > > >> you
>> > > > > > >>>>> also think we can use this KIP's proposed metrics to cover
>> > the
>> > > > > window
>> > > > > > >>>>> retention cased skipping records, then we can include the
>> > > changes
>> > > > > in
>> > > > > > >>> this
>> > > > > > >>>>> KIP as well.
>> > > > > > >>>>>
>> > > > > > >>>>> Regarding the current proposal, I'm actually not too
>> worried
>> > > > about
>> > > > > > the
>> > > > > > >>>>> inconsistency between query semantics and downstream emit
>> > > > > semantics.
>> > > > > > >> For
>> > > > > > >>>>> queries, we will always return the current running
>> results of
>> > > the
>> > > > > > >>> windows,
>> > > > > > >>>>> being it partial or final results depending on the window
>> > > > retention
>> > > > > > >> time
>> > > > > > >>>>> anyways, which has nothing to do whether the emitted
>> stream
>> > > > should
>> > > > > be
>> > > > > > >>> one
>> > > > > > >>>>> final output per key or not. I also agree that having a
>> > unified
>> > > > > > >>> operation
>> > > > > > >>>>> is generally better for users to focus on leveraging that
>> one
>> > > > only
>> > > > > > >> than
>> > > > > > >>>>> learning about two set of operations. The only question I
>> had
>> > > is,
>> > > > > for
>> > > > > > >>>>> final
>> > > > > > >>>>> updates of window stores, if it is a bit awkward to
>> > understand
>> > > > the
>> > > > > > >>>>> configuration combo. Thinking about this more, I think my
>> > root
>> > > > > worry
>> > > > > > >> in
>> > > > > > >>>>> the
>> > > > > > >>>>> "suppressLateEvents" call for windowed tables, since from
>> a
>> > > user
>> > > > > > >>>>> perspective: if my retention time is X which means "pay
>> the
>> > > cost
>> > > > to
>> > > > > > >>> allow
>> > > > > > >>>>> late records up to X to still be applied updating the
>> > tables",
>> > > > why
>> > > > > > >>> would I
>> > > > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do
>> not
>> > > send
>> > > > > the
>> > > > > > >>>>> updates up to Y, which means the downstream operator or
>> sink
>> > > > topic
>> > > > > > for
>> > > > > > >>>>> this
>> > > > > > >>>>> stream would actually see a truncated update stream while
>> > I've
>> > > > paid
>> > > > > > >>> larger
>> > > > > > >>>>> cost for that"; and of course, Y > X would not make sense
>> > > either
>> > > > as
>> > > > > > >> you
>> > > > > > >>>>> would not see any updates later than X anyways. So in
>> all, my
>> > > > > feeling
>> > > > > > >> is
>> > > > > > >>>>> that it makes less sense for windowed table's
>> > > > "suppressLateEvents"
>> > > > > > >> with
>> > > > > > >>> a
>> > > > > > >>>>> parameter that is not equal to the window retention, and
>> > > opening
>> > > > > the
>> > > > > > >>> door
>> > > > > > >>>>> in the current proposal may confuse people with that.
>> > > > > > >>>>>
>> > > > > > >>>>> Again, above is just a subjective opinion and probably we
>> can
>> > > > also
>> > > > > > >> bring
>> > > > > > >>>>> up
>> > > > > > >>>>> some scenarios that users does want to set X != Y.. but
>> > > > personally
>> > > > > I
>> > > > > > >>> feel
>> > > > > > >>>>> that even if the semantics for this scenario if intuitive
>> for
>> > > > user
>> > > > > to
>> > > > > > >>>>> understand, doe that really make sense and should we
>> really
>> > > open
>> > > > > the
>> > > > > > >>> door
>> > > > > > >>>>> for it. So I think maybe separating the final update in a
>> > > > separate
>> > > > > > >> API's
>> > > > > > >>>>> benefits may overwhelm the advantage of having one uniform
>> > > > > > definition.
>> > > > > > >>> And
>> > > > > > >>>>> for my alternative proposal, the rationale was from both
>> my
>> > > > concern
>> > > > > > >>> about
>> > > > > > >>>>> "suppressLateEvents" for windowed store, and Matthias'
>> > question
>> > > > > about
>> > > > > > >>>>> "suppressLateEvents" for non-windowed stores, that if it
>> is
>> > > less
>> > > > > > >>>>> meaningful
>> > > > > > >>>>> for both, we can consider removing it completely and only
>> do
>> > > > > > >>>>> "IntermediateSuppression" in Suppress instead.
>> > > > > > >>>>>
>> > > > > > >>>>> So I'd summarize my thoughts in the following questions:
>> > > > > > >>>>>
>> > > > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
>> > > > > retention
>> > > > > > >>> time)
>> > > > > > >>>>> for windowed stores make sense in practice?
>> > > > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
>> > > > non-windowed
>> > > > > > >>> stores
>> > > > > > >>>>> make sense in practice?
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> Guozhang
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
>> > > bbejeck@gmail.com>
>> > > > > > >> wrote:
>> > > > > > >>>>>
>> > > > > > >>>>>> Thanks for the explanation, that does make sense.  I have
>> > some
>> > > > > > >>>>> questions on
>> > > > > > >>>>>> operations, but I'll just wait for the PR and tests.
>> > > > > > >>>>>>
>> > > > > > >>>>>> Thanks,
>> > > > > > >>>>>> Bill
>> > > > > > >>>>>>
>> > > > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
>> > > john@confluent.io
>> > > > >
>> > > > > > >>> wrote:
>> > > > > > >>>>>>
>> > > > > > >>>>>>> Hi Bill,
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Thanks for the review!
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Your question is very much applicable to the KIP and
>> not at
>> > > all
>> > > > > an
>> > > > > > >>>>>>> implementation detail. Thanks for bringing it up.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> I'm proposing not to change the existing caches and
>> > > > > configurations
>> > > > > > >>> at
>> > > > > > >>>>> all
>> > > > > > >>>>>>> (for now).
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Imagine you have a topology like this:
>> > > > > > >>>>>>> commit.interval.ms = 100
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> The first ktable (ktable1) will respect the commit
>> interval
>> > > and
>> > > > > > >>> buffer
>> > > > > > >>>>>>> events for 100ms before logging, storing, or forwarding
>> > them
>> > > > > > >> (IIRC).
>> > > > > > >>>>>>> Therefore, the second ktable (suppress) will only see
>> the
>> > > > events
>> > > > > > >> at
>> > > > > > >>> a
>> > > > > > >>>>>> rate
>> > > > > > >>>>>>> of once per 100ms. It will apply its own buffering, and
>> > emit
>> > > > once
>> > > > > > >>> per
>> > > > > > >>>>>> 200ms
>> > > > > > >>>>>>> This case is pretty trivial because the suppress time
>> is a
>> > > > > > >> multiple
>> > > > > > >>> of
>> > > > > > >>>>>> the
>> > > > > > >>>>>>> commit interval.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> When it's not an integer multiple, you'll get behavior
>> like
>> > > in
>> > > > > > >> this
>> > > > > > >>>>>> marble
>> > > > > > >>>>>>> diagram:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> [ KTable caching with commit interval = 2 ]
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>       [ suppress with emitAfter = 3 ]
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> <---------------(k:2)----------------(k:6)->
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> If this behavior isn't desired (for example, if you
>> wanted
>> > to
>> > > > > emit
>> > > > > > >>>>> (k:3)
>> > > > > > >>>>>> at
>> > > > > > >>>>>>> time 3, I'd recommend setting the
>> > "cache.max.bytes.buffering"
>> > > > to
>> > > > > 0
>> > > > > > >>> or
>> > > > > > >>>>>>> modifying the topology to disable caching. Then, the
>> > behavior
>> > > > is
>> > > > > > >>> more
>> > > > > > >>>>>>> simply determined just by the suppress operator.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Does that seem right to you?
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Regarding the changelogs, because the suppression
>> operator
>> > > > hangs
>> > > > > > >>> onto
>> > > > > > >>>>>>> events for a while, it will need its own changelog. The
>> > > > changelog
>> > > > > > >>>>>>> should represent the current state of the buffer at all
>> > > times.
>> > > > So
>> > > > > > >>> when
>> > > > > > >>>>>> the
>> > > > > > >>>>>>> suppress operator sees (k:2), for example, it will log
>> > (k:2).
>> > > > > When
>> > > > > > >>> it
>> > > > > > >>>>>>> later gets to time 3, it's time to emit (k:2)
>> downstream.
>> > > > Because
>> > > > > > >> k
>> > > > > > >>>>> is no
>> > > > > > >>>>>>> longer buffered, the suppress operator will log
>> (k:null).
>> > > Thus,
>> > > > > > >> when
>> > > > > > >>>>>>> recovering,
>> > > > > > >>>>>>> it can rebuild the buffer by reading its changelog.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> What do you think about this?
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Thanks,
>> > > > > > >>>>>>> -John
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
>> > > bbejeck@gmail.com
>> > > > >
>> > > > > > >>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> Hi John,  thanks for the KIP.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Early on in the KIP, you mention the current approaches
>> > for
>> > > > > > >>>>> controlling
>> > > > > > >>>>>>> the
>> > > > > > >>>>>>>> rate of downstream records from a KTable, cache size
>> > > > > > >> configuration
>> > > > > > >>>>> and
>> > > > > > >>>>>>>> commit time.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Will these configuration parameters still be in effect
>> for
>> > > > > > >> tables
>> > > > > > >>>>> that
>> > > > > > >>>>>>>> don't use suppression?  For tables taking advantage of
>> > > > > > >>> suppression,
>> > > > > > >>>>>> will
>> > > > > > >>>>>>>> these configurations have no impact?
>> > > > > > >>>>>>>> This last question may be to implementation specific
>> but
>> > if
>> > > > the
>> > > > > > >>>>>> requested
>> > > > > > >>>>>>>> suppression time is longer than the specified commit
>> time,
>> > > > will
>> > > > > > >>> the
>> > > > > > >>>>>>> latest
>> > > > > > >>>>>>>> record in the suppression buffer get stored in a
>> > changelog?
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Thanks,
>> > > > > > >>>>>>>> Bill
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
>> > > > john@confluent.io
>> > > > > > >>>
>> > > > > > >>>>>> wrote:
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>> Thanks for the feedback, Matthias,
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> It seems like in straightforward relational processing
>> > > cases,
>> > > > > > >> it
>> > > > > > >>>>>> would
>> > > > > > >>>>>>>> not
>> > > > > > >>>>>>>>> make sense to bound the lateness of KTables. In
>> general,
>> > it
>> > > > > > >>> seems
>> > > > > > >>>>>>> better
>> > > > > > >>>>>>>> to
>> > > > > > >>>>>>>>> have "guard rails" in place that make it easier to
>> write
>> > > > > > >>> sensible
>> > > > > > >>>>>>>> programs
>> > > > > > >>>>>>>>> than insensible ones.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> But I'm still going to argue in favor of keeping it
>> for
>> > all
>> > > > > > >>>>> KTables
>> > > > > > >>>>>> ;)
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> 1. I believe it is simpler to understand the operator
>> if
>> > it
>> > > > > > >> has
>> > > > > > >>>>> one
>> > > > > > >>>>>>>> uniform
>> > > > > > >>>>>>>>> definition, regardless of context. It's well defined
>> and
>> > > > > > >>> intuitive
>> > > > > > >>>>>> what
>> > > > > > >>>>>>>>> will happen when you use late-event suppression on a
>> > > KTable,
>> > > > > > >> so
>> > > > > > >>> I
>> > > > > > >>>>>> think
>> > > > > > >>>>>>>>> nothing surprising or dangerous will happen in that
>> case.
>> > > > From
>> > > > > > >>> my
>> > > > > > >>>>>>>>> perspective, having two sets of allowed operations is
>> > > > actually
>> > > > > > >>> an
>> > > > > > >>>>>>>> increase
>> > > > > > >>>>>>>>> in cognitive complexity.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this way.
>> > For
>> > > > > > >>>>> example,
>> > > > > > >>>>>> in
>> > > > > > >>>>>>>> lieu
>> > > > > > >>>>>>>>> of full-featured timestamp semantics, I can implement
>> > MVCC
>> > > > > > >>>>> behavior
>> > > > > > >>>>>>> when
>> > > > > > >>>>>>>>> building a KTable by
>> "suppressLateEvents(Duration.ZERO)".
>> > I
>> > > > > > >>>>> suspect
>> > > > > > >>>>>>> that
>> > > > > > >>>>>>>>> there are other, non-obvious applications of
>> suppressing
>> > > late
>> > > > > > >>>>> events
>> > > > > > >>>>>> on
>> > > > > > >>>>>>>>> KTables.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> 3. Not to get too much into implementation details in
>> a
>> > KIP
>> > > > > > >>>>>> discussion,
>> > > > > > >>>>>>>> but
>> > > > > > >>>>>>>>> if we did want to make late-event suppression
>> available
>> > > only
>> > > > > > >> on
>> > > > > > >>>>>>> windowed
>> > > > > > >>>>>>>>> KTables, we have two enforcement options:
>> > > > > > >>>>>>>>>   a. check when we build the topology - this would be
>> > > simple
>> > > > > > >> to
>> > > > > > >>>>>>>> implement,
>> > > > > > >>>>>>>>> but would be a runtime check. Hopefully, people write
>> > tests
>> > > > > > >> for
>> > > > > > >>>>> their
>> > > > > > >>>>>>>>> topology before deploying them, so the feedback loop
>> > isn't
>> > > > > > >>>>>>> instantaneous,
>> > > > > > >>>>>>>>> but it's not too long either.
>> > > > > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a
>> > > compile
>> > > > > > >>> time
>> > > > > > >>>>>>> check,
>> > > > > > >>>>>>>>> but would also be substantial increase of both
>> interface
>> > > and
>> > > > > > >>> code
>> > > > > > >>>>>>>>> complexity.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> We should definitely strive to have guard rails
>> > protecting
>> > > > > > >>> against
>> > > > > > >>>>>>>>> surprising or dangerous behavior. Protecting against
>> > > programs
>> > > > > > >>>>> that we
>> > > > > > >>>>>>>> don't
>> > > > > > >>>>>>>>> currently predict is a lesser benefit, and I think we
>> can
>> > > put
>> > > > > > >> up
>> > > > > > >>>>>> guard
>> > > > > > >>>>>>>>> rails on a case-by-case basis for that. It seems like
>> the
>> > > > > > >>>>> increase in
>> > > > > > >>>>>>>>> cognitive (and potentially code and interface)
>> complexity
>> > > > > > >> makes
>> > > > > > >>> me
>> > > > > > >>>>>>> think
>> > > > > > >>>>>>>> we
>> > > > > > >>>>>>>>> should skip this case.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> What do you think?
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> Thanks,
>> > > > > > >>>>>>>>> -John
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
>> > > > > > >>>>>>> matthias@confluent.io>
>> > > > > > >>>>>>>>> wrote:
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>> Thanks for the KIP John.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> One initial comments about the last example "Bounded
>> > > > > > >>> lateness":
>> > > > > > >>>>>> For a
>> > > > > > >>>>>>>>>> non-windowed KTable bounding the lateness does not
>> > really
>> > > > > > >> make
>> > > > > > >>>>>> sense,
>> > > > > > >>>>>>>>>> does it?
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> Thus, I am wondering if we should allow
>> > > > > > >> `suppressLateEvents()`
>> > > > > > >>>>> for
>> > > > > > >>>>>>> this
>> > > > > > >>>>>>>>>> case? It seems to be better to only allow it for
>> > > > > > >>>>> windowed-KTables.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> -Matthias
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
>> > > > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> What you gave as new example is semantically the
>> same
>> > as
>> > > > > > >>> what
>> > > > > > >>>>> I
>> > > > > > >>>>>>>>>> suggested.
>> > > > > > >>>>>>>>>>> So it is good by me.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Thanks
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
>> > > > > > >>>>> john@confluent.io
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>> wrote:
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Thanks for taking look, Ted,
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> I agree this is a departure from the conventions of
>> > > > > > >> Streams
>> > > > > > >>>>> DSL.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Most of our config objects have one or two
>> "required"
>> > > > > > >>>>>> parameters,
>> > > > > > >>>>>>>>> which
>> > > > > > >>>>>>>>>> fit
>> > > > > > >>>>>>>>>>>> naturally with the static factory method approach.
>> > > > > > >>>>> TimeWindow,
>> > > > > > >>>>>> for
>> > > > > > >>>>>>>>>> example,
>> > > > > > >>>>>>>>>>>> requires a size parameter, so we can naturally say
>> > > > > > >>>>>>>>> TimeWindows.of(size).
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> I think in the case of a suppression, there's
>> really
>> > no
>> > > > > > >>>>> "core"
>> > > > > > >>>>>>>>>> parameter,
>> > > > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
>> > > > > > >>>>> Suppression()". I
>> > > > > > >>>>>>>> think
>> > > > > > >>>>>>>>>> that
>> > > > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since
>> > there
>> > > > > > >>> are
>> > > > > > >>>>>> many
>> > > > > > >>>>>>>>>> durations
>> > > > > > >>>>>>>>>>>> that we can configure.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> However, thinking about it again, I suppose that I
>> can
>> > > > > > >> give
>> > > > > > >>>>> each
>> > > > > > >>>>>>>>>>>> configuration method a static version, which would
>> let
>> > > > > > >> you
>> > > > > > >>>>>> replace
>> > > > > > >>>>>>>>> "new
>> > > > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the
>> > examples.
>> > > > > > >>>>>>> Basically,
>> > > > > > >>>>>>>>>> instead
>> > > > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I
>> listed.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> For example:
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> windowCounts
>> > > > > > >>>>>>>>>>>>     .suppress(
>> > > > > > >>>>>>>>>>>>         Suppression
>> > > > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.
>> > ofMinutes(10))
>> > > > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> IntermediateSuppression.emitAfter(Duration.ofMinutes(
>> > 10))
>> > > > > > >>>>>>>>>>>>             )
>> > > > > > >>>>>>>>>>>>     );
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Does that seem better?
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Thanks,
>> > > > > > >>>>>>>>>>>> -John
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
>> > > > > > >>> yuzhihong@gmail.com
>> > > > > > >>>>>>
>> > > > > > >>>>>>>> wrote:
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
>> > > > > > >>>>> materials.
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> One suggestion:
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>     .suppress(
>> > > > > > >>>>>>>>>>>>>         new Suppression()
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> Do you think it would be more consistent with the
>> > rest
>> > > > > > >> of
>> > > > > > >>>>>> Streams
>> > > > > > >>>>>>>>> data
>> > > > > > >>>>>>>>>>>>> structures by supporting `of` ?
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> Cheers
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
>> > > > > > >>>>>> john@confluent.io
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>>>> wrote:
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> Hello devs and users,
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> Please take some time to consider this proposal
>> for
>> > > > > > >> Kafka
>> > > > > > >>>>>>> Streams:
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> link:
>> https://cwiki.apache.org/confluence/x/sQU0BQ
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> The basic idea is to provide:
>> > > > > > >>>>>>>>>>>>>> * more usable control over update rate (vs the
>> > current
>> > > > > > >>>>> state
>> > > > > > >>>>>>> store
>> > > > > > >>>>>>>>>>>>> caches)
>> > > > > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations
>> > feature
>> > > > > > >>> which
>> > > > > > >>>>>>> several
>> > > > > > >>>>>>>>>>>> people
>> > > > > > >>>>>>>>>>>>>> have requested
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> I look forward to your feedback!
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>> Thanks,
>> > > > > > >>>>>>>>>>>>>> -John
>> > > > > > >>>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> --
>> > > > > > >>>>> -- Guozhang
>> > > > > > >>>>>
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> --
>> > > > > > >> -- Guozhang
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > -- Guozhang
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Guozhang,

That sounds good to me. I'll include that in the KIP.

Thanks,
-John

On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang <wa...@gmail.com> wrote:

> Let me clarify a bit on what I meant about moving `retentionPeriod` to
> WindowStoreBuilder:
>
> In another discussion we had around KIP-319 / 330, that the "retention
> period" should not really be a window spec, but only a window store spec,
> as it only affects how long to retain each window to be queryable along
> with the storage cost.
>
> More specifically, today the "maintainMs" returned from Windows is used in
> three places:
>
> 1) for windowed aggregations, they are passed in directly into
> `Stores.persistentWindows()` as the retention period parameters. For this
> use case we should just let the WindowStoreBuilder to specify this value
> itself.
>
> NOTE: It is also returned in the KStreamWindowAggregate processor, to
> determine if a received record should be dropped due to its lateness. We
> may need to think of another way to get this value inside the processor
>
> 2) for windowed stream-stream join, it is used as the join range parameter
> but only to check that "windowSizeMs <= retentionPeriodMs". We can do this
> check at the store builder lever instead of at the processor level.
>
>
> If we can remove its usage in both 1) and 2), then we should be able to
> safely remove this from the `Windows` spec.
>
>
> Guozhang
>
>
> On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Thanks for the reply, Guozhang,
> >
> > Good! I agree, that is also a good reason, and I actually made use of
> that
> > in my tests. I'll update the KIP.
> >
> > By the way, I chose "allowedLateness" as I was trying to pick a better
> name
> > than "close", but I think it's actually the wrong name. We don't want to
> > bound the lateness of events in general, only with respect to the end of
> > their window.
> >
> > If we have a window [0,10), with "allowedLateness" of 5, then if we get
> an
> > event with timestamp 3 at time 9, the name implies we'd reject it, which
> > seems silly. Really, we'd only want to start rejecting that event at
> stream
> > time 15.
> >
> > What I meant was more like "allowedLatenessAfterWindowEnd", but that's
> too
> > verbose. I think that "close" + some documentation about what it means
> will
> > be better.
> >
> > 1: "Close" would be measured from the end of the window, so a reasonable
> > default would be "0". Recall that "close" really only needs to be
> specified
> > for final results, and a default of 0 would produce the most intuitive
> > results. If folks later discover that they are missing some late events,
> > they can adjust the parameter accordingly. IMHO, any other value would
> just
> > be a guess on our part.
> >
> > 2a:
> > I think you're saying to re-use "until" instead of adding "close" to the
> > window.
> >
> > The downside here would be that the semantic change could be more
> confusing
> > than deprecating "until" and introducing window "close" and a
> > "retentionTime" on the store builder. The deprecation is a good,
> controlled
> > way for us to make sure people are getting the semantics they think
> they're
> > getting, as well as giving us an opportunity to link people to the API
> they
> > should use instead.
> >
> > I didn't fully understand the second part, but it sounds like you're
> > suggesting to add a new "retentionTime" setter to Windows to bridge the
> gap
> > until we add it to the store builder? That seems kind of roundabout to
> me,
> > if that's what you meant. We could just immediately add it to the store
> > builders in the same PR.
> >
> > 2b: Sounds good to me!
> >
> > Thanks again,
> > -John
> >
> >
> > On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> > > John,
> > >
> > > Thanks for your replies. As for the two options of the API, I think I'm
> > > slightly inclined to the first option as well. My motivation is a bit
> > > different, as I think of the first one maybe more flexible, for
> example:
> > >
> > > KTable<Windowed<..>> table = ... count();
> > >
> > > table.toStream().peek(..);   // want to peek at the changelog stream,
> do
> > > not care about final results.
> > >
> > > table.suppress().toStream().to("topic");    // sending to a topic, want
> > to
> > > only send the final results.
> > >
> > > --------------
> > >
> > > Besides that, I have a few more minor questions:
> > >
> > > 1. For "allowedLateness", what should be the default value? I.e. if
> user
> > do
> > > not specify "allowedLateness" in TimeWindows, what value should we set?
> > >
> > > 2. For API names, some personal suggestions here:
> > >
> > > 2.a) "allowedLateness"  -> "until" (semantics changed, and also value
> is
> > > defined as delta on top of window length), where "until" ->
> > > "retentionPeriod", and the latter will be removed from `Windows` to `
> > > WindowStoreBuilder` in the future.
> > >
> > > 2.b) "BufferConfig" -> "Buffered" ?
> > >
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io>
> wrote:
> > >
> > > > Hey Matthias and Guozhang,
> > > >
> > > > Sorry for the slow reply. I was mulling about your feedback and
> > weighing
> > > > some ideas in a sketchbook PR: https://github.com/apache/
> > kafka/pull/5337
> > > .
> > > >
> > > > Your thought about keeping suppression independent of business logic
> > is a
> > > > very good one. I agree that it would make more sense to add some kind
> > of
> > > > "window close" concept to the window definition.
> > > >
> > > > In fact, doing that immediately solves the inconsistency problem
> > Guozhang
> > > > brought up. There's no need to add a "final results" or "emission"
> > option
> > > > to the windowed aggregation.
> > > >
> > > > What do you think about an API more like this:
> > > >
> > > > final StreamsBuilder builder = new StreamsBuilder();
> > > >
> > > > builder
> > > >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
> > > >   .groupBy(
> > > >     (String k1, String v1) -> k1,
> > > >     Serialized.with(STRING_SERDE, STRING_SERDE)
> > > >   )
> > > >   .windowedBy(TimeWindows
> > > >     .of(scaledTime(2L))
> > > >     .until(scaledTime(3L))
> > > >     .allowedLateness(scaledTime(1L))
> > > >   )
> > > >   .count(Materialized.as("counts"))
> > > >   .suppress(
> > > >     emitFinalResultsOnly(
> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> > SHUT_DOWN)
> > > >     )
> > > >   )
> > > >   .toStream()
> > > >   .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));
> > > >
> > > > Note that:
> > > >  * "emitFinalResultsOnly" is available *only* on windowed tables
> > > (enforced
> > > > by the type system at compile time), and it determines the time to
> wait
> > > by
> > > > looking at "allowedLateness" on the TimeWindows config.
> > > >  * querying "counts" will produce results (eventually) consistent
> with
> > > > what's observable in "output-suppressed".
> > > >  * in all cases, "suppress" has no effect on business logic, just on
> > > event
> > > > suppression.
> > > >
> > > > Is this API straightforward? Or do you still prefer the version that
> > both
> > > > proposed:
> > > >
> > > >   ...
> > > >   .windowedBy(TimeWindows
> > > >     .of(scaledTime(2L))
> > > >     .until(scaledTime(3L))
> > > >     .allowedLateness(scaledTime(1L))
> > > >   )
> > > >   .count(
> > > >     Materialized.as("counts"),
> > > >     emitFinalResultsOnly(
> > > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> > SHUT_DOWN)
> > > >     )
> > > >   )
> > > >   ...
> > > >
> > > > To me, these two are practically identical, and I still vaguely
> prefer
> > > the
> > > > first one.
> > > >
> > > > The prototype has made clearer to me that users of "final results for
> > > > windows" and users of "suppression for table events" both need to
> > > configure
> > > > the suppression buffer.
> > > >
> > > > This buffer configuration consists of:
> > > > 1. how many keys or bytes to keep in memory
> > > > 2. what to do if memory runs out (shut down, start using disk, ...)
> > > >
> > > > So it's not as simple as setting a "final results" flag. We'll either
> > > have
> > > > an "Emit" config object on the windowed aggregators that takes the
> same
> > > > BufferConfig that the "Suppress" config on the suppression operator,
> or
> > > we
> > > > just use the suppression operator for both.
> > > >
> > > > Perhaps it would sweeten the deal a little to point out that we have
> 2
> > > > overloads already for each windowed aggregator (with and without
> > > > Materialized). Adding "Emitted" or something would mean that we'd
> add a
> > > new
> > > > overload for each one, taking us up to 4 overloads each for "count",
> > > > "aggregate" and "reduce". Using "suppress" means that we don't add
> any
> > > new
> > > > overloads.
> > > >
> > > > Thanks again for helping to hash this out,
> > > > -John
> > > >
> > > > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > > >
> > > > > I think I agree with Matthias for having dedicated APIs for
> windowed
> > > > > operation final output scenario, PLUS separating the window close
> > which
> > > > the
> > > > > "final output" would rely on, from the window retention time itself
> > > > > (admittedly it would make this KIP effort larger, but if we believe
> > we
> > > > need
> > > > > to do this separation anyways we could just do it now).
> > > > >
> > > > > And then we can have the `KTable#suppress()` for
> > > intermediate-suppression
> > > > > only, not for late-record-suppression, until we've seen that
> becomes
> > a
> > > > > common feature request because our current design still allows to
> be
> > > > > extended for that purpose.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
> > > matthias@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the discussion. I am just catching up.
> > > > > >
> > > > > > In general, I think we have different uses cases and non-windowed
> > and
> > > > > > windowed is quite different. For the non-windowed case,
> suppress()
> > > has
> > > > > > no (useful) close or retention time, no final semantics, and also
> > no
> > > > > > business logic impact.
> > > > > >
> > > > > > On the other hand, for windowed aggregations, close time and
> final
> > > > > > result do have a meaning. IMHO, `close()` is part of business
> logic
> > > > > > while retention time is not. Also, suppression of intermediate
> > result
> > > > is
> > > > > > not a business rule and there might be use case for which either
> > > "early
> > > > > > intermediate" (before window end time) are suppressed only, or
> all
> > > > > > intermediates are suppressed (maybe also something in the middle,
> > ie,
> > > > > > just reduce the load of intermediate updates). Thus,
> > > window-suppression
> > > > > > is much richer.
> > > > > >
> > > > > > IMHO, a generic `suppress()` operator that can be inserted into
> the
> > > > data
> > > > > > flow at any point is useful. Maybe we should keep is as generic
> as
> > > > > > possible. However, it might be difficult to use with regard to
> > > > > > windowing, as the mental effort to use it is high.
> > > > > >
> > > > > > With regard to Guozhang's comment:
> > > > > >
> > > > > > > we will actually
> > > > > > > process data as old as 30 days as well, while most of the late
> > > > updates
> > > > > > > beyond 5 minutes would be discarded anyways.
> > > > > >
> > > > > > If we use `suppress()` as a standalone operator, this is correct
> > and
> > > > > > intended IMHO. To address the issue if the behavior is unwanted,
> I
> > > > would
> > > > > > suggest to add a "suppress option" directly to
> > > > > > `count()/reduce()/aggregate()` window operator similar to
> > > > > > `Materialized`. This would be an "embedded suppress" and avoid
> the
> > > > > > issue. It would also address the issue about mental effort for
> > > "single
> > > > > > final window result" use case.
> > > > > >
> > > > > > I also think that a shorter close-time than retention time is
> > useful
> > > > for
> > > > > > window aggregation. If we add close() to the window definition
> and
> > > > > > until() to `Materialized`, we can separate both correctly IMHO.
> > > > > >
> > > > > > About setting `close = min(close,retention)` I am not sure. We
> > might
> > > > > > rather throw an exception than reducing the close time
> > automatically.
> > > > > > Otherwise, I see many user question about "I set close to X but
> it
> > > does
> > > > > > not get updated for some data that is with delay of X".
> > > > > >
> > > > > > The tricky question might be to design the API in a backward
> > > compatible
> > > > > > way though.
> > > > > >
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > > On 7/3/18 5:38 AM, John Roesler wrote:
> > > > > > > Hi Guozhang,
> > > > > > >
> > > > > > > I see. It seems like if we want to decouple 1) and 2), we need
> to
> > > > alter
> > > > > > the
> > > > > > > definition of the window. Do you think it would close the gap
> if
> > we
> > > > > > added a
> > > > > > > "window close" time to the window definition?
> > > > > > >
> > > > > > > Such as:
> > > > > > >
> > > > > > > builder.stream("input")
> > > > > > > .groupByKey()
> > > > > > > .windowedBy(
> > > > > > >   TimeWindows
> > > > > > >     .of(60_000)
> > > > > > >     .closeAfter(10 * 60)
> > > > > > >     .until(30L * 24 * 60 * 60 * 1000)
> > > > > > > )
> > > > > > > .count()
> > > > > > > .suppress(Suppression.finalResultsOnly());
> > > > > > >
> > > > > > > Possibly called "finalResultsAtWindowClose" or something?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > -John
> > > > > > >
> > > > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Hey John,
> > > > > > >>
> > > > > > >> Obviously I'm too lazy on email replying diligence compared
> with
> > > you
> > > > > :)
> > > > > > >> Will try to reply them separately:
> > > > > > >>
> > > > > > >>
> > > > > > >> ------------------------------------------------------------
> > > > > > -----------------
> > > > > > >>
> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > > > > > >>
> > > > > > >> I'm aware of this use case, but again, the concern is that, in
> > > this
> > > > > > setting
> > > > > > >> in order to let the window be queryable for 30 days, we will
> > > > actually
> > > > > > >> process data as old as 30 days as well, while most of the late
> > > > updates
> > > > > > >> beyond 5 minutes would be discarded anyways. Personally I
> think
> > > for
> > > > > the
> > > > > > >> final update scenario, the ideal situation users would want is
> > > that
> > > > > "do
> > > > > > not
> > > > > > >> process any data that is less than 5 minutes, and of course no
> > > > update
> > > > > > >> records to the downstream later than 5 minutes either; but
> > retain
> > > > the
> > > > > > >> window to be queryable for 30 days". And by doing that the
> final
> > > > > window
> > > > > > >> snapshot would also be aligned with the update stream as well.
> > In
> > > > > other
> > > > > > >> words, among these three periods:
> > > > > > >>
> > > > > > >> 1) the retention length of the window / table.
> > > > > > >> 2) the late records acceptance for updating the window.
> > > > > > >> 3) the late records update to be sent downstream.
> > > > > > >>
> > > > > > >> Final update use cases would naturally want 2) = 3), while 1)
> > may
> > > be
> > > > > > >> different and larger, while what we provide now is that 1) =
> 2),
> > > > which
> > > > > > >> could be different and in practice larger than 3), hence not
> the
> > > > most
> > > > > > >> intuitive for their needs.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> ------------------------------------------------------------
> > > > > > -----------------
> > > > > > >>
> > > > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > > > > > >>
> > > > > > >> I'd like option 2) over option 1) better as well from
> > programming
> > > > pov.
> > > > > > But
> > > > > > >> I'm wondering if option 2) would provide the above semantics
> or
> > it
> > > > is
> > > > > > still
> > > > > > >> coupling 1) with 2) as well ?
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> Guozhang
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > > >>
> > > > > > >>> In fact, to push the idea further (which IIRC is what
> Matthias
> > > > > > originally
> > > > > > >>> proposed), if we can accept "Suppression#finalResultsOnly" in
> > my
> > > > last
> > > > > > >>> email, then we could also consider whether to eliminate
> > > > > > >>> "suppressLateEvents" entirely.
> > > > > > >>>
> > > > > > >>> We could always add it later, but you've both expressed doubt
> > > that
> > > > > > there
> > > > > > >>> are practical use cases for it outside of final-results.
> > > > > > >>>
> > > > > > >>> -John
> > > > > > >>>
> > > > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
> > john@confluent.io>
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> Hi again, Guozhang ;) Here's the second part of my
> response...
> > > > > > >>>>
> > > > > > >>>> It seems like your main concern is: "if I'm a user who wants
> > > final
> > > > > > >> update
> > > > > > >>>> semantics, how complicated is it for me to get it?"
> > > > > > >>>>
> > > > > > >>>> I think we have to assume that people don't always have time
> > to
> > > > > become
> > > > > > >>>> deeply familiar with all the nuances of a programming
> > > environment
> > > > > > >> before
> > > > > > >>>> they use it. Especially if they're evaluating several
> > frameworks
> > > > for
> > > > > > >>> their
> > > > > > >>>> use case, it's very valuable to make it as obvious as
> possible
> > > how
> > > > > to
> > > > > > >>>> accomplish various computations with Streams.
> > > > > > >>>>
> > > > > > >>>> To me the biggest question is whether with a fresh
> > perspective,
> > > > > people
> > > > > > >>>> would say "oh, I get it, I have to bound my lateness and
> > > suppress
> > > > > > >>>> intermediate updates, and of course I'll get only the final
> > > > > result!",
> > > > > > >> or
> > > > > > >>> if
> > > > > > >>>> it's more like "wtf? all I want is the final result, what
> are
> > > all
> > > > > > these
> > > > > > >>>> parameters?".
> > > > > > >>>>
> > > > > > >>>> I was talking with Matthias a while back, and he had an idea
> > > that
> > > > I
> > > > > > >> think
> > > > > > >>>> can help, which is to essentially set up a final-result
> recipe
> > > in
> > > > > > >>> addition
> > > > > > >>>> to the raw parameters. I previously thought that it wouldn't
> > be
> > > > > > >> possible
> > > > > > >>> to
> > > > > > >>>> restrict its usage to Windowed KTables, but thinking about
> it
> > > > again
> > > > > > >> this
> > > > > > >>>> weekend, I have a couple of ideas:
> > > > > > >>>>
> > > > > > >>>> ================
> > > > > > >>>> = 1. Static Wrapper =
> > > > > > >>>> ================
> > > > > > >>>> We can define an extra static function that "wraps" a KTable
> > > with
> > > > > > >>>> final-result semantics.
> > > > > > >>>>
> > > > > > >>>> public static <K extends Windowed, V> KTable<K, V>
> > > > finalResultsOnly(
> > > > > > >>>>   final KTable<K, V> windowedKTable,
> > > > > > >>>>   final Duration maxAllowedLateness,
> > > > > > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > > > > > >>>>     return windowedKTable.suppress(
> > > > > > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> > > > > > >>>>                    .suppressIntermediateEvents(
> > > > > > >>>>                      IntermediateSuppression
> > > > > > >>>>                        .emitAfter(maxAllowedLateness)
> > > > > > >>>>                        .bufferFullStrategy(
> > bufferFullStrategy)
> > > > > > >>>>                    )
> > > > > > >>>>     );
> > > > > > >>>> }
> > > > > > >>>>
> > > > > > >>>> Because windowedKTable is a parameter, the static function
> can
> > > > > easily
> > > > > > >>>> impose an extra bound on the key type, that it extends
> > Windowed.
> > > > > This
> > > > > > >>> would
> > > > > > >>>> make "final results only" only available on windowed
> ktables.
> > > > > > >>>>
> > > > > > >>>> Here's how it would look to use:
> > > > > > >>>>
> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > > > >>>>   finalResultsOnly(
> > > > > > >>>>     windowCounts,
> > > > > > >>>>     Duration.ofMinutes(10),
> > > > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > > > > > >>>>   );
> > > > > > >>>>
> > > > > > >>>> Trying to use it on a non-windowed KTable yields:
> > > > > > >>>>
> > > > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > > > > > >>>>> org.apache.kafka.streams.kstream.internals.
> > KTableAggregateTest
> > > > > > cannot
> > > > > > >>> be
> > > > > > >>>>> applied to given types;
> > > > > > >>>>>   required:
> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > > > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > > > > > BufferFullStrategy
> > > > > > >>>>>   found:
> > > > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > > > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > > > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > > > > > >>>>>   reason: inference variable K has incompatible bounds
> > > > > > >>>>>     equality constraints: java.lang.String
> > > > > > >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> =================================================
> > > > > > >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> > > > > > >>>> =================================================
> > > > > > >>>>
> > > > > > >>>> By adding K,V parameters to Suppression, we can provide a
> > > > similarly
> > > > > > >>>> bounded config method directly on the Suppression class:
> > > > > > >>>>
> > > > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > > > > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> > > > > > >>>> BufferFullStrategy bufferFullStrategy) {
> > > > > > >>>>     return Suppression
> > > > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > > > > > >>>>         .suppressIntermediateEvents(IntermediateSuppression
> > > > > > >>>>             .emitAfter(maxAllowedLateness)
> > > > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > > > > > >>>>         );
> > > > > > >>>> }
> > > > > > >>>>
> > > > > > >>>> Then, here's how it would look to use it:
> > > > > > >>>>
> > > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > > > >>>>   windowCounts.suppress(
> > > > > > >>>>     Suppression.finalResultsOnly(
> > > > > > >>>>       Duration.ofMinutes(10)
> > > > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > > > > > >>>>     )
> > > > > > >>>>   );
> > > > > > >>>>
> > > > > > >>>> Trying to use it on a non-windowed ktable yields:
> > > > > > >>>>
> > > > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > > > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be
> > > > applied
> > > > > > to
> > > > > > >>>>> given types;
> > > > > > >>>>>   required:
> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > > > >>> Suppression.BufferFullStrategy
> > > > > > >>>>>   found:
> > > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > > > >>> Suppression.BufferFullStrategy
> > > > > > >>>>>   reason: explicit type argument java.lang.String does not
> > > > conform
> > > > > to
> > > > > > >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> ============
> > > > > > >>>> = Downsides =
> > > > > > >>>> ============
> > > > > > >>>>
> > > > > > >>>> Of course, there's a downside either way:
> > > > > > >>>> * for 1:  this "wrapper" interaction would be the first in
> the
> > > > DSL.
> > > > > Is
> > > > > > >> it
> > > > > > >>>> too strange, and how discoverable would it be?
> > > > > > >>>> * for 2: adding those type parameters to Suppression will
> > force
> > > > all
> > > > > > >>>> callers to provide them in the event of a chained
> construction
> > > > > because
> > > > > > >>> Java
> > > > > > >>>> doesn't do RHS recursive type inference. This is already
> > visible
> > > > in
> > > > > > >> other
> > > > > > >>>> parts of the Streams DSL. For example, often calls to
> > > Materialized
> > > > > > >>> builders
> > > > > > >>>> have to provide seemingly obvious type bounds.
> > > > > > >>>>
> > > > > > >>>> ============
> > > > > > >>>> = Conclusion =
> > > > > > >>>> ============
> > > > > > >>>>
> > > > > > >>>> I think option 2 is more "normal" and discoverable. It does
> > > have a
> > > > > > >>>> downside, but it's one that's pre-existing elsewhere in the
> > DSL.
> > > > > > >>>>
> > > > > > >>>> WDYT? Would the addition of this "recipe" method to
> > Suppression
> > > > > > resolve
> > > > > > >>>> your concern?
> > > > > > >>>>
> > > > > > >>>> Thanks again,
> > > > > > >>>> -John
> > > > > > >>>>
> > > > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
> > > wangguoz@gmail.com
> > > > >
> > > > > > >>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi John,
> > > > > > >>>>>
> > > > > > >>>>> Regarding the metrics: yeah I think I'm with you that the
> > > dropped
> > > > > > >>> records
> > > > > > >>>>> due to window retention or emit suppression policies should
> > be
> > > > > > >> recorded
> > > > > > >>>>> differently, and using this KIP's proposed metric would be
> > > fine.
> > > > If
> > > > > > >> you
> > > > > > >>>>> also think we can use this KIP's proposed metrics to cover
> > the
> > > > > window
> > > > > > >>>>> retention cased skipping records, then we can include the
> > > changes
> > > > > in
> > > > > > >>> this
> > > > > > >>>>> KIP as well.
> > > > > > >>>>>
> > > > > > >>>>> Regarding the current proposal, I'm actually not too
> worried
> > > > about
> > > > > > the
> > > > > > >>>>> inconsistency between query semantics and downstream emit
> > > > > semantics.
> > > > > > >> For
> > > > > > >>>>> queries, we will always return the current running results
> of
> > > the
> > > > > > >>> windows,
> > > > > > >>>>> being it partial or final results depending on the window
> > > > retention
> > > > > > >> time
> > > > > > >>>>> anyways, which has nothing to do whether the emitted stream
> > > > should
> > > > > be
> > > > > > >>> one
> > > > > > >>>>> final output per key or not. I also agree that having a
> > unified
> > > > > > >>> operation
> > > > > > >>>>> is generally better for users to focus on leveraging that
> one
> > > > only
> > > > > > >> than
> > > > > > >>>>> learning about two set of operations. The only question I
> had
> > > is,
> > > > > for
> > > > > > >>>>> final
> > > > > > >>>>> updates of window stores, if it is a bit awkward to
> > understand
> > > > the
> > > > > > >>>>> configuration combo. Thinking about this more, I think my
> > root
> > > > > worry
> > > > > > >> in
> > > > > > >>>>> the
> > > > > > >>>>> "suppressLateEvents" call for windowed tables, since from a
> > > user
> > > > > > >>>>> perspective: if my retention time is X which means "pay the
> > > cost
> > > > to
> > > > > > >>> allow
> > > > > > >>>>> late records up to X to still be applied updating the
> > tables",
> > > > why
> > > > > > >>> would I
> > > > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not
> > > send
> > > > > the
> > > > > > >>>>> updates up to Y, which means the downstream operator or
> sink
> > > > topic
> > > > > > for
> > > > > > >>>>> this
> > > > > > >>>>> stream would actually see a truncated update stream while
> > I've
> > > > paid
> > > > > > >>> larger
> > > > > > >>>>> cost for that"; and of course, Y > X would not make sense
> > > either
> > > > as
> > > > > > >> you
> > > > > > >>>>> would not see any updates later than X anyways. So in all,
> my
> > > > > feeling
> > > > > > >> is
> > > > > > >>>>> that it makes less sense for windowed table's
> > > > "suppressLateEvents"
> > > > > > >> with
> > > > > > >>> a
> > > > > > >>>>> parameter that is not equal to the window retention, and
> > > opening
> > > > > the
> > > > > > >>> door
> > > > > > >>>>> in the current proposal may confuse people with that.
> > > > > > >>>>>
> > > > > > >>>>> Again, above is just a subjective opinion and probably we
> can
> > > > also
> > > > > > >> bring
> > > > > > >>>>> up
> > > > > > >>>>> some scenarios that users does want to set X != Y.. but
> > > > personally
> > > > > I
> > > > > > >>> feel
> > > > > > >>>>> that even if the semantics for this scenario if intuitive
> for
> > > > user
> > > > > to
> > > > > > >>>>> understand, doe that really make sense and should we really
> > > open
> > > > > the
> > > > > > >>> door
> > > > > > >>>>> for it. So I think maybe separating the final update in a
> > > > separate
> > > > > > >> API's
> > > > > > >>>>> benefits may overwhelm the advantage of having one uniform
> > > > > > definition.
> > > > > > >>> And
> > > > > > >>>>> for my alternative proposal, the rationale was from both my
> > > > concern
> > > > > > >>> about
> > > > > > >>>>> "suppressLateEvents" for windowed store, and Matthias'
> > question
> > > > > about
> > > > > > >>>>> "suppressLateEvents" for non-windowed stores, that if it is
> > > less
> > > > > > >>>>> meaningful
> > > > > > >>>>> for both, we can consider removing it completely and only
> do
> > > > > > >>>>> "IntermediateSuppression" in Suppress instead.
> > > > > > >>>>>
> > > > > > >>>>> So I'd summarize my thoughts in the following questions:
> > > > > > >>>>>
> > > > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
> > > > > retention
> > > > > > >>> time)
> > > > > > >>>>> for windowed stores make sense in practice?
> > > > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> > > > non-windowed
> > > > > > >>> stores
> > > > > > >>>>> make sense in practice?
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> Guozhang
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
> > > bbejeck@gmail.com>
> > > > > > >> wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Thanks for the explanation, that does make sense.  I have
> > some
> > > > > > >>>>> questions on
> > > > > > >>>>>> operations, but I'll just wait for the PR and tests.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>> Bill
> > > > > > >>>>>>
> > > > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
> > > john@confluent.io
> > > > >
> > > > > > >>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>> Hi Bill,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks for the review!
> > > > > > >>>>>>>
> > > > > > >>>>>>> Your question is very much applicable to the KIP and not
> at
> > > all
> > > > > an
> > > > > > >>>>>>> implementation detail. Thanks for bringing it up.
> > > > > > >>>>>>>
> > > > > > >>>>>>> I'm proposing not to change the existing caches and
> > > > > configurations
> > > > > > >>> at
> > > > > > >>>>> all
> > > > > > >>>>>>> (for now).
> > > > > > >>>>>>>
> > > > > > >>>>>>> Imagine you have a topology like this:
> > > > > > >>>>>>> commit.interval.ms = 100
> > > > > > >>>>>>>
> > > > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > > > > > >>>>>>>
> > > > > > >>>>>>> The first ktable (ktable1) will respect the commit
> interval
> > > and
> > > > > > >>> buffer
> > > > > > >>>>>>> events for 100ms before logging, storing, or forwarding
> > them
> > > > > > >> (IIRC).
> > > > > > >>>>>>> Therefore, the second ktable (suppress) will only see the
> > > > events
> > > > > > >> at
> > > > > > >>> a
> > > > > > >>>>>> rate
> > > > > > >>>>>>> of once per 100ms. It will apply its own buffering, and
> > emit
> > > > once
> > > > > > >>> per
> > > > > > >>>>>> 200ms
> > > > > > >>>>>>> This case is pretty trivial because the suppress time is
> a
> > > > > > >> multiple
> > > > > > >>> of
> > > > > > >>>>>> the
> > > > > > >>>>>>> commit interval.
> > > > > > >>>>>>>
> > > > > > >>>>>>> When it's not an integer multiple, you'll get behavior
> like
> > > in
> > > > > > >> this
> > > > > > >>>>>> marble
> > > > > > >>>>>>> diagram:
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > > > > > >>>>>>>
> > > > > > >>>>>>> [ KTable caching with commit interval = 2 ]
> > > > > > >>>>>>>
> > > > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > > > > > >>>>>>>
> > > > > > >>>>>>>       [ suppress with emitAfter = 3 ]
> > > > > > >>>>>>>
> > > > > > >>>>>>> <---------------(k:2)----------------(k:6)->
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> If this behavior isn't desired (for example, if you
> wanted
> > to
> > > > > emit
> > > > > > >>>>> (k:3)
> > > > > > >>>>>> at
> > > > > > >>>>>>> time 3, I'd recommend setting the
> > "cache.max.bytes.buffering"
> > > > to
> > > > > 0
> > > > > > >>> or
> > > > > > >>>>>>> modifying the topology to disable caching. Then, the
> > behavior
> > > > is
> > > > > > >>> more
> > > > > > >>>>>>> simply determined just by the suppress operator.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Does that seem right to you?
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> Regarding the changelogs, because the suppression
> operator
> > > > hangs
> > > > > > >>> onto
> > > > > > >>>>>>> events for a while, it will need its own changelog. The
> > > > changelog
> > > > > > >>>>>>> should represent the current state of the buffer at all
> > > times.
> > > > So
> > > > > > >>> when
> > > > > > >>>>>> the
> > > > > > >>>>>>> suppress operator sees (k:2), for example, it will log
> > (k:2).
> > > > > When
> > > > > > >>> it
> > > > > > >>>>>>> later gets to time 3, it's time to emit (k:2) downstream.
> > > > Because
> > > > > > >> k
> > > > > > >>>>> is no
> > > > > > >>>>>>> longer buffered, the suppress operator will log (k:null).
> > > Thus,
> > > > > > >> when
> > > > > > >>>>>>> recovering,
> > > > > > >>>>>>> it can rebuild the buffer by reading its changelog.
> > > > > > >>>>>>>
> > > > > > >>>>>>> What do you think about this?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks,
> > > > > > >>>>>>> -John
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
> > > bbejeck@gmail.com
> > > > >
> > > > > > >>>>> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> Hi John,  thanks for the KIP.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Early on in the KIP, you mention the current approaches
> > for
> > > > > > >>>>> controlling
> > > > > > >>>>>>> the
> > > > > > >>>>>>>> rate of downstream records from a KTable, cache size
> > > > > > >> configuration
> > > > > > >>>>> and
> > > > > > >>>>>>>> commit time.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Will these configuration parameters still be in effect
> for
> > > > > > >> tables
> > > > > > >>>>> that
> > > > > > >>>>>>>> don't use suppression?  For tables taking advantage of
> > > > > > >>> suppression,
> > > > > > >>>>>> will
> > > > > > >>>>>>>> these configurations have no impact?
> > > > > > >>>>>>>> This last question may be to implementation specific but
> > if
> > > > the
> > > > > > >>>>>> requested
> > > > > > >>>>>>>> suppression time is longer than the specified commit
> time,
> > > > will
> > > > > > >>> the
> > > > > > >>>>>>> latest
> > > > > > >>>>>>>> record in the suppression buffer get stored in a
> > changelog?
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks,
> > > > > > >>>>>>>> Bill
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> > > > john@confluent.io
> > > > > > >>>
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>> Thanks for the feedback, Matthias,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> It seems like in straightforward relational processing
> > > cases,
> > > > > > >> it
> > > > > > >>>>>> would
> > > > > > >>>>>>>> not
> > > > > > >>>>>>>>> make sense to bound the lateness of KTables. In
> general,
> > it
> > > > > > >>> seems
> > > > > > >>>>>>> better
> > > > > > >>>>>>>> to
> > > > > > >>>>>>>>> have "guard rails" in place that make it easier to
> write
> > > > > > >>> sensible
> > > > > > >>>>>>>> programs
> > > > > > >>>>>>>>> than insensible ones.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> But I'm still going to argue in favor of keeping it for
> > all
> > > > > > >>>>> KTables
> > > > > > >>>>>> ;)
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 1. I believe it is simpler to understand the operator
> if
> > it
> > > > > > >> has
> > > > > > >>>>> one
> > > > > > >>>>>>>> uniform
> > > > > > >>>>>>>>> definition, regardless of context. It's well defined
> and
> > > > > > >>> intuitive
> > > > > > >>>>>> what
> > > > > > >>>>>>>>> will happen when you use late-event suppression on a
> > > KTable,
> > > > > > >> so
> > > > > > >>> I
> > > > > > >>>>>> think
> > > > > > >>>>>>>>> nothing surprising or dangerous will happen in that
> case.
> > > > From
> > > > > > >>> my
> > > > > > >>>>>>>>> perspective, having two sets of allowed operations is
> > > > actually
> > > > > > >>> an
> > > > > > >>>>>>>> increase
> > > > > > >>>>>>>>> in cognitive complexity.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this way.
> > For
> > > > > > >>>>> example,
> > > > > > >>>>>> in
> > > > > > >>>>>>>> lieu
> > > > > > >>>>>>>>> of full-featured timestamp semantics, I can implement
> > MVCC
> > > > > > >>>>> behavior
> > > > > > >>>>>>> when
> > > > > > >>>>>>>>> building a KTable by
> "suppressLateEvents(Duration.ZERO)".
> > I
> > > > > > >>>>> suspect
> > > > > > >>>>>>> that
> > > > > > >>>>>>>>> there are other, non-obvious applications of
> suppressing
> > > late
> > > > > > >>>>> events
> > > > > > >>>>>> on
> > > > > > >>>>>>>>> KTables.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> 3. Not to get too much into implementation details in a
> > KIP
> > > > > > >>>>>> discussion,
> > > > > > >>>>>>>> but
> > > > > > >>>>>>>>> if we did want to make late-event suppression available
> > > only
> > > > > > >> on
> > > > > > >>>>>>> windowed
> > > > > > >>>>>>>>> KTables, we have two enforcement options:
> > > > > > >>>>>>>>>   a. check when we build the topology - this would be
> > > simple
> > > > > > >> to
> > > > > > >>>>>>>> implement,
> > > > > > >>>>>>>>> but would be a runtime check. Hopefully, people write
> > tests
> > > > > > >> for
> > > > > > >>>>> their
> > > > > > >>>>>>>>> topology before deploying them, so the feedback loop
> > isn't
> > > > > > >>>>>>> instantaneous,
> > > > > > >>>>>>>>> but it's not too long either.
> > > > > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a
> > > compile
> > > > > > >>> time
> > > > > > >>>>>>> check,
> > > > > > >>>>>>>>> but would also be substantial increase of both
> interface
> > > and
> > > > > > >>> code
> > > > > > >>>>>>>>> complexity.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> We should definitely strive to have guard rails
> > protecting
> > > > > > >>> against
> > > > > > >>>>>>>>> surprising or dangerous behavior. Protecting against
> > > programs
> > > > > > >>>>> that we
> > > > > > >>>>>>>> don't
> > > > > > >>>>>>>>> currently predict is a lesser benefit, and I think we
> can
> > > put
> > > > > > >> up
> > > > > > >>>>>> guard
> > > > > > >>>>>>>>> rails on a case-by-case basis for that. It seems like
> the
> > > > > > >>>>> increase in
> > > > > > >>>>>>>>> cognitive (and potentially code and interface)
> complexity
> > > > > > >> makes
> > > > > > >>> me
> > > > > > >>>>>>> think
> > > > > > >>>>>>>> we
> > > > > > >>>>>>>>> should skip this case.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> What do you think?
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Thanks,
> > > > > > >>>>>>>>> -John
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > > > > >>>>>>> matthias@confluent.io>
> > > > > > >>>>>>>>> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> Thanks for the KIP John.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> One initial comments about the last example "Bounded
> > > > > > >>> lateness":
> > > > > > >>>>>> For a
> > > > > > >>>>>>>>>> non-windowed KTable bounding the lateness does not
> > really
> > > > > > >> make
> > > > > > >>>>>> sense,
> > > > > > >>>>>>>>>> does it?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Thus, I am wondering if we should allow
> > > > > > >> `suppressLateEvents()`
> > > > > > >>>>> for
> > > > > > >>>>>>> this
> > > > > > >>>>>>>>>> case? It seems to be better to only allow it for
> > > > > > >>>>> windowed-KTables.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> -Matthias
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> What you gave as new example is semantically the same
> > as
> > > > > > >>> what
> > > > > > >>>>> I
> > > > > > >>>>>>>>>> suggested.
> > > > > > >>>>>>>>>>> So it is good by me.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Thanks
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > > > > > >>>>> john@confluent.io
> > > > > > >>>>>>>
> > > > > > >>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Thanks for taking look, Ted,
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I agree this is a departure from the conventions of
> > > > > > >> Streams
> > > > > > >>>>> DSL.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Most of our config objects have one or two
> "required"
> > > > > > >>>>>> parameters,
> > > > > > >>>>>>>>> which
> > > > > > >>>>>>>>>> fit
> > > > > > >>>>>>>>>>>> naturally with the static factory method approach.
> > > > > > >>>>> TimeWindow,
> > > > > > >>>>>> for
> > > > > > >>>>>>>>>> example,
> > > > > > >>>>>>>>>>>> requires a size parameter, so we can naturally say
> > > > > > >>>>>>>>> TimeWindows.of(size).
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I think in the case of a suppression, there's really
> > no
> > > > > > >>>>> "core"
> > > > > > >>>>>>>>>> parameter,
> > > > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > > > > > >>>>> Suppression()". I
> > > > > > >>>>>>>> think
> > > > > > >>>>>>>>>> that
> > > > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since
> > there
> > > > > > >>> are
> > > > > > >>>>>> many
> > > > > > >>>>>>>>>> durations
> > > > > > >>>>>>>>>>>> that we can configure.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> However, thinking about it again, I suppose that I
> can
> > > > > > >> give
> > > > > > >>>>> each
> > > > > > >>>>>>>>>>>> configuration method a static version, which would
> let
> > > > > > >> you
> > > > > > >>>>>> replace
> > > > > > >>>>>>>>> "new
> > > > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the
> > examples.
> > > > > > >>>>>>> Basically,
> > > > > > >>>>>>>>>> instead
> > > > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> For example:
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> windowCounts
> > > > > > >>>>>>>>>>>>     .suppress(
> > > > > > >>>>>>>>>>>>         Suppression
> > > > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.
> > ofMinutes(10))
> > > > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(
> > 10))
> > > > > > >>>>>>>>>>>>             )
> > > > > > >>>>>>>>>>>>     );
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Does that seem better?
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Thanks,
> > > > > > >>>>>>>>>>>> -John
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > > > > > >>> yuzhihong@gmail.com
> > > > > > >>>>>>
> > > > > > >>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> > > > > > >>>>> materials.
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> One suggestion:
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>     .suppress(
> > > > > > >>>>>>>>>>>>>         new Suppression()
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Do you think it would be more consistent with the
> > rest
> > > > > > >> of
> > > > > > >>>>>> Streams
> > > > > > >>>>>>>>> data
> > > > > > >>>>>>>>>>>>> structures by supporting `of` ?
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Cheers
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > > > > > >>>>>> john@confluent.io
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Hello devs and users,
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Please take some time to consider this proposal
> for
> > > > > > >> Kafka
> > > > > > >>>>>>> Streams:
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> link:
> https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> The basic idea is to provide:
> > > > > > >>>>>>>>>>>>>> * more usable control over update rate (vs the
> > current
> > > > > > >>>>> state
> > > > > > >>>>>>> store
> > > > > > >>>>>>>>>>>>> caches)
> > > > > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations
> > feature
> > > > > > >>> which
> > > > > > >>>>>>> several
> > > > > > >>>>>>>>>>>> people
> > > > > > >>>>>>>>>>>>>> have requested
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> I look forward to your feedback!
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Thanks,
> > > > > > >>>>>>>>>>>>>> -John
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> --
> > > > > > >>>>> -- Guozhang
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> -- Guozhang
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

Let me clarify a bit on what I meant about moving `retentionPeriod` to
WindowStoreBuilder:

In another discussion we had around KIP-319 / 330, that the "retention
period" should not really be a window spec, but only a window store spec,
as it only affects how long to retain each window to be queryable along
with the storage cost.

More specifically, today the "maintainMs" returned from Windows is used in
three places:

1) for windowed aggregations, they are passed in directly into
`Stores.persistentWindows()` as the retention period parameters. For this
use case we should just let the WindowStoreBuilder to specify this value
itself.

NOTE: It is also returned in the KStreamWindowAggregate processor, to
determine if a received record should be dropped due to its lateness. We
may need to think of another way to get this value inside the processor

2) for windowed stream-stream join, it is used as the join range parameter
but only to check that "windowSizeMs <= retentionPeriodMs". We can do this
check at the store builder lever instead of at the processor level.


If we can remove its usage in both 1) and 2), then we should be able to
safely remove this from the `Windows` spec.


Guozhang


On Mon, Jul 9, 2018 at 3:53 PM, John Roesler <jo...@confluent.io> wrote:

> Thanks for the reply, Guozhang,
>
> Good! I agree, that is also a good reason, and I actually made use of that
> in my tests. I'll update the KIP.
>
> By the way, I chose "allowedLateness" as I was trying to pick a better name
> than "close", but I think it's actually the wrong name. We don't want to
> bound the lateness of events in general, only with respect to the end of
> their window.
>
> If we have a window [0,10), with "allowedLateness" of 5, then if we get an
> event with timestamp 3 at time 9, the name implies we'd reject it, which
> seems silly. Really, we'd only want to start rejecting that event at stream
> time 15.
>
> What I meant was more like "allowedLatenessAfterWindowEnd", but that's too
> verbose. I think that "close" + some documentation about what it means will
> be better.
>
> 1: "Close" would be measured from the end of the window, so a reasonable
> default would be "0". Recall that "close" really only needs to be specified
> for final results, and a default of 0 would produce the most intuitive
> results. If folks later discover that they are missing some late events,
> they can adjust the parameter accordingly. IMHO, any other value would just
> be a guess on our part.
>
> 2a:
> I think you're saying to re-use "until" instead of adding "close" to the
> window.
>
> The downside here would be that the semantic change could be more confusing
> than deprecating "until" and introducing window "close" and a
> "retentionTime" on the store builder. The deprecation is a good, controlled
> way for us to make sure people are getting the semantics they think they're
> getting, as well as giving us an opportunity to link people to the API they
> should use instead.
>
> I didn't fully understand the second part, but it sounds like you're
> suggesting to add a new "retentionTime" setter to Windows to bridge the gap
> until we add it to the store builder? That seems kind of roundabout to me,
> if that's what you meant. We could just immediately add it to the store
> builders in the same PR.
>
> 2b: Sounds good to me!
>
> Thanks again,
> -John
>
>
> On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > John,
> >
> > Thanks for your replies. As for the two options of the API, I think I'm
> > slightly inclined to the first option as well. My motivation is a bit
> > different, as I think of the first one maybe more flexible, for example:
> >
> > KTable<Windowed<..>> table = ... count();
> >
> > table.toStream().peek(..);   // want to peek at the changelog stream, do
> > not care about final results.
> >
> > table.suppress().toStream().to("topic");    // sending to a topic, want
> to
> > only send the final results.
> >
> > --------------
> >
> > Besides that, I have a few more minor questions:
> >
> > 1. For "allowedLateness", what should be the default value? I.e. if user
> do
> > not specify "allowedLateness" in TimeWindows, what value should we set?
> >
> > 2. For API names, some personal suggestions here:
> >
> > 2.a) "allowedLateness"  -> "until" (semantics changed, and also value is
> > defined as delta on top of window length), where "until" ->
> > "retentionPeriod", and the latter will be removed from `Windows` to `
> > WindowStoreBuilder` in the future.
> >
> > 2.b) "BufferConfig" -> "Buffered" ?
> >
> >
> >
> > Guozhang
> >
> >
> > On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io> wrote:
> >
> > > Hey Matthias and Guozhang,
> > >
> > > Sorry for the slow reply. I was mulling about your feedback and
> weighing
> > > some ideas in a sketchbook PR: https://github.com/apache/
> kafka/pull/5337
> > .
> > >
> > > Your thought about keeping suppression independent of business logic
> is a
> > > very good one. I agree that it would make more sense to add some kind
> of
> > > "window close" concept to the window definition.
> > >
> > > In fact, doing that immediately solves the inconsistency problem
> Guozhang
> > > brought up. There's no need to add a "final results" or "emission"
> option
> > > to the windowed aggregation.
> > >
> > > What do you think about an API more like this:
> > >
> > > final StreamsBuilder builder = new StreamsBuilder();
> > >
> > > builder
> > >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
> > >   .groupBy(
> > >     (String k1, String v1) -> k1,
> > >     Serialized.with(STRING_SERDE, STRING_SERDE)
> > >   )
> > >   .windowedBy(TimeWindows
> > >     .of(scaledTime(2L))
> > >     .until(scaledTime(3L))
> > >     .allowedLateness(scaledTime(1L))
> > >   )
> > >   .count(Materialized.as("counts"))
> > >   .suppress(
> > >     emitFinalResultsOnly(
> > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> SHUT_DOWN)
> > >     )
> > >   )
> > >   .toStream()
> > >   .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));
> > >
> > > Note that:
> > >  * "emitFinalResultsOnly" is available *only* on windowed tables
> > (enforced
> > > by the type system at compile time), and it determines the time to wait
> > by
> > > looking at "allowedLateness" on the TimeWindows config.
> > >  * querying "counts" will produce results (eventually) consistent with
> > > what's observable in "output-suppressed".
> > >  * in all cases, "suppress" has no effect on business logic, just on
> > event
> > > suppression.
> > >
> > > Is this API straightforward? Or do you still prefer the version that
> both
> > > proposed:
> > >
> > >   ...
> > >   .windowedBy(TimeWindows
> > >     .of(scaledTime(2L))
> > >     .until(scaledTime(3L))
> > >     .allowedLateness(scaledTime(1L))
> > >   )
> > >   .count(
> > >     Materialized.as("counts"),
> > >     emitFinalResultsOnly(
> > >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(
> SHUT_DOWN)
> > >     )
> > >   )
> > >   ...
> > >
> > > To me, these two are practically identical, and I still vaguely prefer
> > the
> > > first one.
> > >
> > > The prototype has made clearer to me that users of "final results for
> > > windows" and users of "suppression for table events" both need to
> > configure
> > > the suppression buffer.
> > >
> > > This buffer configuration consists of:
> > > 1. how many keys or bytes to keep in memory
> > > 2. what to do if memory runs out (shut down, start using disk, ...)
> > >
> > > So it's not as simple as setting a "final results" flag. We'll either
> > have
> > > an "Emit" config object on the windowed aggregators that takes the same
> > > BufferConfig that the "Suppress" config on the suppression operator, or
> > we
> > > just use the suppression operator for both.
> > >
> > > Perhaps it would sweeten the deal a little to point out that we have 2
> > > overloads already for each windowed aggregator (with and without
> > > Materialized). Adding "Emitted" or something would mean that we'd add a
> > new
> > > overload for each one, taking us up to 4 overloads each for "count",
> > > "aggregate" and "reduce". Using "suppress" means that we don't add any
> > new
> > > overloads.
> > >
> > > Thanks again for helping to hash this out,
> > > -John
> > >
> > > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> > >
> > > > I think I agree with Matthias for having dedicated APIs for windowed
> > > > operation final output scenario, PLUS separating the window close
> which
> > > the
> > > > "final output" would rely on, from the window retention time itself
> > > > (admittedly it would make this KIP effort larger, but if we believe
> we
> > > need
> > > > to do this separation anyways we could just do it now).
> > > >
> > > > And then we can have the `KTable#suppress()` for
> > intermediate-suppression
> > > > only, not for late-record-suppression, until we've seen that becomes
> a
> > > > common feature request because our current design still allows to be
> > > > extended for that purpose.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
> > matthias@confluent.io>
> > > > wrote:
> > > >
> > > > > Thanks for the discussion. I am just catching up.
> > > > >
> > > > > In general, I think we have different uses cases and non-windowed
> and
> > > > > windowed is quite different. For the non-windowed case, suppress()
> > has
> > > > > no (useful) close or retention time, no final semantics, and also
> no
> > > > > business logic impact.
> > > > >
> > > > > On the other hand, for windowed aggregations, close time and final
> > > > > result do have a meaning. IMHO, `close()` is part of business logic
> > > > > while retention time is not. Also, suppression of intermediate
> result
> > > is
> > > > > not a business rule and there might be use case for which either
> > "early
> > > > > intermediate" (before window end time) are suppressed only, or all
> > > > > intermediates are suppressed (maybe also something in the middle,
> ie,
> > > > > just reduce the load of intermediate updates). Thus,
> > window-suppression
> > > > > is much richer.
> > > > >
> > > > > IMHO, a generic `suppress()` operator that can be inserted into the
> > > data
> > > > > flow at any point is useful. Maybe we should keep is as generic as
> > > > > possible. However, it might be difficult to use with regard to
> > > > > windowing, as the mental effort to use it is high.
> > > > >
> > > > > With regard to Guozhang's comment:
> > > > >
> > > > > > we will actually
> > > > > > process data as old as 30 days as well, while most of the late
> > > updates
> > > > > > beyond 5 minutes would be discarded anyways.
> > > > >
> > > > > If we use `suppress()` as a standalone operator, this is correct
> and
> > > > > intended IMHO. To address the issue if the behavior is unwanted, I
> > > would
> > > > > suggest to add a "suppress option" directly to
> > > > > `count()/reduce()/aggregate()` window operator similar to
> > > > > `Materialized`. This would be an "embedded suppress" and avoid the
> > > > > issue. It would also address the issue about mental effort for
> > "single
> > > > > final window result" use case.
> > > > >
> > > > > I also think that a shorter close-time than retention time is
> useful
> > > for
> > > > > window aggregation. If we add close() to the window definition and
> > > > > until() to `Materialized`, we can separate both correctly IMHO.
> > > > >
> > > > > About setting `close = min(close,retention)` I am not sure. We
> might
> > > > > rather throw an exception than reducing the close time
> automatically.
> > > > > Otherwise, I see many user question about "I set close to X but it
> > does
> > > > > not get updated for some data that is with delay of X".
> > > > >
> > > > > The tricky question might be to design the API in a backward
> > compatible
> > > > > way though.
> > > > >
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 7/3/18 5:38 AM, John Roesler wrote:
> > > > > > Hi Guozhang,
> > > > > >
> > > > > > I see. It seems like if we want to decouple 1) and 2), we need to
> > > alter
> > > > > the
> > > > > > definition of the window. Do you think it would close the gap if
> we
> > > > > added a
> > > > > > "window close" time to the window definition?
> > > > > >
> > > > > > Such as:
> > > > > >
> > > > > > builder.stream("input")
> > > > > > .groupByKey()
> > > > > > .windowedBy(
> > > > > >   TimeWindows
> > > > > >     .of(60_000)
> > > > > >     .closeAfter(10 * 60)
> > > > > >     .until(30L * 24 * 60 * 60 * 1000)
> > > > > > )
> > > > > > .count()
> > > > > > .suppress(Suppression.finalResultsOnly());
> > > > > >
> > > > > > Possibly called "finalResultsAtWindowClose" or something?
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wangguoz@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > >> Hey John,
> > > > > >>
> > > > > >> Obviously I'm too lazy on email replying diligence compared with
> > you
> > > > :)
> > > > > >> Will try to reply them separately:
> > > > > >>
> > > > > >>
> > > > > >> ------------------------------------------------------------
> > > > > -----------------
> > > > > >>
> > > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > > > > >>
> > > > > >> I'm aware of this use case, but again, the concern is that, in
> > this
> > > > > setting
> > > > > >> in order to let the window be queryable for 30 days, we will
> > > actually
> > > > > >> process data as old as 30 days as well, while most of the late
> > > updates
> > > > > >> beyond 5 minutes would be discarded anyways. Personally I think
> > for
> > > > the
> > > > > >> final update scenario, the ideal situation users would want is
> > that
> > > > "do
> > > > > not
> > > > > >> process any data that is less than 5 minutes, and of course no
> > > update
> > > > > >> records to the downstream later than 5 minutes either; but
> retain
> > > the
> > > > > >> window to be queryable for 30 days". And by doing that the final
> > > > window
> > > > > >> snapshot would also be aligned with the update stream as well.
> In
> > > > other
> > > > > >> words, among these three periods:
> > > > > >>
> > > > > >> 1) the retention length of the window / table.
> > > > > >> 2) the late records acceptance for updating the window.
> > > > > >> 3) the late records update to be sent downstream.
> > > > > >>
> > > > > >> Final update use cases would naturally want 2) = 3), while 1)
> may
> > be
> > > > > >> different and larger, while what we provide now is that 1) = 2),
> > > which
> > > > > >> could be different and in practice larger than 3), hence not the
> > > most
> > > > > >> intuitive for their needs.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> ------------------------------------------------------------
> > > > > -----------------
> > > > > >>
> > > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > > > > >>
> > > > > >> I'd like option 2) over option 1) better as well from
> programming
> > > pov.
> > > > > But
> > > > > >> I'm wondering if option 2) would provide the above semantics or
> it
> > > is
> > > > > still
> > > > > >> coupling 1) with 2) as well ?
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> Guozhang
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <john@confluent.io
> >
> > > > wrote:
> > > > > >>
> > > > > >>> In fact, to push the idea further (which IIRC is what Matthias
> > > > > originally
> > > > > >>> proposed), if we can accept "Suppression#finalResultsOnly" in
> my
> > > last
> > > > > >>> email, then we could also consider whether to eliminate
> > > > > >>> "suppressLateEvents" entirely.
> > > > > >>>
> > > > > >>> We could always add it later, but you've both expressed doubt
> > that
> > > > > there
> > > > > >>> are practical use cases for it outside of final-results.
> > > > > >>>
> > > > > >>> -John
> > > > > >>>
> > > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <
> john@confluent.io>
> > > > > wrote:
> > > > > >>>
> > > > > >>>> Hi again, Guozhang ;) Here's the second part of my response...
> > > > > >>>>
> > > > > >>>> It seems like your main concern is: "if I'm a user who wants
> > final
> > > > > >> update
> > > > > >>>> semantics, how complicated is it for me to get it?"
> > > > > >>>>
> > > > > >>>> I think we have to assume that people don't always have time
> to
> > > > become
> > > > > >>>> deeply familiar with all the nuances of a programming
> > environment
> > > > > >> before
> > > > > >>>> they use it. Especially if they're evaluating several
> frameworks
> > > for
> > > > > >>> their
> > > > > >>>> use case, it's very valuable to make it as obvious as possible
> > how
> > > > to
> > > > > >>>> accomplish various computations with Streams.
> > > > > >>>>
> > > > > >>>> To me the biggest question is whether with a fresh
> perspective,
> > > > people
> > > > > >>>> would say "oh, I get it, I have to bound my lateness and
> > suppress
> > > > > >>>> intermediate updates, and of course I'll get only the final
> > > > result!",
> > > > > >> or
> > > > > >>> if
> > > > > >>>> it's more like "wtf? all I want is the final result, what are
> > all
> > > > > these
> > > > > >>>> parameters?".
> > > > > >>>>
> > > > > >>>> I was talking with Matthias a while back, and he had an idea
> > that
> > > I
> > > > > >> think
> > > > > >>>> can help, which is to essentially set up a final-result recipe
> > in
> > > > > >>> addition
> > > > > >>>> to the raw parameters. I previously thought that it wouldn't
> be
> > > > > >> possible
> > > > > >>> to
> > > > > >>>> restrict its usage to Windowed KTables, but thinking about it
> > > again
> > > > > >> this
> > > > > >>>> weekend, I have a couple of ideas:
> > > > > >>>>
> > > > > >>>> ================
> > > > > >>>> = 1. Static Wrapper =
> > > > > >>>> ================
> > > > > >>>> We can define an extra static function that "wraps" a KTable
> > with
> > > > > >>>> final-result semantics.
> > > > > >>>>
> > > > > >>>> public static <K extends Windowed, V> KTable<K, V>
> > > finalResultsOnly(
> > > > > >>>>   final KTable<K, V> windowedKTable,
> > > > > >>>>   final Duration maxAllowedLateness,
> > > > > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > > > > >>>>     return windowedKTable.suppress(
> > > > > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> > > > > >>>>                    .suppressIntermediateEvents(
> > > > > >>>>                      IntermediateSuppression
> > > > > >>>>                        .emitAfter(maxAllowedLateness)
> > > > > >>>>                        .bufferFullStrategy(
> bufferFullStrategy)
> > > > > >>>>                    )
> > > > > >>>>     );
> > > > > >>>> }
> > > > > >>>>
> > > > > >>>> Because windowedKTable is a parameter, the static function can
> > > > easily
> > > > > >>>> impose an extra bound on the key type, that it extends
> Windowed.
> > > > This
> > > > > >>> would
> > > > > >>>> make "final results only" only available on windowed ktables.
> > > > > >>>>
> > > > > >>>> Here's how it would look to use:
> > > > > >>>>
> > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > > >>>>   finalResultsOnly(
> > > > > >>>>     windowCounts,
> > > > > >>>>     Duration.ofMinutes(10),
> > > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > > > > >>>>   );
> > > > > >>>>
> > > > > >>>> Trying to use it on a non-windowed KTable yields:
> > > > > >>>>
> > > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > > > > >>>>> org.apache.kafka.streams.kstream.internals.
> KTableAggregateTest
> > > > > cannot
> > > > > >>> be
> > > > > >>>>> applied to given types;
> > > > > >>>>>   required:
> > > > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > > > > BufferFullStrategy
> > > > > >>>>>   found:
> > > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > > > > >>>>>   reason: inference variable K has incompatible bounds
> > > > > >>>>>     equality constraints: java.lang.String
> > > > > >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> =================================================
> > > > > >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> > > > > >>>> =================================================
> > > > > >>>>
> > > > > >>>> By adding K,V parameters to Suppression, we can provide a
> > > similarly
> > > > > >>>> bounded config method directly on the Suppression class:
> > > > > >>>>
> > > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > > > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> > > > > >>>> BufferFullStrategy bufferFullStrategy) {
> > > > > >>>>     return Suppression
> > > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > > > > >>>>         .suppressIntermediateEvents(IntermediateSuppression
> > > > > >>>>             .emitAfter(maxAllowedLateness)
> > > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > > > > >>>>         );
> > > > > >>>> }
> > > > > >>>>
> > > > > >>>> Then, here's how it would look to use it:
> > > > > >>>>
> > > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > > >>>>   windowCounts.suppress(
> > > > > >>>>     Suppression.finalResultsOnly(
> > > > > >>>>       Duration.ofMinutes(10)
> > > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > > > > >>>>     )
> > > > > >>>>   );
> > > > > >>>>
> > > > > >>>> Trying to use it on a non-windowed ktable yields:
> > > > > >>>>
> > > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be
> > > applied
> > > > > to
> > > > > >>>>> given types;
> > > > > >>>>>   required:
> > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > > >>> Suppression.BufferFullStrategy
> > > > > >>>>>   found:
> > > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > > >>> Suppression.BufferFullStrategy
> > > > > >>>>>   reason: explicit type argument java.lang.String does not
> > > conform
> > > > to
> > > > > >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> ============
> > > > > >>>> = Downsides =
> > > > > >>>> ============
> > > > > >>>>
> > > > > >>>> Of course, there's a downside either way:
> > > > > >>>> * for 1:  this "wrapper" interaction would be the first in the
> > > DSL.
> > > > Is
> > > > > >> it
> > > > > >>>> too strange, and how discoverable would it be?
> > > > > >>>> * for 2: adding those type parameters to Suppression will
> force
> > > all
> > > > > >>>> callers to provide them in the event of a chained construction
> > > > because
> > > > > >>> Java
> > > > > >>>> doesn't do RHS recursive type inference. This is already
> visible
> > > in
> > > > > >> other
> > > > > >>>> parts of the Streams DSL. For example, often calls to
> > Materialized
> > > > > >>> builders
> > > > > >>>> have to provide seemingly obvious type bounds.
> > > > > >>>>
> > > > > >>>> ============
> > > > > >>>> = Conclusion =
> > > > > >>>> ============
> > > > > >>>>
> > > > > >>>> I think option 2 is more "normal" and discoverable. It does
> > have a
> > > > > >>>> downside, but it's one that's pre-existing elsewhere in the
> DSL.
> > > > > >>>>
> > > > > >>>> WDYT? Would the addition of this "recipe" method to
> Suppression
> > > > > resolve
> > > > > >>>> your concern?
> > > > > >>>>
> > > > > >>>> Thanks again,
> > > > > >>>> -John
> > > > > >>>>
> > > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
> > wangguoz@gmail.com
> > > >
> > > > > >>> wrote:
> > > > > >>>>
> > > > > >>>>> Hi John,
> > > > > >>>>>
> > > > > >>>>> Regarding the metrics: yeah I think I'm with you that the
> > dropped
> > > > > >>> records
> > > > > >>>>> due to window retention or emit suppression policies should
> be
> > > > > >> recorded
> > > > > >>>>> differently, and using this KIP's proposed metric would be
> > fine.
> > > If
> > > > > >> you
> > > > > >>>>> also think we can use this KIP's proposed metrics to cover
> the
> > > > window
> > > > > >>>>> retention cased skipping records, then we can include the
> > changes
> > > > in
> > > > > >>> this
> > > > > >>>>> KIP as well.
> > > > > >>>>>
> > > > > >>>>> Regarding the current proposal, I'm actually not too worried
> > > about
> > > > > the
> > > > > >>>>> inconsistency between query semantics and downstream emit
> > > > semantics.
> > > > > >> For
> > > > > >>>>> queries, we will always return the current running results of
> > the
> > > > > >>> windows,
> > > > > >>>>> being it partial or final results depending on the window
> > > retention
> > > > > >> time
> > > > > >>>>> anyways, which has nothing to do whether the emitted stream
> > > should
> > > > be
> > > > > >>> one
> > > > > >>>>> final output per key or not. I also agree that having a
> unified
> > > > > >>> operation
> > > > > >>>>> is generally better for users to focus on leveraging that one
> > > only
> > > > > >> than
> > > > > >>>>> learning about two set of operations. The only question I had
> > is,
> > > > for
> > > > > >>>>> final
> > > > > >>>>> updates of window stores, if it is a bit awkward to
> understand
> > > the
> > > > > >>>>> configuration combo. Thinking about this more, I think my
> root
> > > > worry
> > > > > >> in
> > > > > >>>>> the
> > > > > >>>>> "suppressLateEvents" call for windowed tables, since from a
> > user
> > > > > >>>>> perspective: if my retention time is X which means "pay the
> > cost
> > > to
> > > > > >>> allow
> > > > > >>>>> late records up to X to still be applied updating the
> tables",
> > > why
> > > > > >>> would I
> > > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not
> > send
> > > > the
> > > > > >>>>> updates up to Y, which means the downstream operator or sink
> > > topic
> > > > > for
> > > > > >>>>> this
> > > > > >>>>> stream would actually see a truncated update stream while
> I've
> > > paid
> > > > > >>> larger
> > > > > >>>>> cost for that"; and of course, Y > X would not make sense
> > either
> > > as
> > > > > >> you
> > > > > >>>>> would not see any updates later than X anyways. So in all, my
> > > > feeling
> > > > > >> is
> > > > > >>>>> that it makes less sense for windowed table's
> > > "suppressLateEvents"
> > > > > >> with
> > > > > >>> a
> > > > > >>>>> parameter that is not equal to the window retention, and
> > opening
> > > > the
> > > > > >>> door
> > > > > >>>>> in the current proposal may confuse people with that.
> > > > > >>>>>
> > > > > >>>>> Again, above is just a subjective opinion and probably we can
> > > also
> > > > > >> bring
> > > > > >>>>> up
> > > > > >>>>> some scenarios that users does want to set X != Y.. but
> > > personally
> > > > I
> > > > > >>> feel
> > > > > >>>>> that even if the semantics for this scenario if intuitive for
> > > user
> > > > to
> > > > > >>>>> understand, doe that really make sense and should we really
> > open
> > > > the
> > > > > >>> door
> > > > > >>>>> for it. So I think maybe separating the final update in a
> > > separate
> > > > > >> API's
> > > > > >>>>> benefits may overwhelm the advantage of having one uniform
> > > > > definition.
> > > > > >>> And
> > > > > >>>>> for my alternative proposal, the rationale was from both my
> > > concern
> > > > > >>> about
> > > > > >>>>> "suppressLateEvents" for windowed store, and Matthias'
> question
> > > > about
> > > > > >>>>> "suppressLateEvents" for non-windowed stores, that if it is
> > less
> > > > > >>>>> meaningful
> > > > > >>>>> for both, we can consider removing it completely and only do
> > > > > >>>>> "IntermediateSuppression" in Suppress instead.
> > > > > >>>>>
> > > > > >>>>> So I'd summarize my thoughts in the following questions:
> > > > > >>>>>
> > > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
> > > > retention
> > > > > >>> time)
> > > > > >>>>> for windowed stores make sense in practice?
> > > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> > > non-windowed
> > > > > >>> stores
> > > > > >>>>> make sense in practice?
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> Guozhang
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
> > bbejeck@gmail.com>
> > > > > >> wrote:
> > > > > >>>>>
> > > > > >>>>>> Thanks for the explanation, that does make sense.  I have
> some
> > > > > >>>>> questions on
> > > > > >>>>>> operations, but I'll just wait for the PR and tests.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>> Bill
> > > > > >>>>>>
> > > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
> > john@confluent.io
> > > >
> > > > > >>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Hi Bill,
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks for the review!
> > > > > >>>>>>>
> > > > > >>>>>>> Your question is very much applicable to the KIP and not at
> > all
> > > > an
> > > > > >>>>>>> implementation detail. Thanks for bringing it up.
> > > > > >>>>>>>
> > > > > >>>>>>> I'm proposing not to change the existing caches and
> > > > configurations
> > > > > >>> at
> > > > > >>>>> all
> > > > > >>>>>>> (for now).
> > > > > >>>>>>>
> > > > > >>>>>>> Imagine you have a topology like this:
> > > > > >>>>>>> commit.interval.ms = 100
> > > > > >>>>>>>
> > > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > > > > >>>>>>>
> > > > > >>>>>>> The first ktable (ktable1) will respect the commit interval
> > and
> > > > > >>> buffer
> > > > > >>>>>>> events for 100ms before logging, storing, or forwarding
> them
> > > > > >> (IIRC).
> > > > > >>>>>>> Therefore, the second ktable (suppress) will only see the
> > > events
> > > > > >> at
> > > > > >>> a
> > > > > >>>>>> rate
> > > > > >>>>>>> of once per 100ms. It will apply its own buffering, and
> emit
> > > once
> > > > > >>> per
> > > > > >>>>>> 200ms
> > > > > >>>>>>> This case is pretty trivial because the suppress time is a
> > > > > >> multiple
> > > > > >>> of
> > > > > >>>>>> the
> > > > > >>>>>>> commit interval.
> > > > > >>>>>>>
> > > > > >>>>>>> When it's not an integer multiple, you'll get behavior like
> > in
> > > > > >> this
> > > > > >>>>>> marble
> > > > > >>>>>>> diagram:
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > > > > >>>>>>>
> > > > > >>>>>>> [ KTable caching with commit interval = 2 ]
> > > > > >>>>>>>
> > > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > > > > >>>>>>>
> > > > > >>>>>>>       [ suppress with emitAfter = 3 ]
> > > > > >>>>>>>
> > > > > >>>>>>> <---------------(k:2)----------------(k:6)->
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> If this behavior isn't desired (for example, if you wanted
> to
> > > > emit
> > > > > >>>>> (k:3)
> > > > > >>>>>> at
> > > > > >>>>>>> time 3, I'd recommend setting the
> "cache.max.bytes.buffering"
> > > to
> > > > 0
> > > > > >>> or
> > > > > >>>>>>> modifying the topology to disable caching. Then, the
> behavior
> > > is
> > > > > >>> more
> > > > > >>>>>>> simply determined just by the suppress operator.
> > > > > >>>>>>>
> > > > > >>>>>>> Does that seem right to you?
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Regarding the changelogs, because the suppression operator
> > > hangs
> > > > > >>> onto
> > > > > >>>>>>> events for a while, it will need its own changelog. The
> > > changelog
> > > > > >>>>>>> should represent the current state of the buffer at all
> > times.
> > > So
> > > > > >>> when
> > > > > >>>>>> the
> > > > > >>>>>>> suppress operator sees (k:2), for example, it will log
> (k:2).
> > > > When
> > > > > >>> it
> > > > > >>>>>>> later gets to time 3, it's time to emit (k:2) downstream.
> > > Because
> > > > > >> k
> > > > > >>>>> is no
> > > > > >>>>>>> longer buffered, the suppress operator will log (k:null).
> > Thus,
> > > > > >> when
> > > > > >>>>>>> recovering,
> > > > > >>>>>>> it can rebuild the buffer by reading its changelog.
> > > > > >>>>>>>
> > > > > >>>>>>> What do you think about this?
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks,
> > > > > >>>>>>> -John
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
> > bbejeck@gmail.com
> > > >
> > > > > >>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi John,  thanks for the KIP.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Early on in the KIP, you mention the current approaches
> for
> > > > > >>>>> controlling
> > > > > >>>>>>> the
> > > > > >>>>>>>> rate of downstream records from a KTable, cache size
> > > > > >> configuration
> > > > > >>>>> and
> > > > > >>>>>>>> commit time.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Will these configuration parameters still be in effect for
> > > > > >> tables
> > > > > >>>>> that
> > > > > >>>>>>>> don't use suppression?  For tables taking advantage of
> > > > > >>> suppression,
> > > > > >>>>>> will
> > > > > >>>>>>>> these configurations have no impact?
> > > > > >>>>>>>> This last question may be to implementation specific but
> if
> > > the
> > > > > >>>>>> requested
> > > > > >>>>>>>> suppression time is longer than the specified commit time,
> > > will
> > > > > >>> the
> > > > > >>>>>>> latest
> > > > > >>>>>>>> record in the suppression buffer get stored in a
> changelog?
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks,
> > > > > >>>>>>>> Bill
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> > > john@confluent.io
> > > > > >>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Thanks for the feedback, Matthias,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> It seems like in straightforward relational processing
> > cases,
> > > > > >> it
> > > > > >>>>>> would
> > > > > >>>>>>>> not
> > > > > >>>>>>>>> make sense to bound the lateness of KTables. In general,
> it
> > > > > >>> seems
> > > > > >>>>>>> better
> > > > > >>>>>>>> to
> > > > > >>>>>>>>> have "guard rails" in place that make it easier to write
> > > > > >>> sensible
> > > > > >>>>>>>> programs
> > > > > >>>>>>>>> than insensible ones.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> But I'm still going to argue in favor of keeping it for
> all
> > > > > >>>>> KTables
> > > > > >>>>>> ;)
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 1. I believe it is simpler to understand the operator if
> it
> > > > > >> has
> > > > > >>>>> one
> > > > > >>>>>>>> uniform
> > > > > >>>>>>>>> definition, regardless of context. It's well defined and
> > > > > >>> intuitive
> > > > > >>>>>> what
> > > > > >>>>>>>>> will happen when you use late-event suppression on a
> > KTable,
> > > > > >> so
> > > > > >>> I
> > > > > >>>>>> think
> > > > > >>>>>>>>> nothing surprising or dangerous will happen in that case.
> > > From
> > > > > >>> my
> > > > > >>>>>>>>> perspective, having two sets of allowed operations is
> > > actually
> > > > > >>> an
> > > > > >>>>>>>> increase
> > > > > >>>>>>>>> in cognitive complexity.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this way.
> For
> > > > > >>>>> example,
> > > > > >>>>>> in
> > > > > >>>>>>>> lieu
> > > > > >>>>>>>>> of full-featured timestamp semantics, I can implement
> MVCC
> > > > > >>>>> behavior
> > > > > >>>>>>> when
> > > > > >>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)".
> I
> > > > > >>>>> suspect
> > > > > >>>>>>> that
> > > > > >>>>>>>>> there are other, non-obvious applications of suppressing
> > late
> > > > > >>>>> events
> > > > > >>>>>> on
> > > > > >>>>>>>>> KTables.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> 3. Not to get too much into implementation details in a
> KIP
> > > > > >>>>>> discussion,
> > > > > >>>>>>>> but
> > > > > >>>>>>>>> if we did want to make late-event suppression available
> > only
> > > > > >> on
> > > > > >>>>>>> windowed
> > > > > >>>>>>>>> KTables, we have two enforcement options:
> > > > > >>>>>>>>>   a. check when we build the topology - this would be
> > simple
> > > > > >> to
> > > > > >>>>>>>> implement,
> > > > > >>>>>>>>> but would be a runtime check. Hopefully, people write
> tests
> > > > > >> for
> > > > > >>>>> their
> > > > > >>>>>>>>> topology before deploying them, so the feedback loop
> isn't
> > > > > >>>>>>> instantaneous,
> > > > > >>>>>>>>> but it's not too long either.
> > > > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a
> > compile
> > > > > >>> time
> > > > > >>>>>>> check,
> > > > > >>>>>>>>> but would also be substantial increase of both interface
> > and
> > > > > >>> code
> > > > > >>>>>>>>> complexity.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> We should definitely strive to have guard rails
> protecting
> > > > > >>> against
> > > > > >>>>>>>>> surprising or dangerous behavior. Protecting against
> > programs
> > > > > >>>>> that we
> > > > > >>>>>>>> don't
> > > > > >>>>>>>>> currently predict is a lesser benefit, and I think we can
> > put
> > > > > >> up
> > > > > >>>>>> guard
> > > > > >>>>>>>>> rails on a case-by-case basis for that. It seems like the
> > > > > >>>>> increase in
> > > > > >>>>>>>>> cognitive (and potentially code and interface) complexity
> > > > > >> makes
> > > > > >>> me
> > > > > >>>>>>> think
> > > > > >>>>>>>> we
> > > > > >>>>>>>>> should skip this case.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> What do you think?
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks,
> > > > > >>>>>>>>> -John
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > > > >>>>>>> matthias@confluent.io>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> Thanks for the KIP John.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> One initial comments about the last example "Bounded
> > > > > >>> lateness":
> > > > > >>>>>> For a
> > > > > >>>>>>>>>> non-windowed KTable bounding the lateness does not
> really
> > > > > >> make
> > > > > >>>>>> sense,
> > > > > >>>>>>>>>> does it?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Thus, I am wondering if we should allow
> > > > > >> `suppressLateEvents()`
> > > > > >>>>> for
> > > > > >>>>>>> this
> > > > > >>>>>>>>>> case? It seems to be better to only allow it for
> > > > > >>>>> windowed-KTables.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> -Matthias
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> What you gave as new example is semantically the same
> as
> > > > > >>> what
> > > > > >>>>> I
> > > > > >>>>>>>>>> suggested.
> > > > > >>>>>>>>>>> So it is good by me.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > > > > >>>>> john@confluent.io
> > > > > >>>>>>>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks for taking look, Ted,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I agree this is a departure from the conventions of
> > > > > >> Streams
> > > > > >>>>> DSL.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Most of our config objects have one or two "required"
> > > > > >>>>>> parameters,
> > > > > >>>>>>>>> which
> > > > > >>>>>>>>>> fit
> > > > > >>>>>>>>>>>> naturally with the static factory method approach.
> > > > > >>>>> TimeWindow,
> > > > > >>>>>> for
> > > > > >>>>>>>>>> example,
> > > > > >>>>>>>>>>>> requires a size parameter, so we can naturally say
> > > > > >>>>>>>>> TimeWindows.of(size).
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I think in the case of a suppression, there's really
> no
> > > > > >>>>> "core"
> > > > > >>>>>>>>>> parameter,
> > > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > > > > >>>>> Suppression()". I
> > > > > >>>>>>>> think
> > > > > >>>>>>>>>> that
> > > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since
> there
> > > > > >>> are
> > > > > >>>>>> many
> > > > > >>>>>>>>>> durations
> > > > > >>>>>>>>>>>> that we can configure.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> However, thinking about it again, I suppose that I can
> > > > > >> give
> > > > > >>>>> each
> > > > > >>>>>>>>>>>> configuration method a static version, which would let
> > > > > >> you
> > > > > >>>>>> replace
> > > > > >>>>>>>>> "new
> > > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the
> examples.
> > > > > >>>>>>> Basically,
> > > > > >>>>>>>>>> instead
> > > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> For example:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> windowCounts
> > > > > >>>>>>>>>>>>     .suppress(
> > > > > >>>>>>>>>>>>         Suppression
> > > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.
> ofMinutes(10))
> > > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(
> 10))
> > > > > >>>>>>>>>>>>             )
> > > > > >>>>>>>>>>>>     );
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Does that seem better?
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>> -John
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > > > > >>> yuzhihong@gmail.com
> > > > > >>>>>>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> > > > > >>>>> materials.
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> One suggestion:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>     .suppress(
> > > > > >>>>>>>>>>>>>         new Suppression()
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Do you think it would be more consistent with the
> rest
> > > > > >> of
> > > > > >>>>>> Streams
> > > > > >>>>>>>>> data
> > > > > >>>>>>>>>>>>> structures by supporting `of` ?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Cheers
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > > > > >>>>>> john@confluent.io
> > > > > >>>>>>>>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Hello devs and users,
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Please take some time to consider this proposal for
> > > > > >> Kafka
> > > > > >>>>>>> Streams:
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> The basic idea is to provide:
> > > > > >>>>>>>>>>>>>> * more usable control over update rate (vs the
> current
> > > > > >>>>> state
> > > > > >>>>>>> store
> > > > > >>>>>>>>>>>>> caches)
> > > > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations
> feature
> > > > > >>> which
> > > > > >>>>>>> several
> > > > > >>>>>>>>>>>> people
> > > > > >>>>>>>>>>>>>> have requested
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I look forward to your feedback!
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>>>> -John
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> --
> > > > > >>>>> -- Guozhang
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> -- Guozhang
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Thanks for the reply, Guozhang,

Good! I agree, that is also a good reason, and I actually made use of that
in my tests. I'll update the KIP.

By the way, I chose "allowedLateness" as I was trying to pick a better name
than "close", but I think it's actually the wrong name. We don't want to
bound the lateness of events in general, only with respect to the end of
their window.

If we have a window [0,10), with "allowedLateness" of 5, then if we get an
event with timestamp 3 at time 9, the name implies we'd reject it, which
seems silly. Really, we'd only want to start rejecting that event at stream
time 15.

What I meant was more like "allowedLatenessAfterWindowEnd", but that's too
verbose. I think that "close" + some documentation about what it means will
be better.

1: "Close" would be measured from the end of the window, so a reasonable
default would be "0". Recall that "close" really only needs to be specified
for final results, and a default of 0 would produce the most intuitive
results. If folks later discover that they are missing some late events,
they can adjust the parameter accordingly. IMHO, any other value would just
be a guess on our part.

2a:
I think you're saying to re-use "until" instead of adding "close" to the
window.

The downside here would be that the semantic change could be more confusing
than deprecating "until" and introducing window "close" and a
"retentionTime" on the store builder. The deprecation is a good, controlled
way for us to make sure people are getting the semantics they think they're
getting, as well as giving us an opportunity to link people to the API they
should use instead.

I didn't fully understand the second part, but it sounds like you're
suggesting to add a new "retentionTime" setter to Windows to bridge the gap
until we add it to the store builder? That seems kind of roundabout to me,
if that's what you meant. We could just immediately add it to the store
builders in the same PR.

2b: Sounds good to me!

Thanks again,
-John


On Mon, Jul 9, 2018 at 4:55 PM Guozhang Wang <wa...@gmail.com> wrote:

> John,
>
> Thanks for your replies. As for the two options of the API, I think I'm
> slightly inclined to the first option as well. My motivation is a bit
> different, as I think of the first one maybe more flexible, for example:
>
> KTable<Windowed<..>> table = ... count();
>
> table.toStream().peek(..);   // want to peek at the changelog stream, do
> not care about final results.
>
> table.suppress().toStream().to("topic");    // sending to a topic, want to
> only send the final results.
>
> --------------
>
> Besides that, I have a few more minor questions:
>
> 1. For "allowedLateness", what should be the default value? I.e. if user do
> not specify "allowedLateness" in TimeWindows, what value should we set?
>
> 2. For API names, some personal suggestions here:
>
> 2.a) "allowedLateness"  -> "until" (semantics changed, and also value is
> defined as delta on top of window length), where "until" ->
> "retentionPeriod", and the latter will be removed from `Windows` to `
> WindowStoreBuilder` in the future.
>
> 2.b) "BufferConfig" -> "Buffered" ?
>
>
>
> Guozhang
>
>
> On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Hey Matthias and Guozhang,
> >
> > Sorry for the slow reply. I was mulling about your feedback and weighing
> > some ideas in a sketchbook PR: https://github.com/apache/kafka/pull/5337
> .
> >
> > Your thought about keeping suppression independent of business logic is a
> > very good one. I agree that it would make more sense to add some kind of
> > "window close" concept to the window definition.
> >
> > In fact, doing that immediately solves the inconsistency problem Guozhang
> > brought up. There's no need to add a "final results" or "emission" option
> > to the windowed aggregation.
> >
> > What do you think about an API more like this:
> >
> > final StreamsBuilder builder = new StreamsBuilder();
> >
> > builder
> >   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
> >   .groupBy(
> >     (String k1, String v1) -> k1,
> >     Serialized.with(STRING_SERDE, STRING_SERDE)
> >   )
> >   .windowedBy(TimeWindows
> >     .of(scaledTime(2L))
> >     .until(scaledTime(3L))
> >     .allowedLateness(scaledTime(1L))
> >   )
> >   .count(Materialized.as("counts"))
> >   .suppress(
> >     emitFinalResultsOnly(
> >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
> >     )
> >   )
> >   .toStream()
> >   .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));
> >
> > Note that:
> >  * "emitFinalResultsOnly" is available *only* on windowed tables
> (enforced
> > by the type system at compile time), and it determines the time to wait
> by
> > looking at "allowedLateness" on the TimeWindows config.
> >  * querying "counts" will produce results (eventually) consistent with
> > what's observable in "output-suppressed".
> >  * in all cases, "suppress" has no effect on business logic, just on
> event
> > suppression.
> >
> > Is this API straightforward? Or do you still prefer the version that both
> > proposed:
> >
> >   ...
> >   .windowedBy(TimeWindows
> >     .of(scaledTime(2L))
> >     .until(scaledTime(3L))
> >     .allowedLateness(scaledTime(1L))
> >   )
> >   .count(
> >     Materialized.as("counts"),
> >     emitFinalResultsOnly(
> >       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
> >     )
> >   )
> >   ...
> >
> > To me, these two are practically identical, and I still vaguely prefer
> the
> > first one.
> >
> > The prototype has made clearer to me that users of "final results for
> > windows" and users of "suppression for table events" both need to
> configure
> > the suppression buffer.
> >
> > This buffer configuration consists of:
> > 1. how many keys or bytes to keep in memory
> > 2. what to do if memory runs out (shut down, start using disk, ...)
> >
> > So it's not as simple as setting a "final results" flag. We'll either
> have
> > an "Emit" config object on the windowed aggregators that takes the same
> > BufferConfig that the "Suppress" config on the suppression operator, or
> we
> > just use the suppression operator for both.
> >
> > Perhaps it would sweeten the deal a little to point out that we have 2
> > overloads already for each windowed aggregator (with and without
> > Materialized). Adding "Emitted" or something would mean that we'd add a
> new
> > overload for each one, taking us up to 4 overloads each for "count",
> > "aggregate" and "reduce". Using "suppress" means that we don't add any
> new
> > overloads.
> >
> > Thanks again for helping to hash this out,
> > -John
> >
> > On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> > > I think I agree with Matthias for having dedicated APIs for windowed
> > > operation final output scenario, PLUS separating the window close which
> > the
> > > "final output" would rely on, from the window retention time itself
> > > (admittedly it would make this KIP effort larger, but if we believe we
> > need
> > > to do this separation anyways we could just do it now).
> > >
> > > And then we can have the `KTable#suppress()` for
> intermediate-suppression
> > > only, not for late-record-suppression, until we've seen that becomes a
> > > common feature request because our current design still allows to be
> > > extended for that purpose.
> > >
> > >
> > > Guozhang
> > >
> > > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <
> matthias@confluent.io>
> > > wrote:
> > >
> > > > Thanks for the discussion. I am just catching up.
> > > >
> > > > In general, I think we have different uses cases and non-windowed and
> > > > windowed is quite different. For the non-windowed case, suppress()
> has
> > > > no (useful) close or retention time, no final semantics, and also no
> > > > business logic impact.
> > > >
> > > > On the other hand, for windowed aggregations, close time and final
> > > > result do have a meaning. IMHO, `close()` is part of business logic
> > > > while retention time is not. Also, suppression of intermediate result
> > is
> > > > not a business rule and there might be use case for which either
> "early
> > > > intermediate" (before window end time) are suppressed only, or all
> > > > intermediates are suppressed (maybe also something in the middle, ie,
> > > > just reduce the load of intermediate updates). Thus,
> window-suppression
> > > > is much richer.
> > > >
> > > > IMHO, a generic `suppress()` operator that can be inserted into the
> > data
> > > > flow at any point is useful. Maybe we should keep is as generic as
> > > > possible. However, it might be difficult to use with regard to
> > > > windowing, as the mental effort to use it is high.
> > > >
> > > > With regard to Guozhang's comment:
> > > >
> > > > > we will actually
> > > > > process data as old as 30 days as well, while most of the late
> > updates
> > > > > beyond 5 minutes would be discarded anyways.
> > > >
> > > > If we use `suppress()` as a standalone operator, this is correct and
> > > > intended IMHO. To address the issue if the behavior is unwanted, I
> > would
> > > > suggest to add a "suppress option" directly to
> > > > `count()/reduce()/aggregate()` window operator similar to
> > > > `Materialized`. This would be an "embedded suppress" and avoid the
> > > > issue. It would also address the issue about mental effort for
> "single
> > > > final window result" use case.
> > > >
> > > > I also think that a shorter close-time than retention time is useful
> > for
> > > > window aggregation. If we add close() to the window definition and
> > > > until() to `Materialized`, we can separate both correctly IMHO.
> > > >
> > > > About setting `close = min(close,retention)` I am not sure. We might
> > > > rather throw an exception than reducing the close time automatically.
> > > > Otherwise, I see many user question about "I set close to X but it
> does
> > > > not get updated for some data that is with delay of X".
> > > >
> > > > The tricky question might be to design the API in a backward
> compatible
> > > > way though.
> > > >
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 7/3/18 5:38 AM, John Roesler wrote:
> > > > > Hi Guozhang,
> > > > >
> > > > > I see. It seems like if we want to decouple 1) and 2), we need to
> > alter
> > > > the
> > > > > definition of the window. Do you think it would close the gap if we
> > > > added a
> > > > > "window close" time to the window definition?
> > > > >
> > > > > Such as:
> > > > >
> > > > > builder.stream("input")
> > > > > .groupByKey()
> > > > > .windowedBy(
> > > > >   TimeWindows
> > > > >     .of(60_000)
> > > > >     .closeAfter(10 * 60)
> > > > >     .until(30L * 24 * 60 * 60 * 1000)
> > > > > )
> > > > > .count()
> > > > > .suppress(Suppression.finalResultsOnly());
> > > > >
> > > > > Possibly called "finalResultsAtWindowClose" or something?
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > > >
> > > > >> Hey John,
> > > > >>
> > > > >> Obviously I'm too lazy on email replying diligence compared with
> you
> > > :)
> > > > >> Will try to reply them separately:
> > > > >>
> > > > >>
> > > > >> ------------------------------------------------------------
> > > > -----------------
> > > > >>
> > > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > > > >>
> > > > >> I'm aware of this use case, but again, the concern is that, in
> this
> > > > setting
> > > > >> in order to let the window be queryable for 30 days, we will
> > actually
> > > > >> process data as old as 30 days as well, while most of the late
> > updates
> > > > >> beyond 5 minutes would be discarded anyways. Personally I think
> for
> > > the
> > > > >> final update scenario, the ideal situation users would want is
> that
> > > "do
> > > > not
> > > > >> process any data that is less than 5 minutes, and of course no
> > update
> > > > >> records to the downstream later than 5 minutes either; but retain
> > the
> > > > >> window to be queryable for 30 days". And by doing that the final
> > > window
> > > > >> snapshot would also be aligned with the update stream as well. In
> > > other
> > > > >> words, among these three periods:
> > > > >>
> > > > >> 1) the retention length of the window / table.
> > > > >> 2) the late records acceptance for updating the window.
> > > > >> 3) the late records update to be sent downstream.
> > > > >>
> > > > >> Final update use cases would naturally want 2) = 3), while 1) may
> be
> > > > >> different and larger, while what we provide now is that 1) = 2),
> > which
> > > > >> could be different and in practice larger than 3), hence not the
> > most
> > > > >> intuitive for their needs.
> > > > >>
> > > > >>
> > > > >>
> > > > >> ------------------------------------------------------------
> > > > -----------------
> > > > >>
> > > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > > > >>
> > > > >> I'd like option 2) over option 1) better as well from programming
> > pov.
> > > > But
> > > > >> I'm wondering if option 2) would provide the above semantics or it
> > is
> > > > still
> > > > >> coupling 1) with 2) as well ?
> > > > >>
> > > > >>
> > > > >>
> > > > >> Guozhang
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io>
> > > wrote:
> > > > >>
> > > > >>> In fact, to push the idea further (which IIRC is what Matthias
> > > > originally
> > > > >>> proposed), if we can accept "Suppression#finalResultsOnly" in my
> > last
> > > > >>> email, then we could also consider whether to eliminate
> > > > >>> "suppressLateEvents" entirely.
> > > > >>>
> > > > >>> We could always add it later, but you've both expressed doubt
> that
> > > > there
> > > > >>> are practical use cases for it outside of final-results.
> > > > >>>
> > > > >>> -John
> > > > >>>
> > > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io>
> > > > wrote:
> > > > >>>
> > > > >>>> Hi again, Guozhang ;) Here's the second part of my response...
> > > > >>>>
> > > > >>>> It seems like your main concern is: "if I'm a user who wants
> final
> > > > >> update
> > > > >>>> semantics, how complicated is it for me to get it?"
> > > > >>>>
> > > > >>>> I think we have to assume that people don't always have time to
> > > become
> > > > >>>> deeply familiar with all the nuances of a programming
> environment
> > > > >> before
> > > > >>>> they use it. Especially if they're evaluating several frameworks
> > for
> > > > >>> their
> > > > >>>> use case, it's very valuable to make it as obvious as possible
> how
> > > to
> > > > >>>> accomplish various computations with Streams.
> > > > >>>>
> > > > >>>> To me the biggest question is whether with a fresh perspective,
> > > people
> > > > >>>> would say "oh, I get it, I have to bound my lateness and
> suppress
> > > > >>>> intermediate updates, and of course I'll get only the final
> > > result!",
> > > > >> or
> > > > >>> if
> > > > >>>> it's more like "wtf? all I want is the final result, what are
> all
> > > > these
> > > > >>>> parameters?".
> > > > >>>>
> > > > >>>> I was talking with Matthias a while back, and he had an idea
> that
> > I
> > > > >> think
> > > > >>>> can help, which is to essentially set up a final-result recipe
> in
> > > > >>> addition
> > > > >>>> to the raw parameters. I previously thought that it wouldn't be
> > > > >> possible
> > > > >>> to
> > > > >>>> restrict its usage to Windowed KTables, but thinking about it
> > again
> > > > >> this
> > > > >>>> weekend, I have a couple of ideas:
> > > > >>>>
> > > > >>>> ================
> > > > >>>> = 1. Static Wrapper =
> > > > >>>> ================
> > > > >>>> We can define an extra static function that "wraps" a KTable
> with
> > > > >>>> final-result semantics.
> > > > >>>>
> > > > >>>> public static <K extends Windowed, V> KTable<K, V>
> > finalResultsOnly(
> > > > >>>>   final KTable<K, V> windowedKTable,
> > > > >>>>   final Duration maxAllowedLateness,
> > > > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > > > >>>>     return windowedKTable.suppress(
> > > > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> > > > >>>>                    .suppressIntermediateEvents(
> > > > >>>>                      IntermediateSuppression
> > > > >>>>                        .emitAfter(maxAllowedLateness)
> > > > >>>>                        .bufferFullStrategy(bufferFullStrategy)
> > > > >>>>                    )
> > > > >>>>     );
> > > > >>>> }
> > > > >>>>
> > > > >>>> Because windowedKTable is a parameter, the static function can
> > > easily
> > > > >>>> impose an extra bound on the key type, that it extends Windowed.
> > > This
> > > > >>> would
> > > > >>>> make "final results only" only available on windowed ktables.
> > > > >>>>
> > > > >>>> Here's how it would look to use:
> > > > >>>>
> > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > >>>>   finalResultsOnly(
> > > > >>>>     windowCounts,
> > > > >>>>     Duration.ofMinutes(10),
> > > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > > > >>>>   );
> > > > >>>>
> > > > >>>> Trying to use it on a non-windowed KTable yields:
> > > > >>>>
> > > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > > > >>>>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest
> > > > cannot
> > > > >>> be
> > > > >>>>> applied to given types;
> > > > >>>>>   required:
> > > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > > > BufferFullStrategy
> > > > >>>>>   found:
> > > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > > > >>>>>   reason: inference variable K has incompatible bounds
> > > > >>>>>     equality constraints: java.lang.String
> > > > >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> =================================================
> > > > >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> > > > >>>> =================================================
> > > > >>>>
> > > > >>>> By adding K,V parameters to Suppression, we can provide a
> > similarly
> > > > >>>> bounded config method directly on the Suppression class:
> > > > >>>>
> > > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> > > > >>>> BufferFullStrategy bufferFullStrategy) {
> > > > >>>>     return Suppression
> > > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > > > >>>>         .suppressIntermediateEvents(IntermediateSuppression
> > > > >>>>             .emitAfter(maxAllowedLateness)
> > > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > > > >>>>         );
> > > > >>>> }
> > > > >>>>
> > > > >>>> Then, here's how it would look to use it:
> > > > >>>>
> > > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > > >>>>   windowCounts.suppress(
> > > > >>>>     Suppression.finalResultsOnly(
> > > > >>>>       Duration.ofMinutes(10)
> > > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > > > >>>>     )
> > > > >>>>   );
> > > > >>>>
> > > > >>>> Trying to use it on a non-windowed ktable yields:
> > > > >>>>
> > > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be
> > applied
> > > > to
> > > > >>>>> given types;
> > > > >>>>>   required:
> > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > >>> Suppression.BufferFullStrategy
> > > > >>>>>   found:
> > > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > > >>> Suppression.BufferFullStrategy
> > > > >>>>>   reason: explicit type argument java.lang.String does not
> > conform
> > > to
> > > > >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> ============
> > > > >>>> = Downsides =
> > > > >>>> ============
> > > > >>>>
> > > > >>>> Of course, there's a downside either way:
> > > > >>>> * for 1:  this "wrapper" interaction would be the first in the
> > DSL.
> > > Is
> > > > >> it
> > > > >>>> too strange, and how discoverable would it be?
> > > > >>>> * for 2: adding those type parameters to Suppression will force
> > all
> > > > >>>> callers to provide them in the event of a chained construction
> > > because
> > > > >>> Java
> > > > >>>> doesn't do RHS recursive type inference. This is already visible
> > in
> > > > >> other
> > > > >>>> parts of the Streams DSL. For example, often calls to
> Materialized
> > > > >>> builders
> > > > >>>> have to provide seemingly obvious type bounds.
> > > > >>>>
> > > > >>>> ============
> > > > >>>> = Conclusion =
> > > > >>>> ============
> > > > >>>>
> > > > >>>> I think option 2 is more "normal" and discoverable. It does
> have a
> > > > >>>> downside, but it's one that's pre-existing elsewhere in the DSL.
> > > > >>>>
> > > > >>>> WDYT? Would the addition of this "recipe" method to Suppression
> > > > resolve
> > > > >>>> your concern?
> > > > >>>>
> > > > >>>> Thanks again,
> > > > >>>> -John
> > > > >>>>
> > > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > > >>> wrote:
> > > > >>>>
> > > > >>>>> Hi John,
> > > > >>>>>
> > > > >>>>> Regarding the metrics: yeah I think I'm with you that the
> dropped
> > > > >>> records
> > > > >>>>> due to window retention or emit suppression policies should be
> > > > >> recorded
> > > > >>>>> differently, and using this KIP's proposed metric would be
> fine.
> > If
> > > > >> you
> > > > >>>>> also think we can use this KIP's proposed metrics to cover the
> > > window
> > > > >>>>> retention cased skipping records, then we can include the
> changes
> > > in
> > > > >>> this
> > > > >>>>> KIP as well.
> > > > >>>>>
> > > > >>>>> Regarding the current proposal, I'm actually not too worried
> > about
> > > > the
> > > > >>>>> inconsistency between query semantics and downstream emit
> > > semantics.
> > > > >> For
> > > > >>>>> queries, we will always return the current running results of
> the
> > > > >>> windows,
> > > > >>>>> being it partial or final results depending on the window
> > retention
> > > > >> time
> > > > >>>>> anyways, which has nothing to do whether the emitted stream
> > should
> > > be
> > > > >>> one
> > > > >>>>> final output per key or not. I also agree that having a unified
> > > > >>> operation
> > > > >>>>> is generally better for users to focus on leveraging that one
> > only
> > > > >> than
> > > > >>>>> learning about two set of operations. The only question I had
> is,
> > > for
> > > > >>>>> final
> > > > >>>>> updates of window stores, if it is a bit awkward to understand
> > the
> > > > >>>>> configuration combo. Thinking about this more, I think my root
> > > worry
> > > > >> in
> > > > >>>>> the
> > > > >>>>> "suppressLateEvents" call for windowed tables, since from a
> user
> > > > >>>>> perspective: if my retention time is X which means "pay the
> cost
> > to
> > > > >>> allow
> > > > >>>>> late records up to X to still be applied updating the tables",
> > why
> > > > >>> would I
> > > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not
> send
> > > the
> > > > >>>>> updates up to Y, which means the downstream operator or sink
> > topic
> > > > for
> > > > >>>>> this
> > > > >>>>> stream would actually see a truncated update stream while I've
> > paid
> > > > >>> larger
> > > > >>>>> cost for that"; and of course, Y > X would not make sense
> either
> > as
> > > > >> you
> > > > >>>>> would not see any updates later than X anyways. So in all, my
> > > feeling
> > > > >> is
> > > > >>>>> that it makes less sense for windowed table's
> > "suppressLateEvents"
> > > > >> with
> > > > >>> a
> > > > >>>>> parameter that is not equal to the window retention, and
> opening
> > > the
> > > > >>> door
> > > > >>>>> in the current proposal may confuse people with that.
> > > > >>>>>
> > > > >>>>> Again, above is just a subjective opinion and probably we can
> > also
> > > > >> bring
> > > > >>>>> up
> > > > >>>>> some scenarios that users does want to set X != Y.. but
> > personally
> > > I
> > > > >>> feel
> > > > >>>>> that even if the semantics for this scenario if intuitive for
> > user
> > > to
> > > > >>>>> understand, doe that really make sense and should we really
> open
> > > the
> > > > >>> door
> > > > >>>>> for it. So I think maybe separating the final update in a
> > separate
> > > > >> API's
> > > > >>>>> benefits may overwhelm the advantage of having one uniform
> > > > definition.
> > > > >>> And
> > > > >>>>> for my alternative proposal, the rationale was from both my
> > concern
> > > > >>> about
> > > > >>>>> "suppressLateEvents" for windowed store, and Matthias' question
> > > about
> > > > >>>>> "suppressLateEvents" for non-windowed stores, that if it is
> less
> > > > >>>>> meaningful
> > > > >>>>> for both, we can consider removing it completely and only do
> > > > >>>>> "IntermediateSuppression" in Suppress instead.
> > > > >>>>>
> > > > >>>>> So I'd summarize my thoughts in the following questions:
> > > > >>>>>
> > > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
> > > retention
> > > > >>> time)
> > > > >>>>> for windowed stores make sense in practice?
> > > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> > non-windowed
> > > > >>> stores
> > > > >>>>> make sense in practice?
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> Guozhang
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <
> bbejeck@gmail.com>
> > > > >> wrote:
> > > > >>>>>
> > > > >>>>>> Thanks for the explanation, that does make sense.  I have some
> > > > >>>>> questions on
> > > > >>>>>> operations, but I'll just wait for the PR and tests.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>> Bill
> > > > >>>>>>
> > > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <
> john@confluent.io
> > >
> > > > >>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi Bill,
> > > > >>>>>>>
> > > > >>>>>>> Thanks for the review!
> > > > >>>>>>>
> > > > >>>>>>> Your question is very much applicable to the KIP and not at
> all
> > > an
> > > > >>>>>>> implementation detail. Thanks for bringing it up.
> > > > >>>>>>>
> > > > >>>>>>> I'm proposing not to change the existing caches and
> > > configurations
> > > > >>> at
> > > > >>>>> all
> > > > >>>>>>> (for now).
> > > > >>>>>>>
> > > > >>>>>>> Imagine you have a topology like this:
> > > > >>>>>>> commit.interval.ms = 100
> > > > >>>>>>>
> > > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > > > >>>>>>>
> > > > >>>>>>> The first ktable (ktable1) will respect the commit interval
> and
> > > > >>> buffer
> > > > >>>>>>> events for 100ms before logging, storing, or forwarding them
> > > > >> (IIRC).
> > > > >>>>>>> Therefore, the second ktable (suppress) will only see the
> > events
> > > > >> at
> > > > >>> a
> > > > >>>>>> rate
> > > > >>>>>>> of once per 100ms. It will apply its own buffering, and emit
> > once
> > > > >>> per
> > > > >>>>>> 200ms
> > > > >>>>>>> This case is pretty trivial because the suppress time is a
> > > > >> multiple
> > > > >>> of
> > > > >>>>>> the
> > > > >>>>>>> commit interval.
> > > > >>>>>>>
> > > > >>>>>>> When it's not an integer multiple, you'll get behavior like
> in
> > > > >> this
> > > > >>>>>> marble
> > > > >>>>>>> diagram:
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > > > >>>>>>>
> > > > >>>>>>> [ KTable caching with commit interval = 2 ]
> > > > >>>>>>>
> > > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > > > >>>>>>>
> > > > >>>>>>>       [ suppress with emitAfter = 3 ]
> > > > >>>>>>>
> > > > >>>>>>> <---------------(k:2)----------------(k:6)->
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> If this behavior isn't desired (for example, if you wanted to
> > > emit
> > > > >>>>> (k:3)
> > > > >>>>>> at
> > > > >>>>>>> time 3, I'd recommend setting the "cache.max.bytes.buffering"
> > to
> > > 0
> > > > >>> or
> > > > >>>>>>> modifying the topology to disable caching. Then, the behavior
> > is
> > > > >>> more
> > > > >>>>>>> simply determined just by the suppress operator.
> > > > >>>>>>>
> > > > >>>>>>> Does that seem right to you?
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Regarding the changelogs, because the suppression operator
> > hangs
> > > > >>> onto
> > > > >>>>>>> events for a while, it will need its own changelog. The
> > changelog
> > > > >>>>>>> should represent the current state of the buffer at all
> times.
> > So
> > > > >>> when
> > > > >>>>>> the
> > > > >>>>>>> suppress operator sees (k:2), for example, it will log (k:2).
> > > When
> > > > >>> it
> > > > >>>>>>> later gets to time 3, it's time to emit (k:2) downstream.
> > Because
> > > > >> k
> > > > >>>>> is no
> > > > >>>>>>> longer buffered, the suppress operator will log (k:null).
> Thus,
> > > > >> when
> > > > >>>>>>> recovering,
> > > > >>>>>>> it can rebuild the buffer by reading its changelog.
> > > > >>>>>>>
> > > > >>>>>>> What do you think about this?
> > > > >>>>>>>
> > > > >>>>>>> Thanks,
> > > > >>>>>>> -John
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <
> bbejeck@gmail.com
> > >
> > > > >>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi John,  thanks for the KIP.
> > > > >>>>>>>>
> > > > >>>>>>>> Early on in the KIP, you mention the current approaches for
> > > > >>>>> controlling
> > > > >>>>>>> the
> > > > >>>>>>>> rate of downstream records from a KTable, cache size
> > > > >> configuration
> > > > >>>>> and
> > > > >>>>>>>> commit time.
> > > > >>>>>>>>
> > > > >>>>>>>> Will these configuration parameters still be in effect for
> > > > >> tables
> > > > >>>>> that
> > > > >>>>>>>> don't use suppression?  For tables taking advantage of
> > > > >>> suppression,
> > > > >>>>>> will
> > > > >>>>>>>> these configurations have no impact?
> > > > >>>>>>>> This last question may be to implementation specific but if
> > the
> > > > >>>>>> requested
> > > > >>>>>>>> suppression time is longer than the specified commit time,
> > will
> > > > >>> the
> > > > >>>>>>> latest
> > > > >>>>>>>> record in the suppression buffer get stored in a changelog?
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>> Bill
> > > > >>>>>>>>
> > > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> > john@confluent.io
> > > > >>>
> > > > >>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Thanks for the feedback, Matthias,
> > > > >>>>>>>>>
> > > > >>>>>>>>> It seems like in straightforward relational processing
> cases,
> > > > >> it
> > > > >>>>>> would
> > > > >>>>>>>> not
> > > > >>>>>>>>> make sense to bound the lateness of KTables. In general, it
> > > > >>> seems
> > > > >>>>>>> better
> > > > >>>>>>>> to
> > > > >>>>>>>>> have "guard rails" in place that make it easier to write
> > > > >>> sensible
> > > > >>>>>>>> programs
> > > > >>>>>>>>> than insensible ones.
> > > > >>>>>>>>>
> > > > >>>>>>>>> But I'm still going to argue in favor of keeping it for all
> > > > >>>>> KTables
> > > > >>>>>> ;)
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. I believe it is simpler to understand the operator if it
> > > > >> has
> > > > >>>>> one
> > > > >>>>>>>> uniform
> > > > >>>>>>>>> definition, regardless of context. It's well defined and
> > > > >>> intuitive
> > > > >>>>>> what
> > > > >>>>>>>>> will happen when you use late-event suppression on a
> KTable,
> > > > >> so
> > > > >>> I
> > > > >>>>>> think
> > > > >>>>>>>>> nothing surprising or dangerous will happen in that case.
> > From
> > > > >>> my
> > > > >>>>>>>>> perspective, having two sets of allowed operations is
> > actually
> > > > >>> an
> > > > >>>>>>>> increase
> > > > >>>>>>>>> in cognitive complexity.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 2. To me, it's not crazy to use the operator this way. For
> > > > >>>>> example,
> > > > >>>>>> in
> > > > >>>>>>>> lieu
> > > > >>>>>>>>> of full-featured timestamp semantics, I can implement MVCC
> > > > >>>>> behavior
> > > > >>>>>>> when
> > > > >>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)". I
> > > > >>>>> suspect
> > > > >>>>>>> that
> > > > >>>>>>>>> there are other, non-obvious applications of suppressing
> late
> > > > >>>>> events
> > > > >>>>>> on
> > > > >>>>>>>>> KTables.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 3. Not to get too much into implementation details in a KIP
> > > > >>>>>> discussion,
> > > > >>>>>>>> but
> > > > >>>>>>>>> if we did want to make late-event suppression available
> only
> > > > >> on
> > > > >>>>>>> windowed
> > > > >>>>>>>>> KTables, we have two enforcement options:
> > > > >>>>>>>>>   a. check when we build the topology - this would be
> simple
> > > > >> to
> > > > >>>>>>>> implement,
> > > > >>>>>>>>> but would be a runtime check. Hopefully, people write tests
> > > > >> for
> > > > >>>>> their
> > > > >>>>>>>>> topology before deploying them, so the feedback loop isn't
> > > > >>>>>>> instantaneous,
> > > > >>>>>>>>> but it's not too long either.
> > > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a
> compile
> > > > >>> time
> > > > >>>>>>> check,
> > > > >>>>>>>>> but would also be substantial increase of both interface
> and
> > > > >>> code
> > > > >>>>>>>>> complexity.
> > > > >>>>>>>>>
> > > > >>>>>>>>> We should definitely strive to have guard rails protecting
> > > > >>> against
> > > > >>>>>>>>> surprising or dangerous behavior. Protecting against
> programs
> > > > >>>>> that we
> > > > >>>>>>>> don't
> > > > >>>>>>>>> currently predict is a lesser benefit, and I think we can
> put
> > > > >> up
> > > > >>>>>> guard
> > > > >>>>>>>>> rails on a case-by-case basis for that. It seems like the
> > > > >>>>> increase in
> > > > >>>>>>>>> cognitive (and potentially code and interface) complexity
> > > > >> makes
> > > > >>> me
> > > > >>>>>>> think
> > > > >>>>>>>> we
> > > > >>>>>>>>> should skip this case.
> > > > >>>>>>>>>
> > > > >>>>>>>>> What do you think?
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks,
> > > > >>>>>>>>> -John
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > > >>>>>>> matthias@confluent.io>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Thanks for the KIP John.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> One initial comments about the last example "Bounded
> > > > >>> lateness":
> > > > >>>>>> For a
> > > > >>>>>>>>>> non-windowed KTable bounding the lateness does not really
> > > > >> make
> > > > >>>>>> sense,
> > > > >>>>>>>>>> does it?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thus, I am wondering if we should allow
> > > > >> `suppressLateEvents()`
> > > > >>>>> for
> > > > >>>>>>> this
> > > > >>>>>>>>>> case? It seems to be better to only allow it for
> > > > >>>>> windowed-KTables.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> -Matthias
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> What you gave as new example is semantically the same as
> > > > >>> what
> > > > >>>>> I
> > > > >>>>>>>>>> suggested.
> > > > >>>>>>>>>>> So it is good by me.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > > > >>>>> john@confluent.io
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks for taking look, Ted,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I agree this is a departure from the conventions of
> > > > >> Streams
> > > > >>>>> DSL.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Most of our config objects have one or two "required"
> > > > >>>>>> parameters,
> > > > >>>>>>>>> which
> > > > >>>>>>>>>> fit
> > > > >>>>>>>>>>>> naturally with the static factory method approach.
> > > > >>>>> TimeWindow,
> > > > >>>>>> for
> > > > >>>>>>>>>> example,
> > > > >>>>>>>>>>>> requires a size parameter, so we can naturally say
> > > > >>>>>>>>> TimeWindows.of(size).
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I think in the case of a suppression, there's really no
> > > > >>>>> "core"
> > > > >>>>>>>>>> parameter,
> > > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > > > >>>>> Suppression()". I
> > > > >>>>>>>> think
> > > > >>>>>>>>>> that
> > > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since there
> > > > >>> are
> > > > >>>>>> many
> > > > >>>>>>>>>> durations
> > > > >>>>>>>>>>>> that we can configure.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> However, thinking about it again, I suppose that I can
> > > > >> give
> > > > >>>>> each
> > > > >>>>>>>>>>>> configuration method a static version, which would let
> > > > >> you
> > > > >>>>>> replace
> > > > >>>>>>>>> "new
> > > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the examples.
> > > > >>>>>>> Basically,
> > > > >>>>>>>>>> instead
> > > > >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> For example:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> windowCounts
> > > > >>>>>>>>>>>>     .suppress(
> > > > >>>>>>>>>>>>         Suppression
> > > > >>>>>>>>>>>>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > >>>>>>>>>>>>             )
> > > > >>>>>>>>>>>>     );
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Does that seem better?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>> -John
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > > > >>> yuzhihong@gmail.com
> > > > >>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> > > > >>>>> materials.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> One suggestion:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>     .suppress(
> > > > >>>>>>>>>>>>>         new Suppression()
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Do you think it would be more consistent with the rest
> > > > >> of
> > > > >>>>>> Streams
> > > > >>>>>>>>> data
> > > > >>>>>>>>>>>>> structures by supporting `of` ?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Cheers
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > > > >>>>>> john@confluent.io
> > > > >>>>>>>>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Hello devs and users,
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Please take some time to consider this proposal for
> > > > >> Kafka
> > > > >>>>>>> Streams:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> The basic idea is to provide:
> > > > >>>>>>>>>>>>>> * more usable control over update rate (vs the current
> > > > >>>>> state
> > > > >>>>>>> store
> > > > >>>>>>>>>>>>> caches)
> > > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations feature
> > > > >>> which
> > > > >>>>>>> several
> > > > >>>>>>>>>>>> people
> > > > >>>>>>>>>>>>>> have requested
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> I look forward to your feedback!
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>> -John
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> -- Guozhang
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> -- Guozhang
> > > > >>
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

John,

Thanks for your replies. As for the two options of the API, I think I'm
slightly inclined to the first option as well. My motivation is a bit
different, as I think of the first one maybe more flexible, for example:

KTable<Windowed<..>> table = ... count();

table.toStream().peek(..);   // want to peek at the changelog stream, do
not care about final results.

table.suppress().toStream().to("topic");    // sending to a topic, want to
only send the final results.

--------------

Besides that, I have a few more minor questions:

1. For "allowedLateness", what should be the default value? I.e. if user do
not specify "allowedLateness" in TimeWindows, what value should we set?

2. For API names, some personal suggestions here:

2.a) "allowedLateness"  -> "until" (semantics changed, and also value is
defined as delta on top of window length), where "until" ->
"retentionPeriod", and the latter will be removed from `Windows` to `
WindowStoreBuilder` in the future.

2.b) "BufferConfig" -> "Buffered" ?



Guozhang


On Mon, Jul 9, 2018 at 2:09 PM, John Roesler <jo...@confluent.io> wrote:

> Hey Matthias and Guozhang,
>
> Sorry for the slow reply. I was mulling about your feedback and weighing
> some ideas in a sketchbook PR: https://github.com/apache/kafka/pull/5337.
>
> Your thought about keeping suppression independent of business logic is a
> very good one. I agree that it would make more sense to add some kind of
> "window close" concept to the window definition.
>
> In fact, doing that immediately solves the inconsistency problem Guozhang
> brought up. There's no need to add a "final results" or "emission" option
> to the windowed aggregation.
>
> What do you think about an API more like this:
>
> final StreamsBuilder builder = new StreamsBuilder();
>
> builder
>   .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
>   .groupBy(
>     (String k1, String v1) -> k1,
>     Serialized.with(STRING_SERDE, STRING_SERDE)
>   )
>   .windowedBy(TimeWindows
>     .of(scaledTime(2L))
>     .until(scaledTime(3L))
>     .allowedLateness(scaledTime(1L))
>   )
>   .count(Materialized.as("counts"))
>   .suppress(
>     emitFinalResultsOnly(
>       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
>     )
>   )
>   .toStream()
>   .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));
>
> Note that:
>  * "emitFinalResultsOnly" is available *only* on windowed tables (enforced
> by the type system at compile time), and it determines the time to wait by
> looking at "allowedLateness" on the TimeWindows config.
>  * querying "counts" will produce results (eventually) consistent with
> what's observable in "output-suppressed".
>  * in all cases, "suppress" has no effect on business logic, just on event
> suppression.
>
> Is this API straightforward? Or do you still prefer the version that both
> proposed:
>
>   ...
>   .windowedBy(TimeWindows
>     .of(scaledTime(2L))
>     .until(scaledTime(3L))
>     .allowedLateness(scaledTime(1L))
>   )
>   .count(
>     Materialized.as("counts"),
>     emitFinalResultsOnly(
>       BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
>     )
>   )
>   ...
>
> To me, these two are practically identical, and I still vaguely prefer the
> first one.
>
> The prototype has made clearer to me that users of "final results for
> windows" and users of "suppression for table events" both need to configure
> the suppression buffer.
>
> This buffer configuration consists of:
> 1. how many keys or bytes to keep in memory
> 2. what to do if memory runs out (shut down, start using disk, ...)
>
> So it's not as simple as setting a "final results" flag. We'll either have
> an "Emit" config object on the windowed aggregators that takes the same
> BufferConfig that the "Suppress" config on the suppression operator, or we
> just use the suppression operator for both.
>
> Perhaps it would sweeten the deal a little to point out that we have 2
> overloads already for each windowed aggregator (with and without
> Materialized). Adding "Emitted" or something would mean that we'd add a new
> overload for each one, taking us up to 4 overloads each for "count",
> "aggregate" and "reduce". Using "suppress" means that we don't add any new
> overloads.
>
> Thanks again for helping to hash this out,
> -John
>
> On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > I think I agree with Matthias for having dedicated APIs for windowed
> > operation final output scenario, PLUS separating the window close which
> the
> > "final output" would rely on, from the window retention time itself
> > (admittedly it would make this KIP effort larger, but if we believe we
> need
> > to do this separation anyways we could just do it now).
> >
> > And then we can have the `KTable#suppress()` for intermediate-suppression
> > only, not for late-record-suppression, until we've seen that becomes a
> > common feature request because our current design still allows to be
> > extended for that purpose.
> >
> >
> > Guozhang
> >
> > On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <ma...@confluent.io>
> > wrote:
> >
> > > Thanks for the discussion. I am just catching up.
> > >
> > > In general, I think we have different uses cases and non-windowed and
> > > windowed is quite different. For the non-windowed case, suppress() has
> > > no (useful) close or retention time, no final semantics, and also no
> > > business logic impact.
> > >
> > > On the other hand, for windowed aggregations, close time and final
> > > result do have a meaning. IMHO, `close()` is part of business logic
> > > while retention time is not. Also, suppression of intermediate result
> is
> > > not a business rule and there might be use case for which either "early
> > > intermediate" (before window end time) are suppressed only, or all
> > > intermediates are suppressed (maybe also something in the middle, ie,
> > > just reduce the load of intermediate updates). Thus, window-suppression
> > > is much richer.
> > >
> > > IMHO, a generic `suppress()` operator that can be inserted into the
> data
> > > flow at any point is useful. Maybe we should keep is as generic as
> > > possible. However, it might be difficult to use with regard to
> > > windowing, as the mental effort to use it is high.
> > >
> > > With regard to Guozhang's comment:
> > >
> > > > we will actually
> > > > process data as old as 30 days as well, while most of the late
> updates
> > > > beyond 5 minutes would be discarded anyways.
> > >
> > > If we use `suppress()` as a standalone operator, this is correct and
> > > intended IMHO. To address the issue if the behavior is unwanted, I
> would
> > > suggest to add a "suppress option" directly to
> > > `count()/reduce()/aggregate()` window operator similar to
> > > `Materialized`. This would be an "embedded suppress" and avoid the
> > > issue. It would also address the issue about mental effort for "single
> > > final window result" use case.
> > >
> > > I also think that a shorter close-time than retention time is useful
> for
> > > window aggregation. If we add close() to the window definition and
> > > until() to `Materialized`, we can separate both correctly IMHO.
> > >
> > > About setting `close = min(close,retention)` I am not sure. We might
> > > rather throw an exception than reducing the close time automatically.
> > > Otherwise, I see many user question about "I set close to X but it does
> > > not get updated for some data that is with delay of X".
> > >
> > > The tricky question might be to design the API in a backward compatible
> > > way though.
> > >
> > >
> > >
> > > -Matthias
> > >
> > > On 7/3/18 5:38 AM, John Roesler wrote:
> > > > Hi Guozhang,
> > > >
> > > > I see. It seems like if we want to decouple 1) and 2), we need to
> alter
> > > the
> > > > definition of the window. Do you think it would close the gap if we
> > > added a
> > > > "window close" time to the window definition?
> > > >
> > > > Such as:
> > > >
> > > > builder.stream("input")
> > > > .groupByKey()
> > > > .windowedBy(
> > > >   TimeWindows
> > > >     .of(60_000)
> > > >     .closeAfter(10 * 60)
> > > >     .until(30L * 24 * 60 * 60 * 1000)
> > > > )
> > > > .count()
> > > > .suppress(Suppression.finalResultsOnly());
> > > >
> > > > Possibly called "finalResultsAtWindowClose" or something?
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > > >
> > > >> Hey John,
> > > >>
> > > >> Obviously I'm too lazy on email replying diligence compared with you
> > :)
> > > >> Will try to reply them separately:
> > > >>
> > > >>
> > > >> ------------------------------------------------------------
> > > -----------------
> > > >>
> > > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > > >>
> > > >> I'm aware of this use case, but again, the concern is that, in this
> > > setting
> > > >> in order to let the window be queryable for 30 days, we will
> actually
> > > >> process data as old as 30 days as well, while most of the late
> updates
> > > >> beyond 5 minutes would be discarded anyways. Personally I think for
> > the
> > > >> final update scenario, the ideal situation users would want is that
> > "do
> > > not
> > > >> process any data that is less than 5 minutes, and of course no
> update
> > > >> records to the downstream later than 5 minutes either; but retain
> the
> > > >> window to be queryable for 30 days". And by doing that the final
> > window
> > > >> snapshot would also be aligned with the update stream as well. In
> > other
> > > >> words, among these three periods:
> > > >>
> > > >> 1) the retention length of the window / table.
> > > >> 2) the late records acceptance for updating the window.
> > > >> 3) the late records update to be sent downstream.
> > > >>
> > > >> Final update use cases would naturally want 2) = 3), while 1) may be
> > > >> different and larger, while what we provide now is that 1) = 2),
> which
> > > >> could be different and in practice larger than 3), hence not the
> most
> > > >> intuitive for their needs.
> > > >>
> > > >>
> > > >>
> > > >> ------------------------------------------------------------
> > > -----------------
> > > >>
> > > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > > >>
> > > >> I'd like option 2) over option 1) better as well from programming
> pov.
> > > But
> > > >> I'm wondering if option 2) would provide the above semantics or it
> is
> > > still
> > > >> coupling 1) with 2) as well ?
> > > >>
> > > >>
> > > >>
> > > >> Guozhang
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io>
> > wrote:
> > > >>
> > > >>> In fact, to push the idea further (which IIRC is what Matthias
> > > originally
> > > >>> proposed), if we can accept "Suppression#finalResultsOnly" in my
> last
> > > >>> email, then we could also consider whether to eliminate
> > > >>> "suppressLateEvents" entirely.
> > > >>>
> > > >>> We could always add it later, but you've both expressed doubt that
> > > there
> > > >>> are practical use cases for it outside of final-results.
> > > >>>
> > > >>> -John
> > > >>>
> > > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io>
> > > wrote:
> > > >>>
> > > >>>> Hi again, Guozhang ;) Here's the second part of my response...
> > > >>>>
> > > >>>> It seems like your main concern is: "if I'm a user who wants final
> > > >> update
> > > >>>> semantics, how complicated is it for me to get it?"
> > > >>>>
> > > >>>> I think we have to assume that people don't always have time to
> > become
> > > >>>> deeply familiar with all the nuances of a programming environment
> > > >> before
> > > >>>> they use it. Especially if they're evaluating several frameworks
> for
> > > >>> their
> > > >>>> use case, it's very valuable to make it as obvious as possible how
> > to
> > > >>>> accomplish various computations with Streams.
> > > >>>>
> > > >>>> To me the biggest question is whether with a fresh perspective,
> > people
> > > >>>> would say "oh, I get it, I have to bound my lateness and suppress
> > > >>>> intermediate updates, and of course I'll get only the final
> > result!",
> > > >> or
> > > >>> if
> > > >>>> it's more like "wtf? all I want is the final result, what are all
> > > these
> > > >>>> parameters?".
> > > >>>>
> > > >>>> I was talking with Matthias a while back, and he had an idea that
> I
> > > >> think
> > > >>>> can help, which is to essentially set up a final-result recipe in
> > > >>> addition
> > > >>>> to the raw parameters. I previously thought that it wouldn't be
> > > >> possible
> > > >>> to
> > > >>>> restrict its usage to Windowed KTables, but thinking about it
> again
> > > >> this
> > > >>>> weekend, I have a couple of ideas:
> > > >>>>
> > > >>>> ================
> > > >>>> = 1. Static Wrapper =
> > > >>>> ================
> > > >>>> We can define an extra static function that "wraps" a KTable with
> > > >>>> final-result semantics.
> > > >>>>
> > > >>>> public static <K extends Windowed, V> KTable<K, V>
> finalResultsOnly(
> > > >>>>   final KTable<K, V> windowedKTable,
> > > >>>>   final Duration maxAllowedLateness,
> > > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > > >>>>     return windowedKTable.suppress(
> > > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> > > >>>>                    .suppressIntermediateEvents(
> > > >>>>                      IntermediateSuppression
> > > >>>>                        .emitAfter(maxAllowedLateness)
> > > >>>>                        .bufferFullStrategy(bufferFullStrategy)
> > > >>>>                    )
> > > >>>>     );
> > > >>>> }
> > > >>>>
> > > >>>> Because windowedKTable is a parameter, the static function can
> > easily
> > > >>>> impose an extra bound on the key type, that it extends Windowed.
> > This
> > > >>> would
> > > >>>> make "final results only" only available on windowed ktables.
> > > >>>>
> > > >>>> Here's how it would look to use:
> > > >>>>
> > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > >>>>   finalResultsOnly(
> > > >>>>     windowCounts,
> > > >>>>     Duration.ofMinutes(10),
> > > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > > >>>>   );
> > > >>>>
> > > >>>> Trying to use it on a non-windowed KTable yields:
> > > >>>>
> > > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > > >>>>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest
> > > cannot
> > > >>> be
> > > >>>>> applied to given types;
> > > >>>>>   required:
> > > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > > BufferFullStrategy
> > > >>>>>   found:
> > > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > > >>>>>   reason: inference variable K has incompatible bounds
> > > >>>>>     equality constraints: java.lang.String
> > > >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> =================================================
> > > >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> > > >>>> =================================================
> > > >>>>
> > > >>>> By adding K,V parameters to Suppression, we can provide a
> similarly
> > > >>>> bounded config method directly on the Suppression class:
> > > >>>>
> > > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> > > >>>> BufferFullStrategy bufferFullStrategy) {
> > > >>>>     return Suppression
> > > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > > >>>>         .suppressIntermediateEvents(IntermediateSuppression
> > > >>>>             .emitAfter(maxAllowedLateness)
> > > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > > >>>>         );
> > > >>>> }
> > > >>>>
> > > >>>> Then, here's how it would look to use it:
> > > >>>>
> > > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > > >>>>   windowCounts.suppress(
> > > >>>>     Suppression.finalResultsOnly(
> > > >>>>       Duration.ofMinutes(10)
> > > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > > >>>>     )
> > > >>>>   );
> > > >>>>
> > > >>>> Trying to use it on a non-windowed ktable yields:
> > > >>>>
> > > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be
> applied
> > > to
> > > >>>>> given types;
> > > >>>>>   required:
> > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > >>> Suppression.BufferFullStrategy
> > > >>>>>   found:
> > > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > > >>> Suppression.BufferFullStrategy
> > > >>>>>   reason: explicit type argument java.lang.String does not
> conform
> > to
> > > >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> ============
> > > >>>> = Downsides =
> > > >>>> ============
> > > >>>>
> > > >>>> Of course, there's a downside either way:
> > > >>>> * for 1:  this "wrapper" interaction would be the first in the
> DSL.
> > Is
> > > >> it
> > > >>>> too strange, and how discoverable would it be?
> > > >>>> * for 2: adding those type parameters to Suppression will force
> all
> > > >>>> callers to provide them in the event of a chained construction
> > because
> > > >>> Java
> > > >>>> doesn't do RHS recursive type inference. This is already visible
> in
> > > >> other
> > > >>>> parts of the Streams DSL. For example, often calls to Materialized
> > > >>> builders
> > > >>>> have to provide seemingly obvious type bounds.
> > > >>>>
> > > >>>> ============
> > > >>>> = Conclusion =
> > > >>>> ============
> > > >>>>
> > > >>>> I think option 2 is more "normal" and discoverable. It does have a
> > > >>>> downside, but it's one that's pre-existing elsewhere in the DSL.
> > > >>>>
> > > >>>> WDYT? Would the addition of this "recipe" method to Suppression
> > > resolve
> > > >>>> your concern?
> > > >>>>
> > > >>>> Thanks again,
> > > >>>> -John
> > > >>>>
> > > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wangguoz@gmail.com
> >
> > > >>> wrote:
> > > >>>>
> > > >>>>> Hi John,
> > > >>>>>
> > > >>>>> Regarding the metrics: yeah I think I'm with you that the dropped
> > > >>> records
> > > >>>>> due to window retention or emit suppression policies should be
> > > >> recorded
> > > >>>>> differently, and using this KIP's proposed metric would be fine.
> If
> > > >> you
> > > >>>>> also think we can use this KIP's proposed metrics to cover the
> > window
> > > >>>>> retention cased skipping records, then we can include the changes
> > in
> > > >>> this
> > > >>>>> KIP as well.
> > > >>>>>
> > > >>>>> Regarding the current proposal, I'm actually not too worried
> about
> > > the
> > > >>>>> inconsistency between query semantics and downstream emit
> > semantics.
> > > >> For
> > > >>>>> queries, we will always return the current running results of the
> > > >>> windows,
> > > >>>>> being it partial or final results depending on the window
> retention
> > > >> time
> > > >>>>> anyways, which has nothing to do whether the emitted stream
> should
> > be
> > > >>> one
> > > >>>>> final output per key or not. I also agree that having a unified
> > > >>> operation
> > > >>>>> is generally better for users to focus on leveraging that one
> only
> > > >> than
> > > >>>>> learning about two set of operations. The only question I had is,
> > for
> > > >>>>> final
> > > >>>>> updates of window stores, if it is a bit awkward to understand
> the
> > > >>>>> configuration combo. Thinking about this more, I think my root
> > worry
> > > >> in
> > > >>>>> the
> > > >>>>> "suppressLateEvents" call for windowed tables, since from a user
> > > >>>>> perspective: if my retention time is X which means "pay the cost
> to
> > > >>> allow
> > > >>>>> late records up to X to still be applied updating the tables",
> why
> > > >>> would I
> > > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not send
> > the
> > > >>>>> updates up to Y, which means the downstream operator or sink
> topic
> > > for
> > > >>>>> this
> > > >>>>> stream would actually see a truncated update stream while I've
> paid
> > > >>> larger
> > > >>>>> cost for that"; and of course, Y > X would not make sense either
> as
> > > >> you
> > > >>>>> would not see any updates later than X anyways. So in all, my
> > feeling
> > > >> is
> > > >>>>> that it makes less sense for windowed table's
> "suppressLateEvents"
> > > >> with
> > > >>> a
> > > >>>>> parameter that is not equal to the window retention, and opening
> > the
> > > >>> door
> > > >>>>> in the current proposal may confuse people with that.
> > > >>>>>
> > > >>>>> Again, above is just a subjective opinion and probably we can
> also
> > > >> bring
> > > >>>>> up
> > > >>>>> some scenarios that users does want to set X != Y.. but
> personally
> > I
> > > >>> feel
> > > >>>>> that even if the semantics for this scenario if intuitive for
> user
> > to
> > > >>>>> understand, doe that really make sense and should we really open
> > the
> > > >>> door
> > > >>>>> for it. So I think maybe separating the final update in a
> separate
> > > >> API's
> > > >>>>> benefits may overwhelm the advantage of having one uniform
> > > definition.
> > > >>> And
> > > >>>>> for my alternative proposal, the rationale was from both my
> concern
> > > >>> about
> > > >>>>> "suppressLateEvents" for windowed store, and Matthias' question
> > about
> > > >>>>> "suppressLateEvents" for non-windowed stores, that if it is less
> > > >>>>> meaningful
> > > >>>>> for both, we can consider removing it completely and only do
> > > >>>>> "IntermediateSuppression" in Suppress instead.
> > > >>>>>
> > > >>>>> So I'd summarize my thoughts in the following questions:
> > > >>>>>
> > > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
> > retention
> > > >>> time)
> > > >>>>> for windowed stores make sense in practice?
> > > >>>>> 2. Does "suppressLateEvents" with any parameter Y for
> non-windowed
> > > >>> stores
> > > >>>>> make sense in practice?
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> Guozhang
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com>
> > > >> wrote:
> > > >>>>>
> > > >>>>>> Thanks for the explanation, that does make sense.  I have some
> > > >>>>> questions on
> > > >>>>>> operations, but I'll just wait for the PR and tests.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> Bill
> > > >>>>>>
> > > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <john@confluent.io
> >
> > > >>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi Bill,
> > > >>>>>>>
> > > >>>>>>> Thanks for the review!
> > > >>>>>>>
> > > >>>>>>> Your question is very much applicable to the KIP and not at all
> > an
> > > >>>>>>> implementation detail. Thanks for bringing it up.
> > > >>>>>>>
> > > >>>>>>> I'm proposing not to change the existing caches and
> > configurations
> > > >>> at
> > > >>>>> all
> > > >>>>>>> (for now).
> > > >>>>>>>
> > > >>>>>>> Imagine you have a topology like this:
> > > >>>>>>> commit.interval.ms = 100
> > > >>>>>>>
> > > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > > >>>>>>>
> > > >>>>>>> The first ktable (ktable1) will respect the commit interval and
> > > >>> buffer
> > > >>>>>>> events for 100ms before logging, storing, or forwarding them
> > > >> (IIRC).
> > > >>>>>>> Therefore, the second ktable (suppress) will only see the
> events
> > > >> at
> > > >>> a
> > > >>>>>> rate
> > > >>>>>>> of once per 100ms. It will apply its own buffering, and emit
> once
> > > >>> per
> > > >>>>>> 200ms
> > > >>>>>>> This case is pretty trivial because the suppress time is a
> > > >> multiple
> > > >>> of
> > > >>>>>> the
> > > >>>>>>> commit interval.
> > > >>>>>>>
> > > >>>>>>> When it's not an integer multiple, you'll get behavior like in
> > > >> this
> > > >>>>>> marble
> > > >>>>>>> diagram:
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > > >>>>>>>
> > > >>>>>>> [ KTable caching with commit interval = 2 ]
> > > >>>>>>>
> > > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > > >>>>>>>
> > > >>>>>>>       [ suppress with emitAfter = 3 ]
> > > >>>>>>>
> > > >>>>>>> <---------------(k:2)----------------(k:6)->
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> If this behavior isn't desired (for example, if you wanted to
> > emit
> > > >>>>> (k:3)
> > > >>>>>> at
> > > >>>>>>> time 3, I'd recommend setting the "cache.max.bytes.buffering"
> to
> > 0
> > > >>> or
> > > >>>>>>> modifying the topology to disable caching. Then, the behavior
> is
> > > >>> more
> > > >>>>>>> simply determined just by the suppress operator.
> > > >>>>>>>
> > > >>>>>>> Does that seem right to you?
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Regarding the changelogs, because the suppression operator
> hangs
> > > >>> onto
> > > >>>>>>> events for a while, it will need its own changelog. The
> changelog
> > > >>>>>>> should represent the current state of the buffer at all times.
> So
> > > >>> when
> > > >>>>>> the
> > > >>>>>>> suppress operator sees (k:2), for example, it will log (k:2).
> > When
> > > >>> it
> > > >>>>>>> later gets to time 3, it's time to emit (k:2) downstream.
> Because
> > > >> k
> > > >>>>> is no
> > > >>>>>>> longer buffered, the suppress operator will log (k:null). Thus,
> > > >> when
> > > >>>>>>> recovering,
> > > >>>>>>> it can rebuild the buffer by reading its changelog.
> > > >>>>>>>
> > > >>>>>>> What do you think about this?
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> -John
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bbejeck@gmail.com
> >
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi John,  thanks for the KIP.
> > > >>>>>>>>
> > > >>>>>>>> Early on in the KIP, you mention the current approaches for
> > > >>>>> controlling
> > > >>>>>>> the
> > > >>>>>>>> rate of downstream records from a KTable, cache size
> > > >> configuration
> > > >>>>> and
> > > >>>>>>>> commit time.
> > > >>>>>>>>
> > > >>>>>>>> Will these configuration parameters still be in effect for
> > > >> tables
> > > >>>>> that
> > > >>>>>>>> don't use suppression?  For tables taking advantage of
> > > >>> suppression,
> > > >>>>>> will
> > > >>>>>>>> these configurations have no impact?
> > > >>>>>>>> This last question may be to implementation specific but if
> the
> > > >>>>>> requested
> > > >>>>>>>> suppression time is longer than the specified commit time,
> will
> > > >>> the
> > > >>>>>>> latest
> > > >>>>>>>> record in the suppression buffer get stored in a changelog?
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Bill
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <
> john@confluent.io
> > > >>>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Thanks for the feedback, Matthias,
> > > >>>>>>>>>
> > > >>>>>>>>> It seems like in straightforward relational processing cases,
> > > >> it
> > > >>>>>> would
> > > >>>>>>>> not
> > > >>>>>>>>> make sense to bound the lateness of KTables. In general, it
> > > >>> seems
> > > >>>>>>> better
> > > >>>>>>>> to
> > > >>>>>>>>> have "guard rails" in place that make it easier to write
> > > >>> sensible
> > > >>>>>>>> programs
> > > >>>>>>>>> than insensible ones.
> > > >>>>>>>>>
> > > >>>>>>>>> But I'm still going to argue in favor of keeping it for all
> > > >>>>> KTables
> > > >>>>>> ;)
> > > >>>>>>>>>
> > > >>>>>>>>> 1. I believe it is simpler to understand the operator if it
> > > >> has
> > > >>>>> one
> > > >>>>>>>> uniform
> > > >>>>>>>>> definition, regardless of context. It's well defined and
> > > >>> intuitive
> > > >>>>>> what
> > > >>>>>>>>> will happen when you use late-event suppression on a KTable,
> > > >> so
> > > >>> I
> > > >>>>>> think
> > > >>>>>>>>> nothing surprising or dangerous will happen in that case.
> From
> > > >>> my
> > > >>>>>>>>> perspective, having two sets of allowed operations is
> actually
> > > >>> an
> > > >>>>>>>> increase
> > > >>>>>>>>> in cognitive complexity.
> > > >>>>>>>>>
> > > >>>>>>>>> 2. To me, it's not crazy to use the operator this way. For
> > > >>>>> example,
> > > >>>>>> in
> > > >>>>>>>> lieu
> > > >>>>>>>>> of full-featured timestamp semantics, I can implement MVCC
> > > >>>>> behavior
> > > >>>>>>> when
> > > >>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)". I
> > > >>>>> suspect
> > > >>>>>>> that
> > > >>>>>>>>> there are other, non-obvious applications of suppressing late
> > > >>>>> events
> > > >>>>>> on
> > > >>>>>>>>> KTables.
> > > >>>>>>>>>
> > > >>>>>>>>> 3. Not to get too much into implementation details in a KIP
> > > >>>>>> discussion,
> > > >>>>>>>> but
> > > >>>>>>>>> if we did want to make late-event suppression available only
> > > >> on
> > > >>>>>>> windowed
> > > >>>>>>>>> KTables, we have two enforcement options:
> > > >>>>>>>>>   a. check when we build the topology - this would be simple
> > > >> to
> > > >>>>>>>> implement,
> > > >>>>>>>>> but would be a runtime check. Hopefully, people write tests
> > > >> for
> > > >>>>> their
> > > >>>>>>>>> topology before deploying them, so the feedback loop isn't
> > > >>>>>>> instantaneous,
> > > >>>>>>>>> but it's not too long either.
> > > >>>>>>>>>   b. add a new WindowedKTable type - this would be a compile
> > > >>> time
> > > >>>>>>> check,
> > > >>>>>>>>> but would also be substantial increase of both interface and
> > > >>> code
> > > >>>>>>>>> complexity.
> > > >>>>>>>>>
> > > >>>>>>>>> We should definitely strive to have guard rails protecting
> > > >>> against
> > > >>>>>>>>> surprising or dangerous behavior. Protecting against programs
> > > >>>>> that we
> > > >>>>>>>> don't
> > > >>>>>>>>> currently predict is a lesser benefit, and I think we can put
> > > >> up
> > > >>>>>> guard
> > > >>>>>>>>> rails on a case-by-case basis for that. It seems like the
> > > >>>>> increase in
> > > >>>>>>>>> cognitive (and potentially code and interface) complexity
> > > >> makes
> > > >>> me
> > > >>>>>>> think
> > > >>>>>>>> we
> > > >>>>>>>>> should skip this case.
> > > >>>>>>>>>
> > > >>>>>>>>> What do you think?
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> -John
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > >>>>>>> matthias@confluent.io>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Thanks for the KIP John.
> > > >>>>>>>>>>
> > > >>>>>>>>>> One initial comments about the last example "Bounded
> > > >>> lateness":
> > > >>>>>> For a
> > > >>>>>>>>>> non-windowed KTable bounding the lateness does not really
> > > >> make
> > > >>>>>> sense,
> > > >>>>>>>>>> does it?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thus, I am wondering if we should allow
> > > >> `suppressLateEvents()`
> > > >>>>> for
> > > >>>>>>> this
> > > >>>>>>>>>> case? It seems to be better to only allow it for
> > > >>>>> windowed-KTables.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> -Matthias
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> What you gave as new example is semantically the same as
> > > >>> what
> > > >>>>> I
> > > >>>>>>>>>> suggested.
> > > >>>>>>>>>>> So it is good by me.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > > >>>>> john@confluent.io
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks for taking look, Ted,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I agree this is a departure from the conventions of
> > > >> Streams
> > > >>>>> DSL.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Most of our config objects have one or two "required"
> > > >>>>>> parameters,
> > > >>>>>>>>> which
> > > >>>>>>>>>> fit
> > > >>>>>>>>>>>> naturally with the static factory method approach.
> > > >>>>> TimeWindow,
> > > >>>>>> for
> > > >>>>>>>>>> example,
> > > >>>>>>>>>>>> requires a size parameter, so we can naturally say
> > > >>>>>>>>> TimeWindows.of(size).
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I think in the case of a suppression, there's really no
> > > >>>>> "core"
> > > >>>>>>>>>> parameter,
> > > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > > >>>>> Suppression()". I
> > > >>>>>>>> think
> > > >>>>>>>>>> that
> > > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since there
> > > >>> are
> > > >>>>>> many
> > > >>>>>>>>>> durations
> > > >>>>>>>>>>>> that we can configure.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> However, thinking about it again, I suppose that I can
> > > >> give
> > > >>>>> each
> > > >>>>>>>>>>>> configuration method a static version, which would let
> > > >> you
> > > >>>>>> replace
> > > >>>>>>>>> "new
> > > >>>>>>>>>>>> Suppression()." with "Suppression." in all the examples.
> > > >>>>>>> Basically,
> > > >>>>>>>>>> instead
> > > >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For example:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> windowCounts
> > > >>>>>>>>>>>>     .suppress(
> > > >>>>>>>>>>>>         Suppression
> > > >>>>>>>>>>>>             .suppressLateEvents(Duration.ofMinutes(10))
> > > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > >>>>>>>>>>>>             )
> > > >>>>>>>>>>>>     );
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Does that seem better?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>> -John
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > > >>> yuzhihong@gmail.com
> > > >>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> > > >>>>> materials.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> One suggestion:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>     .suppress(
> > > >>>>>>>>>>>>>         new Suppression()
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Do you think it would be more consistent with the rest
> > > >> of
> > > >>>>>> Streams
> > > >>>>>>>>> data
> > > >>>>>>>>>>>>> structures by supporting `of` ?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Cheers
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > > >>>>>> john@confluent.io
> > > >>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Hello devs and users,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Please take some time to consider this proposal for
> > > >> Kafka
> > > >>>>>>> Streams:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> The basic idea is to provide:
> > > >>>>>>>>>>>>>> * more usable control over update rate (vs the current
> > > >>>>> state
> > > >>>>>>> store
> > > >>>>>>>>>>>>> caches)
> > > >>>>>>>>>>>>>> * the final-result-for-windowed-computations feature
> > > >>> which
> > > >>>>>>> several
> > > >>>>>>>>>>>> people
> > > >>>>>>>>>>>>>> have requested
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I look forward to your feedback!
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>> -John
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> -- Guozhang
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -- Guozhang
> > > >>
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hey Matthias and Guozhang,

Sorry for the slow reply. I was mulling about your feedback and weighing
some ideas in a sketchbook PR: https://github.com/apache/kafka/pull/5337.

Your thought about keeping suppression independent of business logic is a
very good one. I agree that it would make more sense to add some kind of
"window close" concept to the window definition.

In fact, doing that immediately solves the inconsistency problem Guozhang
brought up. There's no need to add a "final results" or "emission" option
to the windowed aggregation.

What do you think about an API more like this:

final StreamsBuilder builder = new StreamsBuilder();

builder
  .stream("input", Consumed.with(STRING_SERDE, STRING_SERDE))
  .groupBy(
    (String k1, String v1) -> k1,
    Serialized.with(STRING_SERDE, STRING_SERDE)
  )
  .windowedBy(TimeWindows
    .of(scaledTime(2L))
    .until(scaledTime(3L))
    .allowedLateness(scaledTime(1L))
  )
  .count(Materialized.as("counts"))
  .suppress(
    emitFinalResultsOnly(
      BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
    )
  )
  .toStream()
  .to("output-suppressed", Produced.with(STRING_SERDE, LONG_SERDE));

Note that:
 * "emitFinalResultsOnly" is available *only* on windowed tables (enforced
by the type system at compile time), and it determines the time to wait by
looking at "allowedLateness" on the TimeWindows config.
 * querying "counts" will produce results (eventually) consistent with
what's observable in "output-suppressed".
 * in all cases, "suppress" has no effect on business logic, just on event
suppression.

Is this API straightforward? Or do you still prefer the version that both
proposed:

  ...
  .windowedBy(TimeWindows
    .of(scaledTime(2L))
    .until(scaledTime(3L))
    .allowedLateness(scaledTime(1L))
  )
  .count(
    Materialized.as("counts"),
    emitFinalResultsOnly(
      BufferConfig.withBufferKeys(10_000L).bufferFullStrategy(SHUT_DOWN)
    )
  )
  ...

To me, these two are practically identical, and I still vaguely prefer the
first one.

The prototype has made clearer to me that users of "final results for
windows" and users of "suppression for table events" both need to configure
the suppression buffer.

This buffer configuration consists of:
1. how many keys or bytes to keep in memory
2. what to do if memory runs out (shut down, start using disk, ...)

So it's not as simple as setting a "final results" flag. We'll either have
an "Emit" config object on the windowed aggregators that takes the same
BufferConfig that the "Suppress" config on the suppression operator, or we
just use the suppression operator for both.

Perhaps it would sweeten the deal a little to point out that we have 2
overloads already for each windowed aggregator (with and without
Materialized). Adding "Emitted" or something would mean that we'd add a new
overload for each one, taking us up to 4 overloads each for "count",
"aggregate" and "reduce". Using "suppress" means that we don't add any new
overloads.

Thanks again for helping to hash this out,
-John

On Fri, Jul 6, 2018 at 6:20 PM Guozhang Wang <wa...@gmail.com> wrote:

> I think I agree with Matthias for having dedicated APIs for windowed
> operation final output scenario, PLUS separating the window close which the
> "final output" would rely on, from the window retention time itself
> (admittedly it would make this KIP effort larger, but if we believe we need
> to do this separation anyways we could just do it now).
>
> And then we can have the `KTable#suppress()` for intermediate-suppression
> only, not for late-record-suppression, until we've seen that becomes a
> common feature request because our current design still allows to be
> extended for that purpose.
>
>
> Guozhang
>
> On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <ma...@confluent.io>
> wrote:
>
> > Thanks for the discussion. I am just catching up.
> >
> > In general, I think we have different uses cases and non-windowed and
> > windowed is quite different. For the non-windowed case, suppress() has
> > no (useful) close or retention time, no final semantics, and also no
> > business logic impact.
> >
> > On the other hand, for windowed aggregations, close time and final
> > result do have a meaning. IMHO, `close()` is part of business logic
> > while retention time is not. Also, suppression of intermediate result is
> > not a business rule and there might be use case for which either "early
> > intermediate" (before window end time) are suppressed only, or all
> > intermediates are suppressed (maybe also something in the middle, ie,
> > just reduce the load of intermediate updates). Thus, window-suppression
> > is much richer.
> >
> > IMHO, a generic `suppress()` operator that can be inserted into the data
> > flow at any point is useful. Maybe we should keep is as generic as
> > possible. However, it might be difficult to use with regard to
> > windowing, as the mental effort to use it is high.
> >
> > With regard to Guozhang's comment:
> >
> > > we will actually
> > > process data as old as 30 days as well, while most of the late updates
> > > beyond 5 minutes would be discarded anyways.
> >
> > If we use `suppress()` as a standalone operator, this is correct and
> > intended IMHO. To address the issue if the behavior is unwanted, I would
> > suggest to add a "suppress option" directly to
> > `count()/reduce()/aggregate()` window operator similar to
> > `Materialized`. This would be an "embedded suppress" and avoid the
> > issue. It would also address the issue about mental effort for "single
> > final window result" use case.
> >
> > I also think that a shorter close-time than retention time is useful for
> > window aggregation. If we add close() to the window definition and
> > until() to `Materialized`, we can separate both correctly IMHO.
> >
> > About setting `close = min(close,retention)` I am not sure. We might
> > rather throw an exception than reducing the close time automatically.
> > Otherwise, I see many user question about "I set close to X but it does
> > not get updated for some data that is with delay of X".
> >
> > The tricky question might be to design the API in a backward compatible
> > way though.
> >
> >
> >
> > -Matthias
> >
> > On 7/3/18 5:38 AM, John Roesler wrote:
> > > Hi Guozhang,
> > >
> > > I see. It seems like if we want to decouple 1) and 2), we need to alter
> > the
> > > definition of the window. Do you think it would close the gap if we
> > added a
> > > "window close" time to the window definition?
> > >
> > > Such as:
> > >
> > > builder.stream("input")
> > > .groupByKey()
> > > .windowedBy(
> > >   TimeWindows
> > >     .of(60_000)
> > >     .closeAfter(10 * 60)
> > >     .until(30L * 24 * 60 * 60 * 1000)
> > > )
> > > .count()
> > > .suppress(Suppression.finalResultsOnly());
> > >
> > > Possibly called "finalResultsAtWindowClose" or something?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> > >
> > >> Hey John,
> > >>
> > >> Obviously I'm too lazy on email replying diligence compared with you
> :)
> > >> Will try to reply them separately:
> > >>
> > >>
> > >> ------------------------------------------------------------
> > -----------------
> > >>
> > >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> > >>
> > >> I'm aware of this use case, but again, the concern is that, in this
> > setting
> > >> in order to let the window be queryable for 30 days, we will actually
> > >> process data as old as 30 days as well, while most of the late updates
> > >> beyond 5 minutes would be discarded anyways. Personally I think for
> the
> > >> final update scenario, the ideal situation users would want is that
> "do
> > not
> > >> process any data that is less than 5 minutes, and of course no update
> > >> records to the downstream later than 5 minutes either; but retain the
> > >> window to be queryable for 30 days". And by doing that the final
> window
> > >> snapshot would also be aligned with the update stream as well. In
> other
> > >> words, among these three periods:
> > >>
> > >> 1) the retention length of the window / table.
> > >> 2) the late records acceptance for updating the window.
> > >> 3) the late records update to be sent downstream.
> > >>
> > >> Final update use cases would naturally want 2) = 3), while 1) may be
> > >> different and larger, while what we provide now is that 1) = 2), which
> > >> could be different and in practice larger than 3), hence not the most
> > >> intuitive for their needs.
> > >>
> > >>
> > >>
> > >> ------------------------------------------------------------
> > -----------------
> > >>
> > >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> > >>
> > >> I'd like option 2) over option 1) better as well from programming pov.
> > But
> > >> I'm wondering if option 2) would provide the above semantics or it is
> > still
> > >> coupling 1) with 2) as well ?
> > >>
> > >>
> > >>
> > >> Guozhang
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io>
> wrote:
> > >>
> > >>> In fact, to push the idea further (which IIRC is what Matthias
> > originally
> > >>> proposed), if we can accept "Suppression#finalResultsOnly" in my last
> > >>> email, then we could also consider whether to eliminate
> > >>> "suppressLateEvents" entirely.
> > >>>
> > >>> We could always add it later, but you've both expressed doubt that
> > there
> > >>> are practical use cases for it outside of final-results.
> > >>>
> > >>> -John
> > >>>
> > >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io>
> > wrote:
> > >>>
> > >>>> Hi again, Guozhang ;) Here's the second part of my response...
> > >>>>
> > >>>> It seems like your main concern is: "if I'm a user who wants final
> > >> update
> > >>>> semantics, how complicated is it for me to get it?"
> > >>>>
> > >>>> I think we have to assume that people don't always have time to
> become
> > >>>> deeply familiar with all the nuances of a programming environment
> > >> before
> > >>>> they use it. Especially if they're evaluating several frameworks for
> > >>> their
> > >>>> use case, it's very valuable to make it as obvious as possible how
> to
> > >>>> accomplish various computations with Streams.
> > >>>>
> > >>>> To me the biggest question is whether with a fresh perspective,
> people
> > >>>> would say "oh, I get it, I have to bound my lateness and suppress
> > >>>> intermediate updates, and of course I'll get only the final
> result!",
> > >> or
> > >>> if
> > >>>> it's more like "wtf? all I want is the final result, what are all
> > these
> > >>>> parameters?".
> > >>>>
> > >>>> I was talking with Matthias a while back, and he had an idea that I
> > >> think
> > >>>> can help, which is to essentially set up a final-result recipe in
> > >>> addition
> > >>>> to the raw parameters. I previously thought that it wouldn't be
> > >> possible
> > >>> to
> > >>>> restrict its usage to Windowed KTables, but thinking about it again
> > >> this
> > >>>> weekend, I have a couple of ideas:
> > >>>>
> > >>>> ================
> > >>>> = 1. Static Wrapper =
> > >>>> ================
> > >>>> We can define an extra static function that "wraps" a KTable with
> > >>>> final-result semantics.
> > >>>>
> > >>>> public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
> > >>>>   final KTable<K, V> windowedKTable,
> > >>>>   final Duration maxAllowedLateness,
> > >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > >>>>     return windowedKTable.suppress(
> > >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> > >>>>                    .suppressIntermediateEvents(
> > >>>>                      IntermediateSuppression
> > >>>>                        .emitAfter(maxAllowedLateness)
> > >>>>                        .bufferFullStrategy(bufferFullStrategy)
> > >>>>                    )
> > >>>>     );
> > >>>> }
> > >>>>
> > >>>> Because windowedKTable is a parameter, the static function can
> easily
> > >>>> impose an extra bound on the key type, that it extends Windowed.
> This
> > >>> would
> > >>>> make "final results only" only available on windowed ktables.
> > >>>>
> > >>>> Here's how it would look to use:
> > >>>>
> > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > >>>>   finalResultsOnly(
> > >>>>     windowCounts,
> > >>>>     Duration.ofMinutes(10),
> > >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> > >>>>   );
> > >>>>
> > >>>> Trying to use it on a non-windowed KTable yields:
> > >>>>
> > >>>>> Error:(129, 35) java: method finalResultsOnly in class
> > >>>>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest
> > cannot
> > >>> be
> > >>>>> applied to given types;
> > >>>>>   required:
> > >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> > BufferFullStrategy
> > >>>>>   found:
> > >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> > >>> String,java.lang.String>,java.time.Duration,org.apache.
> > >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> > >>>>>   reason: inference variable K has incompatible bounds
> > >>>>>     equality constraints: java.lang.String
> > >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > >>>>
> > >>>>
> > >>>>
> > >>>> =================================================
> > >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> > >>>> =================================================
> > >>>>
> > >>>> By adding K,V parameters to Suppression, we can provide a similarly
> > >>>> bounded config method directly on the Suppression class:
> > >>>>
> > >>>> public static <K extends Windowed, V> Suppression<K, V>
> > >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> > >>>> BufferFullStrategy bufferFullStrategy) {
> > >>>>     return Suppression
> > >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> > >>>>         .suppressIntermediateEvents(IntermediateSuppression
> > >>>>             .emitAfter(maxAllowedLateness)
> > >>>>             .bufferFullStrategy(bufferFullStrategy)
> > >>>>         );
> > >>>> }
> > >>>>
> > >>>> Then, here's how it would look to use it:
> > >>>>
> > >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> > >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> > >>>>   windowCounts.suppress(
> > >>>>     Suppression.finalResultsOnly(
> > >>>>       Duration.ofMinutes(10)
> > >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> > >>>>     )
> > >>>>   );
> > >>>>
> > >>>> Trying to use it on a non-windowed ktable yields:
> > >>>>
> > >>>>> Error:(127, 35) java: method finalResultsOnly in class
> > >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied
> > to
> > >>>>> given types;
> > >>>>>   required:
> > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > >>> Suppression.BufferFullStrategy
> > >>>>>   found:
> > >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> > >>> Suppression.BufferFullStrategy
> > >>>>>   reason: explicit type argument java.lang.String does not conform
> to
> > >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > >>>>
> > >>>>
> > >>>>
> > >>>> ============
> > >>>> = Downsides =
> > >>>> ============
> > >>>>
> > >>>> Of course, there's a downside either way:
> > >>>> * for 1:  this "wrapper" interaction would be the first in the DSL.
> Is
> > >> it
> > >>>> too strange, and how discoverable would it be?
> > >>>> * for 2: adding those type parameters to Suppression will force all
> > >>>> callers to provide them in the event of a chained construction
> because
> > >>> Java
> > >>>> doesn't do RHS recursive type inference. This is already visible in
> > >> other
> > >>>> parts of the Streams DSL. For example, often calls to Materialized
> > >>> builders
> > >>>> have to provide seemingly obvious type bounds.
> > >>>>
> > >>>> ============
> > >>>> = Conclusion =
> > >>>> ============
> > >>>>
> > >>>> I think option 2 is more "normal" and discoverable. It does have a
> > >>>> downside, but it's one that's pre-existing elsewhere in the DSL.
> > >>>>
> > >>>> WDYT? Would the addition of this "recipe" method to Suppression
> > resolve
> > >>>> your concern?
> > >>>>
> > >>>> Thanks again,
> > >>>> -John
> > >>>>
> > >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com>
> > >>> wrote:
> > >>>>
> > >>>>> Hi John,
> > >>>>>
> > >>>>> Regarding the metrics: yeah I think I'm with you that the dropped
> > >>> records
> > >>>>> due to window retention or emit suppression policies should be
> > >> recorded
> > >>>>> differently, and using this KIP's proposed metric would be fine. If
> > >> you
> > >>>>> also think we can use this KIP's proposed metrics to cover the
> window
> > >>>>> retention cased skipping records, then we can include the changes
> in
> > >>> this
> > >>>>> KIP as well.
> > >>>>>
> > >>>>> Regarding the current proposal, I'm actually not too worried about
> > the
> > >>>>> inconsistency between query semantics and downstream emit
> semantics.
> > >> For
> > >>>>> queries, we will always return the current running results of the
> > >>> windows,
> > >>>>> being it partial or final results depending on the window retention
> > >> time
> > >>>>> anyways, which has nothing to do whether the emitted stream should
> be
> > >>> one
> > >>>>> final output per key or not. I also agree that having a unified
> > >>> operation
> > >>>>> is generally better for users to focus on leveraging that one only
> > >> than
> > >>>>> learning about two set of operations. The only question I had is,
> for
> > >>>>> final
> > >>>>> updates of window stores, if it is a bit awkward to understand the
> > >>>>> configuration combo. Thinking about this more, I think my root
> worry
> > >> in
> > >>>>> the
> > >>>>> "suppressLateEvents" call for windowed tables, since from a user
> > >>>>> perspective: if my retention time is X which means "pay the cost to
> > >>> allow
> > >>>>> late records up to X to still be applied updating the tables", why
> > >>> would I
> > >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not send
> the
> > >>>>> updates up to Y, which means the downstream operator or sink topic
> > for
> > >>>>> this
> > >>>>> stream would actually see a truncated update stream while I've paid
> > >>> larger
> > >>>>> cost for that"; and of course, Y > X would not make sense either as
> > >> you
> > >>>>> would not see any updates later than X anyways. So in all, my
> feeling
> > >> is
> > >>>>> that it makes less sense for windowed table's "suppressLateEvents"
> > >> with
> > >>> a
> > >>>>> parameter that is not equal to the window retention, and opening
> the
> > >>> door
> > >>>>> in the current proposal may confuse people with that.
> > >>>>>
> > >>>>> Again, above is just a subjective opinion and probably we can also
> > >> bring
> > >>>>> up
> > >>>>> some scenarios that users does want to set X != Y.. but personally
> I
> > >>> feel
> > >>>>> that even if the semantics for this scenario if intuitive for user
> to
> > >>>>> understand, doe that really make sense and should we really open
> the
> > >>> door
> > >>>>> for it. So I think maybe separating the final update in a separate
> > >> API's
> > >>>>> benefits may overwhelm the advantage of having one uniform
> > definition.
> > >>> And
> > >>>>> for my alternative proposal, the rationale was from both my concern
> > >>> about
> > >>>>> "suppressLateEvents" for windowed store, and Matthias' question
> about
> > >>>>> "suppressLateEvents" for non-windowed stores, that if it is less
> > >>>>> meaningful
> > >>>>> for both, we can consider removing it completely and only do
> > >>>>> "IntermediateSuppression" in Suppress instead.
> > >>>>>
> > >>>>> So I'd summarize my thoughts in the following questions:
> > >>>>>
> > >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window
> retention
> > >>> time)
> > >>>>> for windowed stores make sense in practice?
> > >>>>> 2. Does "suppressLateEvents" with any parameter Y for non-windowed
> > >>> stores
> > >>>>> make sense in practice?
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> Guozhang
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com>
> > >> wrote:
> > >>>>>
> > >>>>>> Thanks for the explanation, that does make sense.  I have some
> > >>>>> questions on
> > >>>>>> operations, but I'll just wait for the PR and tests.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Bill
> > >>>>>>
> > >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io>
> > >>> wrote:
> > >>>>>>
> > >>>>>>> Hi Bill,
> > >>>>>>>
> > >>>>>>> Thanks for the review!
> > >>>>>>>
> > >>>>>>> Your question is very much applicable to the KIP and not at all
> an
> > >>>>>>> implementation detail. Thanks for bringing it up.
> > >>>>>>>
> > >>>>>>> I'm proposing not to change the existing caches and
> configurations
> > >>> at
> > >>>>> all
> > >>>>>>> (for now).
> > >>>>>>>
> > >>>>>>> Imagine you have a topology like this:
> > >>>>>>> commit.interval.ms = 100
> > >>>>>>>
> > >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> > >>>>>>>
> > >>>>>>> The first ktable (ktable1) will respect the commit interval and
> > >>> buffer
> > >>>>>>> events for 100ms before logging, storing, or forwarding them
> > >> (IIRC).
> > >>>>>>> Therefore, the second ktable (suppress) will only see the events
> > >> at
> > >>> a
> > >>>>>> rate
> > >>>>>>> of once per 100ms. It will apply its own buffering, and emit once
> > >>> per
> > >>>>>> 200ms
> > >>>>>>> This case is pretty trivial because the suppress time is a
> > >> multiple
> > >>> of
> > >>>>>> the
> > >>>>>>> commit interval.
> > >>>>>>>
> > >>>>>>> When it's not an integer multiple, you'll get behavior like in
> > >> this
> > >>>>>> marble
> > >>>>>>> diagram:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >>>>>>>
> > >>>>>>> [ KTable caching with commit interval = 2 ]
> > >>>>>>>
> > >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> > >>>>>>>
> > >>>>>>>       [ suppress with emitAfter = 3 ]
> > >>>>>>>
> > >>>>>>> <---------------(k:2)----------------(k:6)->
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> If this behavior isn't desired (for example, if you wanted to
> emit
> > >>>>> (k:3)
> > >>>>>> at
> > >>>>>>> time 3, I'd recommend setting the "cache.max.bytes.buffering" to
> 0
> > >>> or
> > >>>>>>> modifying the topology to disable caching. Then, the behavior is
> > >>> more
> > >>>>>>> simply determined just by the suppress operator.
> > >>>>>>>
> > >>>>>>> Does that seem right to you?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Regarding the changelogs, because the suppression operator hangs
> > >>> onto
> > >>>>>>> events for a while, it will need its own changelog. The changelog
> > >>>>>>> should represent the current state of the buffer at all times. So
> > >>> when
> > >>>>>> the
> > >>>>>>> suppress operator sees (k:2), for example, it will log (k:2).
> When
> > >>> it
> > >>>>>>> later gets to time 3, it's time to emit (k:2) downstream. Because
> > >> k
> > >>>>> is no
> > >>>>>>> longer buffered, the suppress operator will log (k:null). Thus,
> > >> when
> > >>>>>>> recovering,
> > >>>>>>> it can rebuild the buffer by reading its changelog.
> > >>>>>>>
> > >>>>>>> What do you think about this?
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> -John
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi John,  thanks for the KIP.
> > >>>>>>>>
> > >>>>>>>> Early on in the KIP, you mention the current approaches for
> > >>>>> controlling
> > >>>>>>> the
> > >>>>>>>> rate of downstream records from a KTable, cache size
> > >> configuration
> > >>>>> and
> > >>>>>>>> commit time.
> > >>>>>>>>
> > >>>>>>>> Will these configuration parameters still be in effect for
> > >> tables
> > >>>>> that
> > >>>>>>>> don't use suppression?  For tables taking advantage of
> > >>> suppression,
> > >>>>>> will
> > >>>>>>>> these configurations have no impact?
> > >>>>>>>> This last question may be to implementation specific but if the
> > >>>>>> requested
> > >>>>>>>> suppression time is longer than the specified commit time, will
> > >>> the
> > >>>>>>> latest
> > >>>>>>>> record in the suppression buffer get stored in a changelog?
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Bill
> > >>>>>>>>
> > >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <john@confluent.io
> > >>>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Thanks for the feedback, Matthias,
> > >>>>>>>>>
> > >>>>>>>>> It seems like in straightforward relational processing cases,
> > >> it
> > >>>>>> would
> > >>>>>>>> not
> > >>>>>>>>> make sense to bound the lateness of KTables. In general, it
> > >>> seems
> > >>>>>>> better
> > >>>>>>>> to
> > >>>>>>>>> have "guard rails" in place that make it easier to write
> > >>> sensible
> > >>>>>>>> programs
> > >>>>>>>>> than insensible ones.
> > >>>>>>>>>
> > >>>>>>>>> But I'm still going to argue in favor of keeping it for all
> > >>>>> KTables
> > >>>>>> ;)
> > >>>>>>>>>
> > >>>>>>>>> 1. I believe it is simpler to understand the operator if it
> > >> has
> > >>>>> one
> > >>>>>>>> uniform
> > >>>>>>>>> definition, regardless of context. It's well defined and
> > >>> intuitive
> > >>>>>> what
> > >>>>>>>>> will happen when you use late-event suppression on a KTable,
> > >> so
> > >>> I
> > >>>>>> think
> > >>>>>>>>> nothing surprising or dangerous will happen in that case. From
> > >>> my
> > >>>>>>>>> perspective, having two sets of allowed operations is actually
> > >>> an
> > >>>>>>>> increase
> > >>>>>>>>> in cognitive complexity.
> > >>>>>>>>>
> > >>>>>>>>> 2. To me, it's not crazy to use the operator this way. For
> > >>>>> example,
> > >>>>>> in
> > >>>>>>>> lieu
> > >>>>>>>>> of full-featured timestamp semantics, I can implement MVCC
> > >>>>> behavior
> > >>>>>>> when
> > >>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)". I
> > >>>>> suspect
> > >>>>>>> that
> > >>>>>>>>> there are other, non-obvious applications of suppressing late
> > >>>>> events
> > >>>>>> on
> > >>>>>>>>> KTables.
> > >>>>>>>>>
> > >>>>>>>>> 3. Not to get too much into implementation details in a KIP
> > >>>>>> discussion,
> > >>>>>>>> but
> > >>>>>>>>> if we did want to make late-event suppression available only
> > >> on
> > >>>>>>> windowed
> > >>>>>>>>> KTables, we have two enforcement options:
> > >>>>>>>>>   a. check when we build the topology - this would be simple
> > >> to
> > >>>>>>>> implement,
> > >>>>>>>>> but would be a runtime check. Hopefully, people write tests
> > >> for
> > >>>>> their
> > >>>>>>>>> topology before deploying them, so the feedback loop isn't
> > >>>>>>> instantaneous,
> > >>>>>>>>> but it's not too long either.
> > >>>>>>>>>   b. add a new WindowedKTable type - this would be a compile
> > >>> time
> > >>>>>>> check,
> > >>>>>>>>> but would also be substantial increase of both interface and
> > >>> code
> > >>>>>>>>> complexity.
> > >>>>>>>>>
> > >>>>>>>>> We should definitely strive to have guard rails protecting
> > >>> against
> > >>>>>>>>> surprising or dangerous behavior. Protecting against programs
> > >>>>> that we
> > >>>>>>>> don't
> > >>>>>>>>> currently predict is a lesser benefit, and I think we can put
> > >> up
> > >>>>>> guard
> > >>>>>>>>> rails on a case-by-case basis for that. It seems like the
> > >>>>> increase in
> > >>>>>>>>> cognitive (and potentially code and interface) complexity
> > >> makes
> > >>> me
> > >>>>>>> think
> > >>>>>>>> we
> > >>>>>>>>> should skip this case.
> > >>>>>>>>>
> > >>>>>>>>> What do you think?
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> -John
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > >>>>>>> matthias@confluent.io>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for the KIP John.
> > >>>>>>>>>>
> > >>>>>>>>>> One initial comments about the last example "Bounded
> > >>> lateness":
> > >>>>>> For a
> > >>>>>>>>>> non-windowed KTable bounding the lateness does not really
> > >> make
> > >>>>>> sense,
> > >>>>>>>>>> does it?
> > >>>>>>>>>>
> > >>>>>>>>>> Thus, I am wondering if we should allow
> > >> `suppressLateEvents()`
> > >>>>> for
> > >>>>>>> this
> > >>>>>>>>>> case? It seems to be better to only allow it for
> > >>>>> windowed-KTables.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> -Matthias
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> > >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> > >>>>>>>>>>>
> > >>>>>>>>>>> What you gave as new example is semantically the same as
> > >>> what
> > >>>>> I
> > >>>>>>>>>> suggested.
> > >>>>>>>>>>> So it is good by me.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > >>>>> john@confluent.io
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Thanks for taking look, Ted,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I agree this is a departure from the conventions of
> > >> Streams
> > >>>>> DSL.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Most of our config objects have one or two "required"
> > >>>>>> parameters,
> > >>>>>>>>> which
> > >>>>>>>>>> fit
> > >>>>>>>>>>>> naturally with the static factory method approach.
> > >>>>> TimeWindow,
> > >>>>>> for
> > >>>>>>>>>> example,
> > >>>>>>>>>>>> requires a size parameter, so we can naturally say
> > >>>>>>>>> TimeWindows.of(size).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think in the case of a suppression, there's really no
> > >>>>> "core"
> > >>>>>>>>>> parameter,
> > >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> > >>>>> Suppression()". I
> > >>>>>>>> think
> > >>>>>>>>>> that
> > >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since there
> > >>> are
> > >>>>>> many
> > >>>>>>>>>> durations
> > >>>>>>>>>>>> that we can configure.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> However, thinking about it again, I suppose that I can
> > >> give
> > >>>>> each
> > >>>>>>>>>>>> configuration method a static version, which would let
> > >> you
> > >>>>>> replace
> > >>>>>>>>> "new
> > >>>>>>>>>>>> Suppression()." with "Suppression." in all the examples.
> > >>>>>>> Basically,
> > >>>>>>>>>> instead
> > >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> For example:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> windowCounts
> > >>>>>>>>>>>>     .suppress(
> > >>>>>>>>>>>>         Suppression
> > >>>>>>>>>>>>             .suppressLateEvents(Duration.ofMinutes(10))
> > >>>>>>>>>>>>             .suppressIntermediateEvents(
> > >>>>>>>>>>>>
> > >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > >>>>>>>>>>>>             )
> > >>>>>>>>>>>>     );
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Does that seem better?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> -John
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > >>> yuzhihong@gmail.com
> > >>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> > >>>>> materials.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> One suggestion:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>     .suppress(
> > >>>>>>>>>>>>>         new Suppression()
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Do you think it would be more consistent with the rest
> > >> of
> > >>>>>> Streams
> > >>>>>>>>> data
> > >>>>>>>>>>>>> structures by supporting `of` ?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Cheers
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > >>>>>> john@confluent.io
> > >>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hello devs and users,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Please take some time to consider this proposal for
> > >> Kafka
> > >>>>>>> Streams:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> The basic idea is to provide:
> > >>>>>>>>>>>>>> * more usable control over update rate (vs the current
> > >>>>> state
> > >>>>>>> store
> > >>>>>>>>>>>>> caches)
> > >>>>>>>>>>>>>> * the final-result-for-windowed-computations feature
> > >>> which
> > >>>>>>> several
> > >>>>>>>>>>>> people
> > >>>>>>>>>>>>>> have requested
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I look forward to your feedback!
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>> -John
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> -- Guozhang
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> > >
> >
> >
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

I think I agree with Matthias for having dedicated APIs for windowed
operation final output scenario, PLUS separating the window close which the
"final output" would rely on, from the window retention time itself
(admittedly it would make this KIP effort larger, but if we believe we need
to do this separation anyways we could just do it now).

And then we can have the `KTable#suppress()` for intermediate-suppression
only, not for late-record-suppression, until we've seen that becomes a
common feature request because our current design still allows to be
extended for that purpose.


Guozhang

On Wed, Jul 4, 2018 at 12:53 PM, Matthias J. Sax <ma...@confluent.io>
wrote:

> Thanks for the discussion. I am just catching up.
>
> In general, I think we have different uses cases and non-windowed and
> windowed is quite different. For the non-windowed case, suppress() has
> no (useful) close or retention time, no final semantics, and also no
> business logic impact.
>
> On the other hand, for windowed aggregations, close time and final
> result do have a meaning. IMHO, `close()` is part of business logic
> while retention time is not. Also, suppression of intermediate result is
> not a business rule and there might be use case for which either "early
> intermediate" (before window end time) are suppressed only, or all
> intermediates are suppressed (maybe also something in the middle, ie,
> just reduce the load of intermediate updates). Thus, window-suppression
> is much richer.
>
> IMHO, a generic `suppress()` operator that can be inserted into the data
> flow at any point is useful. Maybe we should keep is as generic as
> possible. However, it might be difficult to use with regard to
> windowing, as the mental effort to use it is high.
>
> With regard to Guozhang's comment:
>
> > we will actually
> > process data as old as 30 days as well, while most of the late updates
> > beyond 5 minutes would be discarded anyways.
>
> If we use `suppress()` as a standalone operator, this is correct and
> intended IMHO. To address the issue if the behavior is unwanted, I would
> suggest to add a "suppress option" directly to
> `count()/reduce()/aggregate()` window operator similar to
> `Materialized`. This would be an "embedded suppress" and avoid the
> issue. It would also address the issue about mental effort for "single
> final window result" use case.
>
> I also think that a shorter close-time than retention time is useful for
> window aggregation. If we add close() to the window definition and
> until() to `Materialized`, we can separate both correctly IMHO.
>
> About setting `close = min(close,retention)` I am not sure. We might
> rather throw an exception than reducing the close time automatically.
> Otherwise, I see many user question about "I set close to X but it does
> not get updated for some data that is with delay of X".
>
> The tricky question might be to design the API in a backward compatible
> way though.
>
>
>
> -Matthias
>
> On 7/3/18 5:38 AM, John Roesler wrote:
> > Hi Guozhang,
> >
> > I see. It seems like if we want to decouple 1) and 2), we need to alter
> the
> > definition of the window. Do you think it would close the gap if we
> added a
> > "window close" time to the window definition?
> >
> > Such as:
> >
> > builder.stream("input")
> > .groupByKey()
> > .windowedBy(
> >   TimeWindows
> >     .of(60_000)
> >     .closeAfter(10 * 60)
> >     .until(30L * 24 * 60 * 60 * 1000)
> > )
> > .count()
> > .suppress(Suppression.finalResultsOnly());
> >
> > Possibly called "finalResultsAtWindowClose" or something?
> >
> > Thanks,
> > -John
> >
> > On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com> wrote:
> >
> >> Hey John,
> >>
> >> Obviously I'm too lazy on email replying diligence compared with you :)
> >> Will try to reply them separately:
> >>
> >>
> >> ------------------------------------------------------------
> -----------------
> >>
> >> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
> >>
> >> I'm aware of this use case, but again, the concern is that, in this
> setting
> >> in order to let the window be queryable for 30 days, we will actually
> >> process data as old as 30 days as well, while most of the late updates
> >> beyond 5 minutes would be discarded anyways. Personally I think for the
> >> final update scenario, the ideal situation users would want is that "do
> not
> >> process any data that is less than 5 minutes, and of course no update
> >> records to the downstream later than 5 minutes either; but retain the
> >> window to be queryable for 30 days". And by doing that the final window
> >> snapshot would also be aligned with the update stream as well. In other
> >> words, among these three periods:
> >>
> >> 1) the retention length of the window / table.
> >> 2) the late records acceptance for updating the window.
> >> 3) the late records update to be sent downstream.
> >>
> >> Final update use cases would naturally want 2) = 3), while 1) may be
> >> different and larger, while what we provide now is that 1) = 2), which
> >> could be different and in practice larger than 3), hence not the most
> >> intuitive for their needs.
> >>
> >>
> >>
> >> ------------------------------------------------------------
> -----------------
> >>
> >> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
> >>
> >> I'd like option 2) over option 1) better as well from programming pov.
> But
> >> I'm wondering if option 2) would provide the above semantics or it is
> still
> >> coupling 1) with 2) as well ?
> >>
> >>
> >>
> >> Guozhang
> >>
> >>
> >>
> >>
> >> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io> wrote:
> >>
> >>> In fact, to push the idea further (which IIRC is what Matthias
> originally
> >>> proposed), if we can accept "Suppression#finalResultsOnly" in my last
> >>> email, then we could also consider whether to eliminate
> >>> "suppressLateEvents" entirely.
> >>>
> >>> We could always add it later, but you've both expressed doubt that
> there
> >>> are practical use cases for it outside of final-results.
> >>>
> >>> -John
> >>>
> >>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io>
> wrote:
> >>>
> >>>> Hi again, Guozhang ;) Here's the second part of my response...
> >>>>
> >>>> It seems like your main concern is: "if I'm a user who wants final
> >> update
> >>>> semantics, how complicated is it for me to get it?"
> >>>>
> >>>> I think we have to assume that people don't always have time to become
> >>>> deeply familiar with all the nuances of a programming environment
> >> before
> >>>> they use it. Especially if they're evaluating several frameworks for
> >>> their
> >>>> use case, it's very valuable to make it as obvious as possible how to
> >>>> accomplish various computations with Streams.
> >>>>
> >>>> To me the biggest question is whether with a fresh perspective, people
> >>>> would say "oh, I get it, I have to bound my lateness and suppress
> >>>> intermediate updates, and of course I'll get only the final result!",
> >> or
> >>> if
> >>>> it's more like "wtf? all I want is the final result, what are all
> these
> >>>> parameters?".
> >>>>
> >>>> I was talking with Matthias a while back, and he had an idea that I
> >> think
> >>>> can help, which is to essentially set up a final-result recipe in
> >>> addition
> >>>> to the raw parameters. I previously thought that it wouldn't be
> >> possible
> >>> to
> >>>> restrict its usage to Windowed KTables, but thinking about it again
> >> this
> >>>> weekend, I have a couple of ideas:
> >>>>
> >>>> ================
> >>>> = 1. Static Wrapper =
> >>>> ================
> >>>> We can define an extra static function that "wraps" a KTable with
> >>>> final-result semantics.
> >>>>
> >>>> public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
> >>>>   final KTable<K, V> windowedKTable,
> >>>>   final Duration maxAllowedLateness,
> >>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
> >>>>     return windowedKTable.suppress(
> >>>>         Suppression.suppressLateEvents(maxAllowedLateness)
> >>>>                    .suppressIntermediateEvents(
> >>>>                      IntermediateSuppression
> >>>>                        .emitAfter(maxAllowedLateness)
> >>>>                        .bufferFullStrategy(bufferFullStrategy)
> >>>>                    )
> >>>>     );
> >>>> }
> >>>>
> >>>> Because windowedKTable is a parameter, the static function can easily
> >>>> impose an extra bound on the key type, that it extends Windowed. This
> >>> would
> >>>> make "final results only" only available on windowed ktables.
> >>>>
> >>>> Here's how it would look to use:
> >>>>
> >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> >>>>   finalResultsOnly(
> >>>>     windowCounts,
> >>>>     Duration.ofMinutes(10),
> >>>>     Suppression.BufferFullStrategy.SHUT_DOWN
> >>>>   );
> >>>>
> >>>> Trying to use it on a non-windowed KTable yields:
> >>>>
> >>>>> Error:(129, 35) java: method finalResultsOnly in class
> >>>>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest
> cannot
> >>> be
> >>>>> applied to given types;
> >>>>>   required:
> >>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> >>> Duration,org.apache.kafka.streams.kstream.Suppression.
> BufferFullStrategy
> >>>>>   found:
> >>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
> >>> String,java.lang.String>,java.time.Duration,org.apache.
> >>> kafka.streams.kstream.Suppression.BufferFullStrategy
> >>>>>   reason: inference variable K has incompatible bounds
> >>>>>     equality constraints: java.lang.String
> >>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> >>>>
> >>>>
> >>>>
> >>>> =================================================
> >>>> = 2. Add <K,V> parameters and recipe method to Suppression =
> >>>> =================================================
> >>>>
> >>>> By adding K,V parameters to Suppression, we can provide a similarly
> >>>> bounded config method directly on the Suppression class:
> >>>>
> >>>> public static <K extends Windowed, V> Suppression<K, V>
> >>>> finalResultsOnly(final Duration maxAllowedLateness, final
> >>>> BufferFullStrategy bufferFullStrategy) {
> >>>>     return Suppression
> >>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
> >>>>         .suppressIntermediateEvents(IntermediateSuppression
> >>>>             .emitAfter(maxAllowedLateness)
> >>>>             .bufferFullStrategy(bufferFullStrategy)
> >>>>         );
> >>>> }
> >>>>
> >>>> Then, here's how it would look to use it:
> >>>>
> >>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
> >>>> final KTable<Windowed<Integer>, Long> finalCounts =
> >>>>   windowCounts.suppress(
> >>>>     Suppression.finalResultsOnly(
> >>>>       Duration.ofMinutes(10)
> >>>>       Suppression.BufferFullStrategy.SHUT_DOWN
> >>>>     )
> >>>>   );
> >>>>
> >>>> Trying to use it on a non-windowed ktable yields:
> >>>>
> >>>>> Error:(127, 35) java: method finalResultsOnly in class
> >>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied
> to
> >>>>> given types;
> >>>>>   required:
> >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> >>> Suppression.BufferFullStrategy
> >>>>>   found:
> >>>>> java.time.Duration,org.apache.kafka.streams.kstream.
> >>> Suppression.BufferFullStrategy
> >>>>>   reason: explicit type argument java.lang.String does not conform to
> >>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> >>>>
> >>>>
> >>>>
> >>>> ============
> >>>> = Downsides =
> >>>> ============
> >>>>
> >>>> Of course, there's a downside either way:
> >>>> * for 1:  this "wrapper" interaction would be the first in the DSL. Is
> >> it
> >>>> too strange, and how discoverable would it be?
> >>>> * for 2: adding those type parameters to Suppression will force all
> >>>> callers to provide them in the event of a chained construction because
> >>> Java
> >>>> doesn't do RHS recursive type inference. This is already visible in
> >> other
> >>>> parts of the Streams DSL. For example, often calls to Materialized
> >>> builders
> >>>> have to provide seemingly obvious type bounds.
> >>>>
> >>>> ============
> >>>> = Conclusion =
> >>>> ============
> >>>>
> >>>> I think option 2 is more "normal" and discoverable. It does have a
> >>>> downside, but it's one that's pre-existing elsewhere in the DSL.
> >>>>
> >>>> WDYT? Would the addition of this "recipe" method to Suppression
> resolve
> >>>> your concern?
> >>>>
> >>>> Thanks again,
> >>>> -John
> >>>>
> >>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Hi John,
> >>>>>
> >>>>> Regarding the metrics: yeah I think I'm with you that the dropped
> >>> records
> >>>>> due to window retention or emit suppression policies should be
> >> recorded
> >>>>> differently, and using this KIP's proposed metric would be fine. If
> >> you
> >>>>> also think we can use this KIP's proposed metrics to cover the window
> >>>>> retention cased skipping records, then we can include the changes in
> >>> this
> >>>>> KIP as well.
> >>>>>
> >>>>> Regarding the current proposal, I'm actually not too worried about
> the
> >>>>> inconsistency between query semantics and downstream emit semantics.
> >> For
> >>>>> queries, we will always return the current running results of the
> >>> windows,
> >>>>> being it partial or final results depending on the window retention
> >> time
> >>>>> anyways, which has nothing to do whether the emitted stream should be
> >>> one
> >>>>> final output per key or not. I also agree that having a unified
> >>> operation
> >>>>> is generally better for users to focus on leveraging that one only
> >> than
> >>>>> learning about two set of operations. The only question I had is, for
> >>>>> final
> >>>>> updates of window stores, if it is a bit awkward to understand the
> >>>>> configuration combo. Thinking about this more, I think my root worry
> >> in
> >>>>> the
> >>>>> "suppressLateEvents" call for windowed tables, since from a user
> >>>>> perspective: if my retention time is X which means "pay the cost to
> >>> allow
> >>>>> late records up to X to still be applied updating the tables", why
> >>> would I
> >>>>> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> >>>>> updates up to Y, which means the downstream operator or sink topic
> for
> >>>>> this
> >>>>> stream would actually see a truncated update stream while I've paid
> >>> larger
> >>>>> cost for that"; and of course, Y > X would not make sense either as
> >> you
> >>>>> would not see any updates later than X anyways. So in all, my feeling
> >> is
> >>>>> that it makes less sense for windowed table's "suppressLateEvents"
> >> with
> >>> a
> >>>>> parameter that is not equal to the window retention, and opening the
> >>> door
> >>>>> in the current proposal may confuse people with that.
> >>>>>
> >>>>> Again, above is just a subjective opinion and probably we can also
> >> bring
> >>>>> up
> >>>>> some scenarios that users does want to set X != Y.. but personally I
> >>> feel
> >>>>> that even if the semantics for this scenario if intuitive for user to
> >>>>> understand, doe that really make sense and should we really open the
> >>> door
> >>>>> for it. So I think maybe separating the final update in a separate
> >> API's
> >>>>> benefits may overwhelm the advantage of having one uniform
> definition.
> >>> And
> >>>>> for my alternative proposal, the rationale was from both my concern
> >>> about
> >>>>> "suppressLateEvents" for windowed store, and Matthias' question about
> >>>>> "suppressLateEvents" for non-windowed stores, that if it is less
> >>>>> meaningful
> >>>>> for both, we can consider removing it completely and only do
> >>>>> "IntermediateSuppression" in Suppress instead.
> >>>>>
> >>>>> So I'd summarize my thoughts in the following questions:
> >>>>>
> >>>>> 1. Does "suppressLateEvents" with parameter Y != X (window retention
> >>> time)
> >>>>> for windowed stores make sense in practice?
> >>>>> 2. Does "suppressLateEvents" with any parameter Y for non-windowed
> >>> stores
> >>>>> make sense in practice?
> >>>>>
> >>>>>
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>>
> >>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> Thanks for the explanation, that does make sense.  I have some
> >>>>> questions on
> >>>>>> operations, but I'll just wait for the PR and tests.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Bill
> >>>>>>
> >>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io>
> >>> wrote:
> >>>>>>
> >>>>>>> Hi Bill,
> >>>>>>>
> >>>>>>> Thanks for the review!
> >>>>>>>
> >>>>>>> Your question is very much applicable to the KIP and not at all an
> >>>>>>> implementation detail. Thanks for bringing it up.
> >>>>>>>
> >>>>>>> I'm proposing not to change the existing caches and configurations
> >>> at
> >>>>> all
> >>>>>>> (for now).
> >>>>>>>
> >>>>>>> Imagine you have a topology like this:
> >>>>>>> commit.interval.ms = 100
> >>>>>>>
> >>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
> >>>>>>>
> >>>>>>> The first ktable (ktable1) will respect the commit interval and
> >>> buffer
> >>>>>>> events for 100ms before logging, storing, or forwarding them
> >> (IIRC).
> >>>>>>> Therefore, the second ktable (suppress) will only see the events
> >> at
> >>> a
> >>>>>> rate
> >>>>>>> of once per 100ms. It will apply its own buffering, and emit once
> >>> per
> >>>>>> 200ms
> >>>>>>> This case is pretty trivial because the suppress time is a
> >> multiple
> >>> of
> >>>>>> the
> >>>>>>> commit interval.
> >>>>>>>
> >>>>>>> When it's not an integer multiple, you'll get behavior like in
> >> this
> >>>>>> marble
> >>>>>>> diagram:
> >>>>>>>
> >>>>>>>
> >>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> >>>>>>>
> >>>>>>> [ KTable caching with commit interval = 2 ]
> >>>>>>>
> >>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
> >>>>>>>
> >>>>>>>       [ suppress with emitAfter = 3 ]
> >>>>>>>
> >>>>>>> <---------------(k:2)----------------(k:6)->
> >>>>>>>
> >>>>>>>
> >>>>>>> If this behavior isn't desired (for example, if you wanted to emit
> >>>>> (k:3)
> >>>>>> at
> >>>>>>> time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0
> >>> or
> >>>>>>> modifying the topology to disable caching. Then, the behavior is
> >>> more
> >>>>>>> simply determined just by the suppress operator.
> >>>>>>>
> >>>>>>> Does that seem right to you?
> >>>>>>>
> >>>>>>>
> >>>>>>> Regarding the changelogs, because the suppression operator hangs
> >>> onto
> >>>>>>> events for a while, it will need its own changelog. The changelog
> >>>>>>> should represent the current state of the buffer at all times. So
> >>> when
> >>>>>> the
> >>>>>>> suppress operator sees (k:2), for example, it will log (k:2). When
> >>> it
> >>>>>>> later gets to time 3, it's time to emit (k:2) downstream. Because
> >> k
> >>>>> is no
> >>>>>>> longer buffered, the suppress operator will log (k:null). Thus,
> >> when
> >>>>>>> recovering,
> >>>>>>> it can rebuild the buffer by reading its changelog.
> >>>>>>>
> >>>>>>> What do you think about this?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi John,  thanks for the KIP.
> >>>>>>>>
> >>>>>>>> Early on in the KIP, you mention the current approaches for
> >>>>> controlling
> >>>>>>> the
> >>>>>>>> rate of downstream records from a KTable, cache size
> >> configuration
> >>>>> and
> >>>>>>>> commit time.
> >>>>>>>>
> >>>>>>>> Will these configuration parameters still be in effect for
> >> tables
> >>>>> that
> >>>>>>>> don't use suppression?  For tables taking advantage of
> >>> suppression,
> >>>>>> will
> >>>>>>>> these configurations have no impact?
> >>>>>>>> This last question may be to implementation specific but if the
> >>>>>> requested
> >>>>>>>> suppression time is longer than the specified commit time, will
> >>> the
> >>>>>>> latest
> >>>>>>>> record in the suppression buffer get stored in a changelog?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Bill
> >>>>>>>>
> >>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <john@confluent.io
> >>>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the feedback, Matthias,
> >>>>>>>>>
> >>>>>>>>> It seems like in straightforward relational processing cases,
> >> it
> >>>>>> would
> >>>>>>>> not
> >>>>>>>>> make sense to bound the lateness of KTables. In general, it
> >>> seems
> >>>>>>> better
> >>>>>>>> to
> >>>>>>>>> have "guard rails" in place that make it easier to write
> >>> sensible
> >>>>>>>> programs
> >>>>>>>>> than insensible ones.
> >>>>>>>>>
> >>>>>>>>> But I'm still going to argue in favor of keeping it for all
> >>>>> KTables
> >>>>>> ;)
> >>>>>>>>>
> >>>>>>>>> 1. I believe it is simpler to understand the operator if it
> >> has
> >>>>> one
> >>>>>>>> uniform
> >>>>>>>>> definition, regardless of context. It's well defined and
> >>> intuitive
> >>>>>> what
> >>>>>>>>> will happen when you use late-event suppression on a KTable,
> >> so
> >>> I
> >>>>>> think
> >>>>>>>>> nothing surprising or dangerous will happen in that case. From
> >>> my
> >>>>>>>>> perspective, having two sets of allowed operations is actually
> >>> an
> >>>>>>>> increase
> >>>>>>>>> in cognitive complexity.
> >>>>>>>>>
> >>>>>>>>> 2. To me, it's not crazy to use the operator this way. For
> >>>>> example,
> >>>>>> in
> >>>>>>>> lieu
> >>>>>>>>> of full-featured timestamp semantics, I can implement MVCC
> >>>>> behavior
> >>>>>>> when
> >>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)". I
> >>>>> suspect
> >>>>>>> that
> >>>>>>>>> there are other, non-obvious applications of suppressing late
> >>>>> events
> >>>>>> on
> >>>>>>>>> KTables.
> >>>>>>>>>
> >>>>>>>>> 3. Not to get too much into implementation details in a KIP
> >>>>>> discussion,
> >>>>>>>> but
> >>>>>>>>> if we did want to make late-event suppression available only
> >> on
> >>>>>>> windowed
> >>>>>>>>> KTables, we have two enforcement options:
> >>>>>>>>>   a. check when we build the topology - this would be simple
> >> to
> >>>>>>>> implement,
> >>>>>>>>> but would be a runtime check. Hopefully, people write tests
> >> for
> >>>>> their
> >>>>>>>>> topology before deploying them, so the feedback loop isn't
> >>>>>>> instantaneous,
> >>>>>>>>> but it's not too long either.
> >>>>>>>>>   b. add a new WindowedKTable type - this would be a compile
> >>> time
> >>>>>>> check,
> >>>>>>>>> but would also be substantial increase of both interface and
> >>> code
> >>>>>>>>> complexity.
> >>>>>>>>>
> >>>>>>>>> We should definitely strive to have guard rails protecting
> >>> against
> >>>>>>>>> surprising or dangerous behavior. Protecting against programs
> >>>>> that we
> >>>>>>>> don't
> >>>>>>>>> currently predict is a lesser benefit, and I think we can put
> >> up
> >>>>>> guard
> >>>>>>>>> rails on a case-by-case basis for that. It seems like the
> >>>>> increase in
> >>>>>>>>> cognitive (and potentially code and interface) complexity
> >> makes
> >>> me
> >>>>>>> think
> >>>>>>>> we
> >>>>>>>>> should skip this case.
> >>>>>>>>>
> >>>>>>>>> What do you think?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> >>>>>>> matthias@confluent.io>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks for the KIP John.
> >>>>>>>>>>
> >>>>>>>>>> One initial comments about the last example "Bounded
> >>> lateness":
> >>>>>> For a
> >>>>>>>>>> non-windowed KTable bounding the lateness does not really
> >> make
> >>>>>> sense,
> >>>>>>>>>> does it?
> >>>>>>>>>>
> >>>>>>>>>> Thus, I am wondering if we should allow
> >> `suppressLateEvents()`
> >>>>> for
> >>>>>>> this
> >>>>>>>>>> case? It seems to be better to only allow it for
> >>>>> windowed-KTables.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -Matthias
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
> >>>>>>>>>>> I noticed this (lack of primary parameter) as well.
> >>>>>>>>>>>
> >>>>>>>>>>> What you gave as new example is semantically the same as
> >>> what
> >>>>> I
> >>>>>>>>>> suggested.
> >>>>>>>>>>> So it is good by me.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> >>>>> john@confluent.io
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Thanks for taking look, Ted,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I agree this is a departure from the conventions of
> >> Streams
> >>>>> DSL.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Most of our config objects have one or two "required"
> >>>>>> parameters,
> >>>>>>>>> which
> >>>>>>>>>> fit
> >>>>>>>>>>>> naturally with the static factory method approach.
> >>>>> TimeWindow,
> >>>>>> for
> >>>>>>>>>> example,
> >>>>>>>>>>>> requires a size parameter, so we can naturally say
> >>>>>>>>> TimeWindows.of(size).
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think in the case of a suppression, there's really no
> >>>>> "core"
> >>>>>>>>>> parameter,
> >>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
> >>>>> Suppression()". I
> >>>>>>>> think
> >>>>>>>>>> that
> >>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since there
> >>> are
> >>>>>> many
> >>>>>>>>>> durations
> >>>>>>>>>>>> that we can configure.
> >>>>>>>>>>>>
> >>>>>>>>>>>> However, thinking about it again, I suppose that I can
> >> give
> >>>>> each
> >>>>>>>>>>>> configuration method a static version, which would let
> >> you
> >>>>>> replace
> >>>>>>>>> "new
> >>>>>>>>>>>> Suppression()." with "Suppression." in all the examples.
> >>>>>>> Basically,
> >>>>>>>>>> instead
> >>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For example:
> >>>>>>>>>>>>
> >>>>>>>>>>>> windowCounts
> >>>>>>>>>>>>     .suppress(
> >>>>>>>>>>>>         Suppression
> >>>>>>>>>>>>             .suppressLateEvents(Duration.ofMinutes(10))
> >>>>>>>>>>>>             .suppressIntermediateEvents(
> >>>>>>>>>>>>
> >>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> >>>>>>>>>>>>             )
> >>>>>>>>>>>>     );
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Does that seem better?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> -John
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> >>> yuzhihong@gmail.com
> >>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I started to read this KIP which contains a lot of
> >>>>> materials.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> One suggestion:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>     .suppress(
> >>>>>>>>>>>>>         new Suppression()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do you think it would be more consistent with the rest
> >> of
> >>>>>> Streams
> >>>>>>>>> data
> >>>>>>>>>>>>> structures by supporting `of` ?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> >>>>>> john@confluent.io
> >>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hello devs and users,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please take some time to consider this proposal for
> >> Kafka
> >>>>>>> Streams:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The basic idea is to provide:
> >>>>>>>>>>>>>> * more usable control over update rate (vs the current
> >>>>> state
> >>>>>>> store
> >>>>>>>>>>>>> caches)
> >>>>>>>>>>>>>> * the final-result-for-windowed-computations feature
> >>> which
> >>>>>>> several
> >>>>>>>>>>>> people
> >>>>>>>>>>>>>> have requested
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I look forward to your feedback!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> -- Guozhang
> >>>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by "Matthias J. Sax" <ma...@confluent.io>.

Thanks for the discussion. I am just catching up.

In general, I think we have different uses cases and non-windowed and
windowed is quite different. For the non-windowed case, suppress() has
no (useful) close or retention time, no final semantics, and also no
business logic impact.

On the other hand, for windowed aggregations, close time and final
result do have a meaning. IMHO, `close()` is part of business logic
while retention time is not. Also, suppression of intermediate result is
not a business rule and there might be use case for which either "early
intermediate" (before window end time) are suppressed only, or all
intermediates are suppressed (maybe also something in the middle, ie,
just reduce the load of intermediate updates). Thus, window-suppression
is much richer.

IMHO, a generic `suppress()` operator that can be inserted into the data
flow at any point is useful. Maybe we should keep is as generic as
possible. However, it might be difficult to use with regard to
windowing, as the mental effort to use it is high.

With regard to Guozhang's comment:

> we will actually
> process data as old as 30 days as well, while most of the late updates
> beyond 5 minutes would be discarded anyways.

If we use `suppress()` as a standalone operator, this is correct and
intended IMHO. To address the issue if the behavior is unwanted, I would
suggest to add a "suppress option" directly to
`count()/reduce()/aggregate()` window operator similar to
`Materialized`. This would be an "embedded suppress" and avoid the
issue. It would also address the issue about mental effort for "single
final window result" use case.

I also think that a shorter close-time than retention time is useful for
window aggregation. If we add close() to the window definition and
until() to `Materialized`, we can separate both correctly IMHO.

About setting `close = min(close,retention)` I am not sure. We might
rather throw an exception than reducing the close time automatically.
Otherwise, I see many user question about "I set close to X but it does
not get updated for some data that is with delay of X".

The tricky question might be to design the API in a backward compatible
way though.



-Matthias

On 7/3/18 5:38 AM, John Roesler wrote:
> Hi Guozhang,
> 
> I see. It seems like if we want to decouple 1) and 2), we need to alter the
> definition of the window. Do you think it would close the gap if we added a
> "window close" time to the window definition?
> 
> Such as:
> 
> builder.stream("input")
> .groupByKey()
> .windowedBy(
>   TimeWindows
>     .of(60_000)
>     .closeAfter(10 * 60)
>     .until(30L * 24 * 60 * 60 * 1000)
> )
> .count()
> .suppress(Suppression.finalResultsOnly());
> 
> Possibly called "finalResultsAtWindowClose" or something?
> 
> Thanks,
> -John
> 
> On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com> wrote:
> 
>> Hey John,
>>
>> Obviously I'm too lazy on email replying diligence compared with you :)
>> Will try to reply them separately:
>>
>>
>> -----------------------------------------------------------------------------
>>
>> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
>>
>> I'm aware of this use case, but again, the concern is that, in this setting
>> in order to let the window be queryable for 30 days, we will actually
>> process data as old as 30 days as well, while most of the late updates
>> beyond 5 minutes would be discarded anyways. Personally I think for the
>> final update scenario, the ideal situation users would want is that "do not
>> process any data that is less than 5 minutes, and of course no update
>> records to the downstream later than 5 minutes either; but retain the
>> window to be queryable for 30 days". And by doing that the final window
>> snapshot would also be aligned with the update stream as well. In other
>> words, among these three periods:
>>
>> 1) the retention length of the window / table.
>> 2) the late records acceptance for updating the window.
>> 3) the late records update to be sent downstream.
>>
>> Final update use cases would naturally want 2) = 3), while 1) may be
>> different and larger, while what we provide now is that 1) = 2), which
>> could be different and in practice larger than 3), hence not the most
>> intuitive for their needs.
>>
>>
>>
>> -----------------------------------------------------------------------------
>>
>> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
>>
>> I'd like option 2) over option 1) better as well from programming pov. But
>> I'm wondering if option 2) would provide the above semantics or it is still
>> coupling 1) with 2) as well ?
>>
>>
>>
>> Guozhang
>>
>>
>>
>>
>> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io> wrote:
>>
>>> In fact, to push the idea further (which IIRC is what Matthias originally
>>> proposed), if we can accept "Suppression#finalResultsOnly" in my last
>>> email, then we could also consider whether to eliminate
>>> "suppressLateEvents" entirely.
>>>
>>> We could always add it later, but you've both expressed doubt that there
>>> are practical use cases for it outside of final-results.
>>>
>>> -John
>>>
>>> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io> wrote:
>>>
>>>> Hi again, Guozhang ;) Here's the second part of my response...
>>>>
>>>> It seems like your main concern is: "if I'm a user who wants final
>> update
>>>> semantics, how complicated is it for me to get it?"
>>>>
>>>> I think we have to assume that people don't always have time to become
>>>> deeply familiar with all the nuances of a programming environment
>> before
>>>> they use it. Especially if they're evaluating several frameworks for
>>> their
>>>> use case, it's very valuable to make it as obvious as possible how to
>>>> accomplish various computations with Streams.
>>>>
>>>> To me the biggest question is whether with a fresh perspective, people
>>>> would say "oh, I get it, I have to bound my lateness and suppress
>>>> intermediate updates, and of course I'll get only the final result!",
>> or
>>> if
>>>> it's more like "wtf? all I want is the final result, what are all these
>>>> parameters?".
>>>>
>>>> I was talking with Matthias a while back, and he had an idea that I
>> think
>>>> can help, which is to essentially set up a final-result recipe in
>>> addition
>>>> to the raw parameters. I previously thought that it wouldn't be
>> possible
>>> to
>>>> restrict its usage to Windowed KTables, but thinking about it again
>> this
>>>> weekend, I have a couple of ideas:
>>>>
>>>> ================
>>>> = 1. Static Wrapper =
>>>> ================
>>>> We can define an extra static function that "wraps" a KTable with
>>>> final-result semantics.
>>>>
>>>> public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
>>>>   final KTable<K, V> windowedKTable,
>>>>   final Duration maxAllowedLateness,
>>>>   final Suppression.BufferFullStrategy bufferFullStrategy) {
>>>>     return windowedKTable.suppress(
>>>>         Suppression.suppressLateEvents(maxAllowedLateness)
>>>>                    .suppressIntermediateEvents(
>>>>                      IntermediateSuppression
>>>>                        .emitAfter(maxAllowedLateness)
>>>>                        .bufferFullStrategy(bufferFullStrategy)
>>>>                    )
>>>>     );
>>>> }
>>>>
>>>> Because windowedKTable is a parameter, the static function can easily
>>>> impose an extra bound on the key type, that it extends Windowed. This
>>> would
>>>> make "final results only" only available on windowed ktables.
>>>>
>>>> Here's how it would look to use:
>>>>
>>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
>>>> final KTable<Windowed<Integer>, Long> finalCounts =
>>>>   finalResultsOnly(
>>>>     windowCounts,
>>>>     Duration.ofMinutes(10),
>>>>     Suppression.BufferFullStrategy.SHUT_DOWN
>>>>   );
>>>>
>>>> Trying to use it on a non-windowed KTable yields:
>>>>
>>>>> Error:(129, 35) java: method finalResultsOnly in class
>>>>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot
>>> be
>>>>> applied to given types;
>>>>>   required:
>>>>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
>>> Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>>>>   found:
>>>>> org.apache.kafka.streams.kstream.KTable<java.lang.
>>> String,java.lang.String>,java.time.Duration,org.apache.
>>> kafka.streams.kstream.Suppression.BufferFullStrategy
>>>>>   reason: inference variable K has incompatible bounds
>>>>>     equality constraints: java.lang.String
>>>>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
>>>>
>>>>
>>>>
>>>> =================================================
>>>> = 2. Add <K,V> parameters and recipe method to Suppression =
>>>> =================================================
>>>>
>>>> By adding K,V parameters to Suppression, we can provide a similarly
>>>> bounded config method directly on the Suppression class:
>>>>
>>>> public static <K extends Windowed, V> Suppression<K, V>
>>>> finalResultsOnly(final Duration maxAllowedLateness, final
>>>> BufferFullStrategy bufferFullStrategy) {
>>>>     return Suppression
>>>>         .<K, V>suppressLateEvents(maxAllowedLateness)
>>>>         .suppressIntermediateEvents(IntermediateSuppression
>>>>             .emitAfter(maxAllowedLateness)
>>>>             .bufferFullStrategy(bufferFullStrategy)
>>>>         );
>>>> }
>>>>
>>>> Then, here's how it would look to use it:
>>>>
>>>> final KTable<Windowed<Integer>, Long> windowCounts = ...
>>>> final KTable<Windowed<Integer>, Long> finalCounts =
>>>>   windowCounts.suppress(
>>>>     Suppression.finalResultsOnly(
>>>>       Duration.ofMinutes(10)
>>>>       Suppression.BufferFullStrategy.SHUT_DOWN
>>>>     )
>>>>   );
>>>>
>>>> Trying to use it on a non-windowed ktable yields:
>>>>
>>>>> Error:(127, 35) java: method finalResultsOnly in class
>>>>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
>>>>> given types;
>>>>>   required:
>>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>>> Suppression.BufferFullStrategy
>>>>>   found:
>>>>> java.time.Duration,org.apache.kafka.streams.kstream.
>>> Suppression.BufferFullStrategy
>>>>>   reason: explicit type argument java.lang.String does not conform to
>>>>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
>>>>
>>>>
>>>>
>>>> ============
>>>> = Downsides =
>>>> ============
>>>>
>>>> Of course, there's a downside either way:
>>>> * for 1:  this "wrapper" interaction would be the first in the DSL. Is
>> it
>>>> too strange, and how discoverable would it be?
>>>> * for 2: adding those type parameters to Suppression will force all
>>>> callers to provide them in the event of a chained construction because
>>> Java
>>>> doesn't do RHS recursive type inference. This is already visible in
>> other
>>>> parts of the Streams DSL. For example, often calls to Materialized
>>> builders
>>>> have to provide seemingly obvious type bounds.
>>>>
>>>> ============
>>>> = Conclusion =
>>>> ============
>>>>
>>>> I think option 2 is more "normal" and discoverable. It does have a
>>>> downside, but it's one that's pre-existing elsewhere in the DSL.
>>>>
>>>> WDYT? Would the addition of this "recipe" method to Suppression resolve
>>>> your concern?
>>>>
>>>> Thanks again,
>>>> -John
>>>>
>>>> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com>
>>> wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> Regarding the metrics: yeah I think I'm with you that the dropped
>>> records
>>>>> due to window retention or emit suppression policies should be
>> recorded
>>>>> differently, and using this KIP's proposed metric would be fine. If
>> you
>>>>> also think we can use this KIP's proposed metrics to cover the window
>>>>> retention cased skipping records, then we can include the changes in
>>> this
>>>>> KIP as well.
>>>>>
>>>>> Regarding the current proposal, I'm actually not too worried about the
>>>>> inconsistency between query semantics and downstream emit semantics.
>> For
>>>>> queries, we will always return the current running results of the
>>> windows,
>>>>> being it partial or final results depending on the window retention
>> time
>>>>> anyways, which has nothing to do whether the emitted stream should be
>>> one
>>>>> final output per key or not. I also agree that having a unified
>>> operation
>>>>> is generally better for users to focus on leveraging that one only
>> than
>>>>> learning about two set of operations. The only question I had is, for
>>>>> final
>>>>> updates of window stores, if it is a bit awkward to understand the
>>>>> configuration combo. Thinking about this more, I think my root worry
>> in
>>>>> the
>>>>> "suppressLateEvents" call for windowed tables, since from a user
>>>>> perspective: if my retention time is X which means "pay the cost to
>>> allow
>>>>> late records up to X to still be applied updating the tables", why
>>> would I
>>>>> ever want to suppressLateEvents by Y ( < X), to say "do not send the
>>>>> updates up to Y, which means the downstream operator or sink topic for
>>>>> this
>>>>> stream would actually see a truncated update stream while I've paid
>>> larger
>>>>> cost for that"; and of course, Y > X would not make sense either as
>> you
>>>>> would not see any updates later than X anyways. So in all, my feeling
>> is
>>>>> that it makes less sense for windowed table's "suppressLateEvents"
>> with
>>> a
>>>>> parameter that is not equal to the window retention, and opening the
>>> door
>>>>> in the current proposal may confuse people with that.
>>>>>
>>>>> Again, above is just a subjective opinion and probably we can also
>> bring
>>>>> up
>>>>> some scenarios that users does want to set X != Y.. but personally I
>>> feel
>>>>> that even if the semantics for this scenario if intuitive for user to
>>>>> understand, doe that really make sense and should we really open the
>>> door
>>>>> for it. So I think maybe separating the final update in a separate
>> API's
>>>>> benefits may overwhelm the advantage of having one uniform definition.
>>> And
>>>>> for my alternative proposal, the rationale was from both my concern
>>> about
>>>>> "suppressLateEvents" for windowed store, and Matthias' question about
>>>>> "suppressLateEvents" for non-windowed stores, that if it is less
>>>>> meaningful
>>>>> for both, we can consider removing it completely and only do
>>>>> "IntermediateSuppression" in Suppress instead.
>>>>>
>>>>> So I'd summarize my thoughts in the following questions:
>>>>>
>>>>> 1. Does "suppressLateEvents" with parameter Y != X (window retention
>>> time)
>>>>> for windowed stores make sense in practice?
>>>>> 2. Does "suppressLateEvents" with any parameter Y for non-windowed
>>> stores
>>>>> make sense in practice?
>>>>>
>>>>>
>>>>>
>>>>> Guozhang
>>>>>
>>>>>
>>>>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com>
>> wrote:
>>>>>
>>>>>> Thanks for the explanation, that does make sense.  I have some
>>>>> questions on
>>>>>> operations, but I'll just wait for the PR and tests.
>>>>>>
>>>>>> Thanks,
>>>>>> Bill
>>>>>>
>>>>>> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io>
>>> wrote:
>>>>>>
>>>>>>> Hi Bill,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> Your question is very much applicable to the KIP and not at all an
>>>>>>> implementation detail. Thanks for bringing it up.
>>>>>>>
>>>>>>> I'm proposing not to change the existing caches and configurations
>>> at
>>>>> all
>>>>>>> (for now).
>>>>>>>
>>>>>>> Imagine you have a topology like this:
>>>>>>> commit.interval.ms = 100
>>>>>>>
>>>>>>> (ktable1 (cached)) -> (suppress emitAfter 200)
>>>>>>>
>>>>>>> The first ktable (ktable1) will respect the commit interval and
>>> buffer
>>>>>>> events for 100ms before logging, storing, or forwarding them
>> (IIRC).
>>>>>>> Therefore, the second ktable (suppress) will only see the events
>> at
>>> a
>>>>>> rate
>>>>>>> of once per 100ms. It will apply its own buffering, and emit once
>>> per
>>>>>> 200ms
>>>>>>> This case is pretty trivial because the suppress time is a
>> multiple
>>> of
>>>>>> the
>>>>>>> commit interval.
>>>>>>>
>>>>>>> When it's not an integer multiple, you'll get behavior like in
>> this
>>>>>> marble
>>>>>>> diagram:
>>>>>>>
>>>>>>>
>>>>>>> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>>>>>>>
>>>>>>> [ KTable caching with commit interval = 2 ]
>>>>>>>
>>>>>>> <--------(k:2)---------(k:4)---------(k:6)->
>>>>>>>
>>>>>>>       [ suppress with emitAfter = 3 ]
>>>>>>>
>>>>>>> <---------------(k:2)----------------(k:6)->
>>>>>>>
>>>>>>>
>>>>>>> If this behavior isn't desired (for example, if you wanted to emit
>>>>> (k:3)
>>>>>> at
>>>>>>> time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0
>>> or
>>>>>>> modifying the topology to disable caching. Then, the behavior is
>>> more
>>>>>>> simply determined just by the suppress operator.
>>>>>>>
>>>>>>> Does that seem right to you?
>>>>>>>
>>>>>>>
>>>>>>> Regarding the changelogs, because the suppression operator hangs
>>> onto
>>>>>>> events for a while, it will need its own changelog. The changelog
>>>>>>> should represent the current state of the buffer at all times. So
>>> when
>>>>>> the
>>>>>>> suppress operator sees (k:2), for example, it will log (k:2). When
>>> it
>>>>>>> later gets to time 3, it's time to emit (k:2) downstream. Because
>> k
>>>>> is no
>>>>>>> longer buffered, the suppress operator will log (k:null). Thus,
>> when
>>>>>>> recovering,
>>>>>>> it can rebuild the buffer by reading its changelog.
>>>>>>>
>>>>>>> What do you think about this?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -John
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hi John,  thanks for the KIP.
>>>>>>>>
>>>>>>>> Early on in the KIP, you mention the current approaches for
>>>>> controlling
>>>>>>> the
>>>>>>>> rate of downstream records from a KTable, cache size
>> configuration
>>>>> and
>>>>>>>> commit time.
>>>>>>>>
>>>>>>>> Will these configuration parameters still be in effect for
>> tables
>>>>> that
>>>>>>>> don't use suppression?  For tables taking advantage of
>>> suppression,
>>>>>> will
>>>>>>>> these configurations have no impact?
>>>>>>>> This last question may be to implementation specific but if the
>>>>>> requested
>>>>>>>> suppression time is longer than the specified commit time, will
>>> the
>>>>>>> latest
>>>>>>>> record in the suppression buffer get stored in a changelog?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bill
>>>>>>>>
>>>>>>>> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <john@confluent.io
>>>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the feedback, Matthias,
>>>>>>>>>
>>>>>>>>> It seems like in straightforward relational processing cases,
>> it
>>>>>> would
>>>>>>>> not
>>>>>>>>> make sense to bound the lateness of KTables. In general, it
>>> seems
>>>>>>> better
>>>>>>>> to
>>>>>>>>> have "guard rails" in place that make it easier to write
>>> sensible
>>>>>>>> programs
>>>>>>>>> than insensible ones.
>>>>>>>>>
>>>>>>>>> But I'm still going to argue in favor of keeping it for all
>>>>> KTables
>>>>>> ;)
>>>>>>>>>
>>>>>>>>> 1. I believe it is simpler to understand the operator if it
>> has
>>>>> one
>>>>>>>> uniform
>>>>>>>>> definition, regardless of context. It's well defined and
>>> intuitive
>>>>>> what
>>>>>>>>> will happen when you use late-event suppression on a KTable,
>> so
>>> I
>>>>>> think
>>>>>>>>> nothing surprising or dangerous will happen in that case. From
>>> my
>>>>>>>>> perspective, having two sets of allowed operations is actually
>>> an
>>>>>>>> increase
>>>>>>>>> in cognitive complexity.
>>>>>>>>>
>>>>>>>>> 2. To me, it's not crazy to use the operator this way. For
>>>>> example,
>>>>>> in
>>>>>>>> lieu
>>>>>>>>> of full-featured timestamp semantics, I can implement MVCC
>>>>> behavior
>>>>>>> when
>>>>>>>>> building a KTable by "suppressLateEvents(Duration.ZERO)". I
>>>>> suspect
>>>>>>> that
>>>>>>>>> there are other, non-obvious applications of suppressing late
>>>>> events
>>>>>> on
>>>>>>>>> KTables.
>>>>>>>>>
>>>>>>>>> 3. Not to get too much into implementation details in a KIP
>>>>>> discussion,
>>>>>>>> but
>>>>>>>>> if we did want to make late-event suppression available only
>> on
>>>>>>> windowed
>>>>>>>>> KTables, we have two enforcement options:
>>>>>>>>>   a. check when we build the topology - this would be simple
>> to
>>>>>>>> implement,
>>>>>>>>> but would be a runtime check. Hopefully, people write tests
>> for
>>>>> their
>>>>>>>>> topology before deploying them, so the feedback loop isn't
>>>>>>> instantaneous,
>>>>>>>>> but it's not too long either.
>>>>>>>>>   b. add a new WindowedKTable type - this would be a compile
>>> time
>>>>>>> check,
>>>>>>>>> but would also be substantial increase of both interface and
>>> code
>>>>>>>>> complexity.
>>>>>>>>>
>>>>>>>>> We should definitely strive to have guard rails protecting
>>> against
>>>>>>>>> surprising or dangerous behavior. Protecting against programs
>>>>> that we
>>>>>>>> don't
>>>>>>>>> currently predict is a lesser benefit, and I think we can put
>> up
>>>>>> guard
>>>>>>>>> rails on a case-by-case basis for that. It seems like the
>>>>> increase in
>>>>>>>>> cognitive (and potentially code and interface) complexity
>> makes
>>> me
>>>>>>> think
>>>>>>>> we
>>>>>>>>> should skip this case.
>>>>>>>>>
>>>>>>>>> What do you think?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
>>>>>>> matthias@confluent.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for the KIP John.
>>>>>>>>>>
>>>>>>>>>> One initial comments about the last example "Bounded
>>> lateness":
>>>>>> For a
>>>>>>>>>> non-windowed KTable bounding the lateness does not really
>> make
>>>>>> sense,
>>>>>>>>>> does it?
>>>>>>>>>>
>>>>>>>>>> Thus, I am wondering if we should allow
>> `suppressLateEvents()`
>>>>> for
>>>>>>> this
>>>>>>>>>> case? It seems to be better to only allow it for
>>>>> windowed-KTables.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -Matthias
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/27/18 8:53 AM, Ted Yu wrote:
>>>>>>>>>>> I noticed this (lack of primary parameter) as well.
>>>>>>>>>>>
>>>>>>>>>>> What you gave as new example is semantically the same as
>>> what
>>>>> I
>>>>>>>>>> suggested.
>>>>>>>>>>> So it is good by me.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
>>>>> john@confluent.io
>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for taking look, Ted,
>>>>>>>>>>>>
>>>>>>>>>>>> I agree this is a departure from the conventions of
>> Streams
>>>>> DSL.
>>>>>>>>>>>>
>>>>>>>>>>>> Most of our config objects have one or two "required"
>>>>>> parameters,
>>>>>>>>> which
>>>>>>>>>> fit
>>>>>>>>>>>> naturally with the static factory method approach.
>>>>> TimeWindow,
>>>>>> for
>>>>>>>>>> example,
>>>>>>>>>>>> requires a size parameter, so we can naturally say
>>>>>>>>> TimeWindows.of(size).
>>>>>>>>>>>>
>>>>>>>>>>>> I think in the case of a suppression, there's really no
>>>>> "core"
>>>>>>>>>> parameter,
>>>>>>>>>>>> and "Suppression.of()" seems sillier than "new
>>>>> Suppression()". I
>>>>>>>> think
>>>>>>>>>> that
>>>>>>>>>>>> Suppression.of(duration) would be ambiguous, since there
>>> are
>>>>>> many
>>>>>>>>>> durations
>>>>>>>>>>>> that we can configure.
>>>>>>>>>>>>
>>>>>>>>>>>> However, thinking about it again, I suppose that I can
>> give
>>>>> each
>>>>>>>>>>>> configuration method a static version, which would let
>> you
>>>>>> replace
>>>>>>>>> "new
>>>>>>>>>>>> Suppression()." with "Suppression." in all the examples.
>>>>>>> Basically,
>>>>>>>>>> instead
>>>>>>>>>>>> of "of()", we'd support any of the methods I listed.
>>>>>>>>>>>>
>>>>>>>>>>>> For example:
>>>>>>>>>>>>
>>>>>>>>>>>> windowCounts
>>>>>>>>>>>>     .suppress(
>>>>>>>>>>>>         Suppression
>>>>>>>>>>>>             .suppressLateEvents(Duration.ofMinutes(10))
>>>>>>>>>>>>             .suppressIntermediateEvents(
>>>>>>>>>>>>
>>>>>>>>>>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>>>>>>>>>>>>             )
>>>>>>>>>>>>     );
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Does that seem better?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
>>> yuzhihong@gmail.com
>>>>>>
>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I started to read this KIP which contains a lot of
>>>>> materials.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One suggestion:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     .suppress(
>>>>>>>>>>>>>         new Suppression()
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you think it would be more consistent with the rest
>> of
>>>>>> Streams
>>>>>>>>> data
>>>>>>>>>>>>> structures by supporting `of` ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suppression.of(Duration.ofMinutes(10))
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
>>>>>> john@confluent.io
>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello devs and users,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please take some time to consider this proposal for
>> Kafka
>>>>>>> Streams:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> KIP-328: Ability to suppress updates for KTables
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The basic idea is to provide:
>>>>>>>>>>>>>> * more usable control over update rate (vs the current
>>>>> state
>>>>>>> store
>>>>>>>>>>>>> caches)
>>>>>>>>>>>>>> * the final-result-for-windowed-computations feature
>>> which
>>>>>>> several
>>>>>>>>>>>> people
>>>>>>>>>>>>>> have requested
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I look forward to your feedback!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Guozhang
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Guozhang,

I see. It seems like if we want to decouple 1) and 2), we need to alter the
definition of the window. Do you think it would close the gap if we added a
"window close" time to the window definition?

Such as:

builder.stream("input")
.groupByKey()
.windowedBy(
  TimeWindows
    .of(60_000)
    .closeAfter(10 * 60)
    .until(30L * 24 * 60 * 60 * 1000)
)
.count()
.suppress(Suppression.finalResultsOnly());

Possibly called "finalResultsAtWindowClose" or something?

Thanks,
-John

On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hey John,
>
> Obviously I'm too lazy on email replying diligence compared with you :)
> Will try to reply them separately:
>
>
> -----------------------------------------------------------------------------
>
> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
>
> I'm aware of this use case, but again, the concern is that, in this setting
> in order to let the window be queryable for 30 days, we will actually
> process data as old as 30 days as well, while most of the late updates
> beyond 5 minutes would be discarded anyways. Personally I think for the
> final update scenario, the ideal situation users would want is that "do not
> process any data that is less than 5 minutes, and of course no update
> records to the downstream later than 5 minutes either; but retain the
> window to be queryable for 30 days". And by doing that the final window
> snapshot would also be aligned with the update stream as well. In other
> words, among these three periods:
>
> 1) the retention length of the window / table.
> 2) the late records acceptance for updating the window.
> 3) the late records update to be sent downstream.
>
> Final update use cases would naturally want 2) = 3), while 1) may be
> different and larger, while what we provide now is that 1) = 2), which
> could be different and in practice larger than 3), hence not the most
> intuitive for their needs.
>
>
>
> -----------------------------------------------------------------------------
>
> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
>
> I'd like option 2) over option 1) better as well from programming pov. But
> I'm wondering if option 2) would provide the above semantics or it is still
> coupling 1) with 2) as well ?
>
>
>
> Guozhang
>
>
>
>
> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io> wrote:
>
> > In fact, to push the idea further (which IIRC is what Matthias originally
> > proposed), if we can accept "Suppression#finalResultsOnly" in my last
> > email, then we could also consider whether to eliminate
> > "suppressLateEvents" entirely.
> >
> > We could always add it later, but you've both expressed doubt that there
> > are practical use cases for it outside of final-results.
> >
> > -John
> >
> > On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi again, Guozhang ;) Here's the second part of my response...
> > >
> > > It seems like your main concern is: "if I'm a user who wants final
> update
> > > semantics, how complicated is it for me to get it?"
> > >
> > > I think we have to assume that people don't always have time to become
> > > deeply familiar with all the nuances of a programming environment
> before
> > > they use it. Especially if they're evaluating several frameworks for
> > their
> > > use case, it's very valuable to make it as obvious as possible how to
> > > accomplish various computations with Streams.
> > >
> > > To me the biggest question is whether with a fresh perspective, people
> > > would say "oh, I get it, I have to bound my lateness and suppress
> > > intermediate updates, and of course I'll get only the final result!",
> or
> > if
> > > it's more like "wtf? all I want is the final result, what are all these
> > > parameters?".
> > >
> > > I was talking with Matthias a while back, and he had an idea that I
> think
> > > can help, which is to essentially set up a final-result recipe in
> > addition
> > > to the raw parameters. I previously thought that it wouldn't be
> possible
> > to
> > > restrict its usage to Windowed KTables, but thinking about it again
> this
> > > weekend, I have a couple of ideas:
> > >
> > > ================
> > > = 1. Static Wrapper =
> > > ================
> > > We can define an extra static function that "wraps" a KTable with
> > > final-result semantics.
> > >
> > > public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
> > >   final KTable<K, V> windowedKTable,
> > >   final Duration maxAllowedLateness,
> > >   final Suppression.BufferFullStrategy bufferFullStrategy) {
> > >     return windowedKTable.suppress(
> > >         Suppression.suppressLateEvents(maxAllowedLateness)
> > >                    .suppressIntermediateEvents(
> > >                      IntermediateSuppression
> > >                        .emitAfter(maxAllowedLateness)
> > >                        .bufferFullStrategy(bufferFullStrategy)
> > >                    )
> > >     );
> > > }
> > >
> > > Because windowedKTable is a parameter, the static function can easily
> > > impose an extra bound on the key type, that it extends Windowed. This
> > would
> > > make "final results only" only available on windowed ktables.
> > >
> > > Here's how it would look to use:
> > >
> > > final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > final KTable<Windowed<Integer>, Long> finalCounts =
> > >   finalResultsOnly(
> > >     windowCounts,
> > >     Duration.ofMinutes(10),
> > >     Suppression.BufferFullStrategy.SHUT_DOWN
> > >   );
> > >
> > > Trying to use it on a non-windowed KTable yields:
> > >
> > >> Error:(129, 35) java: method finalResultsOnly in class
> > >> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot
> > be
> > >> applied to given types;
> > >>   required:
> > >> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> > Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
> > >>   found:
> > >> org.apache.kafka.streams.kstream.KTable<java.lang.
> > String,java.lang.String>,java.time.Duration,org.apache.
> > kafka.streams.kstream.Suppression.BufferFullStrategy
> > >>   reason: inference variable K has incompatible bounds
> > >>     equality constraints: java.lang.String
> > >>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> > >
> > >
> > >
> > > =================================================
> > > = 2. Add <K,V> parameters and recipe method to Suppression =
> > > =================================================
> > >
> > > By adding K,V parameters to Suppression, we can provide a similarly
> > > bounded config method directly on the Suppression class:
> > >
> > > public static <K extends Windowed, V> Suppression<K, V>
> > > finalResultsOnly(final Duration maxAllowedLateness, final
> > > BufferFullStrategy bufferFullStrategy) {
> > >     return Suppression
> > >         .<K, V>suppressLateEvents(maxAllowedLateness)
> > >         .suppressIntermediateEvents(IntermediateSuppression
> > >             .emitAfter(maxAllowedLateness)
> > >             .bufferFullStrategy(bufferFullStrategy)
> > >         );
> > > }
> > >
> > > Then, here's how it would look to use it:
> > >
> > > final KTable<Windowed<Integer>, Long> windowCounts = ...
> > > final KTable<Windowed<Integer>, Long> finalCounts =
> > >   windowCounts.suppress(
> > >     Suppression.finalResultsOnly(
> > >       Duration.ofMinutes(10)
> > >       Suppression.BufferFullStrategy.SHUT_DOWN
> > >     )
> > >   );
> > >
> > > Trying to use it on a non-windowed ktable yields:
> > >
> > >> Error:(127, 35) java: method finalResultsOnly in class
> > >> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
> > >> given types;
> > >>   required:
> > >> java.time.Duration,org.apache.kafka.streams.kstream.
> > Suppression.BufferFullStrategy
> > >>   found:
> > >> java.time.Duration,org.apache.kafka.streams.kstream.
> > Suppression.BufferFullStrategy
> > >>   reason: explicit type argument java.lang.String does not conform to
> > >> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> > >
> > >
> > >
> > > ============
> > > = Downsides =
> > > ============
> > >
> > > Of course, there's a downside either way:
> > > * for 1:  this "wrapper" interaction would be the first in the DSL. Is
> it
> > > too strange, and how discoverable would it be?
> > > * for 2: adding those type parameters to Suppression will force all
> > > callers to provide them in the event of a chained construction because
> > Java
> > > doesn't do RHS recursive type inference. This is already visible in
> other
> > > parts of the Streams DSL. For example, often calls to Materialized
> > builders
> > > have to provide seemingly obvious type bounds.
> > >
> > > ============
> > > = Conclusion =
> > > ============
> > >
> > > I think option 2 is more "normal" and discoverable. It does have a
> > > downside, but it's one that's pre-existing elsewhere in the DSL.
> > >
> > > WDYT? Would the addition of this "recipe" method to Suppression resolve
> > > your concern?
> > >
> > > Thanks again,
> > > -John
> > >
> > > On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > >> Hi John,
> > >>
> > >> Regarding the metrics: yeah I think I'm with you that the dropped
> > records
> > >> due to window retention or emit suppression policies should be
> recorded
> > >> differently, and using this KIP's proposed metric would be fine. If
> you
> > >> also think we can use this KIP's proposed metrics to cover the window
> > >> retention cased skipping records, then we can include the changes in
> > this
> > >> KIP as well.
> > >>
> > >> Regarding the current proposal, I'm actually not too worried about the
> > >> inconsistency between query semantics and downstream emit semantics.
> For
> > >> queries, we will always return the current running results of the
> > windows,
> > >> being it partial or final results depending on the window retention
> time
> > >> anyways, which has nothing to do whether the emitted stream should be
> > one
> > >> final output per key or not. I also agree that having a unified
> > operation
> > >> is generally better for users to focus on leveraging that one only
> than
> > >> learning about two set of operations. The only question I had is, for
> > >> final
> > >> updates of window stores, if it is a bit awkward to understand the
> > >> configuration combo. Thinking about this more, I think my root worry
> in
> > >> the
> > >> "suppressLateEvents" call for windowed tables, since from a user
> > >> perspective: if my retention time is X which means "pay the cost to
> > allow
> > >> late records up to X to still be applied updating the tables", why
> > would I
> > >> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> > >> updates up to Y, which means the downstream operator or sink topic for
> > >> this
> > >> stream would actually see a truncated update stream while I've paid
> > larger
> > >> cost for that"; and of course, Y > X would not make sense either as
> you
> > >> would not see any updates later than X anyways. So in all, my feeling
> is
> > >> that it makes less sense for windowed table's "suppressLateEvents"
> with
> > a
> > >> parameter that is not equal to the window retention, and opening the
> > door
> > >> in the current proposal may confuse people with that.
> > >>
> > >> Again, above is just a subjective opinion and probably we can also
> bring
> > >> up
> > >> some scenarios that users does want to set X != Y.. but personally I
> > feel
> > >> that even if the semantics for this scenario if intuitive for user to
> > >> understand, doe that really make sense and should we really open the
> > door
> > >> for it. So I think maybe separating the final update in a separate
> API's
> > >> benefits may overwhelm the advantage of having one uniform definition.
> > And
> > >> for my alternative proposal, the rationale was from both my concern
> > about
> > >> "suppressLateEvents" for windowed store, and Matthias' question about
> > >> "suppressLateEvents" for non-windowed stores, that if it is less
> > >> meaningful
> > >> for both, we can consider removing it completely and only do
> > >> "IntermediateSuppression" in Suppress instead.
> > >>
> > >> So I'd summarize my thoughts in the following questions:
> > >>
> > >> 1. Does "suppressLateEvents" with parameter Y != X (window retention
> > time)
> > >> for windowed stores make sense in practice?
> > >> 2. Does "suppressLateEvents" with any parameter Y for non-windowed
> > stores
> > >> make sense in practice?
> > >>
> > >>
> > >>
> > >> Guozhang
> > >>
> > >>
> > >> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com>
> wrote:
> > >>
> > >> > Thanks for the explanation, that does make sense.  I have some
> > >> questions on
> > >> > operations, but I'll just wait for the PR and tests.
> > >> >
> > >> > Thanks,
> > >> > Bill
> > >> >
> > >> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io>
> > wrote:
> > >> >
> > >> > > Hi Bill,
> > >> > >
> > >> > > Thanks for the review!
> > >> > >
> > >> > > Your question is very much applicable to the KIP and not at all an
> > >> > > implementation detail. Thanks for bringing it up.
> > >> > >
> > >> > > I'm proposing not to change the existing caches and configurations
> > at
> > >> all
> > >> > > (for now).
> > >> > >
> > >> > > Imagine you have a topology like this:
> > >> > > commit.interval.ms = 100
> > >> > >
> > >> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> > >> > >
> > >> > > The first ktable (ktable1) will respect the commit interval and
> > buffer
> > >> > > events for 100ms before logging, storing, or forwarding them
> (IIRC).
> > >> > > Therefore, the second ktable (suppress) will only see the events
> at
> > a
> > >> > rate
> > >> > > of once per 100ms. It will apply its own buffering, and emit once
> > per
> > >> > 200ms
> > >> > > This case is pretty trivial because the suppress time is a
> multiple
> > of
> > >> > the
> > >> > > commit interval.
> > >> > >
> > >> > > When it's not an integer multiple, you'll get behavior like in
> this
> > >> > marble
> > >> > > diagram:
> > >> > >
> > >> > >
> > >> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >> > >
> > >> > > [ KTable caching with commit interval = 2 ]
> > >> > >
> > >> > > <--------(k:2)---------(k:4)---------(k:6)->
> > >> > >
> > >> > >       [ suppress with emitAfter = 3 ]
> > >> > >
> > >> > > <---------------(k:2)----------------(k:6)->
> > >> > >
> > >> > >
> > >> > > If this behavior isn't desired (for example, if you wanted to emit
> > >> (k:3)
> > >> > at
> > >> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0
> > or
> > >> > > modifying the topology to disable caching. Then, the behavior is
> > more
> > >> > > simply determined just by the suppress operator.
> > >> > >
> > >> > > Does that seem right to you?
> > >> > >
> > >> > >
> > >> > > Regarding the changelogs, because the suppression operator hangs
> > onto
> > >> > > events for a while, it will need its own changelog. The changelog
> > >> > > should represent the current state of the buffer at all times. So
> > when
> > >> > the
> > >> > > suppress operator sees (k:2), for example, it will log (k:2). When
> > it
> > >> > > later gets to time 3, it's time to emit (k:2) downstream. Because
> k
> > >> is no
> > >> > > longer buffered, the suppress operator will log (k:null). Thus,
> when
> > >> > > recovering,
> > >> > > it can rebuild the buffer by reading its changelog.
> > >> > >
> > >> > > What do you think about this?
> > >> > >
> > >> > > Thanks,
> > >> > > -John
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
> > >> wrote:
> > >> > >
> > >> > > > Hi John,  thanks for the KIP.
> > >> > > >
> > >> > > > Early on in the KIP, you mention the current approaches for
> > >> controlling
> > >> > > the
> > >> > > > rate of downstream records from a KTable, cache size
> configuration
> > >> and
> > >> > > > commit time.
> > >> > > >
> > >> > > > Will these configuration parameters still be in effect for
> tables
> > >> that
> > >> > > > don't use suppression?  For tables taking advantage of
> > suppression,
> > >> > will
> > >> > > > these configurations have no impact?
> > >> > > > This last question may be to implementation specific but if the
> > >> > requested
> > >> > > > suppression time is longer than the specified commit time, will
> > the
> > >> > > latest
> > >> > > > record in the suppression buffer get stored in a changelog?
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Bill
> > >> > > >
> > >> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <john@confluent.io
> >
> > >> > wrote:
> > >> > > >
> > >> > > > > Thanks for the feedback, Matthias,
> > >> > > > >
> > >> > > > > It seems like in straightforward relational processing cases,
> it
> > >> > would
> > >> > > > not
> > >> > > > > make sense to bound the lateness of KTables. In general, it
> > seems
> > >> > > better
> > >> > > > to
> > >> > > > > have "guard rails" in place that make it easier to write
> > sensible
> > >> > > > programs
> > >> > > > > than insensible ones.
> > >> > > > >
> > >> > > > > But I'm still going to argue in favor of keeping it for all
> > >> KTables
> > >> > ;)
> > >> > > > >
> > >> > > > > 1. I believe it is simpler to understand the operator if it
> has
> > >> one
> > >> > > > uniform
> > >> > > > > definition, regardless of context. It's well defined and
> > intuitive
> > >> > what
> > >> > > > > will happen when you use late-event suppression on a KTable,
> so
> > I
> > >> > think
> > >> > > > > nothing surprising or dangerous will happen in that case. From
> > my
> > >> > > > > perspective, having two sets of allowed operations is actually
> > an
> > >> > > > increase
> > >> > > > > in cognitive complexity.
> > >> > > > >
> > >> > > > > 2. To me, it's not crazy to use the operator this way. For
> > >> example,
> > >> > in
> > >> > > > lieu
> > >> > > > > of full-featured timestamp semantics, I can implement MVCC
> > >> behavior
> > >> > > when
> > >> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I
> > >> suspect
> > >> > > that
> > >> > > > > there are other, non-obvious applications of suppressing late
> > >> events
> > >> > on
> > >> > > > > KTables.
> > >> > > > >
> > >> > > > > 3. Not to get too much into implementation details in a KIP
> > >> > discussion,
> > >> > > > but
> > >> > > > > if we did want to make late-event suppression available only
> on
> > >> > > windowed
> > >> > > > > KTables, we have two enforcement options:
> > >> > > > >   a. check when we build the topology - this would be simple
> to
> > >> > > > implement,
> > >> > > > > but would be a runtime check. Hopefully, people write tests
> for
> > >> their
> > >> > > > > topology before deploying them, so the feedback loop isn't
> > >> > > instantaneous,
> > >> > > > > but it's not too long either.
> > >> > > > >   b. add a new WindowedKTable type - this would be a compile
> > time
> > >> > > check,
> > >> > > > > but would also be substantial increase of both interface and
> > code
> > >> > > > > complexity.
> > >> > > > >
> > >> > > > > We should definitely strive to have guard rails protecting
> > against
> > >> > > > > surprising or dangerous behavior. Protecting against programs
> > >> that we
> > >> > > > don't
> > >> > > > > currently predict is a lesser benefit, and I think we can put
> up
> > >> > guard
> > >> > > > > rails on a case-by-case basis for that. It seems like the
> > >> increase in
> > >> > > > > cognitive (and potentially code and interface) complexity
> makes
> > me
> > >> > > think
> > >> > > > we
> > >> > > > > should skip this case.
> > >> > > > >
> > >> > > > > What do you think?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > -John
> > >> > > > >
> > >> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > >> > > matthias@confluent.io>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Thanks for the KIP John.
> > >> > > > > >
> > >> > > > > > One initial comments about the last example "Bounded
> > lateness":
> > >> > For a
> > >> > > > > > non-windowed KTable bounding the lateness does not really
> make
> > >> > sense,
> > >> > > > > > does it?
> > >> > > > > >
> > >> > > > > > Thus, I am wondering if we should allow
> `suppressLateEvents()`
> > >> for
> > >> > > this
> > >> > > > > > case? It seems to be better to only allow it for
> > >> windowed-KTables.
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > -Matthias
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > >> > > > > > > I noticed this (lack of primary parameter) as well.
> > >> > > > > > >
> > >> > > > > > > What you gave as new example is semantically the same as
> > what
> > >> I
> > >> > > > > > suggested.
> > >> > > > > > > So it is good by me.
> > >> > > > > > >
> > >> > > > > > > Thanks
> > >> > > > > > >
> > >> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> > >> john@confluent.io
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > >
> > >> > > > > > >> Thanks for taking look, Ted,
> > >> > > > > > >>
> > >> > > > > > >> I agree this is a departure from the conventions of
> Streams
> > >> DSL.
> > >> > > > > > >>
> > >> > > > > > >> Most of our config objects have one or two "required"
> > >> > parameters,
> > >> > > > > which
> > >> > > > > > fit
> > >> > > > > > >> naturally with the static factory method approach.
> > >> TimeWindow,
> > >> > for
> > >> > > > > > example,
> > >> > > > > > >> requires a size parameter, so we can naturally say
> > >> > > > > TimeWindows.of(size).
> > >> > > > > > >>
> > >> > > > > > >> I think in the case of a suppression, there's really no
> > >> "core"
> > >> > > > > > parameter,
> > >> > > > > > >> and "Suppression.of()" seems sillier than "new
> > >> Suppression()". I
> > >> > > > think
> > >> > > > > > that
> > >> > > > > > >> Suppression.of(duration) would be ambiguous, since there
> > are
> > >> > many
> > >> > > > > > durations
> > >> > > > > > >> that we can configure.
> > >> > > > > > >>
> > >> > > > > > >> However, thinking about it again, I suppose that I can
> give
> > >> each
> > >> > > > > > >> configuration method a static version, which would let
> you
> > >> > replace
> > >> > > > > "new
> > >> > > > > > >> Suppression()." with "Suppression." in all the examples.
> > >> > > Basically,
> > >> > > > > > instead
> > >> > > > > > >> of "of()", we'd support any of the methods I listed.
> > >> > > > > > >>
> > >> > > > > > >> For example:
> > >> > > > > > >>
> > >> > > > > > >> windowCounts
> > >> > > > > > >>     .suppress(
> > >> > > > > > >>         Suppression
> > >> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > >> > > > > > >>             .suppressIntermediateEvents(
> > >> > > > > > >>
> > >> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > >> > > > > > >>             )
> > >> > > > > > >>     );
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> Does that seem better?
> > >> > > > > > >>
> > >> > > > > > >> Thanks,
> > >> > > > > > >> -John
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> > yuzhihong@gmail.com
> > >> >
> > >> > > > wrote:
> > >> > > > > > >>
> > >> > > > > > >>> I started to read this KIP which contains a lot of
> > >> materials.
> > >> > > > > > >>>
> > >> > > > > > >>> One suggestion:
> > >> > > > > > >>>
> > >> > > > > > >>>     .suppress(
> > >> > > > > > >>>         new Suppression()
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>> Do you think it would be more consistent with the rest
> of
> > >> > Streams
> > >> > > > > data
> > >> > > > > > >>> structures by supporting `of` ?
> > >> > > > > > >>>
> > >> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>> Cheers
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > >> > john@confluent.io
> > >> > > >
> > >> > > > > > wrote:
> > >> > > > > > >>>
> > >> > > > > > >>>> Hello devs and users,
> > >> > > > > > >>>>
> > >> > > > > > >>>> Please take some time to consider this proposal for
> Kafka
> > >> > > Streams:
> > >> > > > > > >>>>
> > >> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > >> > > > > > >>>>
> > >> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >> > > > > > >>>>
> > >> > > > > > >>>> The basic idea is to provide:
> > >> > > > > > >>>> * more usable control over update rate (vs the current
> > >> state
> > >> > > store
> > >> > > > > > >>> caches)
> > >> > > > > > >>>> * the final-result-for-windowed-computations feature
> > which
> > >> > > several
> > >> > > > > > >> people
> > >> > > > > > >>>> have requested
> > >> > > > > > >>>>
> > >> > > > > > >>>> I look forward to your feedback!
> > >> > > > > > >>>>
> > >> > > > > > >>>> Thanks,
> > >> > > > > > >>>> -John
> > >> > > > > > >>>>
> > >> > > > > > >>>
> > >> > > > > > >>
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

Hey John,

Obviously I'm too lazy on email replying diligence compared with you :)
Will try to reply them separately:

-----------------------------------------------------------------------------

To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":

I'm aware of this use case, but again, the concern is that, in this setting
in order to let the window be queryable for 30 days, we will actually
process data as old as 30 days as well, while most of the late updates
beyond 5 minutes would be discarded anyways. Personally I think for the
final update scenario, the ideal situation users would want is that "do not
process any data that is less than 5 minutes, and of course no update
records to the downstream later than 5 minutes either; but retain the
window to be queryable for 30 days". And by doing that the final window
snapshot would also be aligned with the update stream as well. In other
words, among these three periods:

1) the retention length of the window / table.
2) the late records acceptance for updating the window.
3) the late records update to be sent downstream.

Final update use cases would naturally want 2) = 3), while 1) may be
different and larger, while what we provide now is that 1) = 2), which
could be different and in practice larger than 3), hence not the most
intuitive for their needs.


-----------------------------------------------------------------------------

To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":

I'd like option 2) over option 1) better as well from programming pov. But
I'm wondering if option 2) would provide the above semantics or it is still
coupling 1) with 2) as well ?



Guozhang




On Mon, Jul 2, 2018 at 1:08 PM, John Roesler <jo...@confluent.io> wrote:

> In fact, to push the idea further (which IIRC is what Matthias originally
> proposed), if we can accept "Suppression#finalResultsOnly" in my last
> email, then we could also consider whether to eliminate
> "suppressLateEvents" entirely.
>
> We could always add it later, but you've both expressed doubt that there
> are practical use cases for it outside of final-results.
>
> -John
>
> On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io> wrote:
>
> > Hi again, Guozhang ;) Here's the second part of my response...
> >
> > It seems like your main concern is: "if I'm a user who wants final update
> > semantics, how complicated is it for me to get it?"
> >
> > I think we have to assume that people don't always have time to become
> > deeply familiar with all the nuances of a programming environment before
> > they use it. Especially if they're evaluating several frameworks for
> their
> > use case, it's very valuable to make it as obvious as possible how to
> > accomplish various computations with Streams.
> >
> > To me the biggest question is whether with a fresh perspective, people
> > would say "oh, I get it, I have to bound my lateness and suppress
> > intermediate updates, and of course I'll get only the final result!", or
> if
> > it's more like "wtf? all I want is the final result, what are all these
> > parameters?".
> >
> > I was talking with Matthias a while back, and he had an idea that I think
> > can help, which is to essentially set up a final-result recipe in
> addition
> > to the raw parameters. I previously thought that it wouldn't be possible
> to
> > restrict its usage to Windowed KTables, but thinking about it again this
> > weekend, I have a couple of ideas:
> >
> > ================
> > = 1. Static Wrapper =
> > ================
> > We can define an extra static function that "wraps" a KTable with
> > final-result semantics.
> >
> > public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
> >   final KTable<K, V> windowedKTable,
> >   final Duration maxAllowedLateness,
> >   final Suppression.BufferFullStrategy bufferFullStrategy) {
> >     return windowedKTable.suppress(
> >         Suppression.suppressLateEvents(maxAllowedLateness)
> >                    .suppressIntermediateEvents(
> >                      IntermediateSuppression
> >                        .emitAfter(maxAllowedLateness)
> >                        .bufferFullStrategy(bufferFullStrategy)
> >                    )
> >     );
> > }
> >
> > Because windowedKTable is a parameter, the static function can easily
> > impose an extra bound on the key type, that it extends Windowed. This
> would
> > make "final results only" only available on windowed ktables.
> >
> > Here's how it would look to use:
> >
> > final KTable<Windowed<Integer>, Long> windowCounts = ...
> > final KTable<Windowed<Integer>, Long> finalCounts =
> >   finalResultsOnly(
> >     windowCounts,
> >     Duration.ofMinutes(10),
> >     Suppression.BufferFullStrategy.SHUT_DOWN
> >   );
> >
> > Trying to use it on a non-windowed KTable yields:
> >
> >> Error:(129, 35) java: method finalResultsOnly in class
> >> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot
> be
> >> applied to given types;
> >>   required:
> >> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.
> Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
> >>   found:
> >> org.apache.kafka.streams.kstream.KTable<java.lang.
> String,java.lang.String>,java.time.Duration,org.apache.
> kafka.streams.kstream.Suppression.BufferFullStrategy
> >>   reason: inference variable K has incompatible bounds
> >>     equality constraints: java.lang.String
> >>     upper bounds: org.apache.kafka.streams.kstream.Windowed
> >
> >
> >
> > =================================================
> > = 2. Add <K,V> parameters and recipe method to Suppression =
> > =================================================
> >
> > By adding K,V parameters to Suppression, we can provide a similarly
> > bounded config method directly on the Suppression class:
> >
> > public static <K extends Windowed, V> Suppression<K, V>
> > finalResultsOnly(final Duration maxAllowedLateness, final
> > BufferFullStrategy bufferFullStrategy) {
> >     return Suppression
> >         .<K, V>suppressLateEvents(maxAllowedLateness)
> >         .suppressIntermediateEvents(IntermediateSuppression
> >             .emitAfter(maxAllowedLateness)
> >             .bufferFullStrategy(bufferFullStrategy)
> >         );
> > }
> >
> > Then, here's how it would look to use it:
> >
> > final KTable<Windowed<Integer>, Long> windowCounts = ...
> > final KTable<Windowed<Integer>, Long> finalCounts =
> >   windowCounts.suppress(
> >     Suppression.finalResultsOnly(
> >       Duration.ofMinutes(10)
> >       Suppression.BufferFullStrategy.SHUT_DOWN
> >     )
> >   );
> >
> > Trying to use it on a non-windowed ktable yields:
> >
> >> Error:(127, 35) java: method finalResultsOnly in class
> >> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
> >> given types;
> >>   required:
> >> java.time.Duration,org.apache.kafka.streams.kstream.
> Suppression.BufferFullStrategy
> >>   found:
> >> java.time.Duration,org.apache.kafka.streams.kstream.
> Suppression.BufferFullStrategy
> >>   reason: explicit type argument java.lang.String does not conform to
> >> declared bound(s) org.apache.kafka.streams.kstream.Windowed
> >
> >
> >
> > ============
> > = Downsides =
> > ============
> >
> > Of course, there's a downside either way:
> > * for 1:  this "wrapper" interaction would be the first in the DSL. Is it
> > too strange, and how discoverable would it be?
> > * for 2: adding those type parameters to Suppression will force all
> > callers to provide them in the event of a chained construction because
> Java
> > doesn't do RHS recursive type inference. This is already visible in other
> > parts of the Streams DSL. For example, often calls to Materialized
> builders
> > have to provide seemingly obvious type bounds.
> >
> > ============
> > = Conclusion =
> > ============
> >
> > I think option 2 is more "normal" and discoverable. It does have a
> > downside, but it's one that's pre-existing elsewhere in the DSL.
> >
> > WDYT? Would the addition of this "recipe" method to Suppression resolve
> > your concern?
> >
> > Thanks again,
> > -John
> >
> > On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> >> Hi John,
> >>
> >> Regarding the metrics: yeah I think I'm with you that the dropped
> records
> >> due to window retention or emit suppression policies should be recorded
> >> differently, and using this KIP's proposed metric would be fine. If you
> >> also think we can use this KIP's proposed metrics to cover the window
> >> retention cased skipping records, then we can include the changes in
> this
> >> KIP as well.
> >>
> >> Regarding the current proposal, I'm actually not too worried about the
> >> inconsistency between query semantics and downstream emit semantics. For
> >> queries, we will always return the current running results of the
> windows,
> >> being it partial or final results depending on the window retention time
> >> anyways, which has nothing to do whether the emitted stream should be
> one
> >> final output per key or not. I also agree that having a unified
> operation
> >> is generally better for users to focus on leveraging that one only than
> >> learning about two set of operations. The only question I had is, for
> >> final
> >> updates of window stores, if it is a bit awkward to understand the
> >> configuration combo. Thinking about this more, I think my root worry in
> >> the
> >> "suppressLateEvents" call for windowed tables, since from a user
> >> perspective: if my retention time is X which means "pay the cost to
> allow
> >> late records up to X to still be applied updating the tables", why
> would I
> >> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> >> updates up to Y, which means the downstream operator or sink topic for
> >> this
> >> stream would actually see a truncated update stream while I've paid
> larger
> >> cost for that"; and of course, Y > X would not make sense either as you
> >> would not see any updates later than X anyways. So in all, my feeling is
> >> that it makes less sense for windowed table's "suppressLateEvents" with
> a
> >> parameter that is not equal to the window retention, and opening the
> door
> >> in the current proposal may confuse people with that.
> >>
> >> Again, above is just a subjective opinion and probably we can also bring
> >> up
> >> some scenarios that users does want to set X != Y.. but personally I
> feel
> >> that even if the semantics for this scenario if intuitive for user to
> >> understand, doe that really make sense and should we really open the
> door
> >> for it. So I think maybe separating the final update in a separate API's
> >> benefits may overwhelm the advantage of having one uniform definition.
> And
> >> for my alternative proposal, the rationale was from both my concern
> about
> >> "suppressLateEvents" for windowed store, and Matthias' question about
> >> "suppressLateEvents" for non-windowed stores, that if it is less
> >> meaningful
> >> for both, we can consider removing it completely and only do
> >> "IntermediateSuppression" in Suppress instead.
> >>
> >> So I'd summarize my thoughts in the following questions:
> >>
> >> 1. Does "suppressLateEvents" with parameter Y != X (window retention
> time)
> >> for windowed stores make sense in practice?
> >> 2. Does "suppressLateEvents" with any parameter Y for non-windowed
> stores
> >> make sense in practice?
> >>
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
> >>
> >> > Thanks for the explanation, that does make sense.  I have some
> >> questions on
> >> > operations, but I'll just wait for the PR and tests.
> >> >
> >> > Thanks,
> >> > Bill
> >> >
> >> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io>
> wrote:
> >> >
> >> > > Hi Bill,
> >> > >
> >> > > Thanks for the review!
> >> > >
> >> > > Your question is very much applicable to the KIP and not at all an
> >> > > implementation detail. Thanks for bringing it up.
> >> > >
> >> > > I'm proposing not to change the existing caches and configurations
> at
> >> all
> >> > > (for now).
> >> > >
> >> > > Imagine you have a topology like this:
> >> > > commit.interval.ms = 100
> >> > >
> >> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> >> > >
> >> > > The first ktable (ktable1) will respect the commit interval and
> buffer
> >> > > events for 100ms before logging, storing, or forwarding them (IIRC).
> >> > > Therefore, the second ktable (suppress) will only see the events at
> a
> >> > rate
> >> > > of once per 100ms. It will apply its own buffering, and emit once
> per
> >> > 200ms
> >> > > This case is pretty trivial because the suppress time is a multiple
> of
> >> > the
> >> > > commit interval.
> >> > >
> >> > > When it's not an integer multiple, you'll get behavior like in this
> >> > marble
> >> > > diagram:
> >> > >
> >> > >
> >> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> >> > >
> >> > > [ KTable caching with commit interval = 2 ]
> >> > >
> >> > > <--------(k:2)---------(k:4)---------(k:6)->
> >> > >
> >> > >       [ suppress with emitAfter = 3 ]
> >> > >
> >> > > <---------------(k:2)----------------(k:6)->
> >> > >
> >> > >
> >> > > If this behavior isn't desired (for example, if you wanted to emit
> >> (k:3)
> >> > at
> >> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0
> or
> >> > > modifying the topology to disable caching. Then, the behavior is
> more
> >> > > simply determined just by the suppress operator.
> >> > >
> >> > > Does that seem right to you?
> >> > >
> >> > >
> >> > > Regarding the changelogs, because the suppression operator hangs
> onto
> >> > > events for a while, it will need its own changelog. The changelog
> >> > > should represent the current state of the buffer at all times. So
> when
> >> > the
> >> > > suppress operator sees (k:2), for example, it will log (k:2). When
> it
> >> > > later gets to time 3, it's time to emit (k:2) downstream. Because k
> >> is no
> >> > > longer buffered, the suppress operator will log (k:null). Thus, when
> >> > > recovering,
> >> > > it can rebuild the buffer by reading its changelog.
> >> > >
> >> > > What do you think about this?
> >> > >
> >> > > Thanks,
> >> > > -John
> >> > >
> >> > >
> >> > >
> >> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Hi John,  thanks for the KIP.
> >> > > >
> >> > > > Early on in the KIP, you mention the current approaches for
> >> controlling
> >> > > the
> >> > > > rate of downstream records from a KTable, cache size configuration
> >> and
> >> > > > commit time.
> >> > > >
> >> > > > Will these configuration parameters still be in effect for tables
> >> that
> >> > > > don't use suppression?  For tables taking advantage of
> suppression,
> >> > will
> >> > > > these configurations have no impact?
> >> > > > This last question may be to implementation specific but if the
> >> > requested
> >> > > > suppression time is longer than the specified commit time, will
> the
> >> > > latest
> >> > > > record in the suppression buffer get stored in a changelog?
> >> > > >
> >> > > > Thanks,
> >> > > > Bill
> >> > > >
> >> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> >> > wrote:
> >> > > >
> >> > > > > Thanks for the feedback, Matthias,
> >> > > > >
> >> > > > > It seems like in straightforward relational processing cases, it
> >> > would
> >> > > > not
> >> > > > > make sense to bound the lateness of KTables. In general, it
> seems
> >> > > better
> >> > > > to
> >> > > > > have "guard rails" in place that make it easier to write
> sensible
> >> > > > programs
> >> > > > > than insensible ones.
> >> > > > >
> >> > > > > But I'm still going to argue in favor of keeping it for all
> >> KTables
> >> > ;)
> >> > > > >
> >> > > > > 1. I believe it is simpler to understand the operator if it has
> >> one
> >> > > > uniform
> >> > > > > definition, regardless of context. It's well defined and
> intuitive
> >> > what
> >> > > > > will happen when you use late-event suppression on a KTable, so
> I
> >> > think
> >> > > > > nothing surprising or dangerous will happen in that case. From
> my
> >> > > > > perspective, having two sets of allowed operations is actually
> an
> >> > > > increase
> >> > > > > in cognitive complexity.
> >> > > > >
> >> > > > > 2. To me, it's not crazy to use the operator this way. For
> >> example,
> >> > in
> >> > > > lieu
> >> > > > > of full-featured timestamp semantics, I can implement MVCC
> >> behavior
> >> > > when
> >> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I
> >> suspect
> >> > > that
> >> > > > > there are other, non-obvious applications of suppressing late
> >> events
> >> > on
> >> > > > > KTables.
> >> > > > >
> >> > > > > 3. Not to get too much into implementation details in a KIP
> >> > discussion,
> >> > > > but
> >> > > > > if we did want to make late-event suppression available only on
> >> > > windowed
> >> > > > > KTables, we have two enforcement options:
> >> > > > >   a. check when we build the topology - this would be simple to
> >> > > > implement,
> >> > > > > but would be a runtime check. Hopefully, people write tests for
> >> their
> >> > > > > topology before deploying them, so the feedback loop isn't
> >> > > instantaneous,
> >> > > > > but it's not too long either.
> >> > > > >   b. add a new WindowedKTable type - this would be a compile
> time
> >> > > check,
> >> > > > > but would also be substantial increase of both interface and
> code
> >> > > > > complexity.
> >> > > > >
> >> > > > > We should definitely strive to have guard rails protecting
> against
> >> > > > > surprising or dangerous behavior. Protecting against programs
> >> that we
> >> > > > don't
> >> > > > > currently predict is a lesser benefit, and I think we can put up
> >> > guard
> >> > > > > rails on a case-by-case basis for that. It seems like the
> >> increase in
> >> > > > > cognitive (and potentially code and interface) complexity makes
> me
> >> > > think
> >> > > > we
> >> > > > > should skip this case.
> >> > > > >
> >> > > > > What do you think?
> >> > > > >
> >> > > > > Thanks,
> >> > > > > -John
> >> > > > >
> >> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> >> > > matthias@confluent.io>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Thanks for the KIP John.
> >> > > > > >
> >> > > > > > One initial comments about the last example "Bounded
> lateness":
> >> > For a
> >> > > > > > non-windowed KTable bounding the lateness does not really make
> >> > sense,
> >> > > > > > does it?
> >> > > > > >
> >> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
> >> for
> >> > > this
> >> > > > > > case? It seems to be better to only allow it for
> >> windowed-KTables.
> >> > > > > >
> >> > > > > >
> >> > > > > > -Matthias
> >> > > > > >
> >> > > > > >
> >> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> >> > > > > > > I noticed this (lack of primary parameter) as well.
> >> > > > > > >
> >> > > > > > > What you gave as new example is semantically the same as
> what
> >> I
> >> > > > > > suggested.
> >> > > > > > > So it is good by me.
> >> > > > > > >
> >> > > > > > > Thanks
> >> > > > > > >
> >> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> >> john@confluent.io
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > >> Thanks for taking look, Ted,
> >> > > > > > >>
> >> > > > > > >> I agree this is a departure from the conventions of Streams
> >> DSL.
> >> > > > > > >>
> >> > > > > > >> Most of our config objects have one or two "required"
> >> > parameters,
> >> > > > > which
> >> > > > > > fit
> >> > > > > > >> naturally with the static factory method approach.
> >> TimeWindow,
> >> > for
> >> > > > > > example,
> >> > > > > > >> requires a size parameter, so we can naturally say
> >> > > > > TimeWindows.of(size).
> >> > > > > > >>
> >> > > > > > >> I think in the case of a suppression, there's really no
> >> "core"
> >> > > > > > parameter,
> >> > > > > > >> and "Suppression.of()" seems sillier than "new
> >> Suppression()". I
> >> > > > think
> >> > > > > > that
> >> > > > > > >> Suppression.of(duration) would be ambiguous, since there
> are
> >> > many
> >> > > > > > durations
> >> > > > > > >> that we can configure.
> >> > > > > > >>
> >> > > > > > >> However, thinking about it again, I suppose that I can give
> >> each
> >> > > > > > >> configuration method a static version, which would let you
> >> > replace
> >> > > > > "new
> >> > > > > > >> Suppression()." with "Suppression." in all the examples.
> >> > > Basically,
> >> > > > > > instead
> >> > > > > > >> of "of()", we'd support any of the methods I listed.
> >> > > > > > >>
> >> > > > > > >> For example:
> >> > > > > > >>
> >> > > > > > >> windowCounts
> >> > > > > > >>     .suppress(
> >> > > > > > >>         Suppression
> >> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> >> > > > > > >>             .suppressIntermediateEvents(
> >> > > > > > >>
> >> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> >> > > > > > >>             )
> >> > > > > > >>     );
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> Does that seem better?
> >> > > > > > >>
> >> > > > > > >> Thanks,
> >> > > > > > >> -John
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <
> yuzhihong@gmail.com
> >> >
> >> > > > wrote:
> >> > > > > > >>
> >> > > > > > >>> I started to read this KIP which contains a lot of
> >> materials.
> >> > > > > > >>>
> >> > > > > > >>> One suggestion:
> >> > > > > > >>>
> >> > > > > > >>>     .suppress(
> >> > > > > > >>>         new Suppression()
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> Do you think it would be more consistent with the rest of
> >> > Streams
> >> > > > > data
> >> > > > > > >>> structures by supporting `of` ?
> >> > > > > > >>>
> >> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> Cheers
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> >> > john@confluent.io
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >>>
> >> > > > > > >>>> Hello devs and users,
> >> > > > > > >>>>
> >> > > > > > >>>> Please take some time to consider this proposal for Kafka
> >> > > Streams:
> >> > > > > > >>>>
> >> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> >> > > > > > >>>>
> >> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >> > > > > > >>>>
> >> > > > > > >>>> The basic idea is to provide:
> >> > > > > > >>>> * more usable control over update rate (vs the current
> >> state
> >> > > store
> >> > > > > > >>> caches)
> >> > > > > > >>>> * the final-result-for-windowed-computations feature
> which
> >> > > several
> >> > > > > > >> people
> >> > > > > > >>>> have requested
> >> > > > > > >>>>
> >> > > > > > >>>> I look forward to your feedback!
> >> > > > > > >>>>
> >> > > > > > >>>> Thanks,
> >> > > > > > >>>> -John
> >> > > > > > >>>>
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

In fact, to push the idea further (which IIRC is what Matthias originally
proposed), if we can accept "Suppression#finalResultsOnly" in my last
email, then we could also consider whether to eliminate
"suppressLateEvents" entirely.

We could always add it later, but you've both expressed doubt that there
are practical use cases for it outside of final-results.

-John

On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io> wrote:

> Hi again, Guozhang ;) Here's the second part of my response...
>
> It seems like your main concern is: "if I'm a user who wants final update
> semantics, how complicated is it for me to get it?"
>
> I think we have to assume that people don't always have time to become
> deeply familiar with all the nuances of a programming environment before
> they use it. Especially if they're evaluating several frameworks for their
> use case, it's very valuable to make it as obvious as possible how to
> accomplish various computations with Streams.
>
> To me the biggest question is whether with a fresh perspective, people
> would say "oh, I get it, I have to bound my lateness and suppress
> intermediate updates, and of course I'll get only the final result!", or if
> it's more like "wtf? all I want is the final result, what are all these
> parameters?".
>
> I was talking with Matthias a while back, and he had an idea that I think
> can help, which is to essentially set up a final-result recipe in addition
> to the raw parameters. I previously thought that it wouldn't be possible to
> restrict its usage to Windowed KTables, but thinking about it again this
> weekend, I have a couple of ideas:
>
> ================
> = 1. Static Wrapper =
> ================
> We can define an extra static function that "wraps" a KTable with
> final-result semantics.
>
> public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
>   final KTable<K, V> windowedKTable,
>   final Duration maxAllowedLateness,
>   final Suppression.BufferFullStrategy bufferFullStrategy) {
>     return windowedKTable.suppress(
>         Suppression.suppressLateEvents(maxAllowedLateness)
>                    .suppressIntermediateEvents(
>                      IntermediateSuppression
>                        .emitAfter(maxAllowedLateness)
>                        .bufferFullStrategy(bufferFullStrategy)
>                    )
>     );
> }
>
> Because windowedKTable is a parameter, the static function can easily
> impose an extra bound on the key type, that it extends Windowed. This would
> make "final results only" only available on windowed ktables.
>
> Here's how it would look to use:
>
> final KTable<Windowed<Integer>, Long> windowCounts = ...
> final KTable<Windowed<Integer>, Long> finalCounts =
>   finalResultsOnly(
>     windowCounts,
>     Duration.ofMinutes(10),
>     Suppression.BufferFullStrategy.SHUT_DOWN
>   );
>
> Trying to use it on a non-windowed KTable yields:
>
>> Error:(129, 35) java: method finalResultsOnly in class
>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot be
>> applied to given types;
>>   required:
>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   found:
>> org.apache.kafka.streams.kstream.KTable<java.lang.String,java.lang.String>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   reason: inference variable K has incompatible bounds
>>     equality constraints: java.lang.String
>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
>
>
>
> =================================================
> = 2. Add <K,V> parameters and recipe method to Suppression =
> =================================================
>
> By adding K,V parameters to Suppression, we can provide a similarly
> bounded config method directly on the Suppression class:
>
> public static <K extends Windowed, V> Suppression<K, V>
> finalResultsOnly(final Duration maxAllowedLateness, final
> BufferFullStrategy bufferFullStrategy) {
>     return Suppression
>         .<K, V>suppressLateEvents(maxAllowedLateness)
>         .suppressIntermediateEvents(IntermediateSuppression
>             .emitAfter(maxAllowedLateness)
>             .bufferFullStrategy(bufferFullStrategy)
>         );
> }
>
> Then, here's how it would look to use it:
>
> final KTable<Windowed<Integer>, Long> windowCounts = ...
> final KTable<Windowed<Integer>, Long> finalCounts =
>   windowCounts.suppress(
>     Suppression.finalResultsOnly(
>       Duration.ofMinutes(10)
>       Suppression.BufferFullStrategy.SHUT_DOWN
>     )
>   );
>
> Trying to use it on a non-windowed ktable yields:
>
>> Error:(127, 35) java: method finalResultsOnly in class
>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
>> given types;
>>   required:
>> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   found:
>> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   reason: explicit type argument java.lang.String does not conform to
>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
>
>
>
> ============
> = Downsides =
> ============
>
> Of course, there's a downside either way:
> * for 1:  this "wrapper" interaction would be the first in the DSL. Is it
> too strange, and how discoverable would it be?
> * for 2: adding those type parameters to Suppression will force all
> callers to provide them in the event of a chained construction because Java
> doesn't do RHS recursive type inference. This is already visible in other
> parts of the Streams DSL. For example, often calls to Materialized builders
> have to provide seemingly obvious type bounds.
>
> ============
> = Conclusion =
> ============
>
> I think option 2 is more "normal" and discoverable. It does have a
> downside, but it's one that's pre-existing elsewhere in the DSL.
>
> WDYT? Would the addition of this "recipe" method to Suppression resolve
> your concern?
>
> Thanks again,
> -John
>
> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Hi John,
>>
>> Regarding the metrics: yeah I think I'm with you that the dropped records
>> due to window retention or emit suppression policies should be recorded
>> differently, and using this KIP's proposed metric would be fine. If you
>> also think we can use this KIP's proposed metrics to cover the window
>> retention cased skipping records, then we can include the changes in this
>> KIP as well.
>>
>> Regarding the current proposal, I'm actually not too worried about the
>> inconsistency between query semantics and downstream emit semantics. For
>> queries, we will always return the current running results of the windows,
>> being it partial or final results depending on the window retention time
>> anyways, which has nothing to do whether the emitted stream should be one
>> final output per key or not. I also agree that having a unified operation
>> is generally better for users to focus on leveraging that one only than
>> learning about two set of operations. The only question I had is, for
>> final
>> updates of window stores, if it is a bit awkward to understand the
>> configuration combo. Thinking about this more, I think my root worry in
>> the
>> "suppressLateEvents" call for windowed tables, since from a user
>> perspective: if my retention time is X which means "pay the cost to allow
>> late records up to X to still be applied updating the tables", why would I
>> ever want to suppressLateEvents by Y ( < X), to say "do not send the
>> updates up to Y, which means the downstream operator or sink topic for
>> this
>> stream would actually see a truncated update stream while I've paid larger
>> cost for that"; and of course, Y > X would not make sense either as you
>> would not see any updates later than X anyways. So in all, my feeling is
>> that it makes less sense for windowed table's "suppressLateEvents" with a
>> parameter that is not equal to the window retention, and opening the door
>> in the current proposal may confuse people with that.
>>
>> Again, above is just a subjective opinion and probably we can also bring
>> up
>> some scenarios that users does want to set X != Y.. but personally I feel
>> that even if the semantics for this scenario if intuitive for user to
>> understand, doe that really make sense and should we really open the door
>> for it. So I think maybe separating the final update in a separate API's
>> benefits may overwhelm the advantage of having one uniform definition. And
>> for my alternative proposal, the rationale was from both my concern about
>> "suppressLateEvents" for windowed store, and Matthias' question about
>> "suppressLateEvents" for non-windowed stores, that if it is less
>> meaningful
>> for both, we can consider removing it completely and only do
>> "IntermediateSuppression" in Suppress instead.
>>
>> So I'd summarize my thoughts in the following questions:
>>
>> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
>> for windowed stores make sense in practice?
>> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
>> make sense in practice?
>>
>>
>>
>> Guozhang
>>
>>
>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>>
>> > Thanks for the explanation, that does make sense.  I have some
>> questions on
>> > operations, but I'll just wait for the PR and tests.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
>> >
>> > > Hi Bill,
>> > >
>> > > Thanks for the review!
>> > >
>> > > Your question is very much applicable to the KIP and not at all an
>> > > implementation detail. Thanks for bringing it up.
>> > >
>> > > I'm proposing not to change the existing caches and configurations at
>> all
>> > > (for now).
>> > >
>> > > Imagine you have a topology like this:
>> > > commit.interval.ms = 100
>> > >
>> > > (ktable1 (cached)) -> (suppress emitAfter 200)
>> > >
>> > > The first ktable (ktable1) will respect the commit interval and buffer
>> > > events for 100ms before logging, storing, or forwarding them (IIRC).
>> > > Therefore, the second ktable (suppress) will only see the events at a
>> > rate
>> > > of once per 100ms. It will apply its own buffering, and emit once per
>> > 200ms
>> > > This case is pretty trivial because the suppress time is a multiple of
>> > the
>> > > commit interval.
>> > >
>> > > When it's not an integer multiple, you'll get behavior like in this
>> > marble
>> > > diagram:
>> > >
>> > >
>> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>> > >
>> > > [ KTable caching with commit interval = 2 ]
>> > >
>> > > <--------(k:2)---------(k:4)---------(k:6)->
>> > >
>> > >       [ suppress with emitAfter = 3 ]
>> > >
>> > > <---------------(k:2)----------------(k:6)->
>> > >
>> > >
>> > > If this behavior isn't desired (for example, if you wanted to emit
>> (k:3)
>> > at
>> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
>> > > modifying the topology to disable caching. Then, the behavior is more
>> > > simply determined just by the suppress operator.
>> > >
>> > > Does that seem right to you?
>> > >
>> > >
>> > > Regarding the changelogs, because the suppression operator hangs onto
>> > > events for a while, it will need its own changelog. The changelog
>> > > should represent the current state of the buffer at all times. So when
>> > the
>> > > suppress operator sees (k:2), for example, it will log (k:2). When it
>> > > later gets to time 3, it's time to emit (k:2) downstream. Because k
>> is no
>> > > longer buffered, the suppress operator will log (k:null). Thus, when
>> > > recovering,
>> > > it can rebuild the buffer by reading its changelog.
>> > >
>> > > What do you think about this?
>> > >
>> > > Thanks,
>> > > -John
>> > >
>> > >
>> > >
>> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
>> wrote:
>> > >
>> > > > Hi John,  thanks for the KIP.
>> > > >
>> > > > Early on in the KIP, you mention the current approaches for
>> controlling
>> > > the
>> > > > rate of downstream records from a KTable, cache size configuration
>> and
>> > > > commit time.
>> > > >
>> > > > Will these configuration parameters still be in effect for tables
>> that
>> > > > don't use suppression?  For tables taking advantage of suppression,
>> > will
>> > > > these configurations have no impact?
>> > > > This last question may be to implementation specific but if the
>> > requested
>> > > > suppression time is longer than the specified commit time, will the
>> > > latest
>> > > > record in the suppression buffer get stored in a changelog?
>> > > >
>> > > > Thanks,
>> > > > Bill
>> > > >
>> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
>> > wrote:
>> > > >
>> > > > > Thanks for the feedback, Matthias,
>> > > > >
>> > > > > It seems like in straightforward relational processing cases, it
>> > would
>> > > > not
>> > > > > make sense to bound the lateness of KTables. In general, it seems
>> > > better
>> > > > to
>> > > > > have "guard rails" in place that make it easier to write sensible
>> > > > programs
>> > > > > than insensible ones.
>> > > > >
>> > > > > But I'm still going to argue in favor of keeping it for all
>> KTables
>> > ;)
>> > > > >
>> > > > > 1. I believe it is simpler to understand the operator if it has
>> one
>> > > > uniform
>> > > > > definition, regardless of context. It's well defined and intuitive
>> > what
>> > > > > will happen when you use late-event suppression on a KTable, so I
>> > think
>> > > > > nothing surprising or dangerous will happen in that case. From my
>> > > > > perspective, having two sets of allowed operations is actually an
>> > > > increase
>> > > > > in cognitive complexity.
>> > > > >
>> > > > > 2. To me, it's not crazy to use the operator this way. For
>> example,
>> > in
>> > > > lieu
>> > > > > of full-featured timestamp semantics, I can implement MVCC
>> behavior
>> > > when
>> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I
>> suspect
>> > > that
>> > > > > there are other, non-obvious applications of suppressing late
>> events
>> > on
>> > > > > KTables.
>> > > > >
>> > > > > 3. Not to get too much into implementation details in a KIP
>> > discussion,
>> > > > but
>> > > > > if we did want to make late-event suppression available only on
>> > > windowed
>> > > > > KTables, we have two enforcement options:
>> > > > >   a. check when we build the topology - this would be simple to
>> > > > implement,
>> > > > > but would be a runtime check. Hopefully, people write tests for
>> their
>> > > > > topology before deploying them, so the feedback loop isn't
>> > > instantaneous,
>> > > > > but it's not too long either.
>> > > > >   b. add a new WindowedKTable type - this would be a compile time
>> > > check,
>> > > > > but would also be substantial increase of both interface and code
>> > > > > complexity.
>> > > > >
>> > > > > We should definitely strive to have guard rails protecting against
>> > > > > surprising or dangerous behavior. Protecting against programs
>> that we
>> > > > don't
>> > > > > currently predict is a lesser benefit, and I think we can put up
>> > guard
>> > > > > rails on a case-by-case basis for that. It seems like the
>> increase in
>> > > > > cognitive (and potentially code and interface) complexity makes me
>> > > think
>> > > > we
>> > > > > should skip this case.
>> > > > >
>> > > > > What do you think?
>> > > > >
>> > > > > Thanks,
>> > > > > -John
>> > > > >
>> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
>> > > matthias@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks for the KIP John.
>> > > > > >
>> > > > > > One initial comments about the last example "Bounded lateness":
>> > For a
>> > > > > > non-windowed KTable bounding the lateness does not really make
>> > sense,
>> > > > > > does it?
>> > > > > >
>> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
>> for
>> > > this
>> > > > > > case? It seems to be better to only allow it for
>> windowed-KTables.
>> > > > > >
>> > > > > >
>> > > > > > -Matthias
>> > > > > >
>> > > > > >
>> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
>> > > > > > > I noticed this (lack of primary parameter) as well.
>> > > > > > >
>> > > > > > > What you gave as new example is semantically the same as what
>> I
>> > > > > > suggested.
>> > > > > > > So it is good by me.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
>> john@confluent.io
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > >> Thanks for taking look, Ted,
>> > > > > > >>
>> > > > > > >> I agree this is a departure from the conventions of Streams
>> DSL.
>> > > > > > >>
>> > > > > > >> Most of our config objects have one or two "required"
>> > parameters,
>> > > > > which
>> > > > > > fit
>> > > > > > >> naturally with the static factory method approach.
>> TimeWindow,
>> > for
>> > > > > > example,
>> > > > > > >> requires a size parameter, so we can naturally say
>> > > > > TimeWindows.of(size).
>> > > > > > >>
>> > > > > > >> I think in the case of a suppression, there's really no
>> "core"
>> > > > > > parameter,
>> > > > > > >> and "Suppression.of()" seems sillier than "new
>> Suppression()". I
>> > > > think
>> > > > > > that
>> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
>> > many
>> > > > > > durations
>> > > > > > >> that we can configure.
>> > > > > > >>
>> > > > > > >> However, thinking about it again, I suppose that I can give
>> each
>> > > > > > >> configuration method a static version, which would let you
>> > replace
>> > > > > "new
>> > > > > > >> Suppression()." with "Suppression." in all the examples.
>> > > Basically,
>> > > > > > instead
>> > > > > > >> of "of()", we'd support any of the methods I listed.
>> > > > > > >>
>> > > > > > >> For example:
>> > > > > > >>
>> > > > > > >> windowCounts
>> > > > > > >>     .suppress(
>> > > > > > >>         Suppression
>> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
>> > > > > > >>             .suppressIntermediateEvents(
>> > > > > > >>
>> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>> > > > > > >>             )
>> > > > > > >>     );
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> Does that seem better?
>> > > > > > >>
>> > > > > > >> Thanks,
>> > > > > > >> -John
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yuzhihong@gmail.com
>> >
>> > > > wrote:
>> > > > > > >>
>> > > > > > >>> I started to read this KIP which contains a lot of
>> materials.
>> > > > > > >>>
>> > > > > > >>> One suggestion:
>> > > > > > >>>
>> > > > > > >>>     .suppress(
>> > > > > > >>>         new Suppression()
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> Do you think it would be more consistent with the rest of
>> > Streams
>> > > > > data
>> > > > > > >>> structures by supporting `of` ?
>> > > > > > >>>
>> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> Cheers
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
>> > john@confluent.io
>> > > >
>> > > > > > wrote:
>> > > > > > >>>
>> > > > > > >>>> Hello devs and users,
>> > > > > > >>>>
>> > > > > > >>>> Please take some time to consider this proposal for Kafka
>> > > Streams:
>> > > > > > >>>>
>> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
>> > > > > > >>>>
>> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>> > > > > > >>>>
>> > > > > > >>>> The basic idea is to provide:
>> > > > > > >>>> * more usable control over update rate (vs the current
>> state
>> > > store
>> > > > > > >>> caches)
>> > > > > > >>>> * the final-result-for-windowed-computations feature which
>> > > several
>> > > > > > >> people
>> > > > > > >>>> have requested
>> > > > > > >>>>
>> > > > > > >>>> I look forward to your feedback!
>> > > > > > >>>>
>> > > > > > >>>> Thanks,
>> > > > > > >>>> -John
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

In fact, to push the idea further (which IIRC is what Matthias originally
proposed), if we can accept "Suppression#finalResultsOnly" in my last
email, then we could also consider whether to eliminate
"suppressLateEvents" entirely.

We could always add it later, but you've both expressed doubt that there
are practical use cases for it outside of final-results.

-John

On Mon, Jul 2, 2018 at 12:27 PM John Roesler <jo...@confluent.io> wrote:

> Hi again, Guozhang ;) Here's the second part of my response...
>
> It seems like your main concern is: "if I'm a user who wants final update
> semantics, how complicated is it for me to get it?"
>
> I think we have to assume that people don't always have time to become
> deeply familiar with all the nuances of a programming environment before
> they use it. Especially if they're evaluating several frameworks for their
> use case, it's very valuable to make it as obvious as possible how to
> accomplish various computations with Streams.
>
> To me the biggest question is whether with a fresh perspective, people
> would say "oh, I get it, I have to bound my lateness and suppress
> intermediate updates, and of course I'll get only the final result!", or if
> it's more like "wtf? all I want is the final result, what are all these
> parameters?".
>
> I was talking with Matthias a while back, and he had an idea that I think
> can help, which is to essentially set up a final-result recipe in addition
> to the raw parameters. I previously thought that it wouldn't be possible to
> restrict its usage to Windowed KTables, but thinking about it again this
> weekend, I have a couple of ideas:
>
> ================
> = 1. Static Wrapper =
> ================
> We can define an extra static function that "wraps" a KTable with
> final-result semantics.
>
> public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
>   final KTable<K, V> windowedKTable,
>   final Duration maxAllowedLateness,
>   final Suppression.BufferFullStrategy bufferFullStrategy) {
>     return windowedKTable.suppress(
>         Suppression.suppressLateEvents(maxAllowedLateness)
>                    .suppressIntermediateEvents(
>                      IntermediateSuppression
>                        .emitAfter(maxAllowedLateness)
>                        .bufferFullStrategy(bufferFullStrategy)
>                    )
>     );
> }
>
> Because windowedKTable is a parameter, the static function can easily
> impose an extra bound on the key type, that it extends Windowed. This would
> make "final results only" only available on windowed ktables.
>
> Here's how it would look to use:
>
> final KTable<Windowed<Integer>, Long> windowCounts = ...
> final KTable<Windowed<Integer>, Long> finalCounts =
>   finalResultsOnly(
>     windowCounts,
>     Duration.ofMinutes(10),
>     Suppression.BufferFullStrategy.SHUT_DOWN
>   );
>
> Trying to use it on a non-windowed KTable yields:
>
>> Error:(129, 35) java: method finalResultsOnly in class
>> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot be
>> applied to given types;
>>   required:
>> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   found:
>> org.apache.kafka.streams.kstream.KTable<java.lang.String,java.lang.String>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   reason: inference variable K has incompatible bounds
>>     equality constraints: java.lang.String
>>     upper bounds: org.apache.kafka.streams.kstream.Windowed
>
>
>
> =================================================
> = 2. Add <K,V> parameters and recipe method to Suppression =
> =================================================
>
> By adding K,V parameters to Suppression, we can provide a similarly
> bounded config method directly on the Suppression class:
>
> public static <K extends Windowed, V> Suppression<K, V>
> finalResultsOnly(final Duration maxAllowedLateness, final
> BufferFullStrategy bufferFullStrategy) {
>     return Suppression
>         .<K, V>suppressLateEvents(maxAllowedLateness)
>         .suppressIntermediateEvents(IntermediateSuppression
>             .emitAfter(maxAllowedLateness)
>             .bufferFullStrategy(bufferFullStrategy)
>         );
> }
>
> Then, here's how it would look to use it:
>
> final KTable<Windowed<Integer>, Long> windowCounts = ...
> final KTable<Windowed<Integer>, Long> finalCounts =
>   windowCounts.suppress(
>     Suppression.finalResultsOnly(
>       Duration.ofMinutes(10)
>       Suppression.BufferFullStrategy.SHUT_DOWN
>     )
>   );
>
> Trying to use it on a non-windowed ktable yields:
>
>> Error:(127, 35) java: method finalResultsOnly in class
>> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
>> given types;
>>   required:
>> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   found:
>> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>>   reason: explicit type argument java.lang.String does not conform to
>> declared bound(s) org.apache.kafka.streams.kstream.Windowed
>
>
>
> ============
> = Downsides =
> ============
>
> Of course, there's a downside either way:
> * for 1:  this "wrapper" interaction would be the first in the DSL. Is it
> too strange, and how discoverable would it be?
> * for 2: adding those type parameters to Suppression will force all
> callers to provide them in the event of a chained construction because Java
> doesn't do RHS recursive type inference. This is already visible in other
> parts of the Streams DSL. For example, often calls to Materialized builders
> have to provide seemingly obvious type bounds.
>
> ============
> = Conclusion =
> ============
>
> I think option 2 is more "normal" and discoverable. It does have a
> downside, but it's one that's pre-existing elsewhere in the DSL.
>
> WDYT? Would the addition of this "recipe" method to Suppression resolve
> your concern?
>
> Thanks again,
> -John
>
> On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Hi John,
>>
>> Regarding the metrics: yeah I think I'm with you that the dropped records
>> due to window retention or emit suppression policies should be recorded
>> differently, and using this KIP's proposed metric would be fine. If you
>> also think we can use this KIP's proposed metrics to cover the window
>> retention cased skipping records, then we can include the changes in this
>> KIP as well.
>>
>> Regarding the current proposal, I'm actually not too worried about the
>> inconsistency between query semantics and downstream emit semantics. For
>> queries, we will always return the current running results of the windows,
>> being it partial or final results depending on the window retention time
>> anyways, which has nothing to do whether the emitted stream should be one
>> final output per key or not. I also agree that having a unified operation
>> is generally better for users to focus on leveraging that one only than
>> learning about two set of operations. The only question I had is, for
>> final
>> updates of window stores, if it is a bit awkward to understand the
>> configuration combo. Thinking about this more, I think my root worry in
>> the
>> "suppressLateEvents" call for windowed tables, since from a user
>> perspective: if my retention time is X which means "pay the cost to allow
>> late records up to X to still be applied updating the tables", why would I
>> ever want to suppressLateEvents by Y ( < X), to say "do not send the
>> updates up to Y, which means the downstream operator or sink topic for
>> this
>> stream would actually see a truncated update stream while I've paid larger
>> cost for that"; and of course, Y > X would not make sense either as you
>> would not see any updates later than X anyways. So in all, my feeling is
>> that it makes less sense for windowed table's "suppressLateEvents" with a
>> parameter that is not equal to the window retention, and opening the door
>> in the current proposal may confuse people with that.
>>
>> Again, above is just a subjective opinion and probably we can also bring
>> up
>> some scenarios that users does want to set X != Y.. but personally I feel
>> that even if the semantics for this scenario if intuitive for user to
>> understand, doe that really make sense and should we really open the door
>> for it. So I think maybe separating the final update in a separate API's
>> benefits may overwhelm the advantage of having one uniform definition. And
>> for my alternative proposal, the rationale was from both my concern about
>> "suppressLateEvents" for windowed store, and Matthias' question about
>> "suppressLateEvents" for non-windowed stores, that if it is less
>> meaningful
>> for both, we can consider removing it completely and only do
>> "IntermediateSuppression" in Suppress instead.
>>
>> So I'd summarize my thoughts in the following questions:
>>
>> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
>> for windowed stores make sense in practice?
>> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
>> make sense in practice?
>>
>>
>>
>> Guozhang
>>
>>
>> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>>
>> > Thanks for the explanation, that does make sense.  I have some
>> questions on
>> > operations, but I'll just wait for the PR and tests.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
>> >
>> > > Hi Bill,
>> > >
>> > > Thanks for the review!
>> > >
>> > > Your question is very much applicable to the KIP and not at all an
>> > > implementation detail. Thanks for bringing it up.
>> > >
>> > > I'm proposing not to change the existing caches and configurations at
>> all
>> > > (for now).
>> > >
>> > > Imagine you have a topology like this:
>> > > commit.interval.ms = 100
>> > >
>> > > (ktable1 (cached)) -> (suppress emitAfter 200)
>> > >
>> > > The first ktable (ktable1) will respect the commit interval and buffer
>> > > events for 100ms before logging, storing, or forwarding them (IIRC).
>> > > Therefore, the second ktable (suppress) will only see the events at a
>> > rate
>> > > of once per 100ms. It will apply its own buffering, and emit once per
>> > 200ms
>> > > This case is pretty trivial because the suppress time is a multiple of
>> > the
>> > > commit interval.
>> > >
>> > > When it's not an integer multiple, you'll get behavior like in this
>> > marble
>> > > diagram:
>> > >
>> > >
>> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>> > >
>> > > [ KTable caching with commit interval = 2 ]
>> > >
>> > > <--------(k:2)---------(k:4)---------(k:6)->
>> > >
>> > >       [ suppress with emitAfter = 3 ]
>> > >
>> > > <---------------(k:2)----------------(k:6)->
>> > >
>> > >
>> > > If this behavior isn't desired (for example, if you wanted to emit
>> (k:3)
>> > at
>> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
>> > > modifying the topology to disable caching. Then, the behavior is more
>> > > simply determined just by the suppress operator.
>> > >
>> > > Does that seem right to you?
>> > >
>> > >
>> > > Regarding the changelogs, because the suppression operator hangs onto
>> > > events for a while, it will need its own changelog. The changelog
>> > > should represent the current state of the buffer at all times. So when
>> > the
>> > > suppress operator sees (k:2), for example, it will log (k:2). When it
>> > > later gets to time 3, it's time to emit (k:2) downstream. Because k
>> is no
>> > > longer buffered, the suppress operator will log (k:null). Thus, when
>> > > recovering,
>> > > it can rebuild the buffer by reading its changelog.
>> > >
>> > > What do you think about this?
>> > >
>> > > Thanks,
>> > > -John
>> > >
>> > >
>> > >
>> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com>
>> wrote:
>> > >
>> > > > Hi John,  thanks for the KIP.
>> > > >
>> > > > Early on in the KIP, you mention the current approaches for
>> controlling
>> > > the
>> > > > rate of downstream records from a KTable, cache size configuration
>> and
>> > > > commit time.
>> > > >
>> > > > Will these configuration parameters still be in effect for tables
>> that
>> > > > don't use suppression?  For tables taking advantage of suppression,
>> > will
>> > > > these configurations have no impact?
>> > > > This last question may be to implementation specific but if the
>> > requested
>> > > > suppression time is longer than the specified commit time, will the
>> > > latest
>> > > > record in the suppression buffer get stored in a changelog?
>> > > >
>> > > > Thanks,
>> > > > Bill
>> > > >
>> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
>> > wrote:
>> > > >
>> > > > > Thanks for the feedback, Matthias,
>> > > > >
>> > > > > It seems like in straightforward relational processing cases, it
>> > would
>> > > > not
>> > > > > make sense to bound the lateness of KTables. In general, it seems
>> > > better
>> > > > to
>> > > > > have "guard rails" in place that make it easier to write sensible
>> > > > programs
>> > > > > than insensible ones.
>> > > > >
>> > > > > But I'm still going to argue in favor of keeping it for all
>> KTables
>> > ;)
>> > > > >
>> > > > > 1. I believe it is simpler to understand the operator if it has
>> one
>> > > > uniform
>> > > > > definition, regardless of context. It's well defined and intuitive
>> > what
>> > > > > will happen when you use late-event suppression on a KTable, so I
>> > think
>> > > > > nothing surprising or dangerous will happen in that case. From my
>> > > > > perspective, having two sets of allowed operations is actually an
>> > > > increase
>> > > > > in cognitive complexity.
>> > > > >
>> > > > > 2. To me, it's not crazy to use the operator this way. For
>> example,
>> > in
>> > > > lieu
>> > > > > of full-featured timestamp semantics, I can implement MVCC
>> behavior
>> > > when
>> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I
>> suspect
>> > > that
>> > > > > there are other, non-obvious applications of suppressing late
>> events
>> > on
>> > > > > KTables.
>> > > > >
>> > > > > 3. Not to get too much into implementation details in a KIP
>> > discussion,
>> > > > but
>> > > > > if we did want to make late-event suppression available only on
>> > > windowed
>> > > > > KTables, we have two enforcement options:
>> > > > >   a. check when we build the topology - this would be simple to
>> > > > implement,
>> > > > > but would be a runtime check. Hopefully, people write tests for
>> their
>> > > > > topology before deploying them, so the feedback loop isn't
>> > > instantaneous,
>> > > > > but it's not too long either.
>> > > > >   b. add a new WindowedKTable type - this would be a compile time
>> > > check,
>> > > > > but would also be substantial increase of both interface and code
>> > > > > complexity.
>> > > > >
>> > > > > We should definitely strive to have guard rails protecting against
>> > > > > surprising or dangerous behavior. Protecting against programs
>> that we
>> > > > don't
>> > > > > currently predict is a lesser benefit, and I think we can put up
>> > guard
>> > > > > rails on a case-by-case basis for that. It seems like the
>> increase in
>> > > > > cognitive (and potentially code and interface) complexity makes me
>> > > think
>> > > > we
>> > > > > should skip this case.
>> > > > >
>> > > > > What do you think?
>> > > > >
>> > > > > Thanks,
>> > > > > -John
>> > > > >
>> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
>> > > matthias@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks for the KIP John.
>> > > > > >
>> > > > > > One initial comments about the last example "Bounded lateness":
>> > For a
>> > > > > > non-windowed KTable bounding the lateness does not really make
>> > sense,
>> > > > > > does it?
>> > > > > >
>> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
>> for
>> > > this
>> > > > > > case? It seems to be better to only allow it for
>> windowed-KTables.
>> > > > > >
>> > > > > >
>> > > > > > -Matthias
>> > > > > >
>> > > > > >
>> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
>> > > > > > > I noticed this (lack of primary parameter) as well.
>> > > > > > >
>> > > > > > > What you gave as new example is semantically the same as what
>> I
>> > > > > > suggested.
>> > > > > > > So it is good by me.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
>> john@confluent.io
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > >> Thanks for taking look, Ted,
>> > > > > > >>
>> > > > > > >> I agree this is a departure from the conventions of Streams
>> DSL.
>> > > > > > >>
>> > > > > > >> Most of our config objects have one or two "required"
>> > parameters,
>> > > > > which
>> > > > > > fit
>> > > > > > >> naturally with the static factory method approach.
>> TimeWindow,
>> > for
>> > > > > > example,
>> > > > > > >> requires a size parameter, so we can naturally say
>> > > > > TimeWindows.of(size).
>> > > > > > >>
>> > > > > > >> I think in the case of a suppression, there's really no
>> "core"
>> > > > > > parameter,
>> > > > > > >> and "Suppression.of()" seems sillier than "new
>> Suppression()". I
>> > > > think
>> > > > > > that
>> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
>> > many
>> > > > > > durations
>> > > > > > >> that we can configure.
>> > > > > > >>
>> > > > > > >> However, thinking about it again, I suppose that I can give
>> each
>> > > > > > >> configuration method a static version, which would let you
>> > replace
>> > > > > "new
>> > > > > > >> Suppression()." with "Suppression." in all the examples.
>> > > Basically,
>> > > > > > instead
>> > > > > > >> of "of()", we'd support any of the methods I listed.
>> > > > > > >>
>> > > > > > >> For example:
>> > > > > > >>
>> > > > > > >> windowCounts
>> > > > > > >>     .suppress(
>> > > > > > >>         Suppression
>> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
>> > > > > > >>             .suppressIntermediateEvents(
>> > > > > > >>
>> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>> > > > > > >>             )
>> > > > > > >>     );
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> Does that seem better?
>> > > > > > >>
>> > > > > > >> Thanks,
>> > > > > > >> -John
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yuzhihong@gmail.com
>> >
>> > > > wrote:
>> > > > > > >>
>> > > > > > >>> I started to read this KIP which contains a lot of
>> materials.
>> > > > > > >>>
>> > > > > > >>> One suggestion:
>> > > > > > >>>
>> > > > > > >>>     .suppress(
>> > > > > > >>>         new Suppression()
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> Do you think it would be more consistent with the rest of
>> > Streams
>> > > > > data
>> > > > > > >>> structures by supporting `of` ?
>> > > > > > >>>
>> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> Cheers
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
>> > john@confluent.io
>> > > >
>> > > > > > wrote:
>> > > > > > >>>
>> > > > > > >>>> Hello devs and users,
>> > > > > > >>>>
>> > > > > > >>>> Please take some time to consider this proposal for Kafka
>> > > Streams:
>> > > > > > >>>>
>> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
>> > > > > > >>>>
>> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>> > > > > > >>>>
>> > > > > > >>>> The basic idea is to provide:
>> > > > > > >>>> * more usable control over update rate (vs the current
>> state
>> > > store
>> > > > > > >>> caches)
>> > > > > > >>>> * the final-result-for-windowed-computations feature which
>> > > several
>> > > > > > >> people
>> > > > > > >>>> have requested
>> > > > > > >>>>
>> > > > > > >>>> I look forward to your feedback!
>> > > > > > >>>>
>> > > > > > >>>> Thanks,
>> > > > > > >>>> -John
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi again, Guozhang ;) Here's the second part of my response...

It seems like your main concern is: "if I'm a user who wants final update
semantics, how complicated is it for me to get it?"

I think we have to assume that people don't always have time to become
deeply familiar with all the nuances of a programming environment before
they use it. Especially if they're evaluating several frameworks for their
use case, it's very valuable to make it as obvious as possible how to
accomplish various computations with Streams.

To me the biggest question is whether with a fresh perspective, people
would say "oh, I get it, I have to bound my lateness and suppress
intermediate updates, and of course I'll get only the final result!", or if
it's more like "wtf? all I want is the final result, what are all these
parameters?".

I was talking with Matthias a while back, and he had an idea that I think
can help, which is to essentially set up a final-result recipe in addition
to the raw parameters. I previously thought that it wouldn't be possible to
restrict its usage to Windowed KTables, but thinking about it again this
weekend, I have a couple of ideas:

================
= 1. Static Wrapper =
================
We can define an extra static function that "wraps" a KTable with
final-result semantics.

public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
  final KTable<K, V> windowedKTable,
  final Duration maxAllowedLateness,
  final Suppression.BufferFullStrategy bufferFullStrategy) {
    return windowedKTable.suppress(
        Suppression.suppressLateEvents(maxAllowedLateness)
                   .suppressIntermediateEvents(
                     IntermediateSuppression
                       .emitAfter(maxAllowedLateness)
                       .bufferFullStrategy(bufferFullStrategy)
                   )
    );
}

Because windowedKTable is a parameter, the static function can easily
impose an extra bound on the key type, that it extends Windowed. This would
make "final results only" only available on windowed ktables.

Here's how it would look to use:

final KTable<Windowed<Integer>, Long> windowCounts = ...
final KTable<Windowed<Integer>, Long> finalCounts =
  finalResultsOnly(
    windowCounts,
    Duration.ofMinutes(10),
    Suppression.BufferFullStrategy.SHUT_DOWN
  );

Trying to use it on a non-windowed KTable yields:

> Error:(129, 35) java: method finalResultsOnly in class
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot be
> applied to given types;
>   required:
> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   found:
> org.apache.kafka.streams.kstream.KTable<java.lang.String,java.lang.String>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   reason: inference variable K has incompatible bounds
>     equality constraints: java.lang.String
>     upper bounds: org.apache.kafka.streams.kstream.Windowed



=================================================
= 2. Add <K,V> parameters and recipe method to Suppression =
=================================================

By adding K,V parameters to Suppression, we can provide a similarly bounded
config method directly on the Suppression class:

public static <K extends Windowed, V> Suppression<K, V>
finalResultsOnly(final Duration maxAllowedLateness, final
BufferFullStrategy bufferFullStrategy) {
    return Suppression
        .<K, V>suppressLateEvents(maxAllowedLateness)
        .suppressIntermediateEvents(IntermediateSuppression
            .emitAfter(maxAllowedLateness)
            .bufferFullStrategy(bufferFullStrategy)
        );
}

Then, here's how it would look to use it:

final KTable<Windowed<Integer>, Long> windowCounts = ...
final KTable<Windowed<Integer>, Long> finalCounts =
  windowCounts.suppress(
    Suppression.finalResultsOnly(
      Duration.ofMinutes(10)
      Suppression.BufferFullStrategy.SHUT_DOWN
    )
  );

Trying to use it on a non-windowed ktable yields:

> Error:(127, 35) java: method finalResultsOnly in class
> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
> given types;
>   required:
> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   found:
> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   reason: explicit type argument java.lang.String does not conform to
> declared bound(s) org.apache.kafka.streams.kstream.Windowed



============
= Downsides =
============

Of course, there's a downside either way:
* for 1:  this "wrapper" interaction would be the first in the DSL. Is it
too strange, and how discoverable would it be?
* for 2: adding those type parameters to Suppression will force all callers
to provide them in the event of a chained construction because Java doesn't
do RHS recursive type inference. This is already visible in other parts of
the Streams DSL. For example, often calls to Materialized builders have to
provide seemingly obvious type bounds.

============
= Conclusion =
============

I think option 2 is more "normal" and discoverable. It does have a
downside, but it's one that's pre-existing elsewhere in the DSL.

WDYT? Would the addition of this "recipe" method to Suppression resolve
your concern?

Thanks again,
-John

On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hi John,
>
> Regarding the metrics: yeah I think I'm with you that the dropped records
> due to window retention or emit suppression policies should be recorded
> differently, and using this KIP's proposed metric would be fine. If you
> also think we can use this KIP's proposed metrics to cover the window
> retention cased skipping records, then we can include the changes in this
> KIP as well.
>
> Regarding the current proposal, I'm actually not too worried about the
> inconsistency between query semantics and downstream emit semantics. For
> queries, we will always return the current running results of the windows,
> being it partial or final results depending on the window retention time
> anyways, which has nothing to do whether the emitted stream should be one
> final output per key or not. I also agree that having a unified operation
> is generally better for users to focus on leveraging that one only than
> learning about two set of operations. The only question I had is, for final
> updates of window stores, if it is a bit awkward to understand the
> configuration combo. Thinking about this more, I think my root worry in the
> "suppressLateEvents" call for windowed tables, since from a user
> perspective: if my retention time is X which means "pay the cost to allow
> late records up to X to still be applied updating the tables", why would I
> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> updates up to Y, which means the downstream operator or sink topic for this
> stream would actually see a truncated update stream while I've paid larger
> cost for that"; and of course, Y > X would not make sense either as you
> would not see any updates later than X anyways. So in all, my feeling is
> that it makes less sense for windowed table's "suppressLateEvents" with a
> parameter that is not equal to the window retention, and opening the door
> in the current proposal may confuse people with that.
>
> Again, above is just a subjective opinion and probably we can also bring up
> some scenarios that users does want to set X != Y.. but personally I feel
> that even if the semantics for this scenario if intuitive for user to
> understand, doe that really make sense and should we really open the door
> for it. So I think maybe separating the final update in a separate API's
> benefits may overwhelm the advantage of having one uniform definition. And
> for my alternative proposal, the rationale was from both my concern about
> "suppressLateEvents" for windowed store, and Matthias' question about
> "suppressLateEvents" for non-windowed stores, that if it is less meaningful
> for both, we can consider removing it completely and only do
> "IntermediateSuppression" in Suppress instead.
>
> So I'd summarize my thoughts in the following questions:
>
> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
> for windowed stores make sense in practice?
> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
> make sense in practice?
>
>
>
> Guozhang
>
>
> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>
> > Thanks for the explanation, that does make sense.  I have some questions
> on
> > operations, but I'll just wait for the PR and tests.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Bill,
> > >
> > > Thanks for the review!
> > >
> > > Your question is very much applicable to the KIP and not at all an
> > > implementation detail. Thanks for bringing it up.
> > >
> > > I'm proposing not to change the existing caches and configurations at
> all
> > > (for now).
> > >
> > > Imagine you have a topology like this:
> > > commit.interval.ms = 100
> > >
> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> > >
> > > The first ktable (ktable1) will respect the commit interval and buffer
> > > events for 100ms before logging, storing, or forwarding them (IIRC).
> > > Therefore, the second ktable (suppress) will only see the events at a
> > rate
> > > of once per 100ms. It will apply its own buffering, and emit once per
> > 200ms
> > > This case is pretty trivial because the suppress time is a multiple of
> > the
> > > commit interval.
> > >
> > > When it's not an integer multiple, you'll get behavior like in this
> > marble
> > > diagram:
> > >
> > >
> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >
> > > [ KTable caching with commit interval = 2 ]
> > >
> > > <--------(k:2)---------(k:4)---------(k:6)->
> > >
> > >       [ suppress with emitAfter = 3 ]
> > >
> > > <---------------(k:2)----------------(k:6)->
> > >
> > >
> > > If this behavior isn't desired (for example, if you wanted to emit
> (k:3)
> > at
> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> > > modifying the topology to disable caching. Then, the behavior is more
> > > simply determined just by the suppress operator.
> > >
> > > Does that seem right to you?
> > >
> > >
> > > Regarding the changelogs, because the suppression operator hangs onto
> > > events for a while, it will need its own changelog. The changelog
> > > should represent the current state of the buffer at all times. So when
> > the
> > > suppress operator sees (k:2), for example, it will log (k:2). When it
> > > later gets to time 3, it's time to emit (k:2) downstream. Because k is
> no
> > > longer buffered, the suppress operator will log (k:null). Thus, when
> > > recovering,
> > > it can rebuild the buffer by reading its changelog.
> > >
> > > What do you think about this?
> > >
> > > Thanks,
> > > -John
> > >
> > >
> > >
> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
> > >
> > > > Hi John,  thanks for the KIP.
> > > >
> > > > Early on in the KIP, you mention the current approaches for
> controlling
> > > the
> > > > rate of downstream records from a KTable, cache size configuration
> and
> > > > commit time.
> > > >
> > > > Will these configuration parameters still be in effect for tables
> that
> > > > don't use suppression?  For tables taking advantage of suppression,
> > will
> > > > these configurations have no impact?
> > > > This last question may be to implementation specific but if the
> > requested
> > > > suppression time is longer than the specified commit time, will the
> > > latest
> > > > record in the suppression buffer get stored in a changelog?
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > > > Thanks for the feedback, Matthias,
> > > > >
> > > > > It seems like in straightforward relational processing cases, it
> > would
> > > > not
> > > > > make sense to bound the lateness of KTables. In general, it seems
> > > better
> > > > to
> > > > > have "guard rails" in place that make it easier to write sensible
> > > > programs
> > > > > than insensible ones.
> > > > >
> > > > > But I'm still going to argue in favor of keeping it for all KTables
> > ;)
> > > > >
> > > > > 1. I believe it is simpler to understand the operator if it has one
> > > > uniform
> > > > > definition, regardless of context. It's well defined and intuitive
> > what
> > > > > will happen when you use late-event suppression on a KTable, so I
> > think
> > > > > nothing surprising or dangerous will happen in that case. From my
> > > > > perspective, having two sets of allowed operations is actually an
> > > > increase
> > > > > in cognitive complexity.
> > > > >
> > > > > 2. To me, it's not crazy to use the operator this way. For example,
> > in
> > > > lieu
> > > > > of full-featured timestamp semantics, I can implement MVCC behavior
> > > when
> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> > > that
> > > > > there are other, non-obvious applications of suppressing late
> events
> > on
> > > > > KTables.
> > > > >
> > > > > 3. Not to get too much into implementation details in a KIP
> > discussion,
> > > > but
> > > > > if we did want to make late-event suppression available only on
> > > windowed
> > > > > KTables, we have two enforcement options:
> > > > >   a. check when we build the topology - this would be simple to
> > > > implement,
> > > > > but would be a runtime check. Hopefully, people write tests for
> their
> > > > > topology before deploying them, so the feedback loop isn't
> > > instantaneous,
> > > > > but it's not too long either.
> > > > >   b. add a new WindowedKTable type - this would be a compile time
> > > check,
> > > > > but would also be substantial increase of both interface and code
> > > > > complexity.
> > > > >
> > > > > We should definitely strive to have guard rails protecting against
> > > > > surprising or dangerous behavior. Protecting against programs that
> we
> > > > don't
> > > > > currently predict is a lesser benefit, and I think we can put up
> > guard
> > > > > rails on a case-by-case basis for that. It seems like the increase
> in
> > > > > cognitive (and potentially code and interface) complexity makes me
> > > think
> > > > we
> > > > > should skip this case.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > matthias@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the KIP John.
> > > > > >
> > > > > > One initial comments about the last example "Bounded lateness":
> > For a
> > > > > > non-windowed KTable bounding the lateness does not really make
> > sense,
> > > > > > does it?
> > > > > >
> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
> for
> > > this
> > > > > > case? It seems to be better to only allow it for
> windowed-KTables.
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > >
> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > > I noticed this (lack of primary parameter) as well.
> > > > > > >
> > > > > > > What you gave as new example is semantically the same as what I
> > > > > > suggested.
> > > > > > > So it is good by me.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for taking look, Ted,
> > > > > > >>
> > > > > > >> I agree this is a departure from the conventions of Streams
> DSL.
> > > > > > >>
> > > > > > >> Most of our config objects have one or two "required"
> > parameters,
> > > > > which
> > > > > > fit
> > > > > > >> naturally with the static factory method approach. TimeWindow,
> > for
> > > > > > example,
> > > > > > >> requires a size parameter, so we can naturally say
> > > > > TimeWindows.of(size).
> > > > > > >>
> > > > > > >> I think in the case of a suppression, there's really no "core"
> > > > > > parameter,
> > > > > > >> and "Suppression.of()" seems sillier than "new
> Suppression()". I
> > > > think
> > > > > > that
> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
> > many
> > > > > > durations
> > > > > > >> that we can configure.
> > > > > > >>
> > > > > > >> However, thinking about it again, I suppose that I can give
> each
> > > > > > >> configuration method a static version, which would let you
> > replace
> > > > > "new
> > > > > > >> Suppression()." with "Suppression." in all the examples.
> > > Basically,
> > > > > > instead
> > > > > > >> of "of()", we'd support any of the methods I listed.
> > > > > > >>
> > > > > > >> For example:
> > > > > > >>
> > > > > > >> windowCounts
> > > > > > >>     .suppress(
> > > > > > >>         Suppression
> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > > > >>             .suppressIntermediateEvents(
> > > > > > >>
> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > > > >>             )
> > > > > > >>     );
> > > > > > >>
> > > > > > >>
> > > > > > >> Does that seem better?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> -John
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > > > wrote:
> > > > > > >>
> > > > > > >>> I started to read this KIP which contains a lot of materials.
> > > > > > >>>
> > > > > > >>> One suggestion:
> > > > > > >>>
> > > > > > >>>     .suppress(
> > > > > > >>>         new Suppression()
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Do you think it would be more consistent with the rest of
> > Streams
> > > > > data
> > > > > > >>> structures by supporting `of` ?
> > > > > > >>>
> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Cheers
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > john@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> Hello devs and users,
> > > > > > >>>>
> > > > > > >>>> Please take some time to consider this proposal for Kafka
> > > Streams:
> > > > > > >>>>
> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > > > >>>>
> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > > >>>>
> > > > > > >>>> The basic idea is to provide:
> > > > > > >>>> * more usable control over update rate (vs the current state
> > > store
> > > > > > >>> caches)
> > > > > > >>>> * the final-result-for-windowed-computations feature which
> > > several
> > > > > > >> people
> > > > > > >>>> have requested
> > > > > > >>>>
> > > > > > >>>> I look forward to your feedback!
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> -John
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi again, Guozhang ;) Here's the second part of my response...

It seems like your main concern is: "if I'm a user who wants final update
semantics, how complicated is it for me to get it?"

I think we have to assume that people don't always have time to become
deeply familiar with all the nuances of a programming environment before
they use it. Especially if they're evaluating several frameworks for their
use case, it's very valuable to make it as obvious as possible how to
accomplish various computations with Streams.

To me the biggest question is whether with a fresh perspective, people
would say "oh, I get it, I have to bound my lateness and suppress
intermediate updates, and of course I'll get only the final result!", or if
it's more like "wtf? all I want is the final result, what are all these
parameters?".

I was talking with Matthias a while back, and he had an idea that I think
can help, which is to essentially set up a final-result recipe in addition
to the raw parameters. I previously thought that it wouldn't be possible to
restrict its usage to Windowed KTables, but thinking about it again this
weekend, I have a couple of ideas:

================
= 1. Static Wrapper =
================
We can define an extra static function that "wraps" a KTable with
final-result semantics.

public static <K extends Windowed, V> KTable<K, V> finalResultsOnly(
  final KTable<K, V> windowedKTable,
  final Duration maxAllowedLateness,
  final Suppression.BufferFullStrategy bufferFullStrategy) {
    return windowedKTable.suppress(
        Suppression.suppressLateEvents(maxAllowedLateness)
                   .suppressIntermediateEvents(
                     IntermediateSuppression
                       .emitAfter(maxAllowedLateness)
                       .bufferFullStrategy(bufferFullStrategy)
                   )
    );
}

Because windowedKTable is a parameter, the static function can easily
impose an extra bound on the key type, that it extends Windowed. This would
make "final results only" only available on windowed ktables.

Here's how it would look to use:

final KTable<Windowed<Integer>, Long> windowCounts = ...
final KTable<Windowed<Integer>, Long> finalCounts =
  finalResultsOnly(
    windowCounts,
    Duration.ofMinutes(10),
    Suppression.BufferFullStrategy.SHUT_DOWN
  );

Trying to use it on a non-windowed KTable yields:

> Error:(129, 35) java: method finalResultsOnly in class
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest cannot be
> applied to given types;
>   required:
> org.apache.kafka.streams.kstream.KTable<K,V>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   found:
> org.apache.kafka.streams.kstream.KTable<java.lang.String,java.lang.String>,java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   reason: inference variable K has incompatible bounds
>     equality constraints: java.lang.String
>     upper bounds: org.apache.kafka.streams.kstream.Windowed



=================================================
= 2. Add <K,V> parameters and recipe method to Suppression =
=================================================

By adding K,V parameters to Suppression, we can provide a similarly bounded
config method directly on the Suppression class:

public static <K extends Windowed, V> Suppression<K, V>
finalResultsOnly(final Duration maxAllowedLateness, final
BufferFullStrategy bufferFullStrategy) {
    return Suppression
        .<K, V>suppressLateEvents(maxAllowedLateness)
        .suppressIntermediateEvents(IntermediateSuppression
            .emitAfter(maxAllowedLateness)
            .bufferFullStrategy(bufferFullStrategy)
        );
}

Then, here's how it would look to use it:

final KTable<Windowed<Integer>, Long> windowCounts = ...
final KTable<Windowed<Integer>, Long> finalCounts =
  windowCounts.suppress(
    Suppression.finalResultsOnly(
      Duration.ofMinutes(10)
      Suppression.BufferFullStrategy.SHUT_DOWN
    )
  );

Trying to use it on a non-windowed ktable yields:

> Error:(127, 35) java: method finalResultsOnly in class
> org.apache.kafka.streams.kstream.Suppression<K,V> cannot be applied to
> given types;
>   required:
> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   found:
> java.time.Duration,org.apache.kafka.streams.kstream.Suppression.BufferFullStrategy
>   reason: explicit type argument java.lang.String does not conform to
> declared bound(s) org.apache.kafka.streams.kstream.Windowed



============
= Downsides =
============

Of course, there's a downside either way:
* for 1:  this "wrapper" interaction would be the first in the DSL. Is it
too strange, and how discoverable would it be?
* for 2: adding those type parameters to Suppression will force all callers
to provide them in the event of a chained construction because Java doesn't
do RHS recursive type inference. This is already visible in other parts of
the Streams DSL. For example, often calls to Materialized builders have to
provide seemingly obvious type bounds.

============
= Conclusion =
============

I think option 2 is more "normal" and discoverable. It does have a
downside, but it's one that's pre-existing elsewhere in the DSL.

WDYT? Would the addition of this "recipe" method to Suppression resolve
your concern?

Thanks again,
-John

On Sun, Jul 1, 2018 at 11:24 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hi John,
>
> Regarding the metrics: yeah I think I'm with you that the dropped records
> due to window retention or emit suppression policies should be recorded
> differently, and using this KIP's proposed metric would be fine. If you
> also think we can use this KIP's proposed metrics to cover the window
> retention cased skipping records, then we can include the changes in this
> KIP as well.
>
> Regarding the current proposal, I'm actually not too worried about the
> inconsistency between query semantics and downstream emit semantics. For
> queries, we will always return the current running results of the windows,
> being it partial or final results depending on the window retention time
> anyways, which has nothing to do whether the emitted stream should be one
> final output per key or not. I also agree that having a unified operation
> is generally better for users to focus on leveraging that one only than
> learning about two set of operations. The only question I had is, for final
> updates of window stores, if it is a bit awkward to understand the
> configuration combo. Thinking about this more, I think my root worry in the
> "suppressLateEvents" call for windowed tables, since from a user
> perspective: if my retention time is X which means "pay the cost to allow
> late records up to X to still be applied updating the tables", why would I
> ever want to suppressLateEvents by Y ( < X), to say "do not send the
> updates up to Y, which means the downstream operator or sink topic for this
> stream would actually see a truncated update stream while I've paid larger
> cost for that"; and of course, Y > X would not make sense either as you
> would not see any updates later than X anyways. So in all, my feeling is
> that it makes less sense for windowed table's "suppressLateEvents" with a
> parameter that is not equal to the window retention, and opening the door
> in the current proposal may confuse people with that.
>
> Again, above is just a subjective opinion and probably we can also bring up
> some scenarios that users does want to set X != Y.. but personally I feel
> that even if the semantics for this scenario if intuitive for user to
> understand, doe that really make sense and should we really open the door
> for it. So I think maybe separating the final update in a separate API's
> benefits may overwhelm the advantage of having one uniform definition. And
> for my alternative proposal, the rationale was from both my concern about
> "suppressLateEvents" for windowed store, and Matthias' question about
> "suppressLateEvents" for non-windowed stores, that if it is less meaningful
> for both, we can consider removing it completely and only do
> "IntermediateSuppression" in Suppress instead.
>
> So I'd summarize my thoughts in the following questions:
>
> 1. Does "suppressLateEvents" with parameter Y != X (window retention time)
> for windowed stores make sense in practice?
> 2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
> make sense in practice?
>
>
>
> Guozhang
>
>
> On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:
>
> > Thanks for the explanation, that does make sense.  I have some questions
> on
> > operations, but I'll just wait for the PR and tests.
> >
> > Thanks,
> > Bill
> >
> > On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Hi Bill,
> > >
> > > Thanks for the review!
> > >
> > > Your question is very much applicable to the KIP and not at all an
> > > implementation detail. Thanks for bringing it up.
> > >
> > > I'm proposing not to change the existing caches and configurations at
> all
> > > (for now).
> > >
> > > Imagine you have a topology like this:
> > > commit.interval.ms = 100
> > >
> > > (ktable1 (cached)) -> (suppress emitAfter 200)
> > >
> > > The first ktable (ktable1) will respect the commit interval and buffer
> > > events for 100ms before logging, storing, or forwarding them (IIRC).
> > > Therefore, the second ktable (suppress) will only see the events at a
> > rate
> > > of once per 100ms. It will apply its own buffering, and emit once per
> > 200ms
> > > This case is pretty trivial because the suppress time is a multiple of
> > the
> > > commit interval.
> > >
> > > When it's not an integer multiple, you'll get behavior like in this
> > marble
> > > diagram:
> > >
> > >
> > > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> > >
> > > [ KTable caching with commit interval = 2 ]
> > >
> > > <--------(k:2)---------(k:4)---------(k:6)->
> > >
> > >       [ suppress with emitAfter = 3 ]
> > >
> > > <---------------(k:2)----------------(k:6)->
> > >
> > >
> > > If this behavior isn't desired (for example, if you wanted to emit
> (k:3)
> > at
> > > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> > > modifying the topology to disable caching. Then, the behavior is more
> > > simply determined just by the suppress operator.
> > >
> > > Does that seem right to you?
> > >
> > >
> > > Regarding the changelogs, because the suppression operator hangs onto
> > > events for a while, it will need its own changelog. The changelog
> > > should represent the current state of the buffer at all times. So when
> > the
> > > suppress operator sees (k:2), for example, it will log (k:2). When it
> > > later gets to time 3, it's time to emit (k:2) downstream. Because k is
> no
> > > longer buffered, the suppress operator will log (k:null). Thus, when
> > > recovering,
> > > it can rebuild the buffer by reading its changelog.
> > >
> > > What do you think about this?
> > >
> > > Thanks,
> > > -John
> > >
> > >
> > >
> > > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
> > >
> > > > Hi John,  thanks for the KIP.
> > > >
> > > > Early on in the KIP, you mention the current approaches for
> controlling
> > > the
> > > > rate of downstream records from a KTable, cache size configuration
> and
> > > > commit time.
> > > >
> > > > Will these configuration parameters still be in effect for tables
> that
> > > > don't use suppression?  For tables taking advantage of suppression,
> > will
> > > > these configurations have no impact?
> > > > This last question may be to implementation specific but if the
> > requested
> > > > suppression time is longer than the specified commit time, will the
> > > latest
> > > > record in the suppression buffer get stored in a changelog?
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > > > Thanks for the feedback, Matthias,
> > > > >
> > > > > It seems like in straightforward relational processing cases, it
> > would
> > > > not
> > > > > make sense to bound the lateness of KTables. In general, it seems
> > > better
> > > > to
> > > > > have "guard rails" in place that make it easier to write sensible
> > > > programs
> > > > > than insensible ones.
> > > > >
> > > > > But I'm still going to argue in favor of keeping it for all KTables
> > ;)
> > > > >
> > > > > 1. I believe it is simpler to understand the operator if it has one
> > > > uniform
> > > > > definition, regardless of context. It's well defined and intuitive
> > what
> > > > > will happen when you use late-event suppression on a KTable, so I
> > think
> > > > > nothing surprising or dangerous will happen in that case. From my
> > > > > perspective, having two sets of allowed operations is actually an
> > > > increase
> > > > > in cognitive complexity.
> > > > >
> > > > > 2. To me, it's not crazy to use the operator this way. For example,
> > in
> > > > lieu
> > > > > of full-featured timestamp semantics, I can implement MVCC behavior
> > > when
> > > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> > > that
> > > > > there are other, non-obvious applications of suppressing late
> events
> > on
> > > > > KTables.
> > > > >
> > > > > 3. Not to get too much into implementation details in a KIP
> > discussion,
> > > > but
> > > > > if we did want to make late-event suppression available only on
> > > windowed
> > > > > KTables, we have two enforcement options:
> > > > >   a. check when we build the topology - this would be simple to
> > > > implement,
> > > > > but would be a runtime check. Hopefully, people write tests for
> their
> > > > > topology before deploying them, so the feedback loop isn't
> > > instantaneous,
> > > > > but it's not too long either.
> > > > >   b. add a new WindowedKTable type - this would be a compile time
> > > check,
> > > > > but would also be substantial increase of both interface and code
> > > > > complexity.
> > > > >
> > > > > We should definitely strive to have guard rails protecting against
> > > > > surprising or dangerous behavior. Protecting against programs that
> we
> > > > don't
> > > > > currently predict is a lesser benefit, and I think we can put up
> > guard
> > > > > rails on a case-by-case basis for that. It seems like the increase
> in
> > > > > cognitive (and potentially code and interface) complexity makes me
> > > think
> > > > we
> > > > > should skip this case.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > > matthias@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the KIP John.
> > > > > >
> > > > > > One initial comments about the last example "Bounded lateness":
> > For a
> > > > > > non-windowed KTable bounding the lateness does not really make
> > sense,
> > > > > > does it?
> > > > > >
> > > > > > Thus, I am wondering if we should allow `suppressLateEvents()`
> for
> > > this
> > > > > > case? It seems to be better to only allow it for
> windowed-KTables.
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > >
> > > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > > I noticed this (lack of primary parameter) as well.
> > > > > > >
> > > > > > > What you gave as new example is semantically the same as what I
> > > > > > suggested.
> > > > > > > So it is good by me.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for taking look, Ted,
> > > > > > >>
> > > > > > >> I agree this is a departure from the conventions of Streams
> DSL.
> > > > > > >>
> > > > > > >> Most of our config objects have one or two "required"
> > parameters,
> > > > > which
> > > > > > fit
> > > > > > >> naturally with the static factory method approach. TimeWindow,
> > for
> > > > > > example,
> > > > > > >> requires a size parameter, so we can naturally say
> > > > > TimeWindows.of(size).
> > > > > > >>
> > > > > > >> I think in the case of a suppression, there's really no "core"
> > > > > > parameter,
> > > > > > >> and "Suppression.of()" seems sillier than "new
> Suppression()". I
> > > > think
> > > > > > that
> > > > > > >> Suppression.of(duration) would be ambiguous, since there are
> > many
> > > > > > durations
> > > > > > >> that we can configure.
> > > > > > >>
> > > > > > >> However, thinking about it again, I suppose that I can give
> each
> > > > > > >> configuration method a static version, which would let you
> > replace
> > > > > "new
> > > > > > >> Suppression()." with "Suppression." in all the examples.
> > > Basically,
> > > > > > instead
> > > > > > >> of "of()", we'd support any of the methods I listed.
> > > > > > >>
> > > > > > >> For example:
> > > > > > >>
> > > > > > >> windowCounts
> > > > > > >>     .suppress(
> > > > > > >>         Suppression
> > > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > > > >>             .suppressIntermediateEvents(
> > > > > > >>
> > > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > > > >>             )
> > > > > > >>     );
> > > > > > >>
> > > > > > >>
> > > > > > >> Does that seem better?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> -John
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > > > wrote:
> > > > > > >>
> > > > > > >>> I started to read this KIP which contains a lot of materials.
> > > > > > >>>
> > > > > > >>> One suggestion:
> > > > > > >>>
> > > > > > >>>     .suppress(
> > > > > > >>>         new Suppression()
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Do you think it would be more consistent with the rest of
> > Streams
> > > > > data
> > > > > > >>> structures by supporting `of` ?
> > > > > > >>>
> > > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Cheers
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> > john@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> Hello devs and users,
> > > > > > >>>>
> > > > > > >>>> Please take some time to consider this proposal for Kafka
> > > Streams:
> > > > > > >>>>
> > > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > > > >>>>
> > > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > > >>>>
> > > > > > >>>> The basic idea is to provide:
> > > > > > >>>> * more usable control over update rate (vs the current state
> > > store
> > > > > > >>> caches)
> > > > > > >>>> * the final-result-for-windowed-computations feature which
> > > several
> > > > > > >> people
> > > > > > >>>> have requested
> > > > > > >>>>
> > > > > > >>>> I look forward to your feedback!
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> -John
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

Hi John,

Regarding the metrics: yeah I think I'm with you that the dropped records
due to window retention or emit suppression policies should be recorded
differently, and using this KIP's proposed metric would be fine. If you
also think we can use this KIP's proposed metrics to cover the window
retention cased skipping records, then we can include the changes in this
KIP as well.

Regarding the current proposal, I'm actually not too worried about the
inconsistency between query semantics and downstream emit semantics. For
queries, we will always return the current running results of the windows,
being it partial or final results depending on the window retention time
anyways, which has nothing to do whether the emitted stream should be one
final output per key or not. I also agree that having a unified operation
is generally better for users to focus on leveraging that one only than
learning about two set of operations. The only question I had is, for final
updates of window stores, if it is a bit awkward to understand the
configuration combo. Thinking about this more, I think my root worry in the
"suppressLateEvents" call for windowed tables, since from a user
perspective: if my retention time is X which means "pay the cost to allow
late records up to X to still be applied updating the tables", why would I
ever want to suppressLateEvents by Y ( < X), to say "do not send the
updates up to Y, which means the downstream operator or sink topic for this
stream would actually see a truncated update stream while I've paid larger
cost for that"; and of course, Y > X would not make sense either as you
would not see any updates later than X anyways. So in all, my feeling is
that it makes less sense for windowed table's "suppressLateEvents" with a
parameter that is not equal to the window retention, and opening the door
in the current proposal may confuse people with that.

Again, above is just a subjective opinion and probably we can also bring up
some scenarios that users does want to set X != Y.. but personally I feel
that even if the semantics for this scenario if intuitive for user to
understand, doe that really make sense and should we really open the door
for it. So I think maybe separating the final update in a separate API's
benefits may overwhelm the advantage of having one uniform definition. And
for my alternative proposal, the rationale was from both my concern about
"suppressLateEvents" for windowed store, and Matthias' question about
"suppressLateEvents" for non-windowed stores, that if it is less meaningful
for both, we can consider removing it completely and only do
"IntermediateSuppression" in Suppress instead.

So I'd summarize my thoughts in the following questions:

1. Does "suppressLateEvents" with parameter Y != X (window retention time)
for windowed stores make sense in practice?
2. Does "suppressLateEvents" with any parameter Y for non-windowed stores
make sense in practice?



Guozhang


On Fri, Jun 29, 2018 at 2:26 PM, Bill Bejeck <bb...@gmail.com> wrote:

> Thanks for the explanation, that does make sense.  I have some questions on
> operations, but I'll just wait for the PR and tests.
>
> Thanks,
> Bill
>
> On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:
>
> > Hi Bill,
> >
> > Thanks for the review!
> >
> > Your question is very much applicable to the KIP and not at all an
> > implementation detail. Thanks for bringing it up.
> >
> > I'm proposing not to change the existing caches and configurations at all
> > (for now).
> >
> > Imagine you have a topology like this:
> > commit.interval.ms = 100
> >
> > (ktable1 (cached)) -> (suppress emitAfter 200)
> >
> > The first ktable (ktable1) will respect the commit interval and buffer
> > events for 100ms before logging, storing, or forwarding them (IIRC).
> > Therefore, the second ktable (suppress) will only see the events at a
> rate
> > of once per 100ms. It will apply its own buffering, and emit once per
> 200ms
> > This case is pretty trivial because the suppress time is a multiple of
> the
> > commit interval.
> >
> > When it's not an integer multiple, you'll get behavior like in this
> marble
> > diagram:
> >
> >
> > <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
> >
> > [ KTable caching with commit interval = 2 ]
> >
> > <--------(k:2)---------(k:4)---------(k:6)->
> >
> >       [ suppress with emitAfter = 3 ]
> >
> > <---------------(k:2)----------------(k:6)->
> >
> >
> > If this behavior isn't desired (for example, if you wanted to emit (k:3)
> at
> > time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> > modifying the topology to disable caching. Then, the behavior is more
> > simply determined just by the suppress operator.
> >
> > Does that seem right to you?
> >
> >
> > Regarding the changelogs, because the suppression operator hangs onto
> > events for a while, it will need its own changelog. The changelog
> > should represent the current state of the buffer at all times. So when
> the
> > suppress operator sees (k:2), for example, it will log (k:2). When it
> > later gets to time 3, it's time to emit (k:2) downstream. Because k is no
> > longer buffered, the suppress operator will log (k:null). Thus, when
> > recovering,
> > it can rebuild the buffer by reading its changelog.
> >
> > What do you think about this?
> >
> > Thanks,
> > -John
> >
> >
> >
> > On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
> >
> > > Hi John,  thanks for the KIP.
> > >
> > > Early on in the KIP, you mention the current approaches for controlling
> > the
> > > rate of downstream records from a KTable, cache size configuration and
> > > commit time.
> > >
> > > Will these configuration parameters still be in effect for tables that
> > > don't use suppression?  For tables taking advantage of suppression,
> will
> > > these configurations have no impact?
> > > This last question may be to implementation specific but if the
> requested
> > > suppression time is longer than the specified commit time, will the
> > latest
> > > record in the suppression buffer get stored in a changelog?
> > >
> > > Thanks,
> > > Bill
> > >
> > > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io>
> wrote:
> > >
> > > > Thanks for the feedback, Matthias,
> > > >
> > > > It seems like in straightforward relational processing cases, it
> would
> > > not
> > > > make sense to bound the lateness of KTables. In general, it seems
> > better
> > > to
> > > > have "guard rails" in place that make it easier to write sensible
> > > programs
> > > > than insensible ones.
> > > >
> > > > But I'm still going to argue in favor of keeping it for all KTables
> ;)
> > > >
> > > > 1. I believe it is simpler to understand the operator if it has one
> > > uniform
> > > > definition, regardless of context. It's well defined and intuitive
> what
> > > > will happen when you use late-event suppression on a KTable, so I
> think
> > > > nothing surprising or dangerous will happen in that case. From my
> > > > perspective, having two sets of allowed operations is actually an
> > > increase
> > > > in cognitive complexity.
> > > >
> > > > 2. To me, it's not crazy to use the operator this way. For example,
> in
> > > lieu
> > > > of full-featured timestamp semantics, I can implement MVCC behavior
> > when
> > > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> > that
> > > > there are other, non-obvious applications of suppressing late events
> on
> > > > KTables.
> > > >
> > > > 3. Not to get too much into implementation details in a KIP
> discussion,
> > > but
> > > > if we did want to make late-event suppression available only on
> > windowed
> > > > KTables, we have two enforcement options:
> > > >   a. check when we build the topology - this would be simple to
> > > implement,
> > > > but would be a runtime check. Hopefully, people write tests for their
> > > > topology before deploying them, so the feedback loop isn't
> > instantaneous,
> > > > but it's not too long either.
> > > >   b. add a new WindowedKTable type - this would be a compile time
> > check,
> > > > but would also be substantial increase of both interface and code
> > > > complexity.
> > > >
> > > > We should definitely strive to have guard rails protecting against
> > > > surprising or dangerous behavior. Protecting against programs that we
> > > don't
> > > > currently predict is a lesser benefit, and I think we can put up
> guard
> > > > rails on a case-by-case basis for that. It seems like the increase in
> > > > cognitive (and potentially code and interface) complexity makes me
> > think
> > > we
> > > > should skip this case.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> > matthias@confluent.io>
> > > > wrote:
> > > >
> > > > > Thanks for the KIP John.
> > > > >
> > > > > One initial comments about the last example "Bounded lateness":
> For a
> > > > > non-windowed KTable bounding the lateness does not really make
> sense,
> > > > > does it?
> > > > >
> > > > > Thus, I am wondering if we should allow `suppressLateEvents()` for
> > this
> > > > > case? It seems to be better to only allow it for windowed-KTables.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > > I noticed this (lack of primary parameter) as well.
> > > > > >
> > > > > > What you gave as new example is semantically the same as what I
> > > > > suggested.
> > > > > > So it is good by me.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <john@confluent.io
> >
> > > > wrote:
> > > > > >
> > > > > >> Thanks for taking look, Ted,
> > > > > >>
> > > > > >> I agree this is a departure from the conventions of Streams DSL.
> > > > > >>
> > > > > >> Most of our config objects have one or two "required"
> parameters,
> > > > which
> > > > > fit
> > > > > >> naturally with the static factory method approach. TimeWindow,
> for
> > > > > example,
> > > > > >> requires a size parameter, so we can naturally say
> > > > TimeWindows.of(size).
> > > > > >>
> > > > > >> I think in the case of a suppression, there's really no "core"
> > > > > parameter,
> > > > > >> and "Suppression.of()" seems sillier than "new Suppression()". I
> > > think
> > > > > that
> > > > > >> Suppression.of(duration) would be ambiguous, since there are
> many
> > > > > durations
> > > > > >> that we can configure.
> > > > > >>
> > > > > >> However, thinking about it again, I suppose that I can give each
> > > > > >> configuration method a static version, which would let you
> replace
> > > > "new
> > > > > >> Suppression()." with "Suppression." in all the examples.
> > Basically,
> > > > > instead
> > > > > >> of "of()", we'd support any of the methods I listed.
> > > > > >>
> > > > > >> For example:
> > > > > >>
> > > > > >> windowCounts
> > > > > >>     .suppress(
> > > > > >>         Suppression
> > > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > > >>             .suppressIntermediateEvents(
> > > > > >>
> > > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > > >>             )
> > > > > >>     );
> > > > > >>
> > > > > >>
> > > > > >> Does that seem better?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> -John
> > > > > >>
> > > > > >>
> > > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > > wrote:
> > > > > >>
> > > > > >>> I started to read this KIP which contains a lot of materials.
> > > > > >>>
> > > > > >>> One suggestion:
> > > > > >>>
> > > > > >>>     .suppress(
> > > > > >>>         new Suppression()
> > > > > >>>
> > > > > >>>
> > > > > >>> Do you think it would be more consistent with the rest of
> Streams
> > > > data
> > > > > >>> structures by supporting `of` ?
> > > > > >>>
> > > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > > >>>
> > > > > >>>
> > > > > >>> Cheers
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <
> john@confluent.io
> > >
> > > > > wrote:
> > > > > >>>
> > > > > >>>> Hello devs and users,
> > > > > >>>>
> > > > > >>>> Please take some time to consider this proposal for Kafka
> > Streams:
> > > > > >>>>
> > > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > > >>>>
> > > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > > >>>>
> > > > > >>>> The basic idea is to provide:
> > > > > >>>> * more usable control over update rate (vs the current state
> > store
> > > > > >>> caches)
> > > > > >>>> * the final-result-for-windowed-computations feature which
> > several
> > > > > >> people
> > > > > >>>> have requested
> > > > > >>>>
> > > > > >>>> I look forward to your feedback!
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> -John
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Bill Bejeck <bb...@gmail.com>.

Thanks for the explanation, that does make sense.  I have some questions on
operations, but I'll just wait for the PR and tests.

Thanks,
Bill

On Wed, Jun 27, 2018 at 8:14 PM John Roesler <jo...@confluent.io> wrote:

> Hi Bill,
>
> Thanks for the review!
>
> Your question is very much applicable to the KIP and not at all an
> implementation detail. Thanks for bringing it up.
>
> I'm proposing not to change the existing caches and configurations at all
> (for now).
>
> Imagine you have a topology like this:
> commit.interval.ms = 100
>
> (ktable1 (cached)) -> (suppress emitAfter 200)
>
> The first ktable (ktable1) will respect the commit interval and buffer
> events for 100ms before logging, storing, or forwarding them (IIRC).
> Therefore, the second ktable (suppress) will only see the events at a rate
> of once per 100ms. It will apply its own buffering, and emit once per 200ms
> This case is pretty trivial because the suppress time is a multiple of the
> commit interval.
>
> When it's not an integer multiple, you'll get behavior like in this marble
> diagram:
>
>
> <-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->
>
> [ KTable caching with commit interval = 2 ]
>
> <--------(k:2)---------(k:4)---------(k:6)->
>
>       [ suppress with emitAfter = 3 ]
>
> <---------------(k:2)----------------(k:6)->
>
>
> If this behavior isn't desired (for example, if you wanted to emit (k:3) at
> time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
> modifying the topology to disable caching. Then, the behavior is more
> simply determined just by the suppress operator.
>
> Does that seem right to you?
>
>
> Regarding the changelogs, because the suppression operator hangs onto
> events for a while, it will need its own changelog. The changelog
> should represent the current state of the buffer at all times. So when the
> suppress operator sees (k:2), for example, it will log (k:2). When it
> later gets to time 3, it's time to emit (k:2) downstream. Because k is no
> longer buffered, the suppress operator will log (k:null). Thus, when
> recovering,
> it can rebuild the buffer by reading its changelog.
>
> What do you think about this?
>
> Thanks,
> -John
>
>
>
> On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:
>
> > Hi John,  thanks for the KIP.
> >
> > Early on in the KIP, you mention the current approaches for controlling
> the
> > rate of downstream records from a KTable, cache size configuration and
> > commit time.
> >
> > Will these configuration parameters still be in effect for tables that
> > don't use suppression?  For tables taking advantage of suppression, will
> > these configurations have no impact?
> > This last question may be to implementation specific but if the requested
> > suppression time is longer than the specified commit time, will the
> latest
> > record in the suppression buffer get stored in a changelog?
> >
> > Thanks,
> > Bill
> >
> > On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io> wrote:
> >
> > > Thanks for the feedback, Matthias,
> > >
> > > It seems like in straightforward relational processing cases, it would
> > not
> > > make sense to bound the lateness of KTables. In general, it seems
> better
> > to
> > > have "guard rails" in place that make it easier to write sensible
> > programs
> > > than insensible ones.
> > >
> > > But I'm still going to argue in favor of keeping it for all KTables ;)
> > >
> > > 1. I believe it is simpler to understand the operator if it has one
> > uniform
> > > definition, regardless of context. It's well defined and intuitive what
> > > will happen when you use late-event suppression on a KTable, so I think
> > > nothing surprising or dangerous will happen in that case. From my
> > > perspective, having two sets of allowed operations is actually an
> > increase
> > > in cognitive complexity.
> > >
> > > 2. To me, it's not crazy to use the operator this way. For example, in
> > lieu
> > > of full-featured timestamp semantics, I can implement MVCC behavior
> when
> > > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect
> that
> > > there are other, non-obvious applications of suppressing late events on
> > > KTables.
> > >
> > > 3. Not to get too much into implementation details in a KIP discussion,
> > but
> > > if we did want to make late-event suppression available only on
> windowed
> > > KTables, we have two enforcement options:
> > >   a. check when we build the topology - this would be simple to
> > implement,
> > > but would be a runtime check. Hopefully, people write tests for their
> > > topology before deploying them, so the feedback loop isn't
> instantaneous,
> > > but it's not too long either.
> > >   b. add a new WindowedKTable type - this would be a compile time
> check,
> > > but would also be substantial increase of both interface and code
> > > complexity.
> > >
> > > We should definitely strive to have guard rails protecting against
> > > surprising or dangerous behavior. Protecting against programs that we
> > don't
> > > currently predict is a lesser benefit, and I think we can put up guard
> > > rails on a case-by-case basis for that. It seems like the increase in
> > > cognitive (and potentially code and interface) complexity makes me
> think
> > we
> > > should skip this case.
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <
> matthias@confluent.io>
> > > wrote:
> > >
> > > > Thanks for the KIP John.
> > > >
> > > > One initial comments about the last example "Bounded lateness": For a
> > > > non-windowed KTable bounding the lateness does not really make sense,
> > > > does it?
> > > >
> > > > Thus, I am wondering if we should allow `suppressLateEvents()` for
> this
> > > > case? It seems to be better to only allow it for windowed-KTables.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > > I noticed this (lack of primary parameter) as well.
> > > > >
> > > > > What you gave as new example is semantically the same as what I
> > > > suggested.
> > > > > So it is good by me.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io>
> > > wrote:
> > > > >
> > > > >> Thanks for taking look, Ted,
> > > > >>
> > > > >> I agree this is a departure from the conventions of Streams DSL.
> > > > >>
> > > > >> Most of our config objects have one or two "required" parameters,
> > > which
> > > > fit
> > > > >> naturally with the static factory method approach. TimeWindow, for
> > > > example,
> > > > >> requires a size parameter, so we can naturally say
> > > TimeWindows.of(size).
> > > > >>
> > > > >> I think in the case of a suppression, there's really no "core"
> > > > parameter,
> > > > >> and "Suppression.of()" seems sillier than "new Suppression()". I
> > think
> > > > that
> > > > >> Suppression.of(duration) would be ambiguous, since there are many
> > > > durations
> > > > >> that we can configure.
> > > > >>
> > > > >> However, thinking about it again, I suppose that I can give each
> > > > >> configuration method a static version, which would let you replace
> > > "new
> > > > >> Suppression()." with "Suppression." in all the examples.
> Basically,
> > > > instead
> > > > >> of "of()", we'd support any of the methods I listed.
> > > > >>
> > > > >> For example:
> > > > >>
> > > > >> windowCounts
> > > > >>     .suppress(
> > > > >>         Suppression
> > > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > > >>             .suppressIntermediateEvents(
> > > > >>
> > > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > > >>             )
> > > > >>     );
> > > > >>
> > > > >>
> > > > >> Does that seem better?
> > > > >>
> > > > >> Thanks,
> > > > >> -John
> > > > >>
> > > > >>
> > > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> > wrote:
> > > > >>
> > > > >>> I started to read this KIP which contains a lot of materials.
> > > > >>>
> > > > >>> One suggestion:
> > > > >>>
> > > > >>>     .suppress(
> > > > >>>         new Suppression()
> > > > >>>
> > > > >>>
> > > > >>> Do you think it would be more consistent with the rest of Streams
> > > data
> > > > >>> structures by supporting `of` ?
> > > > >>>
> > > > >>> Suppression.of(Duration.ofMinutes(10))
> > > > >>>
> > > > >>>
> > > > >>> Cheers
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <john@confluent.io
> >
> > > > wrote:
> > > > >>>
> > > > >>>> Hello devs and users,
> > > > >>>>
> > > > >>>> Please take some time to consider this proposal for Kafka
> Streams:
> > > > >>>>
> > > > >>>> KIP-328: Ability to suppress updates for KTables
> > > > >>>>
> > > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > > >>>>
> > > > >>>> The basic idea is to provide:
> > > > >>>> * more usable control over update rate (vs the current state
> store
> > > > >>> caches)
> > > > >>>> * the final-result-for-windowed-computations feature which
> several
> > > > >> people
> > > > >>>> have requested
> > > > >>>>
> > > > >>>> I look forward to your feedback!
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> -John
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Bill,

Thanks for the review!

Your question is very much applicable to the KIP and not at all an
implementation detail. Thanks for bringing it up.

I'm proposing not to change the existing caches and configurations at all
(for now).

Imagine you have a topology like this:
commit.interval.ms = 100

(ktable1 (cached)) -> (suppress emitAfter 200)

The first ktable (ktable1) will respect the commit interval and buffer
events for 100ms before logging, storing, or forwarding them (IIRC).
Therefore, the second ktable (suppress) will only see the events at a rate
of once per 100ms. It will apply its own buffering, and emit once per 200ms
This case is pretty trivial because the suppress time is a multiple of the
commit interval.

When it's not an integer multiple, you'll get behavior like in this marble
diagram:


<-(k:1)--(k:2)--(k:3)--(k:4)--(k:5)--(k:6)->

[ KTable caching with commit interval = 2 ]

<--------(k:2)---------(k:4)---------(k:6)->

      [ suppress with emitAfter = 3 ]

<---------------(k:2)----------------(k:6)->


If this behavior isn't desired (for example, if you wanted to emit (k:3) at
time 3, I'd recommend setting the "cache.max.bytes.buffering" to 0 or
modifying the topology to disable caching. Then, the behavior is more
simply determined just by the suppress operator.

Does that seem right to you?


Regarding the changelogs, because the suppression operator hangs onto
events for a while, it will need its own changelog. The changelog
should represent the current state of the buffer at all times. So when the
suppress operator sees (k:2), for example, it will log (k:2). When it
later gets to time 3, it's time to emit (k:2) downstream. Because k is no
longer buffered, the suppress operator will log (k:null). Thus, when
recovering,
it can rebuild the buffer by reading its changelog.

What do you think about this?

Thanks,
-John



On Wed, Jun 27, 2018 at 4:16 PM Bill Bejeck <bb...@gmail.com> wrote:

> Hi John,  thanks for the KIP.
>
> Early on in the KIP, you mention the current approaches for controlling the
> rate of downstream records from a KTable, cache size configuration and
> commit time.
>
> Will these configuration parameters still be in effect for tables that
> don't use suppression?  For tables taking advantage of suppression, will
> these configurations have no impact?
> This last question may be to implementation specific but if the requested
> suppression time is longer than the specified commit time, will the latest
> record in the suppression buffer get stored in a changelog?
>
> Thanks,
> Bill
>
> On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io> wrote:
>
> > Thanks for the feedback, Matthias,
> >
> > It seems like in straightforward relational processing cases, it would
> not
> > make sense to bound the lateness of KTables. In general, it seems better
> to
> > have "guard rails" in place that make it easier to write sensible
> programs
> > than insensible ones.
> >
> > But I'm still going to argue in favor of keeping it for all KTables ;)
> >
> > 1. I believe it is simpler to understand the operator if it has one
> uniform
> > definition, regardless of context. It's well defined and intuitive what
> > will happen when you use late-event suppression on a KTable, so I think
> > nothing surprising or dangerous will happen in that case. From my
> > perspective, having two sets of allowed operations is actually an
> increase
> > in cognitive complexity.
> >
> > 2. To me, it's not crazy to use the operator this way. For example, in
> lieu
> > of full-featured timestamp semantics, I can implement MVCC behavior when
> > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
> > there are other, non-obvious applications of suppressing late events on
> > KTables.
> >
> > 3. Not to get too much into implementation details in a KIP discussion,
> but
> > if we did want to make late-event suppression available only on windowed
> > KTables, we have two enforcement options:
> >   a. check when we build the topology - this would be simple to
> implement,
> > but would be a runtime check. Hopefully, people write tests for their
> > topology before deploying them, so the feedback loop isn't instantaneous,
> > but it's not too long either.
> >   b. add a new WindowedKTable type - this would be a compile time check,
> > but would also be substantial increase of both interface and code
> > complexity.
> >
> > We should definitely strive to have guard rails protecting against
> > surprising or dangerous behavior. Protecting against programs that we
> don't
> > currently predict is a lesser benefit, and I think we can put up guard
> > rails on a case-by-case basis for that. It seems like the increase in
> > cognitive (and potentially code and interface) complexity makes me think
> we
> > should skip this case.
> >
> > What do you think?
> >
> > Thanks,
> > -John
> >
> > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
> > wrote:
> >
> > > Thanks for the KIP John.
> > >
> > > One initial comments about the last example "Bounded lateness": For a
> > > non-windowed KTable bounding the lateness does not really make sense,
> > > does it?
> > >
> > > Thus, I am wondering if we should allow `suppressLateEvents()` for this
> > > case? It seems to be better to only allow it for windowed-KTables.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > I noticed this (lack of primary parameter) as well.
> > > >
> > > > What you gave as new example is semantically the same as what I
> > > suggested.
> > > > So it is good by me.
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > >> Thanks for taking look, Ted,
> > > >>
> > > >> I agree this is a departure from the conventions of Streams DSL.
> > > >>
> > > >> Most of our config objects have one or two "required" parameters,
> > which
> > > fit
> > > >> naturally with the static factory method approach. TimeWindow, for
> > > example,
> > > >> requires a size parameter, so we can naturally say
> > TimeWindows.of(size).
> > > >>
> > > >> I think in the case of a suppression, there's really no "core"
> > > parameter,
> > > >> and "Suppression.of()" seems sillier than "new Suppression()". I
> think
> > > that
> > > >> Suppression.of(duration) would be ambiguous, since there are many
> > > durations
> > > >> that we can configure.
> > > >>
> > > >> However, thinking about it again, I suppose that I can give each
> > > >> configuration method a static version, which would let you replace
> > "new
> > > >> Suppression()." with "Suppression." in all the examples. Basically,
> > > instead
> > > >> of "of()", we'd support any of the methods I listed.
> > > >>
> > > >> For example:
> > > >>
> > > >> windowCounts
> > > >>     .suppress(
> > > >>         Suppression
> > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > >>             .suppressIntermediateEvents(
> > > >>
> > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > >>             )
> > > >>     );
> > > >>
> > > >>
> > > >> Does that seem better?
> > > >>
> > > >> Thanks,
> > > >> -John
> > > >>
> > > >>
> > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> wrote:
> > > >>
> > > >>> I started to read this KIP which contains a lot of materials.
> > > >>>
> > > >>> One suggestion:
> > > >>>
> > > >>>     .suppress(
> > > >>>         new Suppression()
> > > >>>
> > > >>>
> > > >>> Do you think it would be more consistent with the rest of Streams
> > data
> > > >>> structures by supporting `of` ?
> > > >>>
> > > >>> Suppression.of(Duration.ofMinutes(10))
> > > >>>
> > > >>>
> > > >>> Cheers
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> > > wrote:
> > > >>>
> > > >>>> Hello devs and users,
> > > >>>>
> > > >>>> Please take some time to consider this proposal for Kafka Streams:
> > > >>>>
> > > >>>> KIP-328: Ability to suppress updates for KTables
> > > >>>>
> > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > >>>>
> > > >>>> The basic idea is to provide:
> > > >>>> * more usable control over update rate (vs the current state store
> > > >>> caches)
> > > >>>> * the final-result-for-windowed-computations feature which several
> > > >> people
> > > >>>> have requested
> > > >>>>
> > > >>>> I look forward to your feedback!
> > > >>>>
> > > >>>> Thanks,
> > > >>>> -John
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Bill Bejeck <bb...@gmail.com>.

Hi John,  thanks for the KIP.

Early on in the KIP, you mention the current approaches for controlling the
rate of downstream records from a KTable, cache size configuration and
commit time.

Will these configuration parameters still be in effect for tables that
don't use suppression?  For tables taking advantage of suppression, will
these configurations have no impact?
This last question may be to implementation specific but if the requested
suppression time is longer than the specified commit time, will the latest
record in the suppression buffer get stored in a changelog?

Thanks,
Bill

On Wed, Jun 27, 2018 at 3:04 PM John Roesler <jo...@confluent.io> wrote:

> Thanks for the feedback, Matthias,
>
> It seems like in straightforward relational processing cases, it would not
> make sense to bound the lateness of KTables. In general, it seems better to
> have "guard rails" in place that make it easier to write sensible programs
> than insensible ones.
>
> But I'm still going to argue in favor of keeping it for all KTables ;)
>
> 1. I believe it is simpler to understand the operator if it has one uniform
> definition, regardless of context. It's well defined and intuitive what
> will happen when you use late-event suppression on a KTable, so I think
> nothing surprising or dangerous will happen in that case. From my
> perspective, having two sets of allowed operations is actually an increase
> in cognitive complexity.
>
> 2. To me, it's not crazy to use the operator this way. For example, in lieu
> of full-featured timestamp semantics, I can implement MVCC behavior when
> building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
> there are other, non-obvious applications of suppressing late events on
> KTables.
>
> 3. Not to get too much into implementation details in a KIP discussion, but
> if we did want to make late-event suppression available only on windowed
> KTables, we have two enforcement options:
>   a. check when we build the topology - this would be simple to implement,
> but would be a runtime check. Hopefully, people write tests for their
> topology before deploying them, so the feedback loop isn't instantaneous,
> but it's not too long either.
>   b. add a new WindowedKTable type - this would be a compile time check,
> but would also be substantial increase of both interface and code
> complexity.
>
> We should definitely strive to have guard rails protecting against
> surprising or dangerous behavior. Protecting against programs that we don't
> currently predict is a lesser benefit, and I think we can put up guard
> rails on a case-by-case basis for that. It seems like the increase in
> cognitive (and potentially code and interface) complexity makes me think we
> should skip this case.
>
> What do you think?
>
> Thanks,
> -John
>
> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
> wrote:
>
> > Thanks for the KIP John.
> >
> > One initial comments about the last example "Bounded lateness": For a
> > non-windowed KTable bounding the lateness does not really make sense,
> > does it?
> >
> > Thus, I am wondering if we should allow `suppressLateEvents()` for this
> > case? It seems to be better to only allow it for windowed-KTables.
> >
> >
> > -Matthias
> >
> >
> > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > I noticed this (lack of primary parameter) as well.
> > >
> > > What you gave as new example is semantically the same as what I
> > suggested.
> > > So it is good by me.
> > >
> > > Thanks
> > >
> > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io>
> wrote:
> > >
> > >> Thanks for taking look, Ted,
> > >>
> > >> I agree this is a departure from the conventions of Streams DSL.
> > >>
> > >> Most of our config objects have one or two "required" parameters,
> which
> > fit
> > >> naturally with the static factory method approach. TimeWindow, for
> > example,
> > >> requires a size parameter, so we can naturally say
> TimeWindows.of(size).
> > >>
> > >> I think in the case of a suppression, there's really no "core"
> > parameter,
> > >> and "Suppression.of()" seems sillier than "new Suppression()". I think
> > that
> > >> Suppression.of(duration) would be ambiguous, since there are many
> > durations
> > >> that we can configure.
> > >>
> > >> However, thinking about it again, I suppose that I can give each
> > >> configuration method a static version, which would let you replace
> "new
> > >> Suppression()." with "Suppression." in all the examples. Basically,
> > instead
> > >> of "of()", we'd support any of the methods I listed.
> > >>
> > >> For example:
> > >>
> > >> windowCounts
> > >>     .suppress(
> > >>         Suppression
> > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > >>             .suppressIntermediateEvents(
> > >>
> >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > >>             )
> > >>     );
> > >>
> > >>
> > >> Does that seem better?
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >>
> > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
> > >>
> > >>> I started to read this KIP which contains a lot of materials.
> > >>>
> > >>> One suggestion:
> > >>>
> > >>>     .suppress(
> > >>>         new Suppression()
> > >>>
> > >>>
> > >>> Do you think it would be more consistent with the rest of Streams
> data
> > >>> structures by supporting `of` ?
> > >>>
> > >>> Suppression.of(Duration.ofMinutes(10))
> > >>>
> > >>>
> > >>> Cheers
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> > wrote:
> > >>>
> > >>>> Hello devs and users,
> > >>>>
> > >>>> Please take some time to consider this proposal for Kafka Streams:
> > >>>>
> > >>>> KIP-328: Ability to suppress updates for KTables
> > >>>>
> > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >>>>
> > >>>> The basic idea is to provide:
> > >>>> * more usable control over update rate (vs the current state store
> > >>> caches)
> > >>>> * the final-result-for-windowed-computations feature which several
> > >> people
> > >>>> have requested
> > >>>>
> > >>>> I look forward to your feedback!
> > >>>>
> > >>>> Thanks,
> > >>>> -John
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hi Guozhang,

Thanks for the review and the alternative idea.


A quick note on the metrics. I actually do think that it should be true
that "Skipped records are records that are for one
reason or another invalid." I recently added the change to record a
skipped-record when we get a record for a window that is beyond the
retention period. I think this was a mistake, since such records aren't
invalid. I'd like to draft a separate KIP to correct that with a new
metric. For clarity, the behavior in the currently released version
actually doesn't record any metric at all and instead silently drops events
for no-longer-retained windows.



It seems, even with the metric issue aside, the idea about directly
configuring the emit behavior of windowed aggregations might be a more
straightforward way to achieve "final results" for windowed computations.

Regarding the consistency of the aggregation state vs. the result of
suppress, the KIP is a separate operation that produces a new KTable. It
seems actually strange to me to expect the state *upstream* of suppress to
be consistent with the downstream *result* of suppress. This would be like
doing a filter after the aggregation and then expecting a query of the
aggregation to reflect the filter!

I think that if a user wishes to apply suppress to a KTable *and* offer a
queriable view of the suppressed result, there is a solution within the
existing KTable framework: to offer a variant of suppress taking
Materialized. In other words, since the suppress operator actually produces
a new KTable, we could allow users to make the resulting table queriable
(instead of the upstream one), in which case the queriable state can again
be consistent with the downstream. (I anticipate a question about the
implementation efficiency, but I think there's a good solution for that)


About your proposal: suppose I have that window of size 10 min, and `until`
20 min and use Emitted.onlyOnceAfterWindowClosed(late-period-allowed = 5
min). If I get an event at stream time t16 with record time t4, will the
Emitted config prevent both the state update and the emission, or only the
emission? It seems like you're proposing that it would actually not update
the state store or emit, so that IQ results would be consistent with the
downstream. I'm not sure if it's intuitive for a parameter that controls
the emission pattern to also control the computation itself.

Alternatively, we could move the "close" parameter to the window
definition, allowing Emitted to just be onlyOnceAfterWindowClosed() or
wheneverWindowUpdated().

It seems like Emitted might open the door to follow-on requests for other
types of fine-grained control over the emission pattern, and we would have
to make a case-by-case call on whether each one belongs with Emitted or
with Suppress, or both. I think this is actually an advantage to preferring
one operation: it's a simpler structure for both users and developers to
reason about shaping the emission pattern.


All in all, it remains simpler to me to have just one operation responsible
for "suppressing" updates. But obviously can't speak for everyone. I would
very much like to hear more feedback about this.

What do you think about these ideas?

Thanks again,
-John




On Wed, Jun 27, 2018 at 5:10 PM Guozhang Wang <wa...@gmail.com> wrote:

> Hello John, thanks for putting up the KIP. I have a meta comment:
>
> We need to clarify the difference between late event suppression semantics
> and the window retention semantics that result into a windowed KTable. More
> specifically, say you have a window of size 10 min, and `until` 20 min, and
> the resulted window KTable is suppressed for events later than 5 min. My
> understanding is that:
>
> a) for a specific window starting at t0 and ending at t10, if an upstream
> record is received with timestamp falling into [t0, t10) but the current
> stream time has been larger than t20, this record will be dropped on the
> floor and not be used to update the window KTable. Also we record it in the
> "skipped-record" metric.
> b) as above, if an upstream record is received with timestamp falling into
> [t0, t10), and the current stream time is larger than t15 but smaller than
> t20, this record will still be used to update the windowed table, but the
> updated result will not be sent downstream, and it is recorded in the
> "late-event-suppression" metric.
> c) as a result, if we query the windowed KTable's result, we will see
> updates up to stream time t20, but from its resulted changelog stream, we
> will only see results up to stream time t15.
>
> If that is correct, I felt it is really complicated for users to
> comprehend: why should I want to have my windowed aggregations to take late
> records up to 20 minutes, while not seeing their resulted updates in the
> changelog stream? And the semantic difference between "skipped record due
> to window retention" and "late-event-suppression" is quite obscure (btw I
> am not sure it is true that "Skipped records are records that are for one
> reason or another invalid.", since "skipped record due to window retention
> time" is not really due to an invalid record, but some window store
> implementation details, right?)
>
> Thinking about this further, although I understand the intention to propose
> an unified API for all three motivation requests, I feel the "Final value
> of a window" request may better be handled in a more restricted interface.
>
>
> So just throwing out a bold / controversial idea to this proposal: instead
> of using a unified suppress() for all three motivation scenarios, we have:
>
> 1) KTable#suppress() for "Request shaping" and "Easier config", and it will
> only for intermediate-event-suppression, and in this case, for both
> windowed and non-windowed KTable, the suppression semantics can be
> dependent on each key's record timestamp plus the byte buffer size limit /
> buffer strategy.
>
> 2) In TimeWindowedStream / SessionWindowedStream#aggregate() that result in
> a windowed KTable (note that although KGroupedTable#aggregate can also
> result in a windowed KTable, its window semantics is not very well defined
> I'd suggest we defer its discussion later), we add another config object,
> e.g.:
>
> TimeWindowedStream#aggregate(final Initializer<VR> initializer,
>                                                        final Aggregator<?
> super K, ? super V, VR> aggregator,
>                                                        final
> Materialized<K, VR, WindowStore<Bytes, byte[]>> materialized,
>                                                        final Emitted
> emitted);
>
> public class Emitted {
>
>     static Emitted onlyOnceAfterWindowClosed(final long
> late-period-allowed);
>
>     static Emitted wheneverWindowUpdated();  // this may still be subject
> to caching effects, so not exactly every update..
>
> }
>
> ------------
>
> The Emitted config option is going to be much less expressive than
> `Suppressed`, intentionally, to only cover the "Final value of a window"
> case. Note that the resulted window KTable can still be suppressed
> programmatically, but if it is already been emitted only once, then the
> suppress function will take no effect.
>
> In this case, the difference of "late-period-allowed" v.s.
> "Windows.until()" is that, the former determines if or not a record will be
> applied to update the window or not, and it is controlled in the
> WindowedStreamAggregateProcessor, and whenever an event gets dropped
> because of it we record it in a new, say "too-late-records" metric (same to
> "late-event-suppression" actually, just using a different name, while the
> latter only controls how long at least each window will be retained for
> queries and should normally be larger than (window size + late
> period-allowed). From implementation's pov, if the retention time of a
> window is less than (window size + late-period-allowed), the Processor may
> not be able to find any matching window when first trying to get it from
> store, and it then need to tell if it is because the key is never been
> updated for this window or because the window retention has elapsed, hence
> it needs to be aware of the window retention time. And in the latter case,
> it will drop it on the floor and also record it in "too-late-records"
> metrics. And also this emit policy would not need any buffering, since the
> original store's cache contains the record context already need for
> flushing downstream.
>
> My primary motivation is that, from user's perspective, this may be easier
> to comprehensive and reason from the metrics. But if people think it
> actually does not make things better, I'm happy to rethink the current
> proposal.
>
>
>
>
> Guozhang
>
>
> On Wed, Jun 27, 2018 at 12:04 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Thanks for the feedback, Matthias,
> >
> > It seems like in straightforward relational processing cases, it would
> not
> > make sense to bound the lateness of KTables. In general, it seems better
> to
> > have "guard rails" in place that make it easier to write sensible
> programs
> > than insensible ones.
> >
> > But I'm still going to argue in favor of keeping it for all KTables ;)
> >
> > 1. I believe it is simpler to understand the operator if it has one
> uniform
> > definition, regardless of context. It's well defined and intuitive what
> > will happen when you use late-event suppression on a KTable, so I think
> > nothing surprising or dangerous will happen in that case. From my
> > perspective, having two sets of allowed operations is actually an
> increase
> > in cognitive complexity.
> >
> > 2. To me, it's not crazy to use the operator this way. For example, in
> lieu
> > of full-featured timestamp semantics, I can implement MVCC behavior when
> > building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
> > there are other, non-obvious applications of suppressing late events on
> > KTables.
> >
> > 3. Not to get too much into implementation details in a KIP discussion,
> but
> > if we did want to make late-event suppression available only on windowed
> > KTables, we have two enforcement options:
> >   a. check when we build the topology - this would be simple to
> implement,
> > but would be a runtime check. Hopefully, people write tests for their
> > topology before deploying them, so the feedback loop isn't instantaneous,
> > but it's not too long either.
> >   b. add a new WindowedKTable type - this would be a compile time check,
> > but would also be substantial increase of both interface and code
> > complexity.
> >
> > We should definitely strive to have guard rails protecting against
> > surprising or dangerous behavior. Protecting against programs that we
> don't
> > currently predict is a lesser benefit, and I think we can put up guard
> > rails on a case-by-case basis for that. It seems like the increase in
> > cognitive (and potentially code and interface) complexity makes me think
> we
> > should skip this case.
> >
> > What do you think?
> >
> > Thanks,
> > -John
> >
> > On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
> > wrote:
> >
> > > Thanks for the KIP John.
> > >
> > > One initial comments about the last example "Bounded lateness": For a
> > > non-windowed KTable bounding the lateness does not really make sense,
> > > does it?
> > >
> > > Thus, I am wondering if we should allow `suppressLateEvents()` for this
> > > case? It seems to be better to only allow it for windowed-KTables.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > > I noticed this (lack of primary parameter) as well.
> > > >
> > > > What you gave as new example is semantically the same as what I
> > > suggested.
> > > > So it is good by me.
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io>
> > wrote:
> > > >
> > > >> Thanks for taking look, Ted,
> > > >>
> > > >> I agree this is a departure from the conventions of Streams DSL.
> > > >>
> > > >> Most of our config objects have one or two "required" parameters,
> > which
> > > fit
> > > >> naturally with the static factory method approach. TimeWindow, for
> > > example,
> > > >> requires a size parameter, so we can naturally say
> > TimeWindows.of(size).
> > > >>
> > > >> I think in the case of a suppression, there's really no "core"
> > > parameter,
> > > >> and "Suppression.of()" seems sillier than "new Suppression()". I
> think
> > > that
> > > >> Suppression.of(duration) would be ambiguous, since there are many
> > > durations
> > > >> that we can configure.
> > > >>
> > > >> However, thinking about it again, I suppose that I can give each
> > > >> configuration method a static version, which would let you replace
> > "new
> > > >> Suppression()." with "Suppression." in all the examples. Basically,
> > > instead
> > > >> of "of()", we'd support any of the methods I listed.
> > > >>
> > > >> For example:
> > > >>
> > > >> windowCounts
> > > >>     .suppress(
> > > >>         Suppression
> > > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > > >>             .suppressIntermediateEvents(
> > > >>
> > >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > > >>             )
> > > >>     );
> > > >>
> > > >>
> > > >> Does that seem better?
> > > >>
> > > >> Thanks,
> > > >> -John
> > > >>
> > > >>
> > > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com>
> wrote:
> > > >>
> > > >>> I started to read this KIP which contains a lot of materials.
> > > >>>
> > > >>> One suggestion:
> > > >>>
> > > >>>     .suppress(
> > > >>>         new Suppression()
> > > >>>
> > > >>>
> > > >>> Do you think it would be more consistent with the rest of Streams
> > data
> > > >>> structures by supporting `of` ?
> > > >>>
> > > >>> Suppression.of(Duration.ofMinutes(10))
> > > >>>
> > > >>>
> > > >>> Cheers
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> > > wrote:
> > > >>>
> > > >>>> Hello devs and users,
> > > >>>>
> > > >>>> Please take some time to consider this proposal for Kafka Streams:
> > > >>>>
> > > >>>> KIP-328: Ability to suppress updates for KTables
> > > >>>>
> > > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > > >>>>
> > > >>>> The basic idea is to provide:
> > > >>>> * more usable control over update rate (vs the current state store
> > > >>> caches)
> > > >>>> * the final-result-for-windowed-computations feature which several
> > > >> people
> > > >>>> have requested
> > > >>>>
> > > >>>> I look forward to your feedback!
> > > >>>>
> > > >>>> Thanks,
> > > >>>> -John
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Guozhang Wang <wa...@gmail.com>.

Hello John, thanks for putting up the KIP. I have a meta comment:

We need to clarify the difference between late event suppression semantics
and the window retention semantics that result into a windowed KTable. More
specifically, say you have a window of size 10 min, and `until` 20 min, and
the resulted window KTable is suppressed for events later than 5 min. My
understanding is that:

a) for a specific window starting at t0 and ending at t10, if an upstream
record is received with timestamp falling into [t0, t10) but the current
stream time has been larger than t20, this record will be dropped on the
floor and not be used to update the window KTable. Also we record it in the
"skipped-record" metric.
b) as above, if an upstream record is received with timestamp falling into
[t0, t10), and the current stream time is larger than t15 but smaller than
t20, this record will still be used to update the windowed table, but the
updated result will not be sent downstream, and it is recorded in the
"late-event-suppression" metric.
c) as a result, if we query the windowed KTable's result, we will see
updates up to stream time t20, but from its resulted changelog stream, we
will only see results up to stream time t15.

If that is correct, I felt it is really complicated for users to
comprehend: why should I want to have my windowed aggregations to take late
records up to 20 minutes, while not seeing their resulted updates in the
changelog stream? And the semantic difference between "skipped record due
to window retention" and "late-event-suppression" is quite obscure (btw I
am not sure it is true that "Skipped records are records that are for one
reason or another invalid.", since "skipped record due to window retention
time" is not really due to an invalid record, but some window store
implementation details, right?)

Thinking about this further, although I understand the intention to propose
an unified API for all three motivation requests, I feel the "Final value
of a window" request may better be handled in a more restricted interface.

So just throwing out a bold / controversial idea to this proposal: instead
of using a unified suppress() for all three motivation scenarios, we have:

1) KTable#suppress() for "Request shaping" and "Easier config", and it will
only for intermediate-event-suppression, and in this case, for both
windowed and non-windowed KTable, the suppression semantics can be
dependent on each key's record timestamp plus the byte buffer size limit /
buffer strategy.

2) In TimeWindowedStream / SessionWindowedStream#aggregate() that result in
a windowed KTable (note that although KGroupedTable#aggregate can also
result in a windowed KTable, its window semantics is not very well defined
I'd suggest we defer its discussion later), we add another config object,
e.g.:

TimeWindowedStream#aggregate(final Initializer<VR> initializer,
                                                       final Aggregator<?
super K, ? super V, VR> aggregator,
                                                       final
Materialized<K, VR, WindowStore<Bytes, byte[]>> materialized,
                                                       final Emitted
emitted);

public class Emitted {

    static Emitted onlyOnceAfterWindowClosed(final long
late-period-allowed);

    static Emitted wheneverWindowUpdated();  // this may still be subject
to caching effects, so not exactly every update..

}

------------

The Emitted config option is going to be much less expressive than
`Suppressed`, intentionally, to only cover the "Final value of a window"
case. Note that the resulted window KTable can still be suppressed
programmatically, but if it is already been emitted only once, then the
suppress function will take no effect.

In this case, the difference of "late-period-allowed" v.s.
"Windows.until()" is that, the former determines if or not a record will be
applied to update the window or not, and it is controlled in the
WindowedStreamAggregateProcessor, and whenever an event gets dropped
because of it we record it in a new, say "too-late-records" metric (same to
"late-event-suppression" actually, just using a different name, while the
latter only controls how long at least each window will be retained for
queries and should normally be larger than (window size + late
period-allowed). From implementation's pov, if the retention time of a
window is less than (window size + late-period-allowed), the Processor may
not be able to find any matching window when first trying to get it from
store, and it then need to tell if it is because the key is never been
updated for this window or because the window retention has elapsed, hence
it needs to be aware of the window retention time. And in the latter case,
it will drop it on the floor and also record it in "too-late-records"
metrics. And also this emit policy would not need any buffering, since the
original store's cache contains the record context already need for
flushing downstream.

My primary motivation is that, from user's perspective, this may be easier
to comprehensive and reason from the metrics. But if people think it
actually does not make things better, I'm happy to rethink the current
proposal.

Guozhang

On Wed, Jun 27, 2018 at 12:04 PM, John Roesler <jo...@confluent.io> wrote:

> Thanks for the feedback, Matthias,
>
> It seems like in straightforward relational processing cases, it would not
> make sense to bound the lateness of KTables. In general, it seems better to
> have "guard rails" in place that make it easier to write sensible programs
> than insensible ones.
>
> But I'm still going to argue in favor of keeping it for all KTables ;)
>
> 1. I believe it is simpler to understand the operator if it has one uniform
> definition, regardless of context. It's well defined and intuitive what
> will happen when you use late-event suppression on a KTable, so I think
> nothing surprising or dangerous will happen in that case. From my
> perspective, having two sets of allowed operations is actually an increase
> in cognitive complexity.
>
> 2. To me, it's not crazy to use the operator this way. For example, in lieu
> of full-featured timestamp semantics, I can implement MVCC behavior when
> building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
> there are other, non-obvious applications of suppressing late events on
> KTables.
>
> 3. Not to get too much into implementation details in a KIP discussion, but
> if we did want to make late-event suppression available only on windowed
> KTables, we have two enforcement options:
>   a. check when we build the topology - this would be simple to implement,
> but would be a runtime check. Hopefully, people write tests for their
> topology before deploying them, so the feedback loop isn't instantaneous,
> but it's not too long either.
>   b. add a new WindowedKTable type - this would be a compile time check,
> but would also be substantial increase of both interface and code
> complexity.
>
> We should definitely strive to have guard rails protecting against
> surprising or dangerous behavior. Protecting against programs that we don't
> currently predict is a lesser benefit, and I think we can put up guard
> rails on a case-by-case basis for that. It seems like the increase in
> cognitive (and potentially code and interface) complexity makes me think we
> should skip this case.
>
> What do you think?
>
> Thanks,
> -John
>
> On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
> wrote:
>
> > Thanks for the KIP John.
> >
> > One initial comments about the last example "Bounded lateness": For a
> > non-windowed KTable bounding the lateness does not really make sense,
> > does it?
> >
> > Thus, I am wondering if we should allow `suppressLateEvents()` for this
> > case? It seems to be better to only allow it for windowed-KTables.
> >
> >
> > -Matthias
> >
> >
> > On 6/27/18 8:53 AM, Ted Yu wrote:
> > > I noticed this (lack of primary parameter) as well.
> > >
> > > What you gave as new example is semantically the same as what I
> > suggested.
> > > So it is good by me.
> > >
> > > Thanks
> > >
> > > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io>
> wrote:
> > >
> > >> Thanks for taking look, Ted,
> > >>
> > >> I agree this is a departure from the conventions of Streams DSL.
> > >>
> > >> Most of our config objects have one or two "required" parameters,
> which
> > fit
> > >> naturally with the static factory method approach. TimeWindow, for
> > example,
> > >> requires a size parameter, so we can naturally say
> TimeWindows.of(size).
> > >>
> > >> I think in the case of a suppression, there's really no "core"
> > parameter,
> > >> and "Suppression.of()" seems sillier than "new Suppression()". I think
> > that
> > >> Suppression.of(duration) would be ambiguous, since there are many
> > durations
> > >> that we can configure.
> > >>
> > >> However, thinking about it again, I suppose that I can give each
> > >> configuration method a static version, which would let you replace
> "new
> > >> Suppression()." with "Suppression." in all the examples. Basically,
> > instead
> > >> of "of()", we'd support any of the methods I listed.
> > >>
> > >> For example:
> > >>
> > >> windowCounts
> > >>     .suppress(
> > >>         Suppression
> > >>             .suppressLateEvents(Duration.ofMinutes(10))
> > >>             .suppressIntermediateEvents(
> > >>
> >  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> > >>             )
> > >>     );
> > >>
> > >>
> > >> Does that seem better?
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >>
> > >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
> > >>
> > >>> I started to read this KIP which contains a lot of materials.
> > >>>
> > >>> One suggestion:
> > >>>
> > >>>     .suppress(
> > >>>         new Suppression()
> > >>>
> > >>>
> > >>> Do you think it would be more consistent with the rest of Streams
> data
> > >>> structures by supporting `of` ?
> > >>>
> > >>> Suppression.of(Duration.ofMinutes(10))
> > >>>
> > >>>
> > >>> Cheers
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> > wrote:
> > >>>
> > >>>> Hello devs and users,
> > >>>>
> > >>>> Please take some time to consider this proposal for Kafka Streams:
> > >>>>
> > >>>> KIP-328: Ability to suppress updates for KTables
> > >>>>
> > >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >>>>
> > >>>> The basic idea is to provide:
> > >>>> * more usable control over update rate (vs the current state store
> > >>> caches)
> > >>>> * the final-result-for-windowed-computations feature which several
> > >> people
> > >>>> have requested
> > >>>>
> > >>>> I look forward to your feedback!
> > >>>>
> > >>>> Thanks,
> > >>>> -John
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>

-- 
-- Guozhang

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Thanks for the feedback, Matthias,

It seems like in straightforward relational processing cases, it would not
make sense to bound the lateness of KTables. In general, it seems better to
have "guard rails" in place that make it easier to write sensible programs
than insensible ones.

But I'm still going to argue in favor of keeping it for all KTables ;)

1. I believe it is simpler to understand the operator if it has one uniform
definition, regardless of context. It's well defined and intuitive what
will happen when you use late-event suppression on a KTable, so I think
nothing surprising or dangerous will happen in that case. From my
perspective, having two sets of allowed operations is actually an increase
in cognitive complexity.

2. To me, it's not crazy to use the operator this way. For example, in lieu
of full-featured timestamp semantics, I can implement MVCC behavior when
building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
there are other, non-obvious applications of suppressing late events on
KTables.

3. Not to get too much into implementation details in a KIP discussion, but
if we did want to make late-event suppression available only on windowed
KTables, we have two enforcement options:
  a. check when we build the topology - this would be simple to implement,
but would be a runtime check. Hopefully, people write tests for their
topology before deploying them, so the feedback loop isn't instantaneous,
but it's not too long either.
  b. add a new WindowedKTable type - this would be a compile time check,
but would also be substantial increase of both interface and code
complexity.

We should definitely strive to have guard rails protecting against
surprising or dangerous behavior. Protecting against programs that we don't
currently predict is a lesser benefit, and I think we can put up guard
rails on a case-by-case basis for that. It seems like the increase in
cognitive (and potentially code and interface) complexity makes me think we
should skip this case.

What do you think?

Thanks,
-John

On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
wrote:

> Thanks for the KIP John.
>
> One initial comments about the last example "Bounded lateness": For a
> non-windowed KTable bounding the lateness does not really make sense,
> does it?
>
> Thus, I am wondering if we should allow `suppressLateEvents()` for this
> case? It seems to be better to only allow it for windowed-KTables.
>
>
> -Matthias
>
>
> On 6/27/18 8:53 AM, Ted Yu wrote:
> > I noticed this (lack of primary parameter) as well.
> >
> > What you gave as new example is semantically the same as what I
> suggested.
> > So it is good by me.
> >
> > Thanks
> >
> > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io> wrote:
> >
> >> Thanks for taking look, Ted,
> >>
> >> I agree this is a departure from the conventions of Streams DSL.
> >>
> >> Most of our config objects have one or two "required" parameters, which
> fit
> >> naturally with the static factory method approach. TimeWindow, for
> example,
> >> requires a size parameter, so we can naturally say TimeWindows.of(size).
> >>
> >> I think in the case of a suppression, there's really no "core"
> parameter,
> >> and "Suppression.of()" seems sillier than "new Suppression()". I think
> that
> >> Suppression.of(duration) would be ambiguous, since there are many
> durations
> >> that we can configure.
> >>
> >> However, thinking about it again, I suppose that I can give each
> >> configuration method a static version, which would let you replace "new
> >> Suppression()." with "Suppression." in all the examples. Basically,
> instead
> >> of "of()", we'd support any of the methods I listed.
> >>
> >> For example:
> >>
> >> windowCounts
> >>     .suppress(
> >>         Suppression
> >>             .suppressLateEvents(Duration.ofMinutes(10))
> >>             .suppressIntermediateEvents(
> >>
>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> >>             )
> >>     );
> >>
> >>
> >> Does that seem better?
> >>
> >> Thanks,
> >> -John
> >>
> >>
> >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
> >>
> >>> I started to read this KIP which contains a lot of materials.
> >>>
> >>> One suggestion:
> >>>
> >>>     .suppress(
> >>>         new Suppression()
> >>>
> >>>
> >>> Do you think it would be more consistent with the rest of Streams data
> >>> structures by supporting `of` ?
> >>>
> >>> Suppression.of(Duration.ofMinutes(10))
> >>>
> >>>
> >>> Cheers
> >>>
> >>>
> >>>
> >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> wrote:
> >>>
> >>>> Hello devs and users,
> >>>>
> >>>> Please take some time to consider this proposal for Kafka Streams:
> >>>>
> >>>> KIP-328: Ability to suppress updates for KTables
> >>>>
> >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >>>>
> >>>> The basic idea is to provide:
> >>>> * more usable control over update rate (vs the current state store
> >>> caches)
> >>>> * the final-result-for-windowed-computations feature which several
> >> people
> >>>> have requested
> >>>>
> >>>> I look forward to your feedback!
> >>>>
> >>>> Thanks,
> >>>> -John
> >>>>
> >>>
> >>
> >
>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Thanks for the feedback, Matthias,

It seems like in straightforward relational processing cases, it would not
make sense to bound the lateness of KTables. In general, it seems better to
have "guard rails" in place that make it easier to write sensible programs
than insensible ones.

But I'm still going to argue in favor of keeping it for all KTables ;)

1. I believe it is simpler to understand the operator if it has one uniform
definition, regardless of context. It's well defined and intuitive what
will happen when you use late-event suppression on a KTable, so I think
nothing surprising or dangerous will happen in that case. From my
perspective, having two sets of allowed operations is actually an increase
in cognitive complexity.

2. To me, it's not crazy to use the operator this way. For example, in lieu
of full-featured timestamp semantics, I can implement MVCC behavior when
building a KTable by "suppressLateEvents(Duration.ZERO)". I suspect that
there are other, non-obvious applications of suppressing late events on
KTables.

3. Not to get too much into implementation details in a KIP discussion, but
if we did want to make late-event suppression available only on windowed
KTables, we have two enforcement options:
  a. check when we build the topology - this would be simple to implement,
but would be a runtime check. Hopefully, people write tests for their
topology before deploying them, so the feedback loop isn't instantaneous,
but it's not too long either.
  b. add a new WindowedKTable type - this would be a compile time check,
but would also be substantial increase of both interface and code
complexity.

We should definitely strive to have guard rails protecting against
surprising or dangerous behavior. Protecting against programs that we don't
currently predict is a lesser benefit, and I think we can put up guard
rails on a case-by-case basis for that. It seems like the increase in
cognitive (and potentially code and interface) complexity makes me think we
should skip this case.

What do you think?

Thanks,
-John

On Wed, Jun 27, 2018 at 11:59 AM Matthias J. Sax <ma...@confluent.io>
wrote:

> Thanks for the KIP John.
>
> One initial comments about the last example "Bounded lateness": For a
> non-windowed KTable bounding the lateness does not really make sense,
> does it?
>
> Thus, I am wondering if we should allow `suppressLateEvents()` for this
> case? It seems to be better to only allow it for windowed-KTables.
>
>
> -Matthias
>
>
> On 6/27/18 8:53 AM, Ted Yu wrote:
> > I noticed this (lack of primary parameter) as well.
> >
> > What you gave as new example is semantically the same as what I
> suggested.
> > So it is good by me.
> >
> > Thanks
> >
> > On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io> wrote:
> >
> >> Thanks for taking look, Ted,
> >>
> >> I agree this is a departure from the conventions of Streams DSL.
> >>
> >> Most of our config objects have one or two "required" parameters, which
> fit
> >> naturally with the static factory method approach. TimeWindow, for
> example,
> >> requires a size parameter, so we can naturally say TimeWindows.of(size).
> >>
> >> I think in the case of a suppression, there's really no "core"
> parameter,
> >> and "Suppression.of()" seems sillier than "new Suppression()". I think
> that
> >> Suppression.of(duration) would be ambiguous, since there are many
> durations
> >> that we can configure.
> >>
> >> However, thinking about it again, I suppose that I can give each
> >> configuration method a static version, which would let you replace "new
> >> Suppression()." with "Suppression." in all the examples. Basically,
> instead
> >> of "of()", we'd support any of the methods I listed.
> >>
> >> For example:
> >>
> >> windowCounts
> >>     .suppress(
> >>         Suppression
> >>             .suppressLateEvents(Duration.ofMinutes(10))
> >>             .suppressIntermediateEvents(
> >>
>  IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> >>             )
> >>     );
> >>
> >>
> >> Does that seem better?
> >>
> >> Thanks,
> >> -John
> >>
> >>
> >> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
> >>
> >>> I started to read this KIP which contains a lot of materials.
> >>>
> >>> One suggestion:
> >>>
> >>>     .suppress(
> >>>         new Suppression()
> >>>
> >>>
> >>> Do you think it would be more consistent with the rest of Streams data
> >>> structures by supporting `of` ?
> >>>
> >>> Suppression.of(Duration.ofMinutes(10))
> >>>
> >>>
> >>> Cheers
> >>>
> >>>
> >>>
> >>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io>
> wrote:
> >>>
> >>>> Hello devs and users,
> >>>>
> >>>> Please take some time to consider this proposal for Kafka Streams:
> >>>>
> >>>> KIP-328: Ability to suppress updates for KTables
> >>>>
> >>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >>>>
> >>>> The basic idea is to provide:
> >>>> * more usable control over update rate (vs the current state store
> >>> caches)
> >>>> * the final-result-for-windowed-computations feature which several
> >> people
> >>>> have requested
> >>>>
> >>>> I look forward to your feedback!
> >>>>
> >>>> Thanks,
> >>>> -John
> >>>>
> >>>
> >>
> >
>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by "Matthias J. Sax" <ma...@confluent.io>.

Thanks for the KIP John.

One initial comments about the last example "Bounded lateness": For a
non-windowed KTable bounding the lateness does not really make sense,
does it?

Thus, I am wondering if we should allow `suppressLateEvents()` for this
case? It seems to be better to only allow it for windowed-KTables.


-Matthias


On 6/27/18 8:53 AM, Ted Yu wrote:
> I noticed this (lack of primary parameter) as well.
> 
> What you gave as new example is semantically the same as what I suggested.
> So it is good by me.
> 
> Thanks
> 
> On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io> wrote:
> 
>> Thanks for taking look, Ted,
>>
>> I agree this is a departure from the conventions of Streams DSL.
>>
>> Most of our config objects have one or two "required" parameters, which fit
>> naturally with the static factory method approach. TimeWindow, for example,
>> requires a size parameter, so we can naturally say TimeWindows.of(size).
>>
>> I think in the case of a suppression, there's really no "core" parameter,
>> and "Suppression.of()" seems sillier than "new Suppression()". I think that
>> Suppression.of(duration) would be ambiguous, since there are many durations
>> that we can configure.
>>
>> However, thinking about it again, I suppose that I can give each
>> configuration method a static version, which would let you replace "new
>> Suppression()." with "Suppression." in all the examples. Basically, instead
>> of "of()", we'd support any of the methods I listed.
>>
>> For example:
>>
>> windowCounts
>>     .suppress(
>>         Suppression
>>             .suppressLateEvents(Duration.ofMinutes(10))
>>             .suppressIntermediateEvents(
>>                 IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>>             )
>>     );
>>
>>
>> Does that seem better?
>>
>> Thanks,
>> -John
>>
>>
>> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
>>
>>> I started to read this KIP which contains a lot of materials.
>>>
>>> One suggestion:
>>>
>>>     .suppress(
>>>         new Suppression()
>>>
>>>
>>> Do you think it would be more consistent with the rest of Streams data
>>> structures by supporting `of` ?
>>>
>>> Suppression.of(Duration.ofMinutes(10))
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:
>>>
>>>> Hello devs and users,
>>>>
>>>> Please take some time to consider this proposal for Kafka Streams:
>>>>
>>>> KIP-328: Ability to suppress updates for KTables
>>>>
>>>> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>>>>
>>>> The basic idea is to provide:
>>>> * more usable control over update rate (vs the current state store
>>> caches)
>>>> * the final-result-for-windowed-computations feature which several
>> people
>>>> have requested
>>>>
>>>> I look forward to your feedback!
>>>>
>>>> Thanks,
>>>> -John
>>>>
>>>
>>
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Ted Yu <yu...@gmail.com>.

I noticed this (lack of primary parameter) as well.

What you gave as new example is semantically the same as what I suggested.
So it is good by me.

Thanks

On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io> wrote:

> Thanks for taking look, Ted,
>
> I agree this is a departure from the conventions of Streams DSL.
>
> Most of our config objects have one or two "required" parameters, which fit
> naturally with the static factory method approach. TimeWindow, for example,
> requires a size parameter, so we can naturally say TimeWindows.of(size).
>
> I think in the case of a suppression, there's really no "core" parameter,
> and "Suppression.of()" seems sillier than "new Suppression()". I think that
> Suppression.of(duration) would be ambiguous, since there are many durations
> that we can configure.
>
> However, thinking about it again, I suppose that I can give each
> configuration method a static version, which would let you replace "new
> Suppression()." with "Suppression." in all the examples. Basically, instead
> of "of()", we'd support any of the methods I listed.
>
> For example:
>
> windowCounts
>     .suppress(
>         Suppression
>             .suppressLateEvents(Duration.ofMinutes(10))
>             .suppressIntermediateEvents(
>                 IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>             )
>     );
>
>
> Does that seem better?
>
> Thanks,
> -John
>
>
> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
>
> > I started to read this KIP which contains a lot of materials.
> >
> > One suggestion:
> >
> >     .suppress(
> >         new Suppression()
> >
> >
> > Do you think it would be more consistent with the rest of Streams data
> > structures by supporting `of` ?
> >
> > Suppression.of(Duration.ofMinutes(10))
> >
> >
> > Cheers
> >
> >
> >
> > On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:
> >
> > > Hello devs and users,
> > >
> > > Please take some time to consider this proposal for Kafka Streams:
> > >
> > > KIP-328: Ability to suppress updates for KTables
> > >
> > > link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >
> > > The basic idea is to provide:
> > > * more usable control over update rate (vs the current state store
> > caches)
> > > * the final-result-for-windowed-computations feature which several
> people
> > > have requested
> > >
> > > I look forward to your feedback!
> > >
> > > Thanks,
> > > -John
> > >
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Ted Yu <yu...@gmail.com>.

I noticed this (lack of primary parameter) as well.

What you gave as new example is semantically the same as what I suggested.
So it is good by me.

Thanks

On Wed, Jun 27, 2018 at 7:31 AM, John Roesler <jo...@confluent.io> wrote:

> Thanks for taking look, Ted,
>
> I agree this is a departure from the conventions of Streams DSL.
>
> Most of our config objects have one or two "required" parameters, which fit
> naturally with the static factory method approach. TimeWindow, for example,
> requires a size parameter, so we can naturally say TimeWindows.of(size).
>
> I think in the case of a suppression, there's really no "core" parameter,
> and "Suppression.of()" seems sillier than "new Suppression()". I think that
> Suppression.of(duration) would be ambiguous, since there are many durations
> that we can configure.
>
> However, thinking about it again, I suppose that I can give each
> configuration method a static version, which would let you replace "new
> Suppression()." with "Suppression." in all the examples. Basically, instead
> of "of()", we'd support any of the methods I listed.
>
> For example:
>
> windowCounts
>     .suppress(
>         Suppression
>             .suppressLateEvents(Duration.ofMinutes(10))
>             .suppressIntermediateEvents(
>                 IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
>             )
>     );
>
>
> Does that seem better?
>
> Thanks,
> -John
>
>
> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:
>
> > I started to read this KIP which contains a lot of materials.
> >
> > One suggestion:
> >
> >     .suppress(
> >         new Suppression()
> >
> >
> > Do you think it would be more consistent with the rest of Streams data
> > structures by supporting `of` ?
> >
> > Suppression.of(Duration.ofMinutes(10))
> >
> >
> > Cheers
> >
> >
> >
> > On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:
> >
> > > Hello devs and users,
> > >
> > > Please take some time to consider this proposal for Kafka Streams:
> > >
> > > KIP-328: Ability to suppress updates for KTables
> > >
> > > link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >
> > > The basic idea is to provide:
> > > * more usable control over update rate (vs the current state store
> > caches)
> > > * the final-result-for-windowed-computations feature which several
> people
> > > have requested
> > >
> > > I look forward to your feedback!
> > >
> > > Thanks,
> > > -John
> > >
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Thanks for taking look, Ted,

I agree this is a departure from the conventions of Streams DSL.

Most of our config objects have one or two "required" parameters, which fit
naturally with the static factory method approach. TimeWindow, for example,
requires a size parameter, so we can naturally say TimeWindows.of(size).

I think in the case of a suppression, there's really no "core" parameter,
and "Suppression.of()" seems sillier than "new Suppression()". I think that
Suppression.of(duration) would be ambiguous, since there are many durations
that we can configure.

However, thinking about it again, I suppose that I can give each
configuration method a static version, which would let you replace "new
Suppression()." with "Suppression." in all the examples. Basically, instead
of "of()", we'd support any of the methods I listed.

For example:

windowCounts
    .suppress(
        Suppression
            .suppressLateEvents(Duration.ofMinutes(10))
            .suppressIntermediateEvents(
                IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
            )
    );

Does that seem better?

Thanks,
-John

On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:

> I started to read this KIP which contains a lot of materials.
>
> One suggestion:
>
>     .suppress(
>         new Suppression()
>
>
> Do you think it would be more consistent with the rest of Streams data
> structures by supporting `of` ?
>
> Suppression.of(Duration.ofMinutes(10))
>
>
> Cheers
>
>
>
> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Hello devs and users,
> >
> > Please take some time to consider this proposal for Kafka Streams:
> >
> > KIP-328: Ability to suppress updates for KTables
> >
> > link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >
> > The basic idea is to provide:
> > * more usable control over update rate (vs the current state store
> caches)
> > * the final-result-for-windowed-computations feature which several people
> > have requested
> >
> > I look forward to your feedback!
> >
> > Thanks,
> > -John
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Thanks for taking look, Ted,

I agree this is a departure from the conventions of Streams DSL.

Most of our config objects have one or two "required" parameters, which fit
naturally with the static factory method approach. TimeWindow, for example,
requires a size parameter, so we can naturally say TimeWindows.of(size).

I think in the case of a suppression, there's really no "core" parameter,
and "Suppression.of()" seems sillier than "new Suppression()". I think that
Suppression.of(duration) would be ambiguous, since there are many durations
that we can configure.

However, thinking about it again, I suppose that I can give each
configuration method a static version, which would let you replace "new
Suppression()." with "Suppression." in all the examples. Basically, instead
of "of()", we'd support any of the methods I listed.

For example:

windowCounts
    .suppress(
        Suppression
            .suppressLateEvents(Duration.ofMinutes(10))
            .suppressIntermediateEvents(
                IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
            )
    );

Does that seem better?

Thanks,
-John

On Wed, Jun 27, 2018 at 12:44 AM Ted Yu <yu...@gmail.com> wrote:

> I started to read this KIP which contains a lot of materials.
>
> One suggestion:
>
>     .suppress(
>         new Suppression()
>
>
> Do you think it would be more consistent with the rest of Streams data
> structures by supporting `of` ?
>
> Suppression.of(Duration.ofMinutes(10))
>
>
> Cheers
>
>
>
> On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:
>
> > Hello devs and users,
> >
> > Please take some time to consider this proposal for Kafka Streams:
> >
> > KIP-328: Ability to suppress updates for KTables
> >
> > link: https://cwiki.apache.org/confluence/x/sQU0BQ
> >
> > The basic idea is to provide:
> > * more usable control over update rate (vs the current state store
> caches)
> > * the final-result-for-windowed-computations feature which several people
> > have requested
> >
> > I look forward to your feedback!
> >
> > Thanks,
> > -John
> >
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by Ted Yu <yu...@gmail.com>.

I started to read this KIP which contains a lot of materials.

One suggestion:

    .suppress(
        new Suppression()

Do you think it would be more consistent with the rest of Streams data
structures by supporting `of` ?

Suppression.of(Duration.ofMinutes(10))

Cheers

On Tue, Jun 26, 2018 at 1:11 PM, John Roesler <jo...@confluent.io> wrote:

> Hello devs and users,
>
> Please take some time to consider this proposal for Kafka Streams:
>
> KIP-328: Ability to suppress updates for KTables
>
> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>
> The basic idea is to provide:
> * more usable control over update rate (vs the current state store caches)
> * the final-result-for-windowed-computations feature which several people
> have requested
>
> I look forward to your feedback!
>
> Thanks,
> -John
>

Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.

Hello again all,

I realized today that I neglected to include metrics in the proposal. I
have added them just now.

Thanks,
-John

On Tue, Jun 26, 2018 at 3:11 PM John Roesler <jo...@confluent.io> wrote:

> Hello devs and users,
>
> Please take some time to consider this proposal for Kafka Streams:
>
> KIP-328: Ability to suppress updates for KTables
>
> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>
> The basic idea is to provide:
> * more usable control over update rate (vs the current state store caches)
> * the final-result-for-windowed-computations feature which several people
> have requested
>
> I look forward to your feedback!
>
> Thanks,
> -John
>