You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by John Roesler <jo...@confluent.io> on 2018/10/04 16:12:54 UTC

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

Hello again, all,

During review, we realized that there is a relationship between this
(KIP-328) and KIP-372.

KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds a
new internal topic (the changelog for the suppression buffer).

However, we didn't consider this relationship in either KIP discussion,
possibly since they were discussed and accepted concurrently.

I have updated KIP-328 to effectively "merge" the two KIPs by adding a
`withName` builder to Suppressed in the style of the other builders added
in KIP-372:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
.

I think this should be uncontroversial, but as always, let me know of any
objections you may have.


Also, note that I'll be updating the KIP to document the exact buffer
eviction behavior. I previously treated this as an internal implementation
detail, but after consideration, I think users would want to know the
eviction semantics, especially if they are debugging their applications and
scrutinizing the sequence of emitted records.

Thanks,
-John

On Thu, Sep 20, 2018 at 5:34 PM John Roesler <jo...@confluent.io> wrote:

> Hello all,
>
> During review of https://github.com/apache/kafka/pull/5567 for KIP-328,
> the reviewers raised many good suggestions for the API.
>
> The basic design of the suppress operation remains the same, but the
> config object is (in my opinion) far more ergonomic with their suggestions.
>
> I have updated the KIP to reflect the new config (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
> )
>
> Please let me know if anyone wishes to change their vote, and we call for
> a recast.
>
> Thanks,
> -John
>
> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <ma...@confluent.io>
> wrote:
>
>> It seems nobody has any objections against the change.
>>
>> That's for the KIP improvement. I'll go ahead and merge the PR.
>>
>>
>> -Matthias
>>
>> On 8/21/18 2:44 PM, John Roesler wrote:
>> > Hello again, all,
>> >
>> > I belatedly had a better idea for adding grace period to the Windows
>> class
>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
>> > providing the grace-setter in the abstract class and having to retract
>> it
>> > in UnlimitedWindows, I've made the getter abstract method in Windows and
>> > only added setters to Time and Join windows.
>> >
>> > This should not only improve the ergonomics of grace period, but make
>> the
>> > whole class hierarchy more maintainable.
>> >
>> > See the PR for more details: https://github.com/apache/kafka/pull/5536
>> >
>> > I've updated the KIP accordingly. Here's the diff:
>> >
>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
>> >
>> > Please let me know if this changes your vote.
>> >
>> > Thanks,
>> > -John
>> >
>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <jo...@confluent.io> wrote:
>> >
>> >> Hey all,
>> >>
>> >> I just wanted to let you know that a few small issues surfaced during
>> >> implementation and review. I've updated the KIP. Here's the diff:
>> >>
>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
>> >>
>> >> Basically:
>> >> * the metrics named "*-event-*" are inconsistent with existing
>> >> nomenclature, and will be "*-record-*" instead (late records instead of
>> >> late events, for example)
>> >> * the apis taking and returning Duration will use long millis instead.
>> We
>> >> do want to transition to Duration in the future, but we shouldn't do it
>> >> piecemeal.
>> >>
>> >> Thanks,
>> >> -John
>> >>
>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <jo...@confluent.io>
>> wrote:
>> >>
>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang,
>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
>> >>>
>> >>> Thanks for your time,
>> >>> -John
>> >>>
>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <matthias@confluent.io
>> >
>> >>> wrote:
>> >>>
>> >>>> +1 (binding)
>> >>>>
>> >>>> Thanks for the KIP.
>> >>>>
>> >>>>
>> >>>> -Matthias
>> >>>>
>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
>> >>>>> Thanks John! +1
>> >>>>>
>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>>> Yes, the addendum lgtm as well. Thanks!
>> >>>>>>
>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <jo...@confluent.io>
>> >>>> wrote:
>> >>>>>>
>> >>>>>>> Another thing that came up after I started working on an
>> >>>> implementation
>> >>>>>> is
>> >>>>>>> that in addition to deprecating "retention" from the Windows
>> >>>> interface,
>> >>>>>> we
>> >>>>>>> also need to deprecate "segmentInterval", for the same reasons. I
>> >>>> simply
>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
>> >>>>>>>
>> >>>>>>> Hopefully, this doesn't change anyone's vote.
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> -John
>> >>>>>>>
>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <jo...@confluent.io>
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>>> Thanks Guozhang,
>> >>>>>>>>
>> >>>>>>>> Thanks for that catch. to clarify, currently, events are "late"
>> only
>> >>>>>> when
>> >>>>>>>> they are older than the retention period. Currently, we detect
>> this
>> >>>> in
>> >>>>>>> the
>> >>>>>>>> processor and record it as a "skipped-record". We then do not
>> >>>> attempt
>> >>>>>> to
>> >>>>>>>> store the event in the window store. If a user provided a
>> >>>>>> pre-configured
>> >>>>>>>> window store with a retention period smaller than the one they
>> >>>> specify
>> >>>>>>> via
>> >>>>>>>> Windows#until, the segmented store will drop the update with no
>> >>>> metric
>> >>>>>>> and
>> >>>>>>>> record a debug-level log.
>> >>>>>>>>
>> >>>>>>>> With KIP-328, with the introduction of "grace period" and moving
>> >>>>>>> retention
>> >>>>>>>> fully into the state store, we need to have metrics for both
>> "late
>> >>>>>>> events"
>> >>>>>>>> (new records older than the grace period) and "expired window
>> >>>> events"
>> >>>>>>> (new
>> >>>>>>>> records for windows that are no longer retained in the state
>> >>>> store). I
>> >>>>>>>> already proposed metrics for the late events, and I've just
>> updated
>> >>>> the
>> >>>>>>> KIP
>> >>>>>>>> with metrics for the expired window events. I also updated the
>> KIP
>> >>>> to
>> >>>>>>> make
>> >>>>>>>> it clear that neither late nor expired events will count as
>> >>>>>>>> "skipped-records" any more.
>> >>>>>>>>
>> >>>>>>>> -John
>> >>>>>>>>
>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
>> wangguoz@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Hi John,
>> >>>>>>>>>
>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
>> suggestion:
>> >>>>>>>>>
>> >>>>>>>>> Following your suggestion of the differentiation of
>> >>>> `skipped-records`
>> >>>>>>> v.s.
>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
>> scenarios
>> >>>>>>> where
>> >>>>>>>>> records got ignored due the window not being available any more
>> in
>> >>>>>>>>> windowed
>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
>> recording
>> >>>> to
>> >>>>>>> the
>> >>>>>>>>> `late-event-drop` metrics recording.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Guozhang
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <bbejeck@gmail.com
>> >
>> >>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Thanks for the KIP!
>> >>>>>>>>>>
>> >>>>>>>>>> +1
>> >>>>>>>>>>
>> >>>>>>>>>> -Bill
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <yu...@gmail.com>
>> >>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> +1
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
>> john@confluent.io
>> >>>>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hello devs,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new
>> >>>>>> comments,
>> >>>>>>>>> so I
>> >>>>>>>>>>> am
>> >>>>>>>>>>>> calling for a vote!
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/sQU0BQ
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The basic idea is to provide:
>> >>>>>>>>>>>> * more usable control over update rate (vs the current state
>> >>>>>> store
>> >>>>>>>>>>> caches)
>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature which
>> >>>>>>> several
>> >>>>>>>>>> people
>> >>>>>>>>>>>> have requested
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>> -John
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> -- Guozhang
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> -- Guozhang
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >
>>
>>

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.
Hi again, all,

Casey Green pointed out that we overlooked the scala api when we added
Suppress. He was kind enough to send
https://github.com/apache/kafka/pull/6314 to correct this, and we also
updated the KIP. Since it's essentially just copying the existing Java API
over to Scala, we didn't create a new KIP.

Note that we don't plan to treat this as a bug, and therefore don't
currently plan to backport the Scala Suppress API to 2.1 or 2.2.

Thanks,
-John

On Fri, Nov 16, 2018 at 3:54 PM Bill Bejeck <bb...@gmail.com> wrote:

> Hi John,
>
> Thanks for the update, I'm +1 on changes and my +1 vote stands.
>
> -Bill
>
> On Fri, Nov 16, 2018 at 4:19 PM John Roesler <jo...@confluent.io> wrote:
>
> > Hi all, sorry to do this again, but during review of the code to add the
> > metrics proposed in this KIP, the reviewers and I noticed some
> > inconsistencies and drawbacks of the metrics I proposed in the KIP.
> >
> > Here's the diff:
> >
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=24&selectedPageVersions=23
> >
> > * The proposed metrics were all INFO level, but they would be updated on
> > every record, creating a performance concern. If we can refactor the
> > metrics framework in the future to resolve this concern, we may move the
> > metrics back to INFO level.
> > * having separate metrics for memory and disk buffers is unnecessarily
> > complex. The main utility is determining how close the buffer is to the
> > configured limit, which makes a single metric more useful. I've combined
> > them into one "suppression-buffer-size-*" metric.
> > * The "intermediate-result-suppression-*" metric would be equivalent to
> the
> > "process-*" metric which is already available on the ProcessorNode. I've
> > removed it from the KIP.
> > * The "suppression-mem-buffer-evict-*" metric had been proposed as a
> buffer
> > metric, but it makes more sense as a processor node metric, since its
> > counterpart is the "process-*" metric. I've replaced it with a processor
> > node metric, "suppression-emit-*"
> >
> > Let me know if you want to recast votes in response to this change.
> >
> > -John
> >
> > On Thu, Oct 4, 2018 at 11:26 AM John Roesler <jo...@confluent.io> wrote:
> >
> > > Update: Here's a link to the documented eviction behavior:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)
> > >
> > > On Thu, Oct 4, 2018 at 11:12 AM John Roesler <jo...@confluent.io>
> wrote:
> > >
> > >> Hello again, all,
> > >>
> > >> During review, we realized that there is a relationship between this
> > >> (KIP-328) and KIP-372.
> > >>
> > >> KIP-372 proposed to allow naming *all* internal topics, and KIP-328
> adds
> > >> a new internal topic (the changelog for the suppression buffer).
> > >>
> > >> However, we didn't consider this relationship in either KIP
> discussion,
> > >> possibly since they were discussed and accepted concurrently.
> > >>
> > >> I have updated KIP-328 to effectively "merge" the two KIPs by adding a
> > >> `withName` builder to Suppressed in the style of the other builders
> > added
> > >> in KIP-372:
> > >>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
> > >> .
> > >>
> > >> I think this should be uncontroversial, but as always, let me know of
> > any
> > >> objections you may have.
> > >>
> > >>
> > >> Also, note that I'll be updating the KIP to document the exact buffer
> > >> eviction behavior. I previously treated this as an internal
> > implementation
> > >> detail, but after consideration, I think users would want to know the
> > >> eviction semantics, especially if they are debugging their
> applications
> > and
> > >> scrutinizing the sequence of emitted records.
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <jo...@confluent.io>
> wrote:
> > >>
> > >>> Hello all,
> > >>>
> > >>> During review of https://github.com/apache/kafka/pull/5567 for
> > KIP-328,
> > >>> the reviewers raised many good suggestions for the API.
> > >>>
> > >>> The basic design of the suppress operation remains the same, but the
> > >>> config object is (in my opinion) far more ergonomic with their
> > >>> suggestions.
> > >>>
> > >>> I have updated the KIP to reflect the new config (
> > >>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
> > >>> )
> > >>>
> > >>> Please let me know if anyone wishes to change their vote, and we call
> > >>> for a recast.
> > >>>
> > >>> Thanks,
> > >>> -John
> > >>>
> > >>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <
> > matthias@confluent.io>
> > >>> wrote:
> > >>>
> > >>>> It seems nobody has any objections against the change.
> > >>>>
> > >>>> That's for the KIP improvement. I'll go ahead and merge the PR.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 8/21/18 2:44 PM, John Roesler wrote:
> > >>>> > Hello again, all,
> > >>>> >
> > >>>> > I belatedly had a better idea for adding grace period to the
> Windows
> > >>>> class
> > >>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
> > >>>> > providing the grace-setter in the abstract class and having to
> > >>>> retract it
> > >>>> > in UnlimitedWindows, I've made the getter abstract method in
> Windows
> > >>>> and
> > >>>> > only added setters to Time and Join windows.
> > >>>> >
> > >>>> > This should not only improve the ergonomics of grace period, but
> > make
> > >>>> the
> > >>>> > whole class hierarchy more maintainable.
> > >>>> >
> > >>>> > See the PR for more details:
> > >>>> https://github.com/apache/kafka/pull/5536
> > >>>> >
> > >>>> > I've updated the KIP accordingly. Here's the diff:
> > >>>> >
> > >>>>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
> > >>>> >
> > >>>> > Please let me know if this changes your vote.
> > >>>> >
> > >>>> > Thanks,
> > >>>> > -John
> > >>>> >
> > >>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <jo...@confluent.io>
> > >>>> wrote:
> > >>>> >
> > >>>> >> Hey all,
> > >>>> >>
> > >>>> >> I just wanted to let you know that a few small issues surfaced
> > during
> > >>>> >> implementation and review. I've updated the KIP. Here's the diff:
> > >>>> >>
> > >>>>
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
> > >>>> >>
> > >>>> >> Basically:
> > >>>> >> * the metrics named "*-event-*" are inconsistent with existing
> > >>>> >> nomenclature, and will be "*-record-*" instead (late records
> > instead
> > >>>> of
> > >>>> >> late events, for example)
> > >>>> >> * the apis taking and returning Duration will use long millis
> > >>>> instead. We
> > >>>> >> do want to transition to Duration in the future, but we shouldn't
> > do
> > >>>> it
> > >>>> >> piecemeal.
> > >>>> >>
> > >>>> >> Thanks,
> > >>>> >> -John
> > >>>> >>
> > >>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <jo...@confluent.io>
> > >>>> wrote:
> > >>>> >>
> > >>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes
> > (Guozhang,
> > >>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
> > >>>> >>>
> > >>>> >>> Thanks for your time,
> > >>>> >>> -John
> > >>>> >>>
> > >>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <
> > >>>> matthias@confluent.io>
> > >>>> >>> wrote:
> > >>>> >>>
> > >>>> >>>> +1 (binding)
> > >>>> >>>>
> > >>>> >>>> Thanks for the KIP.
> > >>>> >>>>
> > >>>> >>>>
> > >>>> >>>> -Matthias
> > >>>> >>>>
> > >>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
> > >>>> >>>>> Thanks John! +1
> > >>>> >>>>>
> > >>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <
> wangguoz@gmail.com>
> > >>>> wrote:
> > >>>> >>>>>
> > >>>> >>>>>> Yes, the addendum lgtm as well. Thanks!
> > >>>> >>>>>>
> > >>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <
> > john@confluent.io
> > >>>> >
> > >>>> >>>> wrote:
> > >>>> >>>>>>
> > >>>> >>>>>>> Another thing that came up after I started working on an
> > >>>> >>>> implementation
> > >>>> >>>>>> is
> > >>>> >>>>>>> that in addition to deprecating "retention" from the Windows
> > >>>> >>>> interface,
> > >>>> >>>>>> we
> > >>>> >>>>>>> also need to deprecate "segmentInterval", for the same
> > reasons.
> > >>>> I
> > >>>> >>>> simply
> > >>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
> > >>>> >>>>>>>
> > >>>> >>>>>>> Hopefully, this doesn't change anyone's vote.
> > >>>> >>>>>>>
> > >>>> >>>>>>> Thanks,
> > >>>> >>>>>>> -John
> > >>>> >>>>>>>
> > >>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <
> > john@confluent.io
> > >>>> >
> > >>>> >>>> wrote:
> > >>>> >>>>>>>
> > >>>> >>>>>>>> Thanks Guozhang,
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are
> > >>>> "late" only
> > >>>> >>>>>> when
> > >>>> >>>>>>>> they are older than the retention period. Currently, we
> > detect
> > >>>> this
> > >>>> >>>> in
> > >>>> >>>>>>> the
> > >>>> >>>>>>>> processor and record it as a "skipped-record". We then do
> not
> > >>>> >>>> attempt
> > >>>> >>>>>> to
> > >>>> >>>>>>>> store the event in the window store. If a user provided a
> > >>>> >>>>>> pre-configured
> > >>>> >>>>>>>> window store with a retention period smaller than the one
> > they
> > >>>> >>>> specify
> > >>>> >>>>>>> via
> > >>>> >>>>>>>> Windows#until, the segmented store will drop the update
> with
> > no
> > >>>> >>>> metric
> > >>>> >>>>>>> and
> > >>>> >>>>>>>> record a debug-level log.
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and
> > >>>> moving
> > >>>> >>>>>>> retention
> > >>>> >>>>>>>> fully into the state store, we need to have metrics for
> both
> > >>>> "late
> > >>>> >>>>>>> events"
> > >>>> >>>>>>>> (new records older than the grace period) and "expired
> window
> > >>>> >>>> events"
> > >>>> >>>>>>> (new
> > >>>> >>>>>>>> records for windows that are no longer retained in the
> state
> > >>>> >>>> store). I
> > >>>> >>>>>>>> already proposed metrics for the late events, and I've just
> > >>>> updated
> > >>>> >>>> the
> > >>>> >>>>>>> KIP
> > >>>> >>>>>>>> with metrics for the expired window events. I also updated
> > the
> > >>>> KIP
> > >>>> >>>> to
> > >>>> >>>>>>> make
> > >>>> >>>>>>>> it clear that neither late nor expired events will count as
> > >>>> >>>>>>>> "skipped-records" any more.
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> -John
> > >>>> >>>>>>>>
> > >>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
> > >>>> wangguoz@gmail.com>
> > >>>> >>>>>>> wrote:
> > >>>> >>>>>>>>
> > >>>> >>>>>>>>> Hi John,
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
> > >>>> suggestion:
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Following your suggestion of the differentiation of
> > >>>> >>>> `skipped-records`
> > >>>> >>>>>>> v.s.
> > >>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
> > >>>> scenarios
> > >>>> >>>>>>> where
> > >>>> >>>>>>>>> records got ignored due the window not being available any
> > >>>> more in
> > >>>> >>>>>>>>> windowed
> > >>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
> > >>>> recording
> > >>>> >>>> to
> > >>>> >>>>>>> the
> > >>>> >>>>>>>>> `late-event-drop` metrics recording.
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> Guozhang
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <
> > >>>> bbejeck@gmail.com>
> > >>>> >>>>>> wrote:
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>> Thanks for the KIP!
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> +1
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> -Bill
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <
> > yuzhihong@gmail.com>
> > >>>> >>>> wrote:
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>>> +1
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
> > >>>> john@confluent.io
> > >>>> >>>>>
> > >>>> >>>>>>>>> wrote:
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Hello devs,
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no
> new
> > >>>> >>>>>> comments,
> > >>>> >>>>>>>>> so I
> > >>>> >>>>>>>>>>> am
> > >>>> >>>>>>>>>>>> calling for a vote!
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Here's the KIP:
> > >>>> https://cwiki.apache.org/confluence/x/sQU0BQ
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> The basic idea is to provide:
> > >>>> >>>>>>>>>>>> * more usable control over update rate (vs the current
> > >>>> state
> > >>>> >>>>>> store
> > >>>> >>>>>>>>>>> caches)
> > >>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature
> > which
> > >>>> >>>>>>> several
> > >>>> >>>>>>>>>> people
> > >>>> >>>>>>>>>>>> have requested
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>> Thanks,
> > >>>> >>>>>>>>>>>> -John
> > >>>> >>>>>>>>>>>>
> > >>>> >>>>>>>>>>>
> > >>>> >>>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>> --
> > >>>> >>>>>>>>> -- Guozhang
> > >>>> >>>>>>>>>
> > >>>> >>>>>>>>
> > >>>> >>>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>>
> > >>>> >>>>>> --
> > >>>> >>>>>> -- Guozhang
> > >>>> >>>>>>
> > >>>> >>>>>
> > >>>> >>>>
> > >>>> >>>>
> > >>>> >
> > >>>>
> > >>>>
> >
>

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

Posted by Bill Bejeck <bb...@gmail.com>.
Hi John,

Thanks for the update, I'm +1 on changes and my +1 vote stands.

-Bill

On Fri, Nov 16, 2018 at 4:19 PM John Roesler <jo...@confluent.io> wrote:

> Hi all, sorry to do this again, but during review of the code to add the
> metrics proposed in this KIP, the reviewers and I noticed some
> inconsistencies and drawbacks of the metrics I proposed in the KIP.
>
> Here's the diff:
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=24&selectedPageVersions=23
>
> * The proposed metrics were all INFO level, but they would be updated on
> every record, creating a performance concern. If we can refactor the
> metrics framework in the future to resolve this concern, we may move the
> metrics back to INFO level.
> * having separate metrics for memory and disk buffers is unnecessarily
> complex. The main utility is determining how close the buffer is to the
> configured limit, which makes a single metric more useful. I've combined
> them into one "suppression-buffer-size-*" metric.
> * The "intermediate-result-suppression-*" metric would be equivalent to the
> "process-*" metric which is already available on the ProcessorNode. I've
> removed it from the KIP.
> * The "suppression-mem-buffer-evict-*" metric had been proposed as a buffer
> metric, but it makes more sense as a processor node metric, since its
> counterpart is the "process-*" metric. I've replaced it with a processor
> node metric, "suppression-emit-*"
>
> Let me know if you want to recast votes in response to this change.
>
> -John
>
> On Thu, Oct 4, 2018 at 11:26 AM John Roesler <jo...@confluent.io> wrote:
>
> > Update: Here's a link to the documented eviction behavior:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)
> >
> > On Thu, Oct 4, 2018 at 11:12 AM John Roesler <jo...@confluent.io> wrote:
> >
> >> Hello again, all,
> >>
> >> During review, we realized that there is a relationship between this
> >> (KIP-328) and KIP-372.
> >>
> >> KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds
> >> a new internal topic (the changelog for the suppression buffer).
> >>
> >> However, we didn't consider this relationship in either KIP discussion,
> >> possibly since they were discussed and accepted concurrently.
> >>
> >> I have updated KIP-328 to effectively "merge" the two KIPs by adding a
> >> `withName` builder to Suppressed in the style of the other builders
> added
> >> in KIP-372:
> >>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
> >> .
> >>
> >> I think this should be uncontroversial, but as always, let me know of
> any
> >> objections you may have.
> >>
> >>
> >> Also, note that I'll be updating the KIP to document the exact buffer
> >> eviction behavior. I previously treated this as an internal
> implementation
> >> detail, but after consideration, I think users would want to know the
> >> eviction semantics, especially if they are debugging their applications
> and
> >> scrutinizing the sequence of emitted records.
> >>
> >> Thanks,
> >> -John
> >>
> >> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <jo...@confluent.io> wrote:
> >>
> >>> Hello all,
> >>>
> >>> During review of https://github.com/apache/kafka/pull/5567 for
> KIP-328,
> >>> the reviewers raised many good suggestions for the API.
> >>>
> >>> The basic design of the suppress operation remains the same, but the
> >>> config object is (in my opinion) far more ergonomic with their
> >>> suggestions.
> >>>
> >>> I have updated the KIP to reflect the new config (
> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
> >>> )
> >>>
> >>> Please let me know if anyone wishes to change their vote, and we call
> >>> for a recast.
> >>>
> >>> Thanks,
> >>> -John
> >>>
> >>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <
> matthias@confluent.io>
> >>> wrote:
> >>>
> >>>> It seems nobody has any objections against the change.
> >>>>
> >>>> That's for the KIP improvement. I'll go ahead and merge the PR.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 8/21/18 2:44 PM, John Roesler wrote:
> >>>> > Hello again, all,
> >>>> >
> >>>> > I belatedly had a better idea for adding grace period to the Windows
> >>>> class
> >>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
> >>>> > providing the grace-setter in the abstract class and having to
> >>>> retract it
> >>>> > in UnlimitedWindows, I've made the getter abstract method in Windows
> >>>> and
> >>>> > only added setters to Time and Join windows.
> >>>> >
> >>>> > This should not only improve the ergonomics of grace period, but
> make
> >>>> the
> >>>> > whole class hierarchy more maintainable.
> >>>> >
> >>>> > See the PR for more details:
> >>>> https://github.com/apache/kafka/pull/5536
> >>>> >
> >>>> > I've updated the KIP accordingly. Here's the diff:
> >>>> >
> >>>>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
> >>>> >
> >>>> > Please let me know if this changes your vote.
> >>>> >
> >>>> > Thanks,
> >>>> > -John
> >>>> >
> >>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <jo...@confluent.io>
> >>>> wrote:
> >>>> >
> >>>> >> Hey all,
> >>>> >>
> >>>> >> I just wanted to let you know that a few small issues surfaced
> during
> >>>> >> implementation and review. I've updated the KIP. Here's the diff:
> >>>> >>
> >>>>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
> >>>> >>
> >>>> >> Basically:
> >>>> >> * the metrics named "*-event-*" are inconsistent with existing
> >>>> >> nomenclature, and will be "*-record-*" instead (late records
> instead
> >>>> of
> >>>> >> late events, for example)
> >>>> >> * the apis taking and returning Duration will use long millis
> >>>> instead. We
> >>>> >> do want to transition to Duration in the future, but we shouldn't
> do
> >>>> it
> >>>> >> piecemeal.
> >>>> >>
> >>>> >> Thanks,
> >>>> >> -John
> >>>> >>
> >>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <jo...@confluent.io>
> >>>> wrote:
> >>>> >>
> >>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes
> (Guozhang,
> >>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
> >>>> >>>
> >>>> >>> Thanks for your time,
> >>>> >>> -John
> >>>> >>>
> >>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <
> >>>> matthias@confluent.io>
> >>>> >>> wrote:
> >>>> >>>
> >>>> >>>> +1 (binding)
> >>>> >>>>
> >>>> >>>> Thanks for the KIP.
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> -Matthias
> >>>> >>>>
> >>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
> >>>> >>>>> Thanks John! +1
> >>>> >>>>>
> >>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wa...@gmail.com>
> >>>> wrote:
> >>>> >>>>>
> >>>> >>>>>> Yes, the addendum lgtm as well. Thanks!
> >>>> >>>>>>
> >>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <
> john@confluent.io
> >>>> >
> >>>> >>>> wrote:
> >>>> >>>>>>
> >>>> >>>>>>> Another thing that came up after I started working on an
> >>>> >>>> implementation
> >>>> >>>>>> is
> >>>> >>>>>>> that in addition to deprecating "retention" from the Windows
> >>>> >>>> interface,
> >>>> >>>>>> we
> >>>> >>>>>>> also need to deprecate "segmentInterval", for the same
> reasons.
> >>>> I
> >>>> >>>> simply
> >>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
> >>>> >>>>>>>
> >>>> >>>>>>> Hopefully, this doesn't change anyone's vote.
> >>>> >>>>>>>
> >>>> >>>>>>> Thanks,
> >>>> >>>>>>> -John
> >>>> >>>>>>>
> >>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <
> john@confluent.io
> >>>> >
> >>>> >>>> wrote:
> >>>> >>>>>>>
> >>>> >>>>>>>> Thanks Guozhang,
> >>>> >>>>>>>>
> >>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are
> >>>> "late" only
> >>>> >>>>>> when
> >>>> >>>>>>>> they are older than the retention period. Currently, we
> detect
> >>>> this
> >>>> >>>> in
> >>>> >>>>>>> the
> >>>> >>>>>>>> processor and record it as a "skipped-record". We then do not
> >>>> >>>> attempt
> >>>> >>>>>> to
> >>>> >>>>>>>> store the event in the window store. If a user provided a
> >>>> >>>>>> pre-configured
> >>>> >>>>>>>> window store with a retention period smaller than the one
> they
> >>>> >>>> specify
> >>>> >>>>>>> via
> >>>> >>>>>>>> Windows#until, the segmented store will drop the update with
> no
> >>>> >>>> metric
> >>>> >>>>>>> and
> >>>> >>>>>>>> record a debug-level log.
> >>>> >>>>>>>>
> >>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and
> >>>> moving
> >>>> >>>>>>> retention
> >>>> >>>>>>>> fully into the state store, we need to have metrics for both
> >>>> "late
> >>>> >>>>>>> events"
> >>>> >>>>>>>> (new records older than the grace period) and "expired window
> >>>> >>>> events"
> >>>> >>>>>>> (new
> >>>> >>>>>>>> records for windows that are no longer retained in the state
> >>>> >>>> store). I
> >>>> >>>>>>>> already proposed metrics for the late events, and I've just
> >>>> updated
> >>>> >>>> the
> >>>> >>>>>>> KIP
> >>>> >>>>>>>> with metrics for the expired window events. I also updated
> the
> >>>> KIP
> >>>> >>>> to
> >>>> >>>>>>> make
> >>>> >>>>>>>> it clear that neither late nor expired events will count as
> >>>> >>>>>>>> "skipped-records" any more.
> >>>> >>>>>>>>
> >>>> >>>>>>>> -John
> >>>> >>>>>>>>
> >>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
> >>>> wangguoz@gmail.com>
> >>>> >>>>>>> wrote:
> >>>> >>>>>>>>
> >>>> >>>>>>>>> Hi John,
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
> >>>> suggestion:
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Following your suggestion of the differentiation of
> >>>> >>>> `skipped-records`
> >>>> >>>>>>> v.s.
> >>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
> >>>> scenarios
> >>>> >>>>>>> where
> >>>> >>>>>>>>> records got ignored due the window not being available any
> >>>> more in
> >>>> >>>>>>>>> windowed
> >>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
> >>>> recording
> >>>> >>>> to
> >>>> >>>>>>> the
> >>>> >>>>>>>>> `late-event-drop` metrics recording.
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> Guozhang
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <
> >>>> bbejeck@gmail.com>
> >>>> >>>>>> wrote:
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>> Thanks for the KIP!
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> +1
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> -Bill
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <
> yuzhihong@gmail.com>
> >>>> >>>> wrote:
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>>> +1
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
> >>>> john@confluent.io
> >>>> >>>>>
> >>>> >>>>>>>>> wrote:
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>>> Hello devs,
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new
> >>>> >>>>>> comments,
> >>>> >>>>>>>>> so I
> >>>> >>>>>>>>>>> am
> >>>> >>>>>>>>>>>> calling for a vote!
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> Here's the KIP:
> >>>> https://cwiki.apache.org/confluence/x/sQU0BQ
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> The basic idea is to provide:
> >>>> >>>>>>>>>>>> * more usable control over update rate (vs the current
> >>>> state
> >>>> >>>>>> store
> >>>> >>>>>>>>>>> caches)
> >>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature
> which
> >>>> >>>>>>> several
> >>>> >>>>>>>>>> people
> >>>> >>>>>>>>>>>> have requested
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>> Thanks,
> >>>> >>>>>>>>>>>> -John
> >>>> >>>>>>>>>>>>
> >>>> >>>>>>>>>>>
> >>>> >>>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>>
> >>>> >>>>>>>>> --
> >>>> >>>>>>>>> -- Guozhang
> >>>> >>>>>>>>>
> >>>> >>>>>>>>
> >>>> >>>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>>
> >>>> >>>>>> --
> >>>> >>>>>> -- Guozhang
> >>>> >>>>>>
> >>>> >>>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >
> >>>>
> >>>>
>

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.
Hi all, sorry to do this again, but during review of the code to add the
metrics proposed in this KIP, the reviewers and I noticed some
inconsistencies and drawbacks of the metrics I proposed in the KIP.

Here's the diff:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=24&selectedPageVersions=23

* The proposed metrics were all INFO level, but they would be updated on
every record, creating a performance concern. If we can refactor the
metrics framework in the future to resolve this concern, we may move the
metrics back to INFO level.
* having separate metrics for memory and disk buffers is unnecessarily
complex. The main utility is determining how close the buffer is to the
configured limit, which makes a single metric more useful. I've combined
them into one "suppression-buffer-size-*" metric.
* The "intermediate-result-suppression-*" metric would be equivalent to the
"process-*" metric which is already available on the ProcessorNode. I've
removed it from the KIP.
* The "suppression-mem-buffer-evict-*" metric had been proposed as a buffer
metric, but it makes more sense as a processor node metric, since its
counterpart is the "process-*" metric. I've replaced it with a processor
node metric, "suppression-emit-*"

Let me know if you want to recast votes in response to this change.

-John

On Thu, Oct 4, 2018 at 11:26 AM John Roesler <jo...@confluent.io> wrote:

> Update: Here's a link to the documented eviction behavior:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)
>
> On Thu, Oct 4, 2018 at 11:12 AM John Roesler <jo...@confluent.io> wrote:
>
>> Hello again, all,
>>
>> During review, we realized that there is a relationship between this
>> (KIP-328) and KIP-372.
>>
>> KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds
>> a new internal topic (the changelog for the suppression buffer).
>>
>> However, we didn't consider this relationship in either KIP discussion,
>> possibly since they were discussed and accepted concurrently.
>>
>> I have updated KIP-328 to effectively "merge" the two KIPs by adding a
>> `withName` builder to Suppressed in the style of the other builders added
>> in KIP-372:
>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
>> .
>>
>> I think this should be uncontroversial, but as always, let me know of any
>> objections you may have.
>>
>>
>> Also, note that I'll be updating the KIP to document the exact buffer
>> eviction behavior. I previously treated this as an internal implementation
>> detail, but after consideration, I think users would want to know the
>> eviction semantics, especially if they are debugging their applications and
>> scrutinizing the sequence of emitted records.
>>
>> Thanks,
>> -John
>>
>> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <jo...@confluent.io> wrote:
>>
>>> Hello all,
>>>
>>> During review of https://github.com/apache/kafka/pull/5567 for KIP-328,
>>> the reviewers raised many good suggestions for the API.
>>>
>>> The basic design of the suppress operation remains the same, but the
>>> config object is (in my opinion) far more ergonomic with their
>>> suggestions.
>>>
>>> I have updated the KIP to reflect the new config (
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
>>> )
>>>
>>> Please let me know if anyone wishes to change their vote, and we call
>>> for a recast.
>>>
>>> Thanks,
>>> -John
>>>
>>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <ma...@confluent.io>
>>> wrote:
>>>
>>>> It seems nobody has any objections against the change.
>>>>
>>>> That's for the KIP improvement. I'll go ahead and merge the PR.
>>>>
>>>>
>>>> -Matthias
>>>>
>>>> On 8/21/18 2:44 PM, John Roesler wrote:
>>>> > Hello again, all,
>>>> >
>>>> > I belatedly had a better idea for adding grace period to the Windows
>>>> class
>>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
>>>> > providing the grace-setter in the abstract class and having to
>>>> retract it
>>>> > in UnlimitedWindows, I've made the getter abstract method in Windows
>>>> and
>>>> > only added setters to Time and Join windows.
>>>> >
>>>> > This should not only improve the ergonomics of grace period, but make
>>>> the
>>>> > whole class hierarchy more maintainable.
>>>> >
>>>> > See the PR for more details:
>>>> https://github.com/apache/kafka/pull/5536
>>>> >
>>>> > I've updated the KIP accordingly. Here's the diff:
>>>> >
>>>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
>>>> >
>>>> > Please let me know if this changes your vote.
>>>> >
>>>> > Thanks,
>>>> > -John
>>>> >
>>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <jo...@confluent.io>
>>>> wrote:
>>>> >
>>>> >> Hey all,
>>>> >>
>>>> >> I just wanted to let you know that a few small issues surfaced during
>>>> >> implementation and review. I've updated the KIP. Here's the diff:
>>>> >>
>>>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
>>>> >>
>>>> >> Basically:
>>>> >> * the metrics named "*-event-*" are inconsistent with existing
>>>> >> nomenclature, and will be "*-record-*" instead (late records instead
>>>> of
>>>> >> late events, for example)
>>>> >> * the apis taking and returning Duration will use long millis
>>>> instead. We
>>>> >> do want to transition to Duration in the future, but we shouldn't do
>>>> it
>>>> >> piecemeal.
>>>> >>
>>>> >> Thanks,
>>>> >> -John
>>>> >>
>>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <jo...@confluent.io>
>>>> wrote:
>>>> >>
>>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang,
>>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
>>>> >>>
>>>> >>> Thanks for your time,
>>>> >>> -John
>>>> >>>
>>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <
>>>> matthias@confluent.io>
>>>> >>> wrote:
>>>> >>>
>>>> >>>> +1 (binding)
>>>> >>>>
>>>> >>>> Thanks for the KIP.
>>>> >>>>
>>>> >>>>
>>>> >>>> -Matthias
>>>> >>>>
>>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
>>>> >>>>> Thanks John! +1
>>>> >>>>>
>>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wa...@gmail.com>
>>>> wrote:
>>>> >>>>>
>>>> >>>>>> Yes, the addendum lgtm as well. Thanks!
>>>> >>>>>>
>>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <john@confluent.io
>>>> >
>>>> >>>> wrote:
>>>> >>>>>>
>>>> >>>>>>> Another thing that came up after I started working on an
>>>> >>>> implementation
>>>> >>>>>> is
>>>> >>>>>>> that in addition to deprecating "retention" from the Windows
>>>> >>>> interface,
>>>> >>>>>> we
>>>> >>>>>>> also need to deprecate "segmentInterval", for the same reasons.
>>>> I
>>>> >>>> simply
>>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
>>>> >>>>>>>
>>>> >>>>>>> Hopefully, this doesn't change anyone's vote.
>>>> >>>>>>>
>>>> >>>>>>> Thanks,
>>>> >>>>>>> -John
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <john@confluent.io
>>>> >
>>>> >>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>>> Thanks Guozhang,
>>>> >>>>>>>>
>>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are
>>>> "late" only
>>>> >>>>>> when
>>>> >>>>>>>> they are older than the retention period. Currently, we detect
>>>> this
>>>> >>>> in
>>>> >>>>>>> the
>>>> >>>>>>>> processor and record it as a "skipped-record". We then do not
>>>> >>>> attempt
>>>> >>>>>> to
>>>> >>>>>>>> store the event in the window store. If a user provided a
>>>> >>>>>> pre-configured
>>>> >>>>>>>> window store with a retention period smaller than the one they
>>>> >>>> specify
>>>> >>>>>>> via
>>>> >>>>>>>> Windows#until, the segmented store will drop the update with no
>>>> >>>> metric
>>>> >>>>>>> and
>>>> >>>>>>>> record a debug-level log.
>>>> >>>>>>>>
>>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and
>>>> moving
>>>> >>>>>>> retention
>>>> >>>>>>>> fully into the state store, we need to have metrics for both
>>>> "late
>>>> >>>>>>> events"
>>>> >>>>>>>> (new records older than the grace period) and "expired window
>>>> >>>> events"
>>>> >>>>>>> (new
>>>> >>>>>>>> records for windows that are no longer retained in the state
>>>> >>>> store). I
>>>> >>>>>>>> already proposed metrics for the late events, and I've just
>>>> updated
>>>> >>>> the
>>>> >>>>>>> KIP
>>>> >>>>>>>> with metrics for the expired window events. I also updated the
>>>> KIP
>>>> >>>> to
>>>> >>>>>>> make
>>>> >>>>>>>> it clear that neither late nor expired events will count as
>>>> >>>>>>>> "skipped-records" any more.
>>>> >>>>>>>>
>>>> >>>>>>>> -John
>>>> >>>>>>>>
>>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
>>>> wangguoz@gmail.com>
>>>> >>>>>>> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>>> Hi John,
>>>> >>>>>>>>>
>>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
>>>> suggestion:
>>>> >>>>>>>>>
>>>> >>>>>>>>> Following your suggestion of the differentiation of
>>>> >>>> `skipped-records`
>>>> >>>>>>> v.s.
>>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
>>>> scenarios
>>>> >>>>>>> where
>>>> >>>>>>>>> records got ignored due the window not being available any
>>>> more in
>>>> >>>>>>>>> windowed
>>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
>>>> recording
>>>> >>>> to
>>>> >>>>>>> the
>>>> >>>>>>>>> `late-event-drop` metrics recording.
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> Guozhang
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <
>>>> bbejeck@gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>>> Thanks for the KIP!
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> +1
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> -Bill
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <yu...@gmail.com>
>>>> >>>> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>> +1
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
>>>> john@confluent.io
>>>> >>>>>
>>>> >>>>>>>>> wrote:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>> Hello devs,
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new
>>>> >>>>>> comments,
>>>> >>>>>>>>> so I
>>>> >>>>>>>>>>> am
>>>> >>>>>>>>>>>> calling for a vote!
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Here's the KIP:
>>>> https://cwiki.apache.org/confluence/x/sQU0BQ
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> The basic idea is to provide:
>>>> >>>>>>>>>>>> * more usable control over update rate (vs the current
>>>> state
>>>> >>>>>> store
>>>> >>>>>>>>>>> caches)
>>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature which
>>>> >>>>>>> several
>>>> >>>>>>>>>> people
>>>> >>>>>>>>>>>> have requested
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Thanks,
>>>> >>>>>>>>>>>> -John
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> --
>>>> >>>>>>>>> -- Guozhang
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>> -- Guozhang
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >
>>>>
>>>>

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

Posted by John Roesler <jo...@confluent.io>.
Update: Here's a link to the documented eviction behavior:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)

On Thu, Oct 4, 2018 at 11:12 AM John Roesler <jo...@confluent.io> wrote:

> Hello again, all,
>
> During review, we realized that there is a relationship between this
> (KIP-328) and KIP-372.
>
> KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds a
> new internal topic (the changelog for the suppression buffer).
>
> However, we didn't consider this relationship in either KIP discussion,
> possibly since they were discussed and accepted concurrently.
>
> I have updated KIP-328 to effectively "merge" the two KIPs by adding a
> `withName` builder to Suppressed in the style of the other builders added
> in KIP-372:
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19
> .
>
> I think this should be uncontroversial, but as always, let me know of any
> objections you may have.
>
>
> Also, note that I'll be updating the KIP to document the exact buffer
> eviction behavior. I previously treated this as an internal implementation
> detail, but after consideration, I think users would want to know the
> eviction semantics, especially if they are debugging their applications and
> scrutinizing the sequence of emitted records.
>
> Thanks,
> -John
>
> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <jo...@confluent.io> wrote:
>
>> Hello all,
>>
>> During review of https://github.com/apache/kafka/pull/5567 for KIP-328,
>> the reviewers raised many good suggestions for the API.
>>
>> The basic design of the suppress operation remains the same, but the
>> config object is (in my opinion) far more ergonomic with their
>> suggestions.
>>
>> I have updated the KIP to reflect the new config (
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator
>> )
>>
>> Please let me know if anyone wishes to change their vote, and we call for
>> a recast.
>>
>> Thanks,
>> -John
>>
>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <ma...@confluent.io>
>> wrote:
>>
>>> It seems nobody has any objections against the change.
>>>
>>> That's for the KIP improvement. I'll go ahead and merge the PR.
>>>
>>>
>>> -Matthias
>>>
>>> On 8/21/18 2:44 PM, John Roesler wrote:
>>> > Hello again, all,
>>> >
>>> > I belatedly had a better idea for adding grace period to the Windows
>>> class
>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of
>>> > providing the grace-setter in the abstract class and having to retract
>>> it
>>> > in UnlimitedWindows, I've made the getter abstract method in Windows
>>> and
>>> > only added setters to Time and Join windows.
>>> >
>>> > This should not only improve the ergonomics of grace period, but make
>>> the
>>> > whole class hierarchy more maintainable.
>>> >
>>> > See the PR for more details: https://github.com/apache/kafka/pull/5536
>>> >
>>> > I've updated the KIP accordingly. Here's the diff:
>>> >
>>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9
>>> >
>>> > Please let me know if this changes your vote.
>>> >
>>> > Thanks,
>>> > -John
>>> >
>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <jo...@confluent.io>
>>> wrote:
>>> >
>>> >> Hey all,
>>> >>
>>> >> I just wanted to let you know that a few small issues surfaced during
>>> >> implementation and review. I've updated the KIP. Here's the diff:
>>> >>
>>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8
>>> >>
>>> >> Basically:
>>> >> * the metrics named "*-event-*" are inconsistent with existing
>>> >> nomenclature, and will be "*-record-*" instead (late records instead
>>> of
>>> >> late events, for example)
>>> >> * the apis taking and returning Duration will use long millis
>>> instead. We
>>> >> do want to transition to Duration in the future, but we shouldn't do
>>> it
>>> >> piecemeal.
>>> >>
>>> >> Thanks,
>>> >> -John
>>> >>
>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <jo...@confluent.io>
>>> wrote:
>>> >>
>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang,
>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
>>> >>>
>>> >>> Thanks for your time,
>>> >>> -John
>>> >>>
>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax <
>>> matthias@confluent.io>
>>> >>> wrote:
>>> >>>
>>> >>>> +1 (binding)
>>> >>>>
>>> >>>> Thanks for the KIP.
>>> >>>>
>>> >>>>
>>> >>>> -Matthias
>>> >>>>
>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote:
>>> >>>>> Thanks John! +1
>>> >>>>>
>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wa...@gmail.com>
>>> wrote:
>>> >>>>>
>>> >>>>>> Yes, the addendum lgtm as well. Thanks!
>>> >>>>>>
>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <jo...@confluent.io>
>>> >>>> wrote:
>>> >>>>>>
>>> >>>>>>> Another thing that came up after I started working on an
>>> >>>> implementation
>>> >>>>>> is
>>> >>>>>>> that in addition to deprecating "retention" from the Windows
>>> >>>> interface,
>>> >>>>>> we
>>> >>>>>>> also need to deprecate "segmentInterval", for the same reasons. I
>>> >>>> simply
>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly.
>>> >>>>>>>
>>> >>>>>>> Hopefully, this doesn't change anyone's vote.
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> -John
>>> >>>>>>>
>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <jo...@confluent.io>
>>> >>>> wrote:
>>> >>>>>>>
>>> >>>>>>>> Thanks Guozhang,
>>> >>>>>>>>
>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are "late"
>>> only
>>> >>>>>> when
>>> >>>>>>>> they are older than the retention period. Currently, we detect
>>> this
>>> >>>> in
>>> >>>>>>> the
>>> >>>>>>>> processor and record it as a "skipped-record". We then do not
>>> >>>> attempt
>>> >>>>>> to
>>> >>>>>>>> store the event in the window store. If a user provided a
>>> >>>>>> pre-configured
>>> >>>>>>>> window store with a retention period smaller than the one they
>>> >>>> specify
>>> >>>>>>> via
>>> >>>>>>>> Windows#until, the segmented store will drop the update with no
>>> >>>> metric
>>> >>>>>>> and
>>> >>>>>>>> record a debug-level log.
>>> >>>>>>>>
>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and moving
>>> >>>>>>> retention
>>> >>>>>>>> fully into the state store, we need to have metrics for both
>>> "late
>>> >>>>>>> events"
>>> >>>>>>>> (new records older than the grace period) and "expired window
>>> >>>> events"
>>> >>>>>>> (new
>>> >>>>>>>> records for windows that are no longer retained in the state
>>> >>>> store). I
>>> >>>>>>>> already proposed metrics for the late events, and I've just
>>> updated
>>> >>>> the
>>> >>>>>>> KIP
>>> >>>>>>>> with metrics for the expired window events. I also updated the
>>> KIP
>>> >>>> to
>>> >>>>>>> make
>>> >>>>>>>> it clear that neither late nor expired events will count as
>>> >>>>>>>> "skipped-records" any more.
>>> >>>>>>>>
>>> >>>>>>>> -John
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang <
>>> wangguoz@gmail.com>
>>> >>>>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> Hi John,
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor
>>> suggestion:
>>> >>>>>>>>>
>>> >>>>>>>>> Following your suggestion of the differentiation of
>>> >>>> `skipped-records`
>>> >>>>>>> v.s.
>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the
>>> scenarios
>>> >>>>>>> where
>>> >>>>>>>>> records got ignored due the window not being available any
>>> more in
>>> >>>>>>>>> windowed
>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics
>>> recording
>>> >>>> to
>>> >>>>>>> the
>>> >>>>>>>>> `late-event-drop` metrics recording.
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> Guozhang
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck <
>>> bbejeck@gmail.com>
>>> >>>>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> Thanks for the KIP!
>>> >>>>>>>>>>
>>> >>>>>>>>>> +1
>>> >>>>>>>>>>
>>> >>>>>>>>>> -Bill
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <yu...@gmail.com>
>>> >>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> +1
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler <
>>> john@confluent.io
>>> >>>>>
>>> >>>>>>>>> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> Hello devs,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new
>>> >>>>>> comments,
>>> >>>>>>>>> so I
>>> >>>>>>>>>>> am
>>> >>>>>>>>>>>> calling for a vote!
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Here's the KIP:
>>> https://cwiki.apache.org/confluence/x/sQU0BQ
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> The basic idea is to provide:
>>> >>>>>>>>>>>> * more usable control over update rate (vs the current state
>>> >>>>>> store
>>> >>>>>>>>>>> caches)
>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature which
>>> >>>>>>> several
>>> >>>>>>>>>> people
>>> >>>>>>>>>>>> have requested
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>> -John
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> --
>>> >>>>>>>>> -- Guozhang
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> -- Guozhang
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >
>>>
>>>