You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Kyle Winkelman <wi...@gmail.com> on 2017/06/08 01:37:32 UTC

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Just bringing people's attention to the vote thread for my KIP. I started
it before another round of discussion happened. Not sure the protocol so
someone let me know if I am supposed to restart the vote.
Thanks,
Kyle

On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:

> +1  for the KIP and +1 what Xavier said as well.
>
> On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com> wrote:
>
> > Also, +1 for the KIP
> >
> > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:
> >
> > > +1 to what Xavier said
> > >
> > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> wrote:
> > >
> > >> I don't think we should wait for entries from each stream, since that
> > >> might
> > >> limit the usefulness of the cogroup operator. There are instances
> where
> > it
> > >> can be useful to compute something based on data from one or more
> > stream,
> > >> without having to wait for all the streams to produce something for
> the
> > >> group. In the example I gave in the discussion, it is possible to
> > compute
> > >> impression/auction statistics without having to wait for click data,
> > which
> > >> can typically arrive several minutes late.
> > >>
> > >> We could have a separate discussion around adding inner / outer
> > modifiers
> > >> to each of the streams to decide which fields are optional / required
> > >> before sending updates if we think that might be useful.
> > >>
> > >>
> > >>
> > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >>
> > >> > The proposal LGTM, +1
> > >> >
> > >> > One question I have is about when to send the record to the resulted
> > >> KTable
> > >> > changelog. For example in your code snippet in the wiki page, before
> > you
> > >> > see the end result of
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01], Item[no:03], Item[no:04]},
> > >> >                       purchases:{Item[no:07], Item[no:08]},
> > >> >                       wishList:{Item[no:11]}
> > >> >       ]
> > >> >
> > >> >
> > >> > You will firs see
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{},
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{Item[no:07],Item[no:08]},
> > >> >
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{Item[no:07],Item[no:08]},
> > >> >
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > ...
> > >> >
> > >> >
> > >> > I'm wondering if it makes more sense to only start sending the
> update
> > if
> > >> > the corresponding agg-key has seen at least one input from each of
> the
> > >> > input stream? Maybe it is out of the scope of this KIP and we can
> make
> > >> it a
> > >> > more general discussion in a separate one.
> > >> >
> > >> >
> > >> > Guozhang
> > >> >
> > >> >
> > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xavier@confluent.io
> >
> > >> > wrote:
> > >> >
> > >> > > Hi Kyle, I left a few more comments in the discussion thread, if
> you
> > >> > > wouldn't mind taking a look
> > >> > >
> > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > >> winkelman.kyle@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hello all,
> > >> > > >
> > >> > > > I would like to start the vote on KIP-150.
> > >> > > >
> > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > >> > > Kafka-Streams+Cogroup
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Kyle
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > -- Guozhang
> > >> >
> > >>
> > >
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Guozhang Wang <wa...@gmail.com>.
Kyle,

It seems you already have enough committer votes (me, Jay, Sriram, Damian).


Guozhang

On Mon, Jun 26, 2017 at 11:06 AM, Kyle Winkelman <wi...@gmail.com>
wrote:

> Bumping this so it is easy to find now that the discussions have died down.
>
> Thanks,
> Kyle
>
> On Jun 9, 2017 6:32 PM, "Sriram Subramanian" <ra...@confluent.io> wrote:
>
> > +1
> >
> > On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > +1
> > >
> > > -Jay
> > >
> > > On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > I think we can continue on this voting thread.
> > > >
> > > > Currently we have one binding vote and 2 non-binging votes. I would
> > like
> > > to
> > > > call out for other people especially committers to also take a look
> at
> > > this
> > > > proposal and vote.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <
> > winkelman.kyle@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Just bringing people's attention to the vote thread for my KIP. I
> > > started
> > > > > it before another round of discussion happened. Not sure the
> protocol
> > > so
> > > > > someone let me know if I am supposed to restart the vote.
> > > > > Thanks,
> > > > > Kyle
> > > > >
> > > > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > > > >
> > > > > > +1  for the KIP and +1 what Xavier said as well.
> > > > > >
> > > > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <
> damian.guy@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Also, +1 for the KIP
> > > > > > >
> > > > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > +1 to what Xavier said
> > > > > > > >
> > > > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <
> > xavier@confluent.io>
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> I don't think we should wait for entries from each stream,
> > since
> > > > > that
> > > > > > > >> might
> > > > > > > >> limit the usefulness of the cogroup operator. There are
> > > instances
> > > > > > where
> > > > > > > it
> > > > > > > >> can be useful to compute something based on data from one or
> > > more
> > > > > > > stream,
> > > > > > > >> without having to wait for all the streams to produce
> > something
> > > > for
> > > > > > the
> > > > > > > >> group. In the example I gave in the discussion, it is
> possible
> > > to
> > > > > > > compute
> > > > > > > >> impression/auction statistics without having to wait for
> click
> > > > data,
> > > > > > > which
> > > > > > > >> can typically arrive several minutes late.
> > > > > > > >>
> > > > > > > >> We could have a separate discussion around adding inner /
> > outer
> > > > > > > modifiers
> > > > > > > >> to each of the streams to decide which fields are optional /
> > > > > required
> > > > > > > >> before sending updates if we think that might be useful.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> > > wangguoz@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > The proposal LGTM, +1
> > > > > > > >> >
> > > > > > > >> > One question I have is about when to send the record to
> the
> > > > > resulted
> > > > > > > >> KTable
> > > > > > > >> > changelog. For example in your code snippet in the wiki
> > page,
> > > > > before
> > > > > > > you
> > > > > > > >> > see the end result of
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > > > Item[no:04]},
> > > > > > > >> >                       purchases:{Item[no:07],
> Item[no:08]},
> > > > > > > >> >                       wishList:{Item[no:11]}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > You will firs see
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{},
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{Item[no:07],Item[n
> o:08]},
> > > > > > > >> >
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{Item[no:07],Item[n
> o:08]},
> > > > > > > >> >
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > ...
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > I'm wondering if it makes more sense to only start sending
> > the
> > > > > > update
> > > > > > > if
> > > > > > > >> > the corresponding agg-key has seen at least one input from
> > > each
> > > > of
> > > > > > the
> > > > > > > >> > input stream? Maybe it is out of the scope of this KIP and
> > we
> > > > can
> > > > > > make
> > > > > > > >> it a
> > > > > > > >> > more general discussion in a separate one.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Guozhang
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > > > xavier@confluent.io
> > > > > > >
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> > > thread,
> > > > if
> > > > > > you
> > > > > > > >> > > wouldn't mind taking a look
> > > > > > > >> > >
> > > > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > > > >> winkelman.kyle@gmail.com
> > > > > > > >> > >
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hello all,
> > > > > > > >> > > >
> > > > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > > > >> > > >
> > > > > > > >> > > > https://cwiki.apache.org/confl
> uence/display/KAFKA/KIP-
> > > > 150+-+
> > > > > > > >> > > Kafka-Streams+Cogroup
> > > > > > > >> > > >
> > > > > > > >> > > > Thanks,
> > > > > > > >> > > > Kyle
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> > -- Guozhang
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Kyle Winkelman <wi...@gmail.com>.
Bumping this so it is easy to find now that the discussions have died down.

Thanks,
Kyle

On Jun 9, 2017 6:32 PM, "Sriram Subramanian" <ra...@confluent.io> wrote:

> +1
>
> On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:
>
> > +1
> >
> > -Jay
> >
> > On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > I think we can continue on this voting thread.
> > >
> > > Currently we have one binding vote and 2 non-binging votes. I would
> like
> > to
> > > call out for other people especially committers to also take a look at
> > this
> > > proposal and vote.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <
> winkelman.kyle@gmail.com
> > >
> > > wrote:
> > >
> > > > Just bringing people's attention to the vote thread for my KIP. I
> > started
> > > > it before another round of discussion happened. Not sure the protocol
> > so
> > > > someone let me know if I am supposed to restart the vote.
> > > > Thanks,
> > > > Kyle
> > > >
> > > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > > >
> > > > > +1  for the KIP and +1 what Xavier said as well.
> > > > >
> > > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Also, +1 for the KIP
> > > > > >
> > > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > +1 to what Xavier said
> > > > > > >
> > > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <
> xavier@confluent.io>
> > > > > wrote:
> > > > > > >
> > > > > > >> I don't think we should wait for entries from each stream,
> since
> > > > that
> > > > > > >> might
> > > > > > >> limit the usefulness of the cogroup operator. There are
> > instances
> > > > > where
> > > > > > it
> > > > > > >> can be useful to compute something based on data from one or
> > more
> > > > > > stream,
> > > > > > >> without having to wait for all the streams to produce
> something
> > > for
> > > > > the
> > > > > > >> group. In the example I gave in the discussion, it is possible
> > to
> > > > > > compute
> > > > > > >> impression/auction statistics without having to wait for click
> > > data,
> > > > > > which
> > > > > > >> can typically arrive several minutes late.
> > > > > > >>
> > > > > > >> We could have a separate discussion around adding inner /
> outer
> > > > > > modifiers
> > > > > > >> to each of the streams to decide which fields are optional /
> > > > required
> > > > > > >> before sending updates if we think that might be useful.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> > wangguoz@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >>
> > > > > > >> > The proposal LGTM, +1
> > > > > > >> >
> > > > > > >> > One question I have is about when to send the record to the
> > > > resulted
> > > > > > >> KTable
> > > > > > >> > changelog. For example in your code snippet in the wiki
> page,
> > > > before
> > > > > > you
> > > > > > >> > see the end result of
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > > Item[no:04]},
> > > > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > > > >> >                       wishList:{Item[no:11]}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > You will firs see
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{},
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > > >> >
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > > >> >
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > ...
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > I'm wondering if it makes more sense to only start sending
> the
> > > > > update
> > > > > > if
> > > > > > >> > the corresponding agg-key has seen at least one input from
> > each
> > > of
> > > > > the
> > > > > > >> > input stream? Maybe it is out of the scope of this KIP and
> we
> > > can
> > > > > make
> > > > > > >> it a
> > > > > > >> > more general discussion in a separate one.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Guozhang
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > > xavier@confluent.io
> > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> > thread,
> > > if
> > > > > you
> > > > > > >> > > wouldn't mind taking a look
> > > > > > >> > >
> > > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > > >> winkelman.kyle@gmail.com
> > > > > > >> > >
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hello all,
> > > > > > >> > > >
> > > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > > >> > > >
> > > > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 150+-+
> > > > > > >> > > Kafka-Streams+Cogroup
> > > > > > >> > > >
> > > > > > >> > > > Thanks,
> > > > > > >> > > > Kyle
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > --
> > > > > > >> > -- Guozhang
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Sriram Subramanian <ra...@confluent.io>.
+1

On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:

> +1
>
> -Jay
>
> On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > I think we can continue on this voting thread.
> >
> > Currently we have one binding vote and 2 non-binging votes. I would like
> to
> > call out for other people especially committers to also take a look at
> this
> > proposal and vote.
> >
> >
> > Guozhang
> >
> >
> > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <winkelman.kyle@gmail.com
> >
> > wrote:
> >
> > > Just bringing people's attention to the vote thread for my KIP. I
> started
> > > it before another round of discussion happened. Not sure the protocol
> so
> > > someone let me know if I am supposed to restart the vote.
> > > Thanks,
> > > Kyle
> > >
> > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > >
> > > > +1  for the KIP and +1 what Xavier said as well.
> > > >
> > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > > wrote:
> > > >
> > > > > Also, +1 for the KIP
> > > > >
> > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > wrote:
> > > > >
> > > > > > +1 to what Xavier said
> > > > > >
> > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > > > wrote:
> > > > > >
> > > > > >> I don't think we should wait for entries from each stream, since
> > > that
> > > > > >> might
> > > > > >> limit the usefulness of the cogroup operator. There are
> instances
> > > > where
> > > > > it
> > > > > >> can be useful to compute something based on data from one or
> more
> > > > > stream,
> > > > > >> without having to wait for all the streams to produce something
> > for
> > > > the
> > > > > >> group. In the example I gave in the discussion, it is possible
> to
> > > > > compute
> > > > > >> impression/auction statistics without having to wait for click
> > data,
> > > > > which
> > > > > >> can typically arrive several minutes late.
> > > > > >>
> > > > > >> We could have a separate discussion around adding inner / outer
> > > > > modifiers
> > > > > >> to each of the streams to decide which fields are optional /
> > > required
> > > > > >> before sending updates if we think that might be useful.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > > > wrote:
> > > > > >>
> > > > > >> > The proposal LGTM, +1
> > > > > >> >
> > > > > >> > One question I have is about when to send the record to the
> > > resulted
> > > > > >> KTable
> > > > > >> > changelog. For example in your code snippet in the wiki page,
> > > before
> > > > > you
> > > > > >> > see the end result of
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > Item[no:04]},
> > > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > > >> >                       wishList:{Item[no:11]}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> >
> > > > > >> > You will firs see
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{},
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > >> >
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > >> >
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > ...
> > > > > >> >
> > > > > >> >
> > > > > >> > I'm wondering if it makes more sense to only start sending the
> > > > update
> > > > > if
> > > > > >> > the corresponding agg-key has seen at least one input from
> each
> > of
> > > > the
> > > > > >> > input stream? Maybe it is out of the scope of this KIP and we
> > can
> > > > make
> > > > > >> it a
> > > > > >> > more general discussion in a separate one.
> > > > > >> >
> > > > > >> >
> > > > > >> > Guozhang
> > > > > >> >
> > > > > >> >
> > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > xavier@confluent.io
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> thread,
> > if
> > > > you
> > > > > >> > > wouldn't mind taking a look
> > > > > >> > >
> > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > >> winkelman.kyle@gmail.com
> > > > > >> > >
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hello all,
> > > > > >> > > >
> > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > >> > > >
> > > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 150+-+
> > > > > >> > > Kafka-Streams+Cogroup
> > > > > >> > > >
> > > > > >> > > > Thanks,
> > > > > >> > > > Kyle
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > -- Guozhang
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Jay Kreps <ja...@confluent.io>.
+1

-Jay

On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com> wrote:

> I think we can continue on this voting thread.
>
> Currently we have one binding vote and 2 non-binging votes. I would like to
> call out for other people especially committers to also take a look at this
> proposal and vote.
>
>
> Guozhang
>
>
> On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <wi...@gmail.com>
> wrote:
>
> > Just bringing people's attention to the vote thread for my KIP. I started
> > it before another round of discussion happened. Not sure the protocol so
> > someone let me know if I am supposed to restart the vote.
> > Thanks,
> > Kyle
> >
> > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> >
> > > +1  for the KIP and +1 what Xavier said as well.
> > >
> > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > wrote:
> > >
> > > > Also, +1 for the KIP
> > > >
> > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> wrote:
> > > >
> > > > > +1 to what Xavier said
> > > > >
> > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > > wrote:
> > > > >
> > > > >> I don't think we should wait for entries from each stream, since
> > that
> > > > >> might
> > > > >> limit the usefulness of the cogroup operator. There are instances
> > > where
> > > > it
> > > > >> can be useful to compute something based on data from one or more
> > > > stream,
> > > > >> without having to wait for all the streams to produce something
> for
> > > the
> > > > >> group. In the example I gave in the discussion, it is possible to
> > > > compute
> > > > >> impression/auction statistics without having to wait for click
> data,
> > > > which
> > > > >> can typically arrive several minutes late.
> > > > >>
> > > > >> We could have a separate discussion around adding inner / outer
> > > > modifiers
> > > > >> to each of the streams to decide which fields are optional /
> > required
> > > > >> before sending updates if we think that might be useful.
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wangguoz@gmail.com
> >
> > > > wrote:
> > > > >>
> > > > >> > The proposal LGTM, +1
> > > > >> >
> > > > >> > One question I have is about when to send the record to the
> > resulted
> > > > >> KTable
> > > > >> > changelog. For example in your code snippet in the wiki page,
> > before
> > > > you
> > > > >> > see the end result of
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01], Item[no:03],
> > Item[no:04]},
> > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > >> >                       wishList:{Item[no:11]}
> > > > >> >       ]
> > > > >> >
> > > > >> >
> > > > >> > You will firs see
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{},
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > >> >
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > >> >
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > ...
> > > > >> >
> > > > >> >
> > > > >> > I'm wondering if it makes more sense to only start sending the
> > > update
> > > > if
> > > > >> > the corresponding agg-key has seen at least one input from each
> of
> > > the
> > > > >> > input stream? Maybe it is out of the scope of this KIP and we
> can
> > > make
> > > > >> it a
> > > > >> > more general discussion in a separate one.
> > > > >> >
> > > > >> >
> > > > >> > Guozhang
> > > > >> >
> > > > >> >
> > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > xavier@confluent.io
> > > >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Kyle, I left a few more comments in the discussion thread,
> if
> > > you
> > > > >> > > wouldn't mind taking a look
> > > > >> > >
> > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > >> winkelman.kyle@gmail.com
> > > > >> > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Hello all,
> > > > >> > > >
> > > > >> > > > I would like to start the vote on KIP-150.
> > > > >> > > >
> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 150+-+
> > > > >> > > Kafka-Streams+Cogroup
> > > > >> > > >
> > > > >> > > > Thanks,
> > > > >> > > > Kyle
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > -- Guozhang
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Guozhang Wang <wa...@gmail.com>.
I think we can continue on this voting thread.

Currently we have one binding vote and 2 non-binging votes. I would like to
call out for other people especially committers to also take a look at this
proposal and vote.


Guozhang


On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <wi...@gmail.com>
wrote:

> Just bringing people's attention to the vote thread for my KIP. I started
> it before another round of discussion happened. Not sure the protocol so
> someone let me know if I am supposed to restart the vote.
> Thanks,
> Kyle
>
> On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
>
> > +1  for the KIP and +1 what Xavier said as well.
> >
> > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> wrote:
> >
> > > Also, +1 for the KIP
> > >
> > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:
> > >
> > > > +1 to what Xavier said
> > > >
> > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > wrote:
> > > >
> > > >> I don't think we should wait for entries from each stream, since
> that
> > > >> might
> > > >> limit the usefulness of the cogroup operator. There are instances
> > where
> > > it
> > > >> can be useful to compute something based on data from one or more
> > > stream,
> > > >> without having to wait for all the streams to produce something for
> > the
> > > >> group. In the example I gave in the discussion, it is possible to
> > > compute
> > > >> impression/auction statistics without having to wait for click data,
> > > which
> > > >> can typically arrive several minutes late.
> > > >>
> > > >> We could have a separate discussion around adding inner / outer
> > > modifiers
> > > >> to each of the streams to decide which fields are optional /
> required
> > > >> before sending updates if we think that might be useful.
> > > >>
> > > >>
> > > >>
> > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >>
> > > >> > The proposal LGTM, +1
> > > >> >
> > > >> > One question I have is about when to send the record to the
> resulted
> > > >> KTable
> > > >> > changelog. For example in your code snippet in the wiki page,
> before
> > > you
> > > >> > see the end result of
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01], Item[no:03],
> Item[no:04]},
> > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > >> >                       wishList:{Item[no:11]}
> > > >> >       ]
> > > >> >
> > > >> >
> > > >> > You will firs see
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{},
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > ...
> > > >> >
> > > >> >
> > > >> > I'm wondering if it makes more sense to only start sending the
> > update
> > > if
> > > >> > the corresponding agg-key has seen at least one input from each of
> > the
> > > >> > input stream? Maybe it is out of the scope of this KIP and we can
> > make
> > > >> it a
> > > >> > more general discussion in a separate one.
> > > >> >
> > > >> >
> > > >> > Guozhang
> > > >> >
> > > >> >
> > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> xavier@confluent.io
> > >
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Kyle, I left a few more comments in the discussion thread, if
> > you
> > > >> > > wouldn't mind taking a look
> > > >> > >
> > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > >> winkelman.kyle@gmail.com
> > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello all,
> > > >> > > >
> > > >> > > > I would like to start the vote on KIP-150.
> > > >> > > >
> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > > >> > > Kafka-Streams+Cogroup
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Kyle
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > -- Guozhang
> > > >> >
> > > >>
> > > >
> > >
> >
>



-- 
-- Guozhang