You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Kyle Winkelman <wi...@gmail.com> on 2017/05/19 12:30:55 UTC

[Vote] KIP-150 - Kafka-Streams Cogroup

Hello all,

I would like to start the vote on KIP-150.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+Kafka-Streams+Cogroup

Thanks,
Kyle

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Guozhang Wang <wa...@gmail.com>.
Kyle,

It seems you already have enough committer votes (me, Jay, Sriram, Damian).


Guozhang

On Mon, Jun 26, 2017 at 11:06 AM, Kyle Winkelman <wi...@gmail.com>
wrote:

> Bumping this so it is easy to find now that the discussions have died down.
>
> Thanks,
> Kyle
>
> On Jun 9, 2017 6:32 PM, "Sriram Subramanian" <ra...@confluent.io> wrote:
>
> > +1
> >
> > On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > +1
> > >
> > > -Jay
> > >
> > > On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > I think we can continue on this voting thread.
> > > >
> > > > Currently we have one binding vote and 2 non-binging votes. I would
> > like
> > > to
> > > > call out for other people especially committers to also take a look
> at
> > > this
> > > > proposal and vote.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <
> > winkelman.kyle@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Just bringing people's attention to the vote thread for my KIP. I
> > > started
> > > > > it before another round of discussion happened. Not sure the
> protocol
> > > so
> > > > > someone let me know if I am supposed to restart the vote.
> > > > > Thanks,
> > > > > Kyle
> > > > >
> > > > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > > > >
> > > > > > +1  for the KIP and +1 what Xavier said as well.
> > > > > >
> > > > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <
> damian.guy@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Also, +1 for the KIP
> > > > > > >
> > > > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > +1 to what Xavier said
> > > > > > > >
> > > > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <
> > xavier@confluent.io>
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> I don't think we should wait for entries from each stream,
> > since
> > > > > that
> > > > > > > >> might
> > > > > > > >> limit the usefulness of the cogroup operator. There are
> > > instances
> > > > > > where
> > > > > > > it
> > > > > > > >> can be useful to compute something based on data from one or
> > > more
> > > > > > > stream,
> > > > > > > >> without having to wait for all the streams to produce
> > something
> > > > for
> > > > > > the
> > > > > > > >> group. In the example I gave in the discussion, it is
> possible
> > > to
> > > > > > > compute
> > > > > > > >> impression/auction statistics without having to wait for
> click
> > > > data,
> > > > > > > which
> > > > > > > >> can typically arrive several minutes late.
> > > > > > > >>
> > > > > > > >> We could have a separate discussion around adding inner /
> > outer
> > > > > > > modifiers
> > > > > > > >> to each of the streams to decide which fields are optional /
> > > > > required
> > > > > > > >> before sending updates if we think that might be useful.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> > > wangguoz@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > The proposal LGTM, +1
> > > > > > > >> >
> > > > > > > >> > One question I have is about when to send the record to
> the
> > > > > resulted
> > > > > > > >> KTable
> > > > > > > >> > changelog. For example in your code snippet in the wiki
> > page,
> > > > > before
> > > > > > > you
> > > > > > > >> > see the end result of
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > > > Item[no:04]},
> > > > > > > >> >                       purchases:{Item[no:07],
> Item[no:08]},
> > > > > > > >> >                       wishList:{Item[no:11]}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > You will firs see
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{},
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{Item[no:07],Item[n
> o:08]},
> > > > > > > >> >
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > 1L, Customer[
> > > > > > > >> >
> > > > > > > >> >                       cart:{Item[no:01]},
> > > > > > > >> >                       purchases:{Item[no:07],Item[n
> o:08]},
> > > > > > > >> >
> > > > > > > >> >                       wishList:{}
> > > > > > > >> >       ]
> > > > > > > >> >
> > > > > > > >> > ...
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > I'm wondering if it makes more sense to only start sending
> > the
> > > > > > update
> > > > > > > if
> > > > > > > >> > the corresponding agg-key has seen at least one input from
> > > each
> > > > of
> > > > > > the
> > > > > > > >> > input stream? Maybe it is out of the scope of this KIP and
> > we
> > > > can
> > > > > > make
> > > > > > > >> it a
> > > > > > > >> > more general discussion in a separate one.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Guozhang
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > > > xavier@confluent.io
> > > > > > >
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> > > thread,
> > > > if
> > > > > > you
> > > > > > > >> > > wouldn't mind taking a look
> > > > > > > >> > >
> > > > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > > > >> winkelman.kyle@gmail.com
> > > > > > > >> > >
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hello all,
> > > > > > > >> > > >
> > > > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > > > >> > > >
> > > > > > > >> > > > https://cwiki.apache.org/confl
> uence/display/KAFKA/KIP-
> > > > 150+-+
> > > > > > > >> > > Kafka-Streams+Cogroup
> > > > > > > >> > > >
> > > > > > > >> > > > Thanks,
> > > > > > > >> > > > Kyle
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> > -- Guozhang
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Kyle Winkelman <wi...@gmail.com>.
Bumping this so it is easy to find now that the discussions have died down.

Thanks,
Kyle

On Jun 9, 2017 6:32 PM, "Sriram Subramanian" <ra...@confluent.io> wrote:

> +1
>
> On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:
>
> > +1
> >
> > -Jay
> >
> > On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > I think we can continue on this voting thread.
> > >
> > > Currently we have one binding vote and 2 non-binging votes. I would
> like
> > to
> > > call out for other people especially committers to also take a look at
> > this
> > > proposal and vote.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <
> winkelman.kyle@gmail.com
> > >
> > > wrote:
> > >
> > > > Just bringing people's attention to the vote thread for my KIP. I
> > started
> > > > it before another round of discussion happened. Not sure the protocol
> > so
> > > > someone let me know if I am supposed to restart the vote.
> > > > Thanks,
> > > > Kyle
> > > >
> > > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > > >
> > > > > +1  for the KIP and +1 what Xavier said as well.
> > > > >
> > > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Also, +1 for the KIP
> > > > > >
> > > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > +1 to what Xavier said
> > > > > > >
> > > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <
> xavier@confluent.io>
> > > > > wrote:
> > > > > > >
> > > > > > >> I don't think we should wait for entries from each stream,
> since
> > > > that
> > > > > > >> might
> > > > > > >> limit the usefulness of the cogroup operator. There are
> > instances
> > > > > where
> > > > > > it
> > > > > > >> can be useful to compute something based on data from one or
> > more
> > > > > > stream,
> > > > > > >> without having to wait for all the streams to produce
> something
> > > for
> > > > > the
> > > > > > >> group. In the example I gave in the discussion, it is possible
> > to
> > > > > > compute
> > > > > > >> impression/auction statistics without having to wait for click
> > > data,
> > > > > > which
> > > > > > >> can typically arrive several minutes late.
> > > > > > >>
> > > > > > >> We could have a separate discussion around adding inner /
> outer
> > > > > > modifiers
> > > > > > >> to each of the streams to decide which fields are optional /
> > > > required
> > > > > > >> before sending updates if we think that might be useful.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> > wangguoz@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >>
> > > > > > >> > The proposal LGTM, +1
> > > > > > >> >
> > > > > > >> > One question I have is about when to send the record to the
> > > > resulted
> > > > > > >> KTable
> > > > > > >> > changelog. For example in your code snippet in the wiki
> page,
> > > > before
> > > > > > you
> > > > > > >> > see the end result of
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > > Item[no:04]},
> > > > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > > > >> >                       wishList:{Item[no:11]}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > You will firs see
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{},
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > > >> >
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > 1L, Customer[
> > > > > > >> >
> > > > > > >> >                       cart:{Item[no:01]},
> > > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > > >> >
> > > > > > >> >                       wishList:{}
> > > > > > >> >       ]
> > > > > > >> >
> > > > > > >> > ...
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > I'm wondering if it makes more sense to only start sending
> the
> > > > > update
> > > > > > if
> > > > > > >> > the corresponding agg-key has seen at least one input from
> > each
> > > of
> > > > > the
> > > > > > >> > input stream? Maybe it is out of the scope of this KIP and
> we
> > > can
> > > > > make
> > > > > > >> it a
> > > > > > >> > more general discussion in a separate one.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Guozhang
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > > xavier@confluent.io
> > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> > thread,
> > > if
> > > > > you
> > > > > > >> > > wouldn't mind taking a look
> > > > > > >> > >
> > > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > > >> winkelman.kyle@gmail.com
> > > > > > >> > >
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hello all,
> > > > > > >> > > >
> > > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > > >> > > >
> > > > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 150+-+
> > > > > > >> > > Kafka-Streams+Cogroup
> > > > > > >> > > >
> > > > > > >> > > > Thanks,
> > > > > > >> > > > Kyle
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > --
> > > > > > >> > -- Guozhang
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Sriram Subramanian <ra...@confluent.io>.
+1

On Fri, Jun 9, 2017 at 2:24 PM, Jay Kreps <ja...@confluent.io> wrote:

> +1
>
> -Jay
>
> On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > I think we can continue on this voting thread.
> >
> > Currently we have one binding vote and 2 non-binging votes. I would like
> to
> > call out for other people especially committers to also take a look at
> this
> > proposal and vote.
> >
> >
> > Guozhang
> >
> >
> > On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <winkelman.kyle@gmail.com
> >
> > wrote:
> >
> > > Just bringing people's attention to the vote thread for my KIP. I
> started
> > > it before another round of discussion happened. Not sure the protocol
> so
> > > someone let me know if I am supposed to restart the vote.
> > > Thanks,
> > > Kyle
> > >
> > > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> > >
> > > > +1  for the KIP and +1 what Xavier said as well.
> > > >
> > > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > > wrote:
> > > >
> > > > > Also, +1 for the KIP
> > > > >
> > > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> > wrote:
> > > > >
> > > > > > +1 to what Xavier said
> > > > > >
> > > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > > > wrote:
> > > > > >
> > > > > >> I don't think we should wait for entries from each stream, since
> > > that
> > > > > >> might
> > > > > >> limit the usefulness of the cogroup operator. There are
> instances
> > > > where
> > > > > it
> > > > > >> can be useful to compute something based on data from one or
> more
> > > > > stream,
> > > > > >> without having to wait for all the streams to produce something
> > for
> > > > the
> > > > > >> group. In the example I gave in the discussion, it is possible
> to
> > > > > compute
> > > > > >> impression/auction statistics without having to wait for click
> > data,
> > > > > which
> > > > > >> can typically arrive several minutes late.
> > > > > >>
> > > > > >> We could have a separate discussion around adding inner / outer
> > > > > modifiers
> > > > > >> to each of the streams to decide which fields are optional /
> > > required
> > > > > >> before sending updates if we think that might be useful.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > > > wrote:
> > > > > >>
> > > > > >> > The proposal LGTM, +1
> > > > > >> >
> > > > > >> > One question I have is about when to send the record to the
> > > resulted
> > > > > >> KTable
> > > > > >> > changelog. For example in your code snippet in the wiki page,
> > > before
> > > > > you
> > > > > >> > see the end result of
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01], Item[no:03],
> > > Item[no:04]},
> > > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > > >> >                       wishList:{Item[no:11]}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> >
> > > > > >> > You will firs see
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{},
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > >> >
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > 1L, Customer[
> > > > > >> >
> > > > > >> >                       cart:{Item[no:01]},
> > > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > > >> >
> > > > > >> >                       wishList:{}
> > > > > >> >       ]
> > > > > >> >
> > > > > >> > ...
> > > > > >> >
> > > > > >> >
> > > > > >> > I'm wondering if it makes more sense to only start sending the
> > > > update
> > > > > if
> > > > > >> > the corresponding agg-key has seen at least one input from
> each
> > of
> > > > the
> > > > > >> > input stream? Maybe it is out of the scope of this KIP and we
> > can
> > > > make
> > > > > >> it a
> > > > > >> > more general discussion in a separate one.
> > > > > >> >
> > > > > >> >
> > > > > >> > Guozhang
> > > > > >> >
> > > > > >> >
> > > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > > xavier@confluent.io
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Kyle, I left a few more comments in the discussion
> thread,
> > if
> > > > you
> > > > > >> > > wouldn't mind taking a look
> > > > > >> > >
> > > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > > >> winkelman.kyle@gmail.com
> > > > > >> > >
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hello all,
> > > > > >> > > >
> > > > > >> > > > I would like to start the vote on KIP-150.
> > > > > >> > > >
> > > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 150+-+
> > > > > >> > > Kafka-Streams+Cogroup
> > > > > >> > > >
> > > > > >> > > > Thanks,
> > > > > >> > > > Kyle
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > -- Guozhang
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Jay Kreps <ja...@confluent.io>.
+1

-Jay

On Thu, Jun 8, 2017 at 11:16 AM, Guozhang Wang <wa...@gmail.com> wrote:

> I think we can continue on this voting thread.
>
> Currently we have one binding vote and 2 non-binging votes. I would like to
> call out for other people especially committers to also take a look at this
> proposal and vote.
>
>
> Guozhang
>
>
> On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <wi...@gmail.com>
> wrote:
>
> > Just bringing people's attention to the vote thread for my KIP. I started
> > it before another round of discussion happened. Not sure the protocol so
> > someone let me know if I am supposed to restart the vote.
> > Thanks,
> > Kyle
> >
> > On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
> >
> > > +1  for the KIP and +1 what Xavier said as well.
> > >
> > > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> > wrote:
> > >
> > > > Also, +1 for the KIP
> > > >
> > > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com>
> wrote:
> > > >
> > > > > +1 to what Xavier said
> > > > >
> > > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > > wrote:
> > > > >
> > > > >> I don't think we should wait for entries from each stream, since
> > that
> > > > >> might
> > > > >> limit the usefulness of the cogroup operator. There are instances
> > > where
> > > > it
> > > > >> can be useful to compute something based on data from one or more
> > > > stream,
> > > > >> without having to wait for all the streams to produce something
> for
> > > the
> > > > >> group. In the example I gave in the discussion, it is possible to
> > > > compute
> > > > >> impression/auction statistics without having to wait for click
> data,
> > > > which
> > > > >> can typically arrive several minutes late.
> > > > >>
> > > > >> We could have a separate discussion around adding inner / outer
> > > > modifiers
> > > > >> to each of the streams to decide which fields are optional /
> > required
> > > > >> before sending updates if we think that might be useful.
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wangguoz@gmail.com
> >
> > > > wrote:
> > > > >>
> > > > >> > The proposal LGTM, +1
> > > > >> >
> > > > >> > One question I have is about when to send the record to the
> > resulted
> > > > >> KTable
> > > > >> > changelog. For example in your code snippet in the wiki page,
> > before
> > > > you
> > > > >> > see the end result of
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01], Item[no:03],
> > Item[no:04]},
> > > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > > >> >                       wishList:{Item[no:11]}
> > > > >> >       ]
> > > > >> >
> > > > >> >
> > > > >> > You will firs see
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{},
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > >> >
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > 1L, Customer[
> > > > >> >
> > > > >> >                       cart:{Item[no:01]},
> > > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > > >> >
> > > > >> >                       wishList:{}
> > > > >> >       ]
> > > > >> >
> > > > >> > ...
> > > > >> >
> > > > >> >
> > > > >> > I'm wondering if it makes more sense to only start sending the
> > > update
> > > > if
> > > > >> > the corresponding agg-key has seen at least one input from each
> of
> > > the
> > > > >> > input stream? Maybe it is out of the scope of this KIP and we
> can
> > > make
> > > > >> it a
> > > > >> > more general discussion in a separate one.
> > > > >> >
> > > > >> >
> > > > >> > Guozhang
> > > > >> >
> > > > >> >
> > > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> > xavier@confluent.io
> > > >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Kyle, I left a few more comments in the discussion thread,
> if
> > > you
> > > > >> > > wouldn't mind taking a look
> > > > >> > >
> > > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > > >> winkelman.kyle@gmail.com
> > > > >> > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Hello all,
> > > > >> > > >
> > > > >> > > > I would like to start the vote on KIP-150.
> > > > >> > > >
> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 150+-+
> > > > >> > > Kafka-Streams+Cogroup
> > > > >> > > >
> > > > >> > > > Thanks,
> > > > >> > > > Kyle
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > -- Guozhang
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Guozhang Wang <wa...@gmail.com>.
I think we can continue on this voting thread.

Currently we have one binding vote and 2 non-binging votes. I would like to
call out for other people especially committers to also take a look at this
proposal and vote.


Guozhang


On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman <wi...@gmail.com>
wrote:

> Just bringing people's attention to the vote thread for my KIP. I started
> it before another round of discussion happened. Not sure the protocol so
> someone let me know if I am supposed to restart the vote.
> Thanks,
> Kyle
>
> On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:
>
> > +1  for the KIP and +1 what Xavier said as well.
> >
> > On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com>
> wrote:
> >
> > > Also, +1 for the KIP
> > >
> > > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:
> > >
> > > > +1 to what Xavier said
> > > >
> > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> > wrote:
> > > >
> > > >> I don't think we should wait for entries from each stream, since
> that
> > > >> might
> > > >> limit the usefulness of the cogroup operator. There are instances
> > where
> > > it
> > > >> can be useful to compute something based on data from one or more
> > > stream,
> > > >> without having to wait for all the streams to produce something for
> > the
> > > >> group. In the example I gave in the discussion, it is possible to
> > > compute
> > > >> impression/auction statistics without having to wait for click data,
> > > which
> > > >> can typically arrive several minutes late.
> > > >>
> > > >> We could have a separate discussion around adding inner / outer
> > > modifiers
> > > >> to each of the streams to decide which fields are optional /
> required
> > > >> before sending updates if we think that might be useful.
> > > >>
> > > >>
> > > >>
> > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >>
> > > >> > The proposal LGTM, +1
> > > >> >
> > > >> > One question I have is about when to send the record to the
> resulted
> > > >> KTable
> > > >> > changelog. For example in your code snippet in the wiki page,
> before
> > > you
> > > >> > see the end result of
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01], Item[no:03],
> Item[no:04]},
> > > >> >                       purchases:{Item[no:07], Item[no:08]},
> > > >> >                       wishList:{Item[no:11]}
> > > >> >       ]
> > > >> >
> > > >> >
> > > >> > You will firs see
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{},
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >                       cart:{Item[no:01]},
> > > >> >                       purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >                       wishList:{}
> > > >> >       ]
> > > >> >
> > > >> > ...
> > > >> >
> > > >> >
> > > >> > I'm wondering if it makes more sense to only start sending the
> > update
> > > if
> > > >> > the corresponding agg-key has seen at least one input from each of
> > the
> > > >> > input stream? Maybe it is out of the scope of this KIP and we can
> > make
> > > >> it a
> > > >> > more general discussion in a separate one.
> > > >> >
> > > >> >
> > > >> > Guozhang
> > > >> >
> > > >> >
> > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> xavier@confluent.io
> > >
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Kyle, I left a few more comments in the discussion thread, if
> > you
> > > >> > > wouldn't mind taking a look
> > > >> > >
> > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > >> winkelman.kyle@gmail.com
> > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello all,
> > > >> > > >
> > > >> > > > I would like to start the vote on KIP-150.
> > > >> > > >
> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > > >> > > Kafka-Streams+Cogroup
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Kyle
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > -- Guozhang
> > > >> >
> > > >>
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Kyle Winkelman <wi...@gmail.com>.
Just bringing people's attention to the vote thread for my KIP. I started
it before another round of discussion happened. Not sure the protocol so
someone let me know if I am supposed to restart the vote.
Thanks,
Kyle

On May 24, 2017 8:49 AM, "Bill Bejeck" <bb...@gmail.com> wrote:

> +1  for the KIP and +1 what Xavier said as well.
>
> On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com> wrote:
>
> > Also, +1 for the KIP
> >
> > On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:
> >
> > > +1 to what Xavier said
> > >
> > > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io>
> wrote:
> > >
> > >> I don't think we should wait for entries from each stream, since that
> > >> might
> > >> limit the usefulness of the cogroup operator. There are instances
> where
> > it
> > >> can be useful to compute something based on data from one or more
> > stream,
> > >> without having to wait for all the streams to produce something for
> the
> > >> group. In the example I gave in the discussion, it is possible to
> > compute
> > >> impression/auction statistics without having to wait for click data,
> > which
> > >> can typically arrive several minutes late.
> > >>
> > >> We could have a separate discussion around adding inner / outer
> > modifiers
> > >> to each of the streams to decide which fields are optional / required
> > >> before sending updates if we think that might be useful.
> > >>
> > >>
> > >>
> > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >>
> > >> > The proposal LGTM, +1
> > >> >
> > >> > One question I have is about when to send the record to the resulted
> > >> KTable
> > >> > changelog. For example in your code snippet in the wiki page, before
> > you
> > >> > see the end result of
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01], Item[no:03], Item[no:04]},
> > >> >                       purchases:{Item[no:07], Item[no:08]},
> > >> >                       wishList:{Item[no:11]}
> > >> >       ]
> > >> >
> > >> >
> > >> > You will firs see
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{},
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{Item[no:07],Item[no:08]},
> > >> >
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > 1L, Customer[
> > >> >
> > >> >                       cart:{Item[no:01]},
> > >> >                       purchases:{Item[no:07],Item[no:08]},
> > >> >
> > >> >                       wishList:{}
> > >> >       ]
> > >> >
> > >> > ...
> > >> >
> > >> >
> > >> > I'm wondering if it makes more sense to only start sending the
> update
> > if
> > >> > the corresponding agg-key has seen at least one input from each of
> the
> > >> > input stream? Maybe it is out of the scope of this KIP and we can
> make
> > >> it a
> > >> > more general discussion in a separate one.
> > >> >
> > >> >
> > >> > Guozhang
> > >> >
> > >> >
> > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xavier@confluent.io
> >
> > >> > wrote:
> > >> >
> > >> > > Hi Kyle, I left a few more comments in the discussion thread, if
> you
> > >> > > wouldn't mind taking a look
> > >> > >
> > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > >> winkelman.kyle@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hello all,
> > >> > > >
> > >> > > > I would like to start the vote on KIP-150.
> > >> > > >
> > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > >> > > Kafka-Streams+Cogroup
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Kyle
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > -- Guozhang
> > >> >
> > >>
> > >
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Bill Bejeck <bb...@gmail.com>.
+1  for the KIP and +1 what Xavier said as well.

On Wed, May 24, 2017 at 3:57 AM, Damian Guy <da...@gmail.com> wrote:

> Also, +1 for the KIP
>
> On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:
>
> > +1 to what Xavier said
> >
> > On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io> wrote:
> >
> >> I don't think we should wait for entries from each stream, since that
> >> might
> >> limit the usefulness of the cogroup operator. There are instances where
> it
> >> can be useful to compute something based on data from one or more
> stream,
> >> without having to wait for all the streams to produce something for the
> >> group. In the example I gave in the discussion, it is possible to
> compute
> >> impression/auction statistics without having to wait for click data,
> which
> >> can typically arrive several minutes late.
> >>
> >> We could have a separate discussion around adding inner / outer
> modifiers
> >> to each of the streams to decide which fields are optional / required
> >> before sending updates if we think that might be useful.
> >>
> >>
> >>
> >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com>
> wrote:
> >>
> >> > The proposal LGTM, +1
> >> >
> >> > One question I have is about when to send the record to the resulted
> >> KTable
> >> > changelog. For example in your code snippet in the wiki page, before
> you
> >> > see the end result of
> >> >
> >> > 1L, Customer[
> >> >
> >> >                       cart:{Item[no:01], Item[no:03], Item[no:04]},
> >> >                       purchases:{Item[no:07], Item[no:08]},
> >> >                       wishList:{Item[no:11]}
> >> >       ]
> >> >
> >> >
> >> > You will firs see
> >> >
> >> > 1L, Customer[
> >> >
> >> >                       cart:{Item[no:01]},
> >> >                       purchases:{},
> >> >                       wishList:{}
> >> >       ]
> >> >
> >> > 1L, Customer[
> >> >
> >> >                       cart:{Item[no:01]},
> >> >                       purchases:{Item[no:07],Item[no:08]},
> >> >
> >> >                       wishList:{}
> >> >       ]
> >> >
> >> > 1L, Customer[
> >> >
> >> >                       cart:{Item[no:01]},
> >> >                       purchases:{Item[no:07],Item[no:08]},
> >> >
> >> >                       wishList:{}
> >> >       ]
> >> >
> >> > ...
> >> >
> >> >
> >> > I'm wondering if it makes more sense to only start sending the update
> if
> >> > the corresponding agg-key has seen at least one input from each of the
> >> > input stream? Maybe it is out of the scope of this KIP and we can make
> >> it a
> >> > more general discussion in a separate one.
> >> >
> >> >
> >> > Guozhang
> >> >
> >> >
> >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xa...@confluent.io>
> >> > wrote:
> >> >
> >> > > Hi Kyle, I left a few more comments in the discussion thread, if you
> >> > > wouldn't mind taking a look
> >> > >
> >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> >> winkelman.kyle@gmail.com
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Hello all,
> >> > > >
> >> > > > I would like to start the vote on KIP-150.
> >> > > >
> >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> >> > > Kafka-Streams+Cogroup
> >> > > >
> >> > > > Thanks,
> >> > > > Kyle
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >> >
> >>
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Damian Guy <da...@gmail.com>.
Also, +1 for the KIP

On Wed, 24 May 2017 at 08:57 Damian Guy <da...@gmail.com> wrote:

> +1 to what Xavier said
>
> On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io> wrote:
>
>> I don't think we should wait for entries from each stream, since that
>> might
>> limit the usefulness of the cogroup operator. There are instances where it
>> can be useful to compute something based on data from one or more stream,
>> without having to wait for all the streams to produce something for the
>> group. In the example I gave in the discussion, it is possible to compute
>> impression/auction statistics without having to wait for click data, which
>> can typically arrive several minutes late.
>>
>> We could have a separate discussion around adding inner / outer modifiers
>> to each of the streams to decide which fields are optional / required
>> before sending updates if we think that might be useful.
>>
>>
>>
>> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com> wrote:
>>
>> > The proposal LGTM, +1
>> >
>> > One question I have is about when to send the record to the resulted
>> KTable
>> > changelog. For example in your code snippet in the wiki page, before you
>> > see the end result of
>> >
>> > 1L, Customer[
>> >
>> >                       cart:{Item[no:01], Item[no:03], Item[no:04]},
>> >                       purchases:{Item[no:07], Item[no:08]},
>> >                       wishList:{Item[no:11]}
>> >       ]
>> >
>> >
>> > You will firs see
>> >
>> > 1L, Customer[
>> >
>> >                       cart:{Item[no:01]},
>> >                       purchases:{},
>> >                       wishList:{}
>> >       ]
>> >
>> > 1L, Customer[
>> >
>> >                       cart:{Item[no:01]},
>> >                       purchases:{Item[no:07],Item[no:08]},
>> >
>> >                       wishList:{}
>> >       ]
>> >
>> > 1L, Customer[
>> >
>> >                       cart:{Item[no:01]},
>> >                       purchases:{Item[no:07],Item[no:08]},
>> >
>> >                       wishList:{}
>> >       ]
>> >
>> > ...
>> >
>> >
>> > I'm wondering if it makes more sense to only start sending the update if
>> > the corresponding agg-key has seen at least one input from each of the
>> > input stream? Maybe it is out of the scope of this KIP and we can make
>> it a
>> > more general discussion in a separate one.
>> >
>> >
>> > Guozhang
>> >
>> >
>> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xa...@confluent.io>
>> > wrote:
>> >
>> > > Hi Kyle, I left a few more comments in the discussion thread, if you
>> > > wouldn't mind taking a look
>> > >
>> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
>> winkelman.kyle@gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > Hello all,
>> > > >
>> > > > I would like to start the vote on KIP-150.
>> > > >
>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
>> > > Kafka-Streams+Cogroup
>> > > >
>> > > > Thanks,
>> > > > Kyle
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > -- Guozhang
>> >
>>
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Damian Guy <da...@gmail.com>.
+1 to what Xavier said

On Wed, 24 May 2017 at 06:45 Xavier Léauté <xa...@confluent.io> wrote:

> I don't think we should wait for entries from each stream, since that might
> limit the usefulness of the cogroup operator. There are instances where it
> can be useful to compute something based on data from one or more stream,
> without having to wait for all the streams to produce something for the
> group. In the example I gave in the discussion, it is possible to compute
> impression/auction statistics without having to wait for click data, which
> can typically arrive several minutes late.
>
> We could have a separate discussion around adding inner / outer modifiers
> to each of the streams to decide which fields are optional / required
> before sending updates if we think that might be useful.
>
>
>
> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com> wrote:
>
> > The proposal LGTM, +1
> >
> > One question I have is about when to send the record to the resulted
> KTable
> > changelog. For example in your code snippet in the wiki page, before you
> > see the end result of
> >
> > 1L, Customer[
> >
> >                       cart:{Item[no:01], Item[no:03], Item[no:04]},
> >                       purchases:{Item[no:07], Item[no:08]},
> >                       wishList:{Item[no:11]}
> >       ]
> >
> >
> > You will firs see
> >
> > 1L, Customer[
> >
> >                       cart:{Item[no:01]},
> >                       purchases:{},
> >                       wishList:{}
> >       ]
> >
> > 1L, Customer[
> >
> >                       cart:{Item[no:01]},
> >                       purchases:{Item[no:07],Item[no:08]},
> >
> >                       wishList:{}
> >       ]
> >
> > 1L, Customer[
> >
> >                       cart:{Item[no:01]},
> >                       purchases:{Item[no:07],Item[no:08]},
> >
> >                       wishList:{}
> >       ]
> >
> > ...
> >
> >
> > I'm wondering if it makes more sense to only start sending the update if
> > the corresponding agg-key has seen at least one input from each of the
> > input stream? Maybe it is out of the scope of this KIP and we can make
> it a
> > more general discussion in a separate one.
> >
> >
> > Guozhang
> >
> >
> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xa...@confluent.io>
> > wrote:
> >
> > > Hi Kyle, I left a few more comments in the discussion thread, if you
> > > wouldn't mind taking a look
> > >
> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> winkelman.kyle@gmail.com
> > >
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I would like to start the vote on KIP-150.
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > > Kafka-Streams+Cogroup
> > > >
> > > > Thanks,
> > > > Kyle
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Xavier Léauté <xa...@confluent.io>.
I don't think we should wait for entries from each stream, since that might
limit the usefulness of the cogroup operator. There are instances where it
can be useful to compute something based on data from one or more stream,
without having to wait for all the streams to produce something for the
group. In the example I gave in the discussion, it is possible to compute
impression/auction statistics without having to wait for click data, which
can typically arrive several minutes late.

We could have a separate discussion around adding inner / outer modifiers
to each of the streams to decide which fields are optional / required
before sending updates if we think that might be useful.



On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wa...@gmail.com> wrote:

> The proposal LGTM, +1
>
> One question I have is about when to send the record to the resulted KTable
> changelog. For example in your code snippet in the wiki page, before you
> see the end result of
>
> 1L, Customer[
>
>                       cart:{Item[no:01], Item[no:03], Item[no:04]},
>                       purchases:{Item[no:07], Item[no:08]},
>                       wishList:{Item[no:11]}
>       ]
>
>
> You will firs see
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{},
>                       wishList:{}
>       ]
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{Item[no:07],Item[no:08]},
>
>                       wishList:{}
>       ]
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{Item[no:07],Item[no:08]},
>
>                       wishList:{}
>       ]
>
> ...
>
>
> I'm wondering if it makes more sense to only start sending the update if
> the corresponding agg-key has seen at least one input from each of the
> input stream? Maybe it is out of the scope of this KIP and we can make it a
> more general discussion in a separate one.
>
>
> Guozhang
>
>
> On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xa...@confluent.io>
> wrote:
>
> > Hi Kyle, I left a few more comments in the discussion thread, if you
> > wouldn't mind taking a look
> >
> > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <winkelman.kyle@gmail.com
> >
> > wrote:
> >
> > > Hello all,
> > >
> > > I would like to start the vote on KIP-150.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > Kafka-Streams+Cogroup
> > >
> > > Thanks,
> > > Kyle
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Guozhang Wang <wa...@gmail.com>.
The proposal LGTM, +1

One question I have is about when to send the record to the resulted KTable
changelog. For example in your code snippet in the wiki page, before you
see the end result of

1L, Customer[

                      cart:{Item[no:01], Item[no:03], Item[no:04]},
                      purchases:{Item[no:07], Item[no:08]},
                      wishList:{Item[no:11]}
      ]


You will firs see

1L, Customer[

                      cart:{Item[no:01]},
                      purchases:{},
                      wishList:{}
      ]

1L, Customer[

                      cart:{Item[no:01]},
                      purchases:{Item[no:07],Item[no:08]},

                      wishList:{}
      ]

1L, Customer[

                      cart:{Item[no:01]},
                      purchases:{Item[no:07],Item[no:08]},

                      wishList:{}
      ]

...


I'm wondering if it makes more sense to only start sending the update if
the corresponding agg-key has seen at least one input from each of the
input stream? Maybe it is out of the scope of this KIP and we can make it a
more general discussion in a separate one.


Guozhang


On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xa...@confluent.io> wrote:

> Hi Kyle, I left a few more comments in the discussion thread, if you
> wouldn't mind taking a look
>
> On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <wi...@gmail.com>
> wrote:
>
> > Hello all,
> >
> > I would like to start the vote on KIP-150.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> Kafka-Streams+Cogroup
> >
> > Thanks,
> > Kyle
> >
>



-- 
-- Guozhang

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

Posted by Xavier Léauté <xa...@confluent.io>.
Hi Kyle, I left a few more comments in the discussion thread, if you
wouldn't mind taking a look

On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <wi...@gmail.com>
wrote:

> Hello all,
>
> I would like to start the vote on KIP-150.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+Kafka-Streams+Cogroup
>
> Thanks,
> Kyle
>