You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Thomas Becker <to...@Tivo.com> on 2017/05/25 13:48:43 UTC

GlobalKTable limitations

We need to do a series of joins against a KTable that we can't co-
partition with the stream, so we're looking at GlobalKTable.  But the
topic backing the table is not ideally keyed for the sort of lookups
this particular processor needs to do. Unfortunately, GlobalKTable is
very limited in that you can only build one with the exact keys/values
from the backing topic. I'd like to be able to perform various
transformations on the topic before materializing the table.  I'd
envision it looking something like the following:

builder.globalTable(keySerde, valueSerde, topicName)
    .filter((k, v) -> k.isFoo())
    .map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
    .build(tableKeySerde, tableValueSerde, storeName);

Is this something that has been considered or that others would find
useful?

--


    Tommy Becker

    Senior Software Engineer

    O +1 919.460.4747

    tivo.com


________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

Re: GlobalKTable limitations

Posted by Damian Guy <da...@gmail.com>.
We did originally think of allowing `filter`, `map` etc on a GlobalKTable,
though it would have been slightly different to what you are suggesting,
i.e., we'd materialize the topic as a Store and then provide views on top
of that. They'd also be global, but not materialized as physical state
stores.

On Fri, 26 May 2017 at 02:01 Thomas Becker <to...@tivo.com> wrote:

> Hey Eno,
> Thanks for the response. We have considered, but not yet tried that. One
> of the nice things about the GlobalKTable is that it fully "bootstraps"
> before the rest of the topology is started. But if part of the topology is
> itself generating the topic that backs the global table, it seems like that
> would effectively break. Also, doing this obviously requires
> re-materializing the data in a new topic.  To be fair, the topic we are
> building the table from has a bit of an unusual format, which is why I was
> trying to see if anyone else would think this was useful.
>
> -Tommy
>
> ________________________________________
> From: Eno Thereska [eno.thereska@gmail.com]
> Sent: Thursday, May 25, 2017 12:03 PM
> To: dev@kafka.apache.org
> Subject: Re: GlobalKTable limitations
>
> Hi Thomas,
>
> Have you considered doing the transformations on the topic, then
> outputting to another topic and then constructing the GlobalKTable from the
> latter?
>
> The GlobalKTable has the limitations you mention since it was primarily
> designed for joins only. We should consider allowing a less restrictive
> interface if it makes sense.
>
> Eno
>
> > On 25 May 2017, at 14:48, Thomas Becker <to...@Tivo.com> wrote:
> >
> > We need to do a series of joins against a KTable that we can't co-
> > partition with the stream, so we're looking at GlobalKTable.  But the
> > topic backing the table is not ideally keyed for the sort of lookups
> > this particular processor needs to do. Unfortunately, GlobalKTable is
> > very limited in that you can only build one with the exact keys/values
> > from the backing topic. I'd like to be able to perform various
> > transformations on the topic before materializing the table.  I'd
> > envision it looking something like the following:
> >
> > builder.globalTable(keySerde, valueSerde, topicName)
> >    .filter((k, v) -> k.isFoo())
> >    .map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
> >    .build(tableKeySerde, tableValueSerde, storeName);
> >
> > Is this something that has been considered or that others would find
> > useful?
> >
> > --
> >
> >
> >    Tommy Becker
> >
> >    Senior Software Engineer
> >
> >    O +1 919.460.4747 <(919)%20460-4747>
> >
> >    tivo.com
> >
> >
> > ________________________________
> >
> > This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>

RE: GlobalKTable limitations

Posted by Thomas Becker <to...@Tivo.com>.
Hey Eno,
Thanks for the response. We have considered, but not yet tried that. One of the nice things about the GlobalKTable is that it fully "bootstraps" before the rest of the topology is started. But if part of the topology is itself generating the topic that backs the global table, it seems like that would effectively break. Also, doing this obviously requires re-materializing the data in a new topic.  To be fair, the topic we are building the table from has a bit of an unusual format, which is why I was trying to see if anyone else would think this was useful.

-Tommy

________________________________________
From: Eno Thereska [eno.thereska@gmail.com]
Sent: Thursday, May 25, 2017 12:03 PM
To: dev@kafka.apache.org
Subject: Re: GlobalKTable limitations

Hi Thomas,

Have you considered doing the transformations on the topic, then outputting to another topic and then constructing the GlobalKTable from the latter?

The GlobalKTable has the limitations you mention since it was primarily designed for joins only. We should consider allowing a less restrictive interface if it makes sense.

Eno

> On 25 May 2017, at 14:48, Thomas Becker <to...@Tivo.com> wrote:
>
> We need to do a series of joins against a KTable that we can't co-
> partition with the stream, so we're looking at GlobalKTable.  But the
> topic backing the table is not ideally keyed for the sort of lookups
> this particular processor needs to do. Unfortunately, GlobalKTable is
> very limited in that you can only build one with the exact keys/values
> from the backing topic. I'd like to be able to perform various
> transformations on the topic before materializing the table.  I'd
> envision it looking something like the following:
>
> builder.globalTable(keySerde, valueSerde, topicName)
>    .filter((k, v) -> k.isFoo())
>    .map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
>    .build(tableKeySerde, tableValueSerde, storeName);
>
> Is this something that has been considered or that others would find
> useful?
>
> --
>
>
>    Tommy Becker
>
>    Senior Software Engineer
>
>    O +1 919.460.4747
>
>    tivo.com
>
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

Re: GlobalKTable limitations

Posted by Eno Thereska <en...@gmail.com>.
Hi Thomas,

Have you considered doing the transformations on the topic, then outputting to another topic and then constructing the GlobalKTable from the latter? 

The GlobalKTable has the limitations you mention since it was primarily designed for joins only. We should consider allowing a less restrictive interface if it makes sense.

Eno

> On 25 May 2017, at 14:48, Thomas Becker <to...@Tivo.com> wrote:
> 
> We need to do a series of joins against a KTable that we can't co-
> partition with the stream, so we're looking at GlobalKTable.  But the
> topic backing the table is not ideally keyed for the sort of lookups
> this particular processor needs to do. Unfortunately, GlobalKTable is
> very limited in that you can only build one with the exact keys/values
> from the backing topic. I'd like to be able to perform various
> transformations on the topic before materializing the table.  I'd
> envision it looking something like the following:
> 
> builder.globalTable(keySerde, valueSerde, topicName)
>    .filter((k, v) -> k.isFoo())
>    .map((k, v) -> new KeyValue<>(k.getBar(), v.getBaz()))
>    .build(tableKeySerde, tableValueSerde, storeName);
> 
> Is this something that has been considered or that others would find
> useful?
> 
> --
> 
> 
>    Tommy Becker
> 
>    Senior Software Engineer
> 
>    O +1 919.460.4747
> 
>    tivo.com
> 
> 
> ________________________________
> 
> This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.