You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Trevor Huey (JIRA)" <ji...@apache.org> on 2017/11/15 23:41:00 UTC

[jira] [Comment Edited] (KAFKA-3705) Support non-key joining in KTable

    [ https://issues.apache.org/jira/browse/KAFKA-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254129#comment-16254129 ] 

Trevor Huey edited comment on KAFKA-3705 at 11/15/17 11:40 PM:
---------------------------------------------------------------

[~jfilipiak] Please bear with me while I try to get caught up. I'm not yet familiar with the Kafka code base. I have a few questions to try to figure out how I can get involved:
1. It seems like we need to get buy-in on your KIP-213? It doesn't seem like there's been much activity on it besides yourself in a while. What's your current plan of attack for getting that approved?
2. I know you said that the most difficult part is yet to be done. Is there some code you can point me toward so I can start digging in and better understand why this is so difficult?
3. This issue has been open since May '16. How far out do you think we are from getting this implemented?


was (Author: thuey100):
[~jfilipiak] Please bear with me while I try to get caught up. I'm not yet familiar with the Kafka code base. I have a few questions to try to figure out how I can get involved:
1. It seems like we need to get buy-in on your KIP-213? It doesn't seem like there's been much activity on it besides yourself in a while. What's your current plan of attack for getting that approved?
2. I know you said that the most difficult part is yet to be done. Is there some code you can point me toward so I can start digging in and better understand why this is so difficult?
3. This issue has been open since May '16. How far out do you think we are from getting this implemented? Unfortunately, it seems like there's not a lot of momentum behind it from the Kafka team, despite this being labeled a "Major" bug.

> Support non-key joining in KTable
> ---------------------------------
>
>                 Key: KAFKA-3705
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3705
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Guozhang Wang
>              Labels: api
>
> Today in Kafka Streams DSL, KTable joins are only based on keys. If users want to join a KTable A by key {{a}} with another KTable B by key {{b}} but with a "foreign key" {{a}}, and assuming they are read from two topics which are partitioned on {{a}} and {{b}} respectively, they need to do the following pattern:
> {code}
> tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' is partitioned on "a"
> tableA.join(tableB', joiner);
> {code}
> Even if these two tables are read from two topics which are already partitioned on {{a}}, users still need to do the pre-aggregation in order to make the two joining streams to be on the same key. This is a draw-back from programability and we should fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)