You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2021/08/09 22:35:00 UTC
[jira] [Commented] (KAFKA-7497) Kafka Streams should support
self-join on streams
[ https://issues.apache.org/jira/browse/KAFKA-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396307#comment-17396307 ]
Matthias J. Sax commented on KAFKA-7497:
----------------------------------------
Given my above comment:
{quote}Also, and this seems to be the most severs issue, each record would join with itself, what is actually not desired...
{quote}
I think this is actually not correct... At least if we consider self-joins in standard SQL, a record would join with itself. We should follow the same semantics, and thus, it's possible (even not efficient) today with Kafka Stream to do a self-join,
> Kafka Streams should support self-join on streams
> -------------------------------------------------
>
> Key: KAFKA-7497
> URL: https://issues.apache.org/jira/browse/KAFKA-7497
> Project: Kafka
> Issue Type: New Feature
> Components: streams
> Reporter: Robin Moffatt
> Priority: Major
> Labels: needs-kip
>
> There are valid reasons to want to join a stream to itself, but Kafka Streams does not currently support this ({{Invalid topology: Topic foo has already been registered by another source.}}). To perform the join requires creating a second stream as a clone of the first, and then doing a join between the two. This is a clunky workaround and results in unnecessary duplication of data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)