You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "John Roesler (Jira)" <ji...@apache.org> on 2021/07/08 18:04:00 UTC

[jira] [Commented] (KAFKA-8315) Historical join issues

    [ https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377546#comment-17377546 ] 

John Roesler commented on KAFKA-8315:
-------------------------------------

Hello again, all, this issue should be resolved in 3.0, via KAFKA-10091.

> Historical join issues
> ----------------------
>
>                 Key: KAFKA-8315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8315
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Andrew
>            Assignee: John Roesler
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple joins over pre-populated kafka topics. This seems more apparent where one topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [https://github.com/the4thamigo-uk/join-example]
>  
> The only way to increase the period of historically joined records is to increase the grace period for the join windows, and this has repercussions when you extend it to a large period e.g. 2 years of minute-by-minute records.
> Related slack conversations : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]
>  
>  Research on this issue has gone through a few phases :
> 1) This issue was initially thought to be due to the inability to set the retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]), but there is no where to pass a `Materialized` instance to the join operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2) We then found an apparent issue in the code which would affect the partition that is selected to deliver the next record to the join. This would only be a problem for data that is out-of-order, and join-example uses data that is in order of timestamp in both topics. So this fix is thought not to affect join-example.
> This was considered to be an issue and is being addressed in [https://github.com/apache/kafka/pull/6719]
>  3) Further investigation using a crafted unit test seems to show that the partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok
> [https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]
> 4) the current assumption is that the issue is rooted in the way records are consumed from the topics :
> We have tried to set various options to suppress reads form the source topics but it doesnt seem to make any difference : [https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)