You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Damian Guy (JIRA)" <ji...@apache.org> on 2017/01/09 14:52:58 UTC

[jira] [Created] (KAFKA-4609) KTable/KTable join followed by groupBy and aggregate/count can result in incorrect results

Damian Guy created KAFKA-4609:
---------------------------------

             Summary: KTable/KTable join followed by groupBy and aggregate/count can result in incorrect results
                 Key: KAFKA-4609
                 URL: https://issues.apache.org/jira/browse/KAFKA-4609
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.10.1.1, 0.10.2.0
            Reporter: Damian Guy
            Assignee: Damian Guy


When caching is enabled, KTable/KTable joins can result in duplicate values being emitted. This will occur if there were updates to the same key in both tables. Each table is flushed independently, and each table will trigger the join, so you get two results for the same key. 
If we subsequently perform a groupBy and then aggregate operation we will now process these duplicates resulting in incorrect aggregated values. For example count will be double the value it should be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)