You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Thomas Crowley (Jira)" <ji...@apache.org> on 2019/08/29 01:30:00 UTC

[jira] [Updated] (KAFKA-8846) Unexpected results joining a KStream to a KTable after repartitioning

     [ https://issues.apache.org/jira/browse/KAFKA-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Crowley updated KAFKA-8846:
----------------------------------
    Description: 
We seem to have come across a bug with Kafka Streams (or at least unexpected behavior) when joining a KStream to a KTable after re-partitioning our data (via `selectKey`)

Our use case is as follows: we want to aggregate some values and join it with the original message, so that we emit the original message with the current value of the aggregation.

Currently, without re-partitioning, we get the correct behavior as expected, but rekeying the input of the stream gives us incorrect results.

What's stranger, is that the `TestTopologyDriver` gives us the correct/expected results in our re-partitioned topology.


Apologies if Clojure is foreign to anyone, but I have an example of the problematic topology here:
https://github.com/VerrencyOpenSource/repartition-bug/blob/master/src/repartition_bug/with_repartition.clj#L27

If you have `lein` installed on your machine, I have instructions on how you can run the above topology against both the test topology driver and a Kafka cluster: 
https://github.com/VerrencyOpenSource/repartition-bug 



  was:
We seem to have come across a bug with Kafka Streams (or at least unexpected behavior) when joining a KStream to a KTable after re-partitioning our data (via `selectKey`)

Our use case is as follows: we want to aggregate some values and join them back onto the original message, so that we emit the original message, joined with the current value of the aggregation at the current point in time.

Currently, without re-partitioning, we get the correct behavior as expected, but rekeying the input of the stream gives us incorrect results.

What's stranger, is that the `TestTopologyDriver` gives us the correct/expected results in our re-partitioned topology.


Apologies if Clojure is foreign to anyone, but I have an example of the problematic topology here:
https://github.com/VerrencyOpenSource/repartition-bug/blob/master/src/repartition_bug/with_repartition.clj#L27

If you have `lein` installed on your machine, I have instructions on how you can run the above topology against both the test topology driver and a Kafka cluster: 
https://github.com/VerrencyOpenSource/repartition-bug 




> Unexpected results joining a KStream to a KTable after repartitioning 
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-8846
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8846
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.3.0
>            Reporter: Thomas Crowley
>            Priority: Major
>
> We seem to have come across a bug with Kafka Streams (or at least unexpected behavior) when joining a KStream to a KTable after re-partitioning our data (via `selectKey`)
> Our use case is as follows: we want to aggregate some values and join it with the original message, so that we emit the original message with the current value of the aggregation.
> Currently, without re-partitioning, we get the correct behavior as expected, but rekeying the input of the stream gives us incorrect results.
> What's stranger, is that the `TestTopologyDriver` gives us the correct/expected results in our re-partitioned topology.
> Apologies if Clojure is foreign to anyone, but I have an example of the problematic topology here:
> https://github.com/VerrencyOpenSource/repartition-bug/blob/master/src/repartition_bug/with_repartition.clj#L27
> If you have `lein` installed on your machine, I have instructions on how you can run the above topology against both the test topology driver and a Kafka cluster: 
> https://github.com/VerrencyOpenSource/repartition-bug 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)