You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Matthias J. Sax (JIRA)" <ji...@apache.org> on 2017/02/01 01:16:51 UTC
[jira] [Created] (KAFKA-4718) Revisit DSL partitioning assumption
for KStream source topics
Matthias J. Sax created KAFKA-4718:
--------------------------------------
Summary: Revisit DSL partitioning assumption for KStream source topics
Key: KAFKA-4718
URL: https://issues.apache.org/jira/browse/KAFKA-4718
Project: Kafka
Issue Type: Improvement
Components: streams
Reporter: Matthias J. Sax
Priority: Minor
Currently, when reading one or multiple topics via a single call to {{KStreamBuilder#stream()}}, it is assumed that the data is correctly partitions by key.
For "single topic" {{KStream}}, this is a fair assumption, however, for multi-topic {{KStream}}, the assumption is most likely not true if input topics have a different number of partitions, because producer use hash partitioning per default. Thus, to get correct partitions, all producer for those input topics need to use (the same or at least a compatible) custom partitioner.
Making this the default assumption seem rather risky, and we might want to revisit this. Or at least update some docs with corresponding hints.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)