You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Francesco Frontera <fr...@radicalbit.io> on 2018/07/31 13:37:50 UTC

Re-partitioning topic with through (Kafka Streams)

Hi,

I have a question about topic repartitioning in Kafka Streams using
`through` function.

I try to explain the context Briefly:

I have single topic A with two partitions:
 A:1:9
 A:0:0

I try to create a repartitioned topic using Kafka Streams API:

builder.stream("A").map<>((key, val) => KeyValueMapper
{..selectKey(val)}).through("B", Produced.`with`(....))

the streaming job produced a topic with a single partition (B:0:9) instead
of a topic with 2 partitions, which is something I want to achieve (i.e.
retaining the upstream partitions number).

Is there a way to create an auxiliary topic with specific partitions
directly from Kafka Streams API without creating topic explicitly (similar
to join operation)?

If the join neither does it, am I then forced to create manually the
internal topics with the desired number of partitions for internal
repartitioning? Is this valid also for all the KafkaStreams operator's
output topics?


Thanks,
Francesco Frontera.

Re: Re-partitioning topic with through (Kafka Streams)

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Francesco,

Streams auto-created repartition topics's num.partitions are determined by
the num.tasks of the writing sub-topology, which is then determined by the
source topic's num.partitions in turn. There are some proposals about
extending this coupling but not yet implemented:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Repartition+Topic+Hints+in+Streams

So for your scenario, for now you'll have to manually create the "through"
topic explicitly before starting the streams app with the num.partitions
you want.


Guozhang


On Tue, Jul 31, 2018 at 6:37 AM, Francesco Frontera <
francesco.frontera@radicalbit.io> wrote:

> Hi,
>
> I have a question about topic repartitioning in Kafka Streams using
> `through` function.
>
> I try to explain the context Briefly:
>
> I have single topic A with two partitions:
>  A:1:9
>  A:0:0
>
> I try to create a repartitioned topic using Kafka Streams API:
>
> builder.stream("A").map<>((key, val) => KeyValueMapper
> {..selectKey(val)}).through("B", Produced.`with`(....))
>
> the streaming job produced a topic with a single partition (B:0:9) instead
> of a topic with 2 partitions, which is something I want to achieve (i.e.
> retaining the upstream partitions number).
>
> Is there a way to create an auxiliary topic with specific partitions
> directly from Kafka Streams API without creating topic explicitly (similar
> to join operation)?
>
> If the join neither does it, am I then forced to create manually the
> internal topics with the desired number of partitions for internal
> repartitioning? Is this valid also for all the KafkaStreams operator's
> output topics?
>
>
> Thanks,
> Francesco Frontera.
>



-- 
-- Guozhang