You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Sachin Mittal <sj...@gmail.com> on 2017/08/21 06:32:14 UTC

Is there a way in increase number of partitions

Hi,
I have a topic which has four partitions and data is distributed among
those based on a specified key.

If I want to increase the number of partitions to six how can I do the same
and also making sure that messages for a given key always go to one
(specific) partition only.

Will the existing message redistribute themselves among new partition.

Also say earlier message of key A went to partition 1 and going forward any
new message go to same partition where earlier messages for that key are?

And by increasing partitions some keys may use a different partition now,
so how do I ensure the case of all messages of that key belong to single
partition.

Thanks
Sachin

Re: Is there a way in increase number of partitions

Posted by Svante Karlsson <sv...@csi.se>.
Short answer - you cannot. The existing data is not reprocessed since kafka
itself has no knowledge on how you did your partitioning.

The normal workaround is that you stop producers and consumers. Create a
new topic with the desired number of partitions. Consume the old topic from
beginning and write all data to new topic. Restart producers and consumers
from your new topic. You most likely will mess up your consumer offsets.





2017-08-21 8:32 GMT+02:00 Sachin Mittal <sj...@gmail.com>:

> Hi,
> I have a topic which has four partitions and data is distributed among
> those based on a specified key.
>
> If I want to increase the number of partitions to six how can I do the same
> and also making sure that messages for a given key always go to one
> (specific) partition only.
>
> Will the existing message redistribute themselves among new partition.
>
> Also say earlier message of key A went to partition 1 and going forward any
> new message go to same partition where earlier messages for that key are?
>
> And by increasing partitions some keys may use a different partition now,
> so how do I ensure the case of all messages of that key belong to single
> partition.
>
> Thanks
> Sachin
>

Re: Is there a way in increase number of partitions

Posted by Sachin Mittal <sj...@gmail.com>.
Yes I understand that.
The streams application takes care of that
when I do:

input
        .map(new KeyValueMapper<K, V,  KeyValue<K, V1>>() {
            public KeyValue<K, V1> apply(K key, V value) {
                ......
                return new KeyValue<K, V1>(new_key, new_value);
            }
        }).through(k_serde, v_serde, "destination-topic");

It ensures that for a particular new_key it will send that new_value to a
specific partition of destination-topic.
My question is that when we increase number of partitions for
destination-topic, how can we ensure that all old values for a particular
new_key and new values for same new_key are logged to same partition.

Because if streams decide to change the partition for some old now (once we
have increased the partitions), how can we ensure old data is also present
in the same partition.

Is this handled automatically or do we have to do some kind of topic
migration (or if this is something we cannot guarantee to do?)

Thanks
Sachin



On Mon, Aug 21, 2017 at 12:15 PM, Zafar Ansari <za...@gmail.com>
wrote:

> Hi
> You can specify a partition function while producing a message to Kafka
> brokers. This function will determine which partition the message should be
> sent to.
> See
> https://edgent.apache.org/javadoc/r1.1.0/org/apache/
> edgent/connectors/kafka/KafkaProducer.html#publish-
> org.apache.edgent.topology.TStream-org.apache.edgent.
> function.Function-org.apache.edgent.function.Function-org.
> apache.edgent.function.Function-org.apache.edgent.function.Function-
>
>
>
> On 21 August 2017 at 12:02, Sachin Mittal <sj...@gmail.com> wrote:
>
> > Hi,
> > I have a topic which has four partitions and data is distributed among
> > those based on a specified key.
> >
> > If I want to increase the number of partitions to six how can I do the
> same
> > and also making sure that messages for a given key always go to one
> > (specific) partition only.
> >
> > Will the existing message redistribute themselves among new partition.
> >
> > Also say earlier message of key A went to partition 1 and going forward
> any
> > new message go to same partition where earlier messages for that key are?
> >
> > And by increasing partitions some keys may use a different partition now,
> > so how do I ensure the case of all messages of that key belong to single
> > partition.
> >
> > Thanks
> > Sachin
> >
>

Re: Is there a way in increase number of partitions

Posted by Zafar Ansari <za...@gmail.com>.
Hi
You can specify a partition function while producing a message to Kafka
brokers. This function will determine which partition the message should be
sent to.
See
https://edgent.apache.org/javadoc/r1.1.0/org/apache/edgent/connectors/kafka/KafkaProducer.html#publish-org.apache.edgent.topology.TStream-org.apache.edgent.function.Function-org.apache.edgent.function.Function-org.apache.edgent.function.Function-org.apache.edgent.function.Function-



On 21 August 2017 at 12:02, Sachin Mittal <sj...@gmail.com> wrote:

> Hi,
> I have a topic which has four partitions and data is distributed among
> those based on a specified key.
>
> If I want to increase the number of partitions to six how can I do the same
> and also making sure that messages for a given key always go to one
> (specific) partition only.
>
> Will the existing message redistribute themselves among new partition.
>
> Also say earlier message of key A went to partition 1 and going forward any
> new message go to same partition where earlier messages for that key are?
>
> And by increasing partitions some keys may use a different partition now,
> so how do I ensure the case of all messages of that key belong to single
> partition.
>
> Thanks
> Sachin
>