You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Dmitriy Vsekhvalnov <dv...@gmail.com> on 2017/10/04 13:31:27 UTC

Correct way to increase replication factor for kafka-streams internal topics

Hi all,

What is correct way to increase RF for existing internal topics that
kafka-streams create (re-partitioning streams)?

We are increasing RF for source topics and would like to align
kafka-streams as well. App part configuration is simple, but what to do
with existing internal topics?

Remove them and let kafka-streams re-create? Or use kafka-topics.sh tool to
increase RF on all internal topics? Other way?


Thanks in advance.

Re: Correct way to increase replication factor for kafka-streams internal topics

Posted by Dmitriy Vsekhvalnov <dv...@gmail.com>.
Thanks Matthias !

On Thu, Oct 5, 2017 at 12:16 AM, Matthias J. Sax <ma...@confluent.io>
wrote:

> That is hard to do...
>
> Just deleting the topic might result in data loss, if not all data was
> processed by the application yet (note, that repartitioning topics are
> also kind of a buffer between subtopologies).
>
> Just manually changing the number of partitions via kafka-topics.sh will
> break partitioning (at least for some time) and thus result in incorrect
> results. There are also some other dependencies: for example, downstream
> subtopologies that do have a state, will have changelog topics that
> would need to be "fixed", too.
>
> I guess the simplest way would be, to reset you application completely
> and reprocess all data from the input topics:
>
>  -
> https://docs.confluent.io/current/streams/developer-
> guide.html#application-reset-tool
>  -
> https://www.confluent.io/blog/data-reprocessing-with-kafka-
> streams-resetting-a-streams-application/
>
>
> If you wan to avoid any downtime, deploy the application with a new
> application.id to reprocess all data. Afterward, stop the old
> application and clean up it's state.
>
>
> -Matthias
>
> On 10/4/17 6:31 AM, Dmitriy Vsekhvalnov wrote:
> > Hi all,
> >
> > What is correct way to increase RF for existing internal topics that
> > kafka-streams create (re-partitioning streams)?
> >
> > We are increasing RF for source topics and would like to align
> > kafka-streams as well. App part configuration is simple, but what to do
> > with existing internal topics?
> >
> > Remove them and let kafka-streams re-create? Or use kafka-topics.sh tool
> to
> > increase RF on all internal topics? Other way?
> >
> >
> > Thanks in advance.
> >
>
>

Re: Correct way to increase replication factor for kafka-streams internal topics

Posted by "Matthias J. Sax" <ma...@confluent.io>.
That is hard to do...

Just deleting the topic might result in data loss, if not all data was
processed by the application yet (note, that repartitioning topics are
also kind of a buffer between subtopologies).

Just manually changing the number of partitions via kafka-topics.sh will
break partitioning (at least for some time) and thus result in incorrect
results. There are also some other dependencies: for example, downstream
subtopologies that do have a state, will have changelog topics that
would need to be "fixed", too.

I guess the simplest way would be, to reset you application completely
and reprocess all data from the input topics:

 -
https://docs.confluent.io/current/streams/developer-guide.html#application-reset-tool
 -
https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/


If you wan to avoid any downtime, deploy the application with a new
application.id to reprocess all data. Afterward, stop the old
application and clean up it's state.


-Matthias

On 10/4/17 6:31 AM, Dmitriy Vsekhvalnov wrote:
> Hi all,
> 
> What is correct way to increase RF for existing internal topics that
> kafka-streams create (re-partitioning streams)?
> 
> We are increasing RF for source topics and would like to align
> kafka-streams as well. App part configuration is simple, but what to do
> with existing internal topics?
> 
> Remove them and let kafka-streams re-create? Or use kafka-topics.sh tool to
> increase RF on all internal topics? Other way?
> 
> 
> Thanks in advance.
>