You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Navinder Brar (JIRA)" <ji...@apache.org> on 2018/03/13 05:45:00 UTC

[jira] [Created] (KAFKA-6643) Warm up new replicas from scratch when changelog topic has retention time

Navinder Brar created KAFKA-6643:
------------------------------------

             Summary: Warm up new replicas from scratch when changelog topic has retention time
                 Key: KAFKA-6643
                 URL: https://issues.apache.org/jira/browse/KAFKA-6643
             Project: Kafka
          Issue Type: New Feature
          Components: streams
            Reporter: Navinder Brar


In the current scenario, Kafka Streams has changelog Kafka topics(internal topics having all the data for the store) which are used to build the state of replicas. So, if we keep the number of standby replicas as 1, we still have more availability for persistent state stores as changelog Kafka topics are also replicated depending upon broker replication policy but that also means we are using at least 4 times the space(1 master store, 1 replica store, 1 changelog, 1 changelog replica). 

Now if we have an year's data in persistent stores(rocksdb), we don't want the changelog topics to have an year's data as it will put an unnecessary burden on brokers(in terms of space). If we have to scale our kafka streams application(having 200-300 TB's of data) we have to scale the kafka brokers as well. We want to reduce this dependency and find out ways to just use changelog topic as a queue, having just 2 or 3 days of data and warm up the replicas from scratch in some other way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)