You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2017/11/29 18:59:00 UTC

[jira] [Resolved] (KAFKA-6038) Repartition topics could be much more transient

     [ https://issues.apache.org/jira/browse/KAFKA-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guozhang Wang resolved KAFKA-6038.
----------------------------------
    Resolution: Duplicate

> Repartition topics could be much more transient
> -----------------------------------------------
>
>                 Key: KAFKA-6038
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6038
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>              Labels: optimization
>
> Unlike changelog topics, the repartition topics could just be short-lived than eating up the storage space on Kafka brokers. Today users have different ways to configure them with short retention such as enforce a retention of 30 minutes with small log segment sizes, or use AppendTime for repartition topics. All these would be cumbersome and Streams should just do this automatically.
> One way to do it is use the “purgeData” admin API (KIP-107) such that after the offset of the input topics are committed, if the input topics are actually repartition topics, we would purge the data immediately. One tricky thing to consider though, is upon (re-)starting the application, if the repartition topics are used for restoring the states, we need to re-fill these topics in the right way in order for restoration purposes, and there might be some devils in the implementation details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)