You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2020/02/03 22:06:26 UTC

[jira] [Created] (KAFKA-9502) Use explicit "purge data" for windowed changelog topics instead of "delete,compact" topic config

Matthias J. Sax created KAFKA-9502:
--------------------------------------

             Summary: Use explicit "purge data" for windowed changelog topics instead of "delete,compact" topic config
                 Key: KAFKA-9502
                 URL: https://issues.apache.org/jira/browse/KAFKA-9502
             Project: Kafka
          Issue Type: Bug
          Components: streams
            Reporter: Matthias J. Sax


Via https://issues.apache.org/jira/browse/KAFKA-4015, we introduced topic cleanup policy "compact,delete" to allow windowed changelog topic in Kafka Stream to purge old windows eventually. (Note, that the key space for windowed changelog topics is unbounded and "compact" policy itself would imply unbounded grows of the topic).

To guard against clock drift we also added config `windowstore.changelog.additional.retention.ms` via https://issues.apache.org/jira/browse/KAFKA-3595. This config also prevents changelog truncation if the application is offline. Note the broker retention time is a mix of event- and wallclock-time.

Later we improved retention time handling for local state stores client side (https://issues.apache.org/jira/browse/KAFKA-6978 and https://issues.apache.org/jira/browse/KAFKA-7072).

In some other work, we switched to use "purge data" calls (https://issues.apache.org/jira/browse/KAFKA-6150) and set the retention time to infinite (https://issues.apache.org/jira/browse/KAFKA-6535), to avoid data loss for repartition topics

To improve Kafka Streams further, we should consider to combine the lessons from above and apply them to changelog topics, too:

Because state store retention is purely event-time based in Kafka Streams, there is a gap between store retention and changelog retention. This could lead to data loss if an application is offline for a long time and loses its client side state store, because the broker may roll the changelog forward and eventually delete old segments based on its wallclock time progress.

Hence, we should change the changelog topic configs back to "compact" only (or if necessary keep "compact,delete" but set an infinite retention time) and use explicit "purge data" calls to delete data in the changelog topics only when the state store also deletes its local data. This will keep the local state store and the changelog in-sync, protect against data loss, and make the config `windowstore.changelog.additional.retention.ms` unnecessary (we should do a KIP for this ticket to deprecate and remove the config) what helps to reduce broker sides storage as no unnecessary changelog data will be kept.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)