You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2021/12/02 22:02:00 UTC
[jira] [Created] (KAFKA-13499) Avoid restoring outdated records
Matthias J. Sax created KAFKA-13499:
---------------------------------------
Summary: Avoid restoring outdated records
Key: KAFKA-13499
URL: https://issues.apache.org/jira/browse/KAFKA-13499
Project: Kafka
Issue Type: Improvement
Components: streams
Reporter: Matthias J. Sax
Kafka Streams has the config `windowstore.changelog.additional.retention.ms` to allow for an increase retention time.
While an increase retention time can be useful, it can also lead to unnecessary restore cost, especially for stream-stream joins. Assume a stream-stream join with 1h window size and a grace period of 1h. For this case, we only need 2h of data to restore. If we lag, the `windowstore.changelog.additional.retention.ms` helps to prevent the broker from truncating data too early. However, if we don't lag and we need to restore, we restore everything from the changelog.
Instead of doing a seek-to-beginning, we could use the timestamp index to seek the first offset older than the 2h "window" of data that we need to restore, to avoid unnecessary work.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)