You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Marina Popova <pp...@protonmail.com.INVALID> on 2021/05/20 18:10:39 UTC

Weird behavior of topic retention - some are cleaned up too often, some are not at all

Hi, I have posted this question on SO:
https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
but wanted to re-post here as well in case someone spots the issue right away ....

Thank you for your help!

>>>>>
We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.

One topic, **tracking.ap.client.traffic**, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)

Second topic, **tracking.ap.client.traffic.keyed**, is created in KSQL as a stream topic:
```
CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
AS SELECT
...

```
its retention is set to the default broker value, which is 720 hrs :

```
cat /etc/kafka/server.properties | grep retention
log.retention.hours=720
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
# to the retention policies
log.retention.check.interval.ms=300000
```

This topic, though, gets cleaned up every 5 min or so - according to the logs....
The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??

No size-based retention is set for any of the topics.

Two questions:
1. why is the first topic not being cleaned p?
2. why is the second topic being cleaned up so often?

Below are the details about logs and full config of both topics:

log entries for **tracking.ap.client.traffic.keyed**-2 topic/partition - show that this partition is getting cleaned too often:

```
[2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
[2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
[2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
[2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
[2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
[2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
[2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
[2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
[2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
[2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
[2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
[2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
[2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
[2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
[2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
[2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
[2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
[2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
[2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
[2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
[2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
[2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
[2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
[2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
[2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
[2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
[2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
[2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
[2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
[2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
[2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
[2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
[2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
[2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
[2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
[2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
[2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
[2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
[2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
[2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
[2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
```

There are **no such entries** for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??

Full config for **tracking.ap.client.traffic.keyed** topic: (that is cleaned up too often):

```
mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all

All configs for topic tracking.ap.client.traffic.keyed are:
compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
leader.replication.throttled.replicas= sensitive=false synonyms={}
message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
segment.jitter.ms=0 sensitive=false synonyms={}
cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
flush.ms=9223372036854775807 sensitive=false synonyms={}
follower.replication.throttled.replicas= sensitive=false synonyms={}
segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
retention.ms=2592000000 sensitive=false synonyms={}
flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
segment.ms=604800000 sensitive=false synonyms={}
message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
```

Full config for **tracking.ap.client.traffic** topic: (not being cleaned up at all):
```
mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all

All configs for topic tracking.ap.client.traffic are:
compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
leader.replication.throttled.replicas= sensitive=false synonyms={}
message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
segment.jitter.ms=0 sensitive=false synonyms={}
cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
flush.ms=9223372036854775807 sensitive=false synonyms={}
follower.replication.throttled.replicas= sensitive=false synonyms={}
segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
segment.ms=604800000 sensitive=false synonyms={}
message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
```

Here is an example directory content of the logs for both topics:

ls -la /apps/kafka-data/**tracking.ap.client.traffic**-*
```
/apps/kafka-data/tracking.ap.client.traffic-1:
total 412640
drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
-rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
-rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint

/apps/kafka-data/tracking.ap.client.traffic-11:
total 412844
drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
-rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
-rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint

/apps/kafka-data/tracking.ap.client.traffic-13:
total 412660
drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
-rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
-rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint

/apps/kafka-data/tracking.ap.client.traffic-14:
total 412920
drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
-rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
-rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
```

ls -la /apps/kafka-data/**tracking.ap.client.traffic.keyed***
```
/apps/kafka-data/tracking.ap.client.traffic.keyed-1:
total 288
drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
-rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
-rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint

/apps/kafka-data/tracking.ap.client.traffic.keyed-2:
total 816
drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
-rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
-rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
-rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint

/apps/kafka-data/tracking.ap.client.traffic.keyed-3:
total 1812
drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
-rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
-rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
-rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
-rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
```

Thank you!

Sent with [ProtonMail](https://protonmail.com) Secure Email.

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Marina Popova <pp...@protonmail.com.INVALID>.

For those who is interested - I've posted a detailed explanation of the reasons for these issues here: https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all/67827549#67827549

In brief - it was an issue with a low=volume topic and segment rotation for the topic that was not cleaned up for a long time, and it was an issue with wrong event timestamps for the topic that was being cleaned too often,

thank you, Matthias, for the tips!
Marina



Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, May 26, 2021 12:09 PM, Marina Popova <pp...@protonmail.com.INVALID> wrote:

> Thank you, Matthias, for the ideas to verify next!
>
> Here is what I see:
>
> Topic 1 - that is not being cleaned up for 3 days already, but has retention set to 4 hrs: (I've truncated the payload but left the important details):
>
> 11:32:11 confluent@qv-ckafkama7 [~] $ /usr/bin/kafka-dump-log --print-data-log --files /apps/kafka-data/tracking.ap.client.traffic-9/00000000000000744551.log | tail -5
>
> offset: 1294064 position: 319988493 CreateTime: 1622043140910 isvalid: true keysize: -1 valuesize: 633 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] payload: {..."LOGSTASH_ID":"1f7bd3fe-123b-40c2-970e-c85356469eda","ACTIVITY_DATE":1622043139,"@timestamp":"2021-05-26T15:32:19.000Z"}
> offset: 1294065 position: 319989196 CreateTime: 1622043140911 isvalid: true keysize: -1 valuesize: 466 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] payload: {... "LOGSTASH_ID":"a6f94ea3-a915-4e59-88ad-b4843e3e71b4","ACTIVITY_DATE":1622043139,"@timestamp":"2021-05-26T15:32:19.000Z"}
> 11:32:29 confluent@qv-ckafkama7 [~] $ date
> Wed May 26 11:32:47 EDT 2021
> 11:32:47 confluent@qv-ckafkama7 [~] $
>
> CreateTime: 1622043140911 corresponds to:
> GMT: Wednesday, May 26, 2021 3:32:20.911 PM
> Your time zone: Wednesday, May 26, 2021 11:32:20.911 AM GMT-04:00 DST
>
> output of 'date' shows the broker time: Wed May 26 11:32:47 EDT 2021
>
> Topic 2 - created by KSQL based on Topic 1 - that is being cleaned up every 5 min, even though the retention = 720 hrs, and the segments (as seen in the log dir) are in Ks, not even close to 1G default segment size value:
>
> 11:32:47 confluent@qv-ckafkama7 [~] $ /usr/bin/kafka-dump-log --print-data-log --files /apps/kafka-data/tracking.ap.client.traffic.keyed-1/00000000000004735501.log | tail -2
> offset: 4736628 position: 676217 CreateTime: 1622043813 isvalid: true keysize: 8 valuesize: 506 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] key: 17606499 payload: {"ACTIVITY_DATE":1622043813001, ...}
> offset: 4736629 position: 676801 CreateTime: 1622043814 isvalid: true keysize: 7 valuesize: 425 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] key: 1229867 payload: {"ACTIVITY_DATE":1622043814001, ...}
> 11:43:36 confluent@qv-ckafkama7 [~] $ date
> Wed May 26 11:43:40 EDT 2021
> 11:43:40 confluent@qv-ckafkama7 [~] $
>
> So, here - I do see "CreateTime: 1622043814" - in a different format .... mot milliseconds, but seconds, it looks like - could this be considered "very old timestamp" and that's why the logs are being cleaned up every 5 min ??
> I believe this could be because the stream (and topic) is created from the Topic 1, using the ACTIVITY_DATE as the timestamp, which is in seconds, not ms ...
>
> Although, at this time I am more concerned with the Topic 1 problem - as the data keeps growing and growing and eventually causes out of diskspace failures ....
>
> Thank you!!!
> Marina
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, May 26, 2021 3:23 AM, Matthias J. Sax mjsax@apache.org wrote:
>
> > Kafka's retention policy is a little bit tricky, because it determines
> > if data is subject to deletion by comparing the embedded record
> > timestamps to the broker wall-clock time.
> > Can you verify your record timestamps as well as broker clocks?
> > If you write "future data", ie, data with timestamps larger than broker
> > wall-clock time, it might stay in the topic longer than retention time.
> > If you write "very old data", ie, data with timestamps smaller than
> > broker wall-clock minus retention time, data might get purged aggressively.
> > -Matthias
> > On 5/25/21 5:07 PM, Marina Popova wrote:
> >
> > > Thank you, Ran,
> > > yes, I understand about the segment size impact - however, if you look at the configuration of those topics (see below) - you will see they have exactly the same config, with the same segment.bytes setting (to 1G, default), and all other settings are also the same - except for the retention:
> > > 1 topic has retention 4 hrs:
> > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> > > 2nd topic has retention 720 hrs (30 days), default:
> > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > retention.ms=2592000000 sensitive=false synonyms={}
> > > yet, 1st topic is not being cleaned up, but the 2-nd topic is being cleaned up every 5 min, due to "
> > > Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach "
> > > And I can see that the size of the segments never reaches 1G for the 2-nd topic either ...
> > > thank you,
> > > Marina
> > > Sent with ProtonMail Secure Email.
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On Tuesday, May 25, 2021 4:11 PM, Ran Lupovich ranlupovich@gmail.com wrote:
> > >
> > > > Sorry I did not see all the info at first, what do you mean by topic getting cleaned, you have setting to check retention every 5 minutes, the data that getting "cleaned" is the older data which is 30 days old... am I missing something?
> > > > בתאריך יום ג׳, 25 במאי 2021, 23:04, מאת Ran Lupovich ‏ranlupovich@gmail.com:
> > > >
> > > > > By the segment size you are "delete" after 1 giga bytes is full , per partition, you need to remmber the retention is done when segments closed , per partition
> > > > > בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏ranlupovich@gmail.com:
> > > > >
> > > > > > Have you checked the segment size? Did you decribe the topic configuration?maybe you created it with some settings you dont remember
> > > > > > בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova ‏ppine7sub@protonmail.com.invalid:
> > > > > >
> > > > > > > Any idea what is wrong here? I have restarted Kafka brokers a few times, and all other Confluent services like KSQL - but I see exactly the same behavior - one topic gets its logs cleaned up every 5 minutes, while the other one - does not get cleaned up at all ....
> > > > > > > Is there anything else I could check, apart from what I already did - see the post below - to troubleshoot this?
> > > > > > > thank you!
> > > > > > > Marina
> > > > > > > Sent with ProtonMail Secure Email.
> > > > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > > > > On Thursday, May 20, 2021 2:10 PM, Marina Popova ppine7p@protonmail.com.INVALID wrote:
> > > > > > >
> > > > > > > > Hi, I have posted this question on SO:
> > > > > > > > https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
> > > > > > > > but wanted to re-post here as well in case someone spots the issue right away ....
> > > > > > > > Thank you for your help!
> > > > > > > >
> > > > > > > > > > > > >
> > > > > > > >
> > > > > > > > We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.
> > > > > > > > One topic, tracking.ap.client.traffic, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)
> > > > > > > > Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a stream topic:
> > > > > > > > CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
> > > > > > > > WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
> > > > > > > > AS SELECT
> > > > > > > > ...
> > > > > > > > its retention is set to the default broker value, which is 720 hrs :
> > > > > > > > cat /etc/kafka/server.properties | grep retention
> > > > > > > > log.retention.hours=720
> > > > > > > > A size-based retention policy for logs. Segments are pruned from the log unless the remaining
> > > > > > > > ==============================================================================================
> > > > > > > > segments drop below log.retention.bytes. Functions independently of log.retention.hours.
> > > > > > > > =========================================================================================
> > > > > > > > #log.retention.bytes=1073741824
> > > > > > > > to the retention policies
> > > > > > > > ==========================
> > > > > > > > log.retention.check.interval.ms=300000
> > > > > > > > This topic, though, gets cleaned up every 5 min or so - according to the logs....
> > > > > > > > The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??
> > > > > > > > No size-based retention is set for any of the topics.
> > > > > > > > Two questions:
> > > > > > > >
> > > > > > > > 1.  why is the first topic not being cleaned p?
> > > > > > > > 2.  why is the second topic being cleaned up so often?
> > > > > > > >
> > > > > > > > Below are the details about logs and full config of both topics:
> > > > > > > > log entries for tracking.ap.client.traffic.keyed-2 topic/partition - show that this partition is getting cleaned too often:
> > > > > > > > [2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > > [2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
> > > > > > > > [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > > [2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
> > > > > > > > [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > > [2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
> > > > > > > > [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > > [2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
> > > > > > > > [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
> > > > > > > > [2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
> > > > > > > > [2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
> > > > > > > > [2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > > There areno such entries for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??
> > > > > > > > Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned up too often):
> > > > > > > > mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all
> > > > > > > > All configs for topic tracking.ap.client.traffic.keyed are:
> > > > > > > > compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
> > > > > > > > leader.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > > message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> > > > > > > > min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
> > > > > > > > segment.jitter.ms=0 sensitive=false synonyms={}
> > > > > > > > cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> > > > > > > > flush.ms=9223372036854775807 sensitive=false synonyms={}
> > > > > > > > follower.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > > > > > > retention.ms=2592000000 sensitive=false synonyms={}
> > > > > > > > flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> > > > > > > > message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> > > > > > > > file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
> > > > > > > > max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> > > > > > > > min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
> > > > > > > > message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> > > > > > > > preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
> > > > > > > > min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> > > > > > > > index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> > > > > > > > unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
> > > > > > > > retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> > > > > > > > delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> > > > > > > > segment.ms=604800000 sensitive=false synonyms={}
> > > > > > > > message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
> > > > > > > > segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> > > > > > > > Full config fortracking.ap.client.traffic topic: (not being cleaned up at all):
> > > > > > > > mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all
> > > > > > > > All configs for topic tracking.ap.client.traffic are:
> > > > > > > > compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
> > > > > > > > leader.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > > message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> > > > > > > > min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
> > > > > > > > segment.jitter.ms=0 sensitive=false synonyms={}
> > > > > > > > cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> > > > > > > > flush.ms=9223372036854775807 sensitive=false synonyms={}
> > > > > > > > follower.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > > > > > > retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> > > > > > > > flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> > > > > > > > message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> > > > > > > > file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
> > > > > > > > max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> > > > > > > > min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
> > > > > > > > message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> > > > > > > > preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
> > > > > > > > min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> > > > > > > > index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> > > > > > > > unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
> > > > > > > > retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> > > > > > > > delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> > > > > > > > segment.ms=604800000 sensitive=false synonyms={}
> > > > > > > > message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
> > > > > > > > segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> > > > > > > > Here is an example directory content of the logs for both topics:
> > > > > > > > ls -la /apps/kafka-data/tracking.ap.client.traffic-*
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic-1:
> > > > > > > > total 412640
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic-11:
> > > > > > > > total 412844
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic-13:
> > > > > > > > total 412660
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic-14:
> > > > > > > > total 412920
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
> > > > > > > > ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
> > > > > > > > total 288
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
> > > > > > > > total 816
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint
> > > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
> > > > > > > > total 1812
> > > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> > > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
> > > > > > > > -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
> > > > > > > > Thank you!
> > > > > > > > Sent with ProtonMail Secure Email.

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Marina Popova <pp...@protonmail.com.INVALID>.

Thank you, Matthias, for the ideas to verify next!

Here is what I see:

Topic 1 - that is not being cleaned up for 3 days already, but has retention set to 4 hrs: (I've truncated the payload but left the important details):

11:32:11 confluent@qv-ckafkama7 [~] $ /usr/bin/kafka-dump-log --print-data-log --files /apps/kafka-data/tracking.ap.client.traffic-9/00000000000000744551.log | tail -5

offset: 1294064 position: 319988493 CreateTime: 1622043140910 isvalid: true keysize: -1 valuesize: 633 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] payload: {..."LOGSTASH_ID":"1f7bd3fe-123b-40c2-970e-c85356469eda","ACTIVITY_DATE":1622043139,"@timestamp":"2021-05-26T15:32:19.000Z"}
offset: 1294065 position: 319989196 CreateTime: 1622043140911 isvalid: true keysize: -1 valuesize: 466 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] payload: {... "LOGSTASH_ID":"a6f94ea3-a915-4e59-88ad-b4843e3e71b4","ACTIVITY_DATE":1622043139,"@timestamp":"2021-05-26T15:32:19.000Z"}
11:32:29 confluent@qv-ckafkama7 [~] $ date
Wed May 26 11:32:47 EDT 2021
11:32:47 confluent@qv-ckafkama7 [~] $


CreateTime: 1622043140911 corresponds to:
GMT: Wednesday, May 26, 2021 3:32:20.911 PM
Your time zone: Wednesday, May 26, 2021 11:32:20.911 AM GMT-04:00 DST

output of 'date' shows the broker time: Wed May 26 11:32:47 EDT 2021


Topic 2 - created by KSQL based on Topic 1 -  that is being cleaned up every 5 min, even though the retention = 720 hrs, and the segments (as seen in the log dir) are  in Ks, not even close to 1G default segment size value:

11:32:47 confluent@qv-ckafkama7 [~] $ /usr/bin/kafka-dump-log --print-data-log --files /apps/kafka-data/tracking.ap.client.traffic.keyed-1/00000000000004735501.log | tail -2
offset: 4736628 position: 676217 CreateTime: 1622043813 isvalid: true keysize: 8 valuesize: 506 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] key: 17606499 payload: {"ACTIVITY_DATE":1622043813001, ...}
offset: 4736629 position: 676801 CreateTime: 1622043814 isvalid: true keysize: 7 valuesize: 425 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isTransactional: false headerKeys: [] key: 1229867 payload: {"ACTIVITY_DATE":1622043814001, ...}
11:43:36 confluent@qv-ckafkama7 [~] $ date
Wed May 26 11:43:40 EDT 2021
11:43:40 confluent@qv-ckafkama7 [~] $

So, here - I do see "CreateTime: 1622043814" - in a different format .... mot milliseconds, but seconds, it looks like - could this be considered "very old timestamp" and that's why the logs are being cleaned up every 5 min ??
I believe this could be because the stream (and topic) is created from the Topic 1, using the ACTIVITY_DATE as the timestamp, which is in seconds, not ms ...


Although, at this time I am more concerned with the Topic 1 problem - as the data keeps growing and growing and eventually causes out of diskspace failures ....


Thank you!!!
Marina

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, May 26, 2021 3:23 AM, Matthias J. Sax <mj...@apache.org> wrote:

> Kafka's retention policy is a little bit tricky, because it determines
> if data is subject to deletion by comparing the embedded record
> timestamps to the broker wall-clock time.
>
> Can you verify your record timestamps as well as broker clocks?
>
> If you write "future data", ie, data with timestamps larger than broker
> wall-clock time, it might stay in the topic longer than retention time.
>
> If you write "very old data", ie, data with timestamps smaller than
> broker wall-clock minus retention time, data might get purged aggressively.
>
> -Matthias
>
> On 5/25/21 5:07 PM, Marina Popova wrote:
>
> > Thank you, Ran,
> > yes, I understand about the segment size impact - however, if you look at the configuration of those topics (see below) - you will see they have exactly the same config, with the same segment.bytes setting (to 1G, default), and all other settings are also the same - except for the retention:
> > 1 topic has retention 4 hrs:
> > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> > 2nd topic has retention 720 hrs (30 days), default:
> > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > retention.ms=2592000000 sensitive=false synonyms={}
> > yet, 1st topic is not being cleaned up, but the 2-nd topic is being cleaned up every 5 min, due to "
> > Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach "
> > And I can see that the size of the segments never reaches 1G for the 2-nd topic either ...
> > thank you,
> > Marina
> > Sent with ProtonMail Secure Email.
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Tuesday, May 25, 2021 4:11 PM, Ran Lupovich ranlupovich@gmail.com wrote:
> >
> > > Sorry I did not see all the info at first, what do you mean by topic getting cleaned, you have setting to check retention every 5 minutes, the data that getting "cleaned" is the older data which is 30 days old... am I missing something?
> > > בתאריך יום ג׳, 25 במאי 2021, 23:04, מאת Ran Lupovich ‏ranlupovich@gmail.com:
> > >
> > > > By the segment size you are "delete" after 1 giga bytes is full , per partition, you need to remmber the retention is done when segments closed , per partition
> > > > בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏ranlupovich@gmail.com:
> > > >
> > > > > Have you checked the segment size? Did you decribe the topic configuration?maybe you created it with some settings you dont remember
> > > > > בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova ‏ppine7sub@protonmail.com.invalid:
> > > > >
> > > > > > Any idea what is wrong here? I have restarted Kafka brokers a few times, and all other Confluent services like KSQL - but I see exactly the same behavior - one topic gets its logs cleaned up every 5 minutes, while the other one - does not get cleaned up at all ....
> > > > > > Is there anything else I could check, apart from what I already did - see the post below - to troubleshoot this?
> > > > > > thank you!
> > > > > > Marina
> > > > > > Sent with ProtonMail Secure Email.
> > > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > > > On Thursday, May 20, 2021 2:10 PM, Marina Popova ppine7p@protonmail.com.INVALID wrote:
> > > > > >
> > > > > > > Hi, I have posted this question on SO:
> > > > > > > https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
> > > > > > > but wanted to re-post here as well in case someone spots the issue right away ....
> > > > > > > Thank you for your help!
> > > > > > >
> > > > > > > > > > > >
> > > > > > >
> > > > > > > We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.
> > > > > > > One topic, tracking.ap.client.traffic, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)
> > > > > > > Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a stream topic:
> > > > > > > CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
> > > > > > > WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
> > > > > > > AS SELECT
> > > > > > > ...
> > > > > > > its retention is set to the default broker value, which is 720 hrs :
> > > > > > > cat /etc/kafka/server.properties | grep retention
> > > > > > > log.retention.hours=720
> > > > > > >
> > > > > > > A size-based retention policy for logs. Segments are pruned from the log unless the remaining
> > > > > > >
> > > > > > > ==============================================================================================
> > > > > > >
> > > > > > > segments drop below log.retention.bytes. Functions independently of log.retention.hours.
> > > > > > >
> > > > > > > =========================================================================================
> > > > > > >
> > > > > > > #log.retention.bytes=1073741824
> > > > > > >
> > > > > > > to the retention policies
> > > > > > >
> > > > > > > ==========================
> > > > > > >
> > > > > > > log.retention.check.interval.ms=300000
> > > > > > > This topic, though, gets cleaned up every 5 min or so - according to the logs....
> > > > > > > The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??
> > > > > > > No size-based retention is set for any of the topics.
> > > > > > > Two questions:
> > > > > > >
> > > > > > > 1.  why is the first topic not being cleaned p?
> > > > > > > 2.  why is the second topic being cleaned up so often?
> > > > > > >
> > > > > > > Below are the details about logs and full config of both topics:
> > > > > > > log entries for tracking.ap.client.traffic.keyed-2 topic/partition - show that this partition is getting cleaned too often:
> > > > > > > [2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
> > > > > > > [2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
> > > > > > > [2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > [2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
> > > > > > > [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
> > > > > > > [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
> > > > > > > [2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
> > > > > > > [2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
> > > > > > > [2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > [2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
> > > > > > > [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
> > > > > > > [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
> > > > > > > [2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
> > > > > > > [2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
> > > > > > > [2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > [2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
> > > > > > > [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
> > > > > > > [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
> > > > > > > [2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
> > > > > > > [2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
> > > > > > > [2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
> > > > > > > [2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
> > > > > > > [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
> > > > > > > [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
> > > > > > > [2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
> > > > > > > [2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
> > > > > > > [2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
> > > > > > > [2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
> > > > > > > There areno such entries for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??
> > > > > > > Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned up too often):
> > > > > > > mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all
> > > > > > > All configs for topic tracking.ap.client.traffic.keyed are:
> > > > > > > compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
> > > > > > > leader.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> > > > > > > min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
> > > > > > > segment.jitter.ms=0 sensitive=false synonyms={}
> > > > > > > cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> > > > > > > flush.ms=9223372036854775807 sensitive=false synonyms={}
> > > > > > > follower.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > > > > > retention.ms=2592000000 sensitive=false synonyms={}
> > > > > > > flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> > > > > > > message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> > > > > > > file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
> > > > > > > max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> > > > > > > min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
> > > > > > > message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> > > > > > > preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
> > > > > > > min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> > > > > > > index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> > > > > > > unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
> > > > > > > retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> > > > > > > delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> > > > > > > segment.ms=604800000 sensitive=false synonyms={}
> > > > > > > message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
> > > > > > > segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> > > > > > > Full config fortracking.ap.client.traffic topic: (not being cleaned up at all):
> > > > > > > mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all
> > > > > > > All configs for topic tracking.ap.client.traffic are:
> > > > > > > compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
> > > > > > > leader.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> > > > > > > min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
> > > > > > > segment.jitter.ms=0 sensitive=false synonyms={}
> > > > > > > cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> > > > > > > flush.ms=9223372036854775807 sensitive=false synonyms={}
> > > > > > > follower.replication.throttled.replicas= sensitive=false synonyms={}
> > > > > > > segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> > > > > > > retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> > > > > > > flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> > > > > > > message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> > > > > > > file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
> > > > > > > max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> > > > > > > min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
> > > > > > > message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> > > > > > > preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
> > > > > > > min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> > > > > > > index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> > > > > > > unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
> > > > > > > retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> > > > > > > delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> > > > > > > segment.ms=604800000 sensitive=false synonyms={}
> > > > > > > message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
> > > > > > > segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> > > > > > > Here is an example directory content of the logs for both topics:
> > > > > > > ls -la /apps/kafka-data/tracking.ap.client.traffic-*
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic-1:
> > > > > > > total 412640
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic-11:
> > > > > > > total 412844
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic-13:
> > > > > > > total 412660
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic-14:
> > > > > > > total 412920
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
> > > > > > > ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
> > > > > > > total 288
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
> > > > > > > total 816
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint
> > > > > > > /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
> > > > > > > total 1812
> > > > > > > drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> > > > > > > drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
> > > > > > > -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
> > > > > > > Thank you!
> > > > > > > Sent with ProtonMail Secure Email.

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by "Matthias J. Sax" <mj...@apache.org>.

Kafka's retention policy is a little bit tricky, because it determines
if data is subject to deletion by comparing the embedded record
timestamps to the broker wall-clock time.

Can you verify your record timestamps as well as broker clocks?

If you write "future data", ie, data with timestamps larger than broker
wall-clock time, it might stay in the topic longer than retention time.

If you write "very old data", ie, data with timestamps smaller than
broker wall-clock minus retention time, data might get purged aggressively.


-Matthias

On 5/25/21 5:07 PM, Marina Popova wrote:
> Thank you, Ran,
> yes, I understand about the segment size impact - however, if you look at the configuration of those topics (see below) - you will see they have exactly the same config, with the same segment.bytes setting (to 1G, default), and all other settings are also the same - except for the retention:
> 1 topic has retention 4 hrs:
> 
> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> 
> 2nd topic has retention 720 hrs (30 days), default:
> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
> retention.ms=2592000000 sensitive=false synonyms={}
> 
> yet, 1st topic is not being cleaned up, but the 2-nd topic is being cleaned up every 5 min, due to "
> 
> Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach "
> 
> And I can see that the size of the segments never reaches 1G for the 2-nd topic either ...
> 
> thank you,
> 
> Marina
> 
> Sent with [ProtonMail](https://protonmail.com) Secure Email.
> 
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Tuesday, May 25, 2021 4:11 PM, Ran Lupovich <ra...@gmail.com> wrote:
> 
>> Sorry I did not see all the info at first, what do you mean by topic getting cleaned, you have setting to check retention every 5 minutes, the data that getting "cleaned" is the older data which is 30 days old... am I missing something?
>>
>> בתאריך יום ג׳, 25 במאי 2021, 23:04, מאת Ran Lupovich ‏<ra...@gmail.com>:
>>
>>> By the segment size you are "delete" after 1 giga bytes is full , per partition, you need to remmber the retention is done when segments closed , per partition
>>>
>>> בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏<ra...@gmail.com>:
>>>
>>>> Have you checked the segment size? Did you decribe the topic configuration?maybe you created it with some settings you dont remember
>>>>
>>>> בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova ‏<pp...@protonmail.com.invalid>:
>>>>
>>>>> Any idea what is wrong here? I have restarted Kafka brokers a few times, and all other Confluent services like KSQL - but I see exactly the same behavior - one topic gets its logs cleaned up every 5 minutes, while the other one - does not get cleaned up at all ....
>>>>>
>>>>> Is there anything else I could check, apart from what I already did - see the post below - to troubleshoot this?
>>>>>
>>>>> thank you!
>>>>> Marina
>>>>>
>>>>> Sent with ProtonMail Secure Email.
>>>>>
>>>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>>>> On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID> wrote:
>>>>>
>>>>>> Hi, I have posted this question on SO:
>>>>>> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
>>>>>> but wanted to re-post here as well in case someone spots the issue right away ....
>>>>>>
>>>>>> Thank you for your help!
>>>>>>
>>>>>>>>>>>
>>>>>>
>>>>>> We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.
>>>>>>
>>>>>> One topic, tracking.ap.client.traffic, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)
>>>>>>
>>>>>> Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a stream topic:
>>>>>>
>>>>>> CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
>>>>>> WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
>>>>>> AS SELECT
>>>>>> ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> its retention is set to the default broker value, which is 720 hrs :
>>>>>>
>>>>>> cat /etc/kafka/server.properties | grep retention
>>>>>> log.retention.hours=720
>>>>>> # A size-based retention policy for logs. Segments are pruned from the log unless the remaining
>>>>>> # segments drop below log.retention.bytes. Functions independently of log.retention.hours.
>>>>>> #log.retention.bytes=1073741824
>>>>>> # to the retention policies
>>>>>> log.retention.check.interval.ms=300000
>>>>>>
>>>>>>
>>>>>> This topic, though, gets cleaned up every 5 min or so - according to the logs....
>>>>>> The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??
>>>>>>
>>>>>> No size-based retention is set for any of the topics.
>>>>>>
>>>>>> Two questions:
>>>>>>
>>>>>> 1. why is the first topic not being cleaned p?
>>>>>> 2. why is the second topic being cleaned up so often?
>>>>>>
>>>>>> Below are the details about logs and full config of both topics:
>>>>>>
>>>>>> log entries for tracking.ap.client.traffic.keyed-2 topic/partition - show that this partition is getting cleaned too often:
>>>>>>
>>>>>>
>>>>>> [2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
>>>>>> [2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
>>>>>> [2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>>> [2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
>>>>>> [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
>>>>>> [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
>>>>>> [2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
>>>>>> [2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
>>>>>> [2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>>> [2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
>>>>>> [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
>>>>>> [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
>>>>>> [2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
>>>>>> [2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
>>>>>> [2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>>> [2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
>>>>>> [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
>>>>>> [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
>>>>>> [2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
>>>>>> [2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
>>>>>> [2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>>> [2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
>>>>>> [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
>>>>>> [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
>>>>>> [2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
>>>>>> [2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
>>>>>> [2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
>>>>>> [2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
>>>>>>
>>>>>>
>>>>>> There areno such entries for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??
>>>>>>
>>>>>> Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned up too often):
>>>>>>
>>>>>> mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all
>>>>>>
>>>>>> All configs for topic tracking.ap.client.traffic.keyed are:
>>>>>> compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>>>>>> leader.replication.throttled.replicas= sensitive=false synonyms={}
>>>>>> message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>>>>> min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>>>>>> segment.jitter.ms=0 sensitive=false synonyms={}
>>>>>> cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>>>>> flush.ms=9223372036854775807 sensitive=false synonyms={}
>>>>>> follower.replication.throttled.replicas= sensitive=false synonyms={}
>>>>>> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>>>>> retention.ms=2592000000 sensitive=false synonyms={}
>>>>>> flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>>>>> message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>>>>> file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>>>>> max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>>>>> min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>>>>>> message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>>>>> preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>>>>> min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>>>>> index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>>>>> unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>>>>> retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>>>>> delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>>>>> segment.ms=604800000 sensitive=false synonyms={}
>>>>>> message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>>>>>> segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>>>>>
>>>>>>
>>>>>> Full config fortracking.ap.client.traffic topic: (not being cleaned up at all):
>>>>>>
>>>>>> mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all
>>>>>>
>>>>>> All configs for topic tracking.ap.client.traffic are:
>>>>>> compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>>>>>> leader.replication.throttled.replicas= sensitive=false synonyms={}
>>>>>> message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>>>>> min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>>>>>> segment.jitter.ms=0 sensitive=false synonyms={}
>>>>>> cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>>>>> flush.ms=9223372036854775807 sensitive=false synonyms={}
>>>>>> follower.replication.throttled.replicas= sensitive=false synonyms={}
>>>>>> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>>>>> retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
>>>>>> flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>>>>> message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>>>>> file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>>>>> max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>>>>> min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>>>>>> message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>>>>> preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>>>>> min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>>>>> index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>>>>> unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>>>>> retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>>>>> delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>>>>> segment.ms=604800000 sensitive=false synonyms={}
>>>>>> message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>>>>>> segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>>>>>
>>>>>>
>>>>>> Here is an example directory content of the logs for both topics:
>>>>>>
>>>>>> ls -la /apps/kafka-data/tracking.ap.client.traffic-*
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic-1:
>>>>>> total 412640
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic-11:
>>>>>> total 412844
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic-13:
>>>>>> total 412660
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic-14:
>>>>>> total 412920
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>>>>>>
>>>>>>
>>>>>> ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
>>>>>> total 288
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
>>>>>> total 816
>>>>>> drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint
>>>>>>
>>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
>>>>>> total 1812
>>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
>>>>>> -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
>>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
>>>>>> -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
>>>>>>
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Sent with ProtonMail Secure Email.

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Marina Popova <pp...@protonmail.com.INVALID>.

Thank you, Ran,
yes, I understand about the segment size impact - however, if you look at the configuration of those topics (see below) - you will see they have exactly the same config, with the same segment.bytes setting (to 1G, default), and all other settings are also the same - except for the retention:
1 topic has retention 4 hrs:

segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}

2nd topic has retention 720 hrs (30 days), default:
segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
retention.ms=2592000000 sensitive=false synonyms={}

yet, 1st topic is not being cleaned up, but the 2-nd topic is being cleaned up every 5 min, due to "

Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach "

And I can see that the size of the segments never reaches 1G for the 2-nd topic either ...

thank you,

Marina

Sent with [ProtonMail](https://protonmail.com) Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, May 25, 2021 4:11 PM, Ran Lupovich <ra...@gmail.com> wrote:

> Sorry I did not see all the info at first, what do you mean by topic getting cleaned, you have setting to check retention every 5 minutes, the data that getting "cleaned" is the older data which is 30 days old... am I missing something?
>
> בתאריך יום ג׳, 25 במאי 2021, 23:04, מאת Ran Lupovich ‏<ra...@gmail.com>:
>
>> By the segment size you are "delete" after 1 giga bytes is full , per partition, you need to remmber the retention is done when segments closed , per partition
>>
>> בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏<ra...@gmail.com>:
>>
>>> Have you checked the segment size? Did you decribe the topic configuration?maybe you created it with some settings you dont remember
>>>
>>> בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova ‏<pp...@protonmail.com.invalid>:
>>>
>>>> Any idea what is wrong here? I have restarted Kafka brokers a few times, and all other Confluent services like KSQL - but I see exactly the same behavior - one topic gets its logs cleaned up every 5 minutes, while the other one - does not get cleaned up at all ....
>>>>
>>>> Is there anything else I could check, apart from what I already did - see the post below - to troubleshoot this?
>>>>
>>>> thank you!
>>>> Marina
>>>>
>>>> Sent with ProtonMail Secure Email.
>>>>
>>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>>> On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID> wrote:
>>>>
>>>>> Hi, I have posted this question on SO:
>>>>> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
>>>>> but wanted to re-post here as well in case someone spots the issue right away ....
>>>>>
>>>>> Thank you for your help!
>>>>>
>>>>> > > > > >
>>>>>
>>>>> We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.
>>>>>
>>>>> One topic, tracking.ap.client.traffic, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)
>>>>>
>>>>> Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a stream topic:
>>>>>
>>>>> CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
>>>>> WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
>>>>> AS SELECT
>>>>> ...
>>>>>
>>>>>
>>>>>
>>>>> its retention is set to the default broker value, which is 720 hrs :
>>>>>
>>>>> cat /etc/kafka/server.properties | grep retention
>>>>> log.retention.hours=720
>>>>> # A size-based retention policy for logs. Segments are pruned from the log unless the remaining
>>>>> # segments drop below log.retention.bytes. Functions independently of log.retention.hours.
>>>>> #log.retention.bytes=1073741824
>>>>> # to the retention policies
>>>>> log.retention.check.interval.ms=300000
>>>>>
>>>>>
>>>>> This topic, though, gets cleaned up every 5 min or so - according to the logs....
>>>>> The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??
>>>>>
>>>>> No size-based retention is set for any of the topics.
>>>>>
>>>>> Two questions:
>>>>>
>>>>> 1. why is the first topic not being cleaned p?
>>>>> 2. why is the second topic being cleaned up so often?
>>>>>
>>>>> Below are the details about logs and full config of both topics:
>>>>>
>>>>> log entries for tracking.ap.client.traffic.keyed-2 topic/partition - show that this partition is getting cleaned too often:
>>>>>
>>>>>
>>>>> [2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
>>>>> [2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
>>>>> [2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>> [2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
>>>>> [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
>>>>> [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
>>>>> [2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
>>>>> [2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
>>>>> [2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>> [2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
>>>>> [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
>>>>> [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
>>>>> [2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
>>>>> [2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
>>>>> [2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>> [2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
>>>>> [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
>>>>> [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
>>>>> [2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
>>>>> [2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
>>>>> [2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
>>>>> [2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
>>>>> [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
>>>>> [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
>>>>> [2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
>>>>> [2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
>>>>> [2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
>>>>> [2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
>>>>>
>>>>>
>>>>> There areno such entries for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??
>>>>>
>>>>> Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned up too often):
>>>>>
>>>>> mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all
>>>>>
>>>>> All configs for topic tracking.ap.client.traffic.keyed are:
>>>>> compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>>>>> leader.replication.throttled.replicas= sensitive=false synonyms={}
>>>>> message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>>>> min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>>>>> segment.jitter.ms=0 sensitive=false synonyms={}
>>>>> cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>>>> flush.ms=9223372036854775807 sensitive=false synonyms={}
>>>>> follower.replication.throttled.replicas= sensitive=false synonyms={}
>>>>> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>>>> retention.ms=2592000000 sensitive=false synonyms={}
>>>>> flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>>>> message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>>>> file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>>>> max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>>>> min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>>>>> message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>>>> preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>>>> min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>>>> index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>>>> unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>>>> retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>>>> delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>>>> segment.ms=604800000 sensitive=false synonyms={}
>>>>> message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>>>>> segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>>>>
>>>>>
>>>>> Full config fortracking.ap.client.traffic topic: (not being cleaned up at all):
>>>>>
>>>>> mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all
>>>>>
>>>>> All configs for topic tracking.ap.client.traffic are:
>>>>> compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>>>>> leader.replication.throttled.replicas= sensitive=false synonyms={}
>>>>> message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>>>> min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>>>>> segment.jitter.ms=0 sensitive=false synonyms={}
>>>>> cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>>>> flush.ms=9223372036854775807 sensitive=false synonyms={}
>>>>> follower.replication.throttled.replicas= sensitive=false synonyms={}
>>>>> segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>>>> retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
>>>>> flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>>>> message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>>>> file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>>>> max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>>>> min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>>>>> message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>>>> preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>>>> min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>>>> index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>>>> unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>>>> retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>>>> delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>>>> segment.ms=604800000 sensitive=false synonyms={}
>>>>> message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>>>>> segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>>>>
>>>>>
>>>>> Here is an example directory content of the logs for both topics:
>>>>>
>>>>> ls -la /apps/kafka-data/tracking.ap.client.traffic-*
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic-1:
>>>>> total 412640
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic-11:
>>>>> total 412844
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic-13:
>>>>> total 412660
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic-14:
>>>>> total 412920
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>>>>>
>>>>>
>>>>> ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
>>>>> total 288
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
>>>>> total 816
>>>>> drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint
>>>>>
>>>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
>>>>> total 1812
>>>>> drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>>>> drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
>>>>> -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
>>>>> -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
>>>>> -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
>>>>>
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Sent with ProtonMail Secure Email.

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Ran Lupovich <ra...@gmail.com>.

Sorry I did not see all the info at first, what do you mean by topic
getting cleaned, you have setting to check retention every 5 minutes, the
data that getting "cleaned" is the older data which is 30 days old... am I
missing something?

בתאריך יום ג׳, 25 במאי 2021, 23:04, מאת Ran Lupovich ‏<ranlupovich@gmail.com
>:

> By the segment size you are "delete" after 1 giga bytes is full , per
> partition,  you need to remmber the retention is done when segments closed
> , per partition
>
> בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏<
> ranlupovich@gmail.com>:
>
>> Have you checked the segment size? Did you decribe the topic
>> configuration?maybe you created it with some settings you dont remember
>>
>> בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova
>> ‏<pp...@protonmail.com.invalid>:
>>
>>>
>>> Any idea what is wrong here? I have restarted Kafka brokers a few times,
>>> and all other Confluent services like KSQL - but I see exactly the same
>>> behavior - one topic gets its logs cleaned up every 5 minutes, while the
>>> other one - does not get cleaned up at all ....
>>>
>>>
>>> Is there anything else I could check, apart from what I already did -
>>> see the post below - to troubleshoot this?
>>>
>>>
>>> thank you!
>>> Marina
>>>
>>> Sent with ProtonMail Secure Email.
>>>
>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>> On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID>
>>> wrote:
>>>
>>> > Hi, I have posted this question on SO:
>>> >
>>> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
>>> > but wanted to re-post here as well in case someone spots the issue
>>> right away ....
>>> >
>>> > Thank you for your help!
>>> >
>>> > > > > > >
>>> >
>>> > We have two topics on our Kafka cluster that exhibit weird (wrong)
>>> behavior related to retention configuration.
>>> >
>>> > One topic, tracking.ap.client.traffic, has retention set explicitly to
>>> "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in
>>> size, and caused 2 out of 3 kafka brokers to run out of disk space.
>>> (details about config and log below)
>>> >
>>> > Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as
>>> a stream topic:
>>> >
>>> >     CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
>>> >     WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed',
>>> TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
>>> >     AS SELECT
>>> >     ...
>>> >
>>> >
>>> >
>>> > its retention is set to the default broker value, which is 720 hrs :
>>> >
>>> >     cat /etc/kafka/server.properties | grep retention
>>> >     log.retention.hours=720
>>> >     # A size-based retention policy for logs. Segments are pruned from
>>> the log unless the remaining
>>> >     # segments drop below log.retention.bytes. Functions independently
>>> of log.retention.hours.
>>> >     #log.retention.bytes=1073741824
>>> >     # to the retention policies
>>> >     log.retention.check.interval.ms=300000
>>> >
>>> >
>>> > This topic, though, gets cleaned up every 5 min or so - according to
>>> the logs....
>>> > The log entry says the segment is marked for deletion "due to
>>> retention time 2592000000ms breach (kafka.log.Log)" - but how can that be
>>> true if this is happening every 5 min??
>>> >
>>> > No size-based retention is set for any of the topics.
>>> >
>>> > Two questions:
>>> >
>>> > 1.  why is the first topic not being cleaned p?
>>> > 2.  why is the second topic being cleaned up so often?
>>> >
>>> >     Below are the details about logs and full config of both topics:
>>> >
>>> >     log entries for tracking.ap.client.traffic.keyed-2 topic/partition
>>> - show that this partition is getting cleaned too often:
>>> >
>>> >
>>> >     [2021-05-19 21:35:05,822] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Incrementing log start offset to 11755700 (kafka.log.Log)
>>> >     [2021-05-19 21:36:05,822] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Deleting segment 11753910 (kafka.log.Log)
>>> >     [2021-05-19 21:36:05,825] INFO Deleted log
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:36:05,827] INFO Deleted offset index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:36:05,829] INFO Deleted time index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:40:05,838] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>>> deletable segments with base offsets [11755700] due to retention time
>>> 2592000000ms breach (kafka.log.Log)
>>> >     [2021-05-19 21:40:05,843] INFO [ProducerStateManager
>>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>>> offset 11757417 (kafka.log.ProducerStateManager)
>>> >     [2021-05-19 21:40:05,845] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>>> new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
>>> >     [2021-05-19 21:40:05,845] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Scheduling log segment [baseOffset 11755700, size 936249] for deletion.
>>> (kafka.log.Log)
>>> >     [2021-05-19 21:40:05,847] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Incrementing log start offset to 11757417 (kafka.log.Log)
>>> >     [2021-05-19 21:41:05,848] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Deleting segment 11755700 (kafka.log.Log)
>>> >     [2021-05-19 21:41:05,850] INFO Deleted log
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:41:05,852] INFO Deleted offset index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:41:05,853] INFO Deleted time index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:45:05,834] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>>> deletable segments with base offsets [11757417] due to retention time
>>> 2592000000ms breach (kafka.log.Log)
>>> >     [2021-05-19 21:45:05,837] INFO [ProducerStateManager
>>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>>> offset 11759201 (kafka.log.ProducerStateManager)
>>> >     [2021-05-19 21:45:05,839] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>>> new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
>>> >     [2021-05-19 21:45:05,839] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Scheduling log segment [baseOffset 11757417, size 977444] for deletion.
>>> (kafka.log.Log)
>>> >     [2021-05-19 21:45:05,842] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Incrementing log start offset to 11759201 (kafka.log.Log)
>>> >     [2021-05-19 21:46:05,842] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Deleting segment 11757417 (kafka.log.Log)
>>> >     [2021-05-19 21:46:05,844] INFO Deleted log
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:46:05,845] INFO Deleted offset index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:46:05,847] INFO Deleted time index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:50:05,832] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>>> deletable segments with base offsets [11759201] due to retention time
>>> 2592000000ms breach (kafka.log.Log)
>>> >     [2021-05-19 21:50:05,837] INFO [ProducerStateManager
>>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>>> offset 11760980 (kafka.log.ProducerStateManager)
>>> >     [2021-05-19 21:50:05,839] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>>> new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
>>> >     [2021-05-19 21:50:05,839] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Scheduling log segment [baseOffset 11759201, size 991447] for deletion.
>>> (kafka.log.Log)
>>> >     [2021-05-19 21:50:05,844] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Incrementing log start offset to 11760980 (kafka.log.Log)
>>> >     [2021-05-19 21:51:05,844] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Deleting segment 11759201 (kafka.log.Log)
>>> >     [2021-05-19 21:51:05,846] INFO Deleted log
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:51:05,848] INFO Deleted offset index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:51:05,849] INFO Deleted time index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:55:05,828] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>>> deletable segments with base offsets [11760980] due to retention time
>>> 2592000000ms breach (kafka.log.Log)
>>> >     [2021-05-19 21:55:05,833] INFO [ProducerStateManager
>>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>>> offset 11762941 (kafka.log.ProducerStateManager)
>>> >     [2021-05-19 21:55:05,835] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>>> new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
>>> >     [2021-05-19 21:55:05,835] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Scheduling log segment [baseOffset 11760980, size 1079204] for deletion.
>>> (kafka.log.Log)
>>> >     [2021-05-19 21:55:05,838] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Incrementing log start offset to 11762941 (kafka.log.Log)
>>> >     [2021-05-19 21:56:05,838] INFO [Log
>>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>>> Deleting segment 11760980 (kafka.log.Log)
>>> >     [2021-05-19 21:56:05,840] INFO Deleted log
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:56:05,842] INFO Deleted offset index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted.
>>> (kafka.log.LogSegment)
>>> >     [2021-05-19 21:56:05,843] INFO Deleted time index
>>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted.
>>> (kafka.log.LogSegment)
>>> >
>>> >
>>> > There areno such entries for log cleaning for the first topic,
>>> tracking.ap.client.traffic, which indicates it is never even checked for
>>> hitting retention ??
>>> >
>>> > Full config for tracking.ap.client.traffic.keyed topic: (that is
>>> cleaned up too often):
>>> >
>>> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
>>> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
>>> tracking.ap.client.traffic.keyed --describe --all
>>> >
>>> >     All configs for topic tracking.ap.client.traffic.keyed are:
>>> >     compression.type=producer sensitive=false
>>> synonyms={DEFAULT_CONFIG:compression.type=producer}
>>> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
>>> >     message.downconversion.enable=true sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>> >     min.insync.replicas=2 sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
>>> DEFAULT_CONFIG:min.insync.replicas=1}
>>> >     segment.jitter.ms=0 sensitive=false synonyms={}
>>> >     cleanup.policy=delete sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
>>> >     follower.replication.throttled.replicas= sensitive=false
>>> synonyms={}
>>> >     segment.bytes=1073741824 sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
>>> DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>> >     retention.ms=2592000000 sensitive=false synonyms={}
>>> >     flush.messages=9223372036854775807 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>> >     message.format.version=2.1-IV2 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>> >     file.delete.delay.ms=60000 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>> >     max.message.bytes=1000012 sensitive=false
>>> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
>>> log.cleaner.min.compaction.lag.ms=0}
>>> >     message.timestamp.type=CreateTime sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>> >     preallocate=false sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>> >     min.cleanable.dirty.ratio=0.5 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>> >     index.interval.bytes=4096 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>> >     unclean.leader.election.enable=false sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
>>> DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>> >     retention.bytes=-1 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>> >     delete.retention.ms=86400000 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>> >     segment.ms=604800000 sensitive=false synonyms={}
>>> >     message.timestamp.difference.max.ms=9223372036854775807
>>> sensitive=false synonyms={DEFAULT_CONFIG:
>>> log.message.timestamp.difference.max.ms=9223372036854775807}
>>> >     segment.index.bytes=10485760 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>> >
>>> >
>>> > Full config fortracking.ap.client.traffic topic: (not being cleaned up
>>> at all):
>>> >
>>> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
>>> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
>>> tracking.ap.client.traffic --describe --all
>>> >
>>> >     All configs for topic tracking.ap.client.traffic are:
>>> >     compression.type=producer sensitive=false
>>> synonyms={DEFAULT_CONFIG:compression.type=producer}
>>> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
>>> >     message.downconversion.enable=true sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>>> >     min.insync.replicas=2 sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
>>> DEFAULT_CONFIG:min.insync.replicas=1}
>>> >     segment.jitter.ms=0 sensitive=false synonyms={}
>>> >     cleanup.policy=delete sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>>> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
>>> >     follower.replication.throttled.replicas= sensitive=false
>>> synonyms={}
>>> >     segment.bytes=1073741824 sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
>>> DEFAULT_CONFIG:log.segment.bytes=1073741824}
>>> >     retention.ms=14400000 sensitive=false
>>> synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
>>> >     flush.messages=9223372036854775807 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>>> >     message.format.version=2.1-IV2 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>>> >     file.delete.delay.ms=60000 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>>> >     max.message.bytes=1000012 sensitive=false
>>> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>>> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
>>> log.cleaner.min.compaction.lag.ms=0}
>>> >     message.timestamp.type=CreateTime sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>>> >     preallocate=false sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.preallocate=false}
>>> >     min.cleanable.dirty.ratio=0.5 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>>> >     index.interval.bytes=4096 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>>> >     unclean.leader.election.enable=false sensitive=false
>>> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
>>> DEFAULT_CONFIG:unclean.leader.election.enable=false}
>>> >     retention.bytes=-1 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>>> >     delete.retention.ms=86400000 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>>> >     segment.ms=604800000 sensitive=false synonyms={}
>>> >     message.timestamp.difference.max.ms=9223372036854775807
>>> sensitive=false synonyms={DEFAULT_CONFIG:
>>> log.message.timestamp.difference.max.ms=9223372036854775807}
>>> >     segment.index.bytes=10485760 sensitive=false
>>> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>>> >
>>> >
>>> > Here is an example directory content of the logs for both topics:
>>> >
>>> > ls -la /apps/kafka-data/tracking.ap.client.traffic-*
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic-1:
>>> >     total 412640
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000000000000.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48
>>> 00000000000000000000.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000000000000.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
>>> leader-epoch-checkpoint
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic-11:
>>> >     total 412844
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000000000000.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48
>>> 00000000000000000000.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000000000000.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
>>> leader-epoch-checkpoint
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic-13:
>>> >     total 412660
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000000000000.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48
>>> 00000000000000000000.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000000000000.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
>>> leader-epoch-checkpoint
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic-14:
>>> >     total 412920
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000000000000.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48
>>> 00000000000000000000.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000000000000.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
>>> leader-epoch-checkpoint
>>> >
>>> >
>>> > ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
>>> >     total 288
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000004255431.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48
>>> 00000000000004255431.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000004255431.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47
>>> leader-epoch-checkpoint
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
>>> >     total 816
>>> >     drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000011782064.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48
>>> 00000000000011782064.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45
>>> 00000000000011782064.snapshot
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000011782064.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45
>>> leader-epoch-checkpoint
>>> >
>>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
>>> >     total 1812
>>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>>> 00000000000028393394.index
>>> >     -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48
>>> 00000000000028393394.log
>>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>>> 00000000000028393394.timeindex
>>> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47
>>> leader-epoch-checkpoint
>>> >
>>> >
>>> > Thank you!
>>> >
>>> > Sent with ProtonMail Secure Email.
>>>
>>>
>>>

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Ran Lupovich <ra...@gmail.com>.

By the segment size you are "delete" after 1 giga bytes is full , per
partition,  you need to remmber the retention is done when segments closed
, per partition

בתאריך יום ג׳, 25 במאי 2021, 22:59, מאת Ran Lupovich ‏<ranlupovich@gmail.com
>:

> Have you checked the segment size? Did you decribe the topic
> configuration?maybe you created it with some settings you dont remember
>
> בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova
> ‏<pp...@protonmail.com.invalid>:
>
>>
>> Any idea what is wrong here? I have restarted Kafka brokers a few times,
>> and all other Confluent services like KSQL - but I see exactly the same
>> behavior - one topic gets its logs cleaned up every 5 minutes, while the
>> other one - does not get cleaned up at all ....
>>
>>
>> Is there anything else I could check, apart from what I already did - see
>> the post below - to troubleshoot this?
>>
>>
>> thank you!
>> Marina
>>
>> Sent with ProtonMail Secure Email.
>>
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID>
>> wrote:
>>
>> > Hi, I have posted this question on SO:
>> >
>> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
>> > but wanted to re-post here as well in case someone spots the issue
>> right away ....
>> >
>> > Thank you for your help!
>> >
>> > > > > > >
>> >
>> > We have two topics on our Kafka cluster that exhibit weird (wrong)
>> behavior related to retention configuration.
>> >
>> > One topic, tracking.ap.client.traffic, has retention set explicitly to "
>> retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in
>> size, and caused 2 out of 3 kafka brokers to run out of disk space.
>> (details about config and log below)
>> >
>> > Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a
>> stream topic:
>> >
>> >     CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
>> >     WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed',
>> TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
>> >     AS SELECT
>> >     ...
>> >
>> >
>> >
>> > its retention is set to the default broker value, which is 720 hrs :
>> >
>> >     cat /etc/kafka/server.properties | grep retention
>> >     log.retention.hours=720
>> >     # A size-based retention policy for logs. Segments are pruned from
>> the log unless the remaining
>> >     # segments drop below log.retention.bytes. Functions independently
>> of log.retention.hours.
>> >     #log.retention.bytes=1073741824
>> >     # to the retention policies
>> >     log.retention.check.interval.ms=300000
>> >
>> >
>> > This topic, though, gets cleaned up every 5 min or so - according to
>> the logs....
>> > The log entry says the segment is marked for deletion "due to retention
>> time 2592000000ms breach (kafka.log.Log)" - but how can that be true if
>> this is happening every 5 min??
>> >
>> > No size-based retention is set for any of the topics.
>> >
>> > Two questions:
>> >
>> > 1.  why is the first topic not being cleaned p?
>> > 2.  why is the second topic being cleaned up so often?
>> >
>> >     Below are the details about logs and full config of both topics:
>> >
>> >     log entries for tracking.ap.client.traffic.keyed-2 topic/partition
>> - show that this partition is getting cleaned too often:
>> >
>> >
>> >     [2021-05-19 21:35:05,822] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Incrementing log start offset to 11755700 (kafka.log.Log)
>> >     [2021-05-19 21:36:05,822] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Deleting segment 11753910 (kafka.log.Log)
>> >     [2021-05-19 21:36:05,825] INFO Deleted log
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:36:05,827] INFO Deleted offset index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:36:05,829] INFO Deleted time index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:40:05,838] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>> deletable segments with base offsets [11755700] due to retention time
>> 2592000000ms breach (kafka.log.Log)
>> >     [2021-05-19 21:40:05,843] INFO [ProducerStateManager
>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>> offset 11757417 (kafka.log.ProducerStateManager)
>> >     [2021-05-19 21:40:05,845] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>> new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
>> >     [2021-05-19 21:40:05,845] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Scheduling log segment [baseOffset 11755700, size 936249] for deletion.
>> (kafka.log.Log)
>> >     [2021-05-19 21:40:05,847] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Incrementing log start offset to 11757417 (kafka.log.Log)
>> >     [2021-05-19 21:41:05,848] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Deleting segment 11755700 (kafka.log.Log)
>> >     [2021-05-19 21:41:05,850] INFO Deleted log
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:41:05,852] INFO Deleted offset index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:41:05,853] INFO Deleted time index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:45:05,834] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>> deletable segments with base offsets [11757417] due to retention time
>> 2592000000ms breach (kafka.log.Log)
>> >     [2021-05-19 21:45:05,837] INFO [ProducerStateManager
>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>> offset 11759201 (kafka.log.ProducerStateManager)
>> >     [2021-05-19 21:45:05,839] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>> new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
>> >     [2021-05-19 21:45:05,839] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Scheduling log segment [baseOffset 11757417, size 977444] for deletion.
>> (kafka.log.Log)
>> >     [2021-05-19 21:45:05,842] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Incrementing log start offset to 11759201 (kafka.log.Log)
>> >     [2021-05-19 21:46:05,842] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Deleting segment 11757417 (kafka.log.Log)
>> >     [2021-05-19 21:46:05,844] INFO Deleted log
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:46:05,845] INFO Deleted offset index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:46:05,847] INFO Deleted time index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:50:05,832] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>> deletable segments with base offsets [11759201] due to retention time
>> 2592000000ms breach (kafka.log.Log)
>> >     [2021-05-19 21:50:05,837] INFO [ProducerStateManager
>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>> offset 11760980 (kafka.log.ProducerStateManager)
>> >     [2021-05-19 21:50:05,839] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>> new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
>> >     [2021-05-19 21:50:05,839] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Scheduling log segment [baseOffset 11759201, size 991447] for deletion.
>> (kafka.log.Log)
>> >     [2021-05-19 21:50:05,844] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Incrementing log start offset to 11760980 (kafka.log.Log)
>> >     [2021-05-19 21:51:05,844] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Deleting segment 11759201 (kafka.log.Log)
>> >     [2021-05-19 21:51:05,846] INFO Deleted log
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:51:05,848] INFO Deleted offset index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:51:05,849] INFO Deleted time index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:55:05,828] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
>> deletable segments with base offsets [11760980] due to retention time
>> 2592000000ms breach (kafka.log.Log)
>> >     [2021-05-19 21:55:05,833] INFO [ProducerStateManager
>> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
>> offset 11762941 (kafka.log.ProducerStateManager)
>> >     [2021-05-19 21:55:05,835] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
>> new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
>> >     [2021-05-19 21:55:05,835] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Scheduling log segment [baseOffset 11760980, size 1079204] for deletion.
>> (kafka.log.Log)
>> >     [2021-05-19 21:55:05,838] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Incrementing log start offset to 11762941 (kafka.log.Log)
>> >     [2021-05-19 21:56:05,838] INFO [Log
>> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
>> Deleting segment 11760980 (kafka.log.Log)
>> >     [2021-05-19 21:56:05,840] INFO Deleted log
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:56:05,842] INFO Deleted offset index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted.
>> (kafka.log.LogSegment)
>> >     [2021-05-19 21:56:05,843] INFO Deleted time index
>> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted.
>> (kafka.log.LogSegment)
>> >
>> >
>> > There areno such entries for log cleaning for the first topic,
>> tracking.ap.client.traffic, which indicates it is never even checked for
>> hitting retention ??
>> >
>> > Full config for tracking.ap.client.traffic.keyed topic: (that is
>> cleaned up too often):
>> >
>> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
>> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
>> tracking.ap.client.traffic.keyed --describe --all
>> >
>> >     All configs for topic tracking.ap.client.traffic.keyed are:
>> >     compression.type=producer sensitive=false
>> synonyms={DEFAULT_CONFIG:compression.type=producer}
>> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
>> >     message.downconversion.enable=true sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>> >     min.insync.replicas=2 sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
>> DEFAULT_CONFIG:min.insync.replicas=1}
>> >     segment.jitter.ms=0 sensitive=false synonyms={}
>> >     cleanup.policy=delete sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
>> >     follower.replication.throttled.replicas= sensitive=false synonyms={}
>> >     segment.bytes=1073741824 sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
>> DEFAULT_CONFIG:log.segment.bytes=1073741824}
>> >     retention.ms=2592000000 sensitive=false synonyms={}
>> >     flush.messages=9223372036854775807 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>> >     message.format.version=2.1-IV2 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>> >     file.delete.delay.ms=60000 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>> >     max.message.bytes=1000012 sensitive=false
>> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
>> log.cleaner.min.compaction.lag.ms=0}
>> >     message.timestamp.type=CreateTime sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>> >     preallocate=false sensitive=false
>> synonyms={DEFAULT_CONFIG:log.preallocate=false}
>> >     min.cleanable.dirty.ratio=0.5 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>> >     index.interval.bytes=4096 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>> >     unclean.leader.election.enable=false sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
>> DEFAULT_CONFIG:unclean.leader.election.enable=false}
>> >     retention.bytes=-1 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>> >     delete.retention.ms=86400000 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>> >     segment.ms=604800000 sensitive=false synonyms={}
>> >     message.timestamp.difference.max.ms=9223372036854775807
>> sensitive=false synonyms={DEFAULT_CONFIG:
>> log.message.timestamp.difference.max.ms=9223372036854775807}
>> >     segment.index.bytes=10485760 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>> >
>> >
>> > Full config fortracking.ap.client.traffic topic: (not being cleaned up
>> at all):
>> >
>> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
>> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
>> tracking.ap.client.traffic --describe --all
>> >
>> >     All configs for topic tracking.ap.client.traffic are:
>> >     compression.type=producer sensitive=false
>> synonyms={DEFAULT_CONFIG:compression.type=producer}
>> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
>> >     message.downconversion.enable=true sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>> >     min.insync.replicas=2 sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
>> DEFAULT_CONFIG:min.insync.replicas=1}
>> >     segment.jitter.ms=0 sensitive=false synonyms={}
>> >     cleanup.policy=delete sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
>> >     follower.replication.throttled.replicas= sensitive=false synonyms={}
>> >     segment.bytes=1073741824 sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
>> DEFAULT_CONFIG:log.segment.bytes=1073741824}
>> >     retention.ms=14400000 sensitive=false
>> synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
>> >     flush.messages=9223372036854775807 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>> >     message.format.version=2.1-IV2 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>> >     file.delete.delay.ms=60000 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>> >     max.message.bytes=1000012 sensitive=false
>> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
>> log.cleaner.min.compaction.lag.ms=0}
>> >     message.timestamp.type=CreateTime sensitive=false
>> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>> >     preallocate=false sensitive=false
>> synonyms={DEFAULT_CONFIG:log.preallocate=false}
>> >     min.cleanable.dirty.ratio=0.5 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>> >     index.interval.bytes=4096 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>> >     unclean.leader.election.enable=false sensitive=false
>> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
>> DEFAULT_CONFIG:unclean.leader.election.enable=false}
>> >     retention.bytes=-1 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>> >     delete.retention.ms=86400000 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>> >     segment.ms=604800000 sensitive=false synonyms={}
>> >     message.timestamp.difference.max.ms=9223372036854775807
>> sensitive=false synonyms={DEFAULT_CONFIG:
>> log.message.timestamp.difference.max.ms=9223372036854775807}
>> >     segment.index.bytes=10485760 sensitive=false
>> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>> >
>> >
>> > Here is an example directory content of the logs for both topics:
>> >
>> > ls -la /apps/kafka-data/tracking.ap.client.traffic-*
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic-1:
>> >     total 412640
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000000000000.index
>> >     -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48
>> 00000000000000000000.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000000000000.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
>> leader-epoch-checkpoint
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic-11:
>> >     total 412844
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000000000000.index
>> >     -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48
>> 00000000000000000000.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000000000000.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
>> leader-epoch-checkpoint
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic-13:
>> >     total 412660
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000000000000.index
>> >     -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48
>> 00000000000000000000.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000000000000.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
>> leader-epoch-checkpoint
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic-14:
>> >     total 412920
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000000000000.index
>> >     -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48
>> 00000000000000000000.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000000000000.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
>> leader-epoch-checkpoint
>> >
>> >
>> > ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
>> >     total 288
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000004255431.index
>> >     -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48
>> 00000000000004255431.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000004255431.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47
>> leader-epoch-checkpoint
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
>> >     total 816
>> >     drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000011782064.index
>> >     -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48
>> 00000000000011782064.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45
>> 00000000000011782064.snapshot
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000011782064.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45
>> leader-epoch-checkpoint
>> >
>> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
>> >     total 1812
>> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
>> 00000000000028393394.index
>> >     -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48
>> 00000000000028393394.log
>> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
>> 00000000000028393394.timeindex
>> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47
>> leader-epoch-checkpoint
>> >
>> >
>> > Thank you!
>> >
>> > Sent with ProtonMail Secure Email.
>>
>>
>>

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Ran Lupovich <ra...@gmail.com>.

Have you checked the segment size? Did you decribe the topic
configuration?maybe you created it with some settings you dont remember

בתאריך יום ג׳, 25 במאי 2021, 19:51, מאת Marina Popova
‏<pp...@protonmail.com.invalid>:

>
> Any idea what is wrong here? I have restarted Kafka brokers a few times,
> and all other Confluent services like KSQL - but I see exactly the same
> behavior - one topic gets its logs cleaned up every 5 minutes, while the
> other one - does not get cleaned up at all ....
>
>
> Is there anything else I could check, apart from what I already did - see
> the post below - to troubleshoot this?
>
>
> thank you!
> Marina
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID>
> wrote:
>
> > Hi, I have posted this question on SO:
> >
> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
> > but wanted to re-post here as well in case someone spots the issue right
> away ....
> >
> > Thank you for your help!
> >
> > > > > > >
> >
> > We have two topics on our Kafka cluster that exhibit weird (wrong)
> behavior related to retention configuration.
> >
> > One topic, tracking.ap.client.traffic, has retention set explicitly to "
> retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in
> size, and caused 2 out of 3 kafka brokers to run out of disk space.
> (details about config and log below)
> >
> > Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a
> stream topic:
> >
> >     CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
> >     WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed',
> TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
> >     AS SELECT
> >     ...
> >
> >
> >
> > its retention is set to the default broker value, which is 720 hrs :
> >
> >     cat /etc/kafka/server.properties | grep retention
> >     log.retention.hours=720
> >     # A size-based retention policy for logs. Segments are pruned from
> the log unless the remaining
> >     # segments drop below log.retention.bytes. Functions independently
> of log.retention.hours.
> >     #log.retention.bytes=1073741824
> >     # to the retention policies
> >     log.retention.check.interval.ms=300000
> >
> >
> > This topic, though, gets cleaned up every 5 min or so - according to the
> logs....
> > The log entry says the segment is marked for deletion "due to retention
> time 2592000000ms breach (kafka.log.Log)" - but how can that be true if
> this is happening every 5 min??
> >
> > No size-based retention is set for any of the topics.
> >
> > Two questions:
> >
> > 1.  why is the first topic not being cleaned p?
> > 2.  why is the second topic being cleaned up so often?
> >
> >     Below are the details about logs and full config of both topics:
> >
> >     log entries for tracking.ap.client.traffic.keyed-2 topic/partition -
> show that this partition is getting cleaned too often:
> >
> >
> >     [2021-05-19 21:35:05,822] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Incrementing log start offset to 11755700 (kafka.log.Log)
> >     [2021-05-19 21:36:05,822] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Deleting segment 11753910 (kafka.log.Log)
> >     [2021-05-19 21:36:05,825] INFO Deleted log
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:36:05,827] INFO Deleted offset index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:36:05,829] INFO Deleted time index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:40:05,838] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
> deletable segments with base offsets [11755700] due to retention time
> 2592000000ms breach (kafka.log.Log)
> >     [2021-05-19 21:40:05,843] INFO [ProducerStateManager
> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
> offset 11757417 (kafka.log.ProducerStateManager)
> >     [2021-05-19 21:40:05,845] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
> new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
> >     [2021-05-19 21:40:05,845] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Scheduling log segment [baseOffset 11755700, size 936249] for deletion.
> (kafka.log.Log)
> >     [2021-05-19 21:40:05,847] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Incrementing log start offset to 11757417 (kafka.log.Log)
> >     [2021-05-19 21:41:05,848] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Deleting segment 11755700 (kafka.log.Log)
> >     [2021-05-19 21:41:05,850] INFO Deleted log
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:41:05,852] INFO Deleted offset index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:41:05,853] INFO Deleted time index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:45:05,834] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
> deletable segments with base offsets [11757417] due to retention time
> 2592000000ms breach (kafka.log.Log)
> >     [2021-05-19 21:45:05,837] INFO [ProducerStateManager
> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
> offset 11759201 (kafka.log.ProducerStateManager)
> >     [2021-05-19 21:45:05,839] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
> new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
> >     [2021-05-19 21:45:05,839] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Scheduling log segment [baseOffset 11757417, size 977444] for deletion.
> (kafka.log.Log)
> >     [2021-05-19 21:45:05,842] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Incrementing log start offset to 11759201 (kafka.log.Log)
> >     [2021-05-19 21:46:05,842] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Deleting segment 11757417 (kafka.log.Log)
> >     [2021-05-19 21:46:05,844] INFO Deleted log
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:46:05,845] INFO Deleted offset index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:46:05,847] INFO Deleted time index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:50:05,832] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
> deletable segments with base offsets [11759201] due to retention time
> 2592000000ms breach (kafka.log.Log)
> >     [2021-05-19 21:50:05,837] INFO [ProducerStateManager
> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
> offset 11760980 (kafka.log.ProducerStateManager)
> >     [2021-05-19 21:50:05,839] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
> new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
> >     [2021-05-19 21:50:05,839] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Scheduling log segment [baseOffset 11759201, size 991447] for deletion.
> (kafka.log.Log)
> >     [2021-05-19 21:50:05,844] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Incrementing log start offset to 11760980 (kafka.log.Log)
> >     [2021-05-19 21:51:05,844] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Deleting segment 11759201 (kafka.log.Log)
> >     [2021-05-19 21:51:05,846] INFO Deleted log
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:51:05,848] INFO Deleted offset index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:51:05,849] INFO Deleted time index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:55:05,828] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found
> deletable segments with base offsets [11760980] due to retention time
> 2592000000ms breach (kafka.log.Log)
> >     [2021-05-19 21:55:05,833] INFO [ProducerStateManager
> partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at
> offset 11762941 (kafka.log.ProducerStateManager)
> >     [2021-05-19 21:55:05,835] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled
> new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
> >     [2021-05-19 21:55:05,835] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Scheduling log segment [baseOffset 11760980, size 1079204] for deletion.
> (kafka.log.Log)
> >     [2021-05-19 21:55:05,838] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Incrementing log start offset to 11762941 (kafka.log.Log)
> >     [2021-05-19 21:56:05,838] INFO [Log
> partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data]
> Deleting segment 11760980 (kafka.log.Log)
> >     [2021-05-19 21:56:05,840] INFO Deleted log
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:56:05,842] INFO Deleted offset index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted.
> (kafka.log.LogSegment)
> >     [2021-05-19 21:56:05,843] INFO Deleted time index
> /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted.
> (kafka.log.LogSegment)
> >
> >
> > There areno such entries for log cleaning for the first topic,
> tracking.ap.client.traffic, which indicates it is never even checked for
> hitting retention ??
> >
> > Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned
> up too often):
> >
> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
> tracking.ap.client.traffic.keyed --describe --all
> >
> >     All configs for topic tracking.ap.client.traffic.keyed are:
> >     compression.type=producer sensitive=false
> synonyms={DEFAULT_CONFIG:compression.type=producer}
> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
> >     message.downconversion.enable=true sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> >     min.insync.replicas=2 sensitive=false
> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
> DEFAULT_CONFIG:min.insync.replicas=1}
> >     segment.jitter.ms=0 sensitive=false synonyms={}
> >     cleanup.policy=delete sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
> >     follower.replication.throttled.replicas= sensitive=false synonyms={}
> >     segment.bytes=1073741824 sensitive=false
> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
> DEFAULT_CONFIG:log.segment.bytes=1073741824}
> >     retention.ms=2592000000 sensitive=false synonyms={}
> >     flush.messages=9223372036854775807 sensitive=false
> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> >     message.format.version=2.1-IV2 sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> >     file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:
> log.segment.delete.delay.ms=60000}
> >     max.message.bytes=1000012 sensitive=false
> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
> log.cleaner.min.compaction.lag.ms=0}
> >     message.timestamp.type=CreateTime sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> >     preallocate=false sensitive=false
> synonyms={DEFAULT_CONFIG:log.preallocate=false}
> >     min.cleanable.dirty.ratio=0.5 sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> >     index.interval.bytes=4096 sensitive=false
> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> >     unclean.leader.election.enable=false sensitive=false
> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
> DEFAULT_CONFIG:unclean.leader.election.enable=false}
> >     retention.bytes=-1 sensitive=false
> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> >     delete.retention.ms=86400000 sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> >     segment.ms=604800000 sensitive=false synonyms={}
> >     message.timestamp.difference.max.ms=9223372036854775807
> sensitive=false synonyms={DEFAULT_CONFIG:
> log.message.timestamp.difference.max.ms=9223372036854775807}
> >     segment.index.bytes=10485760 sensitive=false
> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> >
> >
> > Full config fortracking.ap.client.traffic topic: (not being cleaned up
> at all):
> >
> >     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs
> --bootstrap-server mybroker1:9092 --entity-type topics --entity-name
> tracking.ap.client.traffic --describe --all
> >
> >     All configs for topic tracking.ap.client.traffic are:
> >     compression.type=producer sensitive=false
> synonyms={DEFAULT_CONFIG:compression.type=producer}
> >     leader.replication.throttled.replicas= sensitive=false synonyms={}
> >     message.downconversion.enable=true sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
> >     min.insync.replicas=2 sensitive=false
> synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2,
> DEFAULT_CONFIG:min.insync.replicas=1}
> >     segment.jitter.ms=0 sensitive=false synonyms={}
> >     cleanup.policy=delete sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
> >     flush.ms=9223372036854775807 sensitive=false synonyms={}
> >     follower.replication.throttled.replicas= sensitive=false synonyms={}
> >     segment.bytes=1073741824 sensitive=false
> synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824,
> DEFAULT_CONFIG:log.segment.bytes=1073741824}
> >     retention.ms=14400000 sensitive=false
> synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
> >     flush.messages=9223372036854775807 sensitive=false
> synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
> >     message.format.version=2.1-IV2 sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
> >     file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:
> log.segment.delete.delay.ms=60000}
> >     max.message.bytes=1000012 sensitive=false
> synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
> >     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:
> log.cleaner.min.compaction.lag.ms=0}
> >     message.timestamp.type=CreateTime sensitive=false
> synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
> >     preallocate=false sensitive=false
> synonyms={DEFAULT_CONFIG:log.preallocate=false}
> >     min.cleanable.dirty.ratio=0.5 sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
> >     index.interval.bytes=4096 sensitive=false
> synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
> >     unclean.leader.election.enable=false sensitive=false
> synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false,
> DEFAULT_CONFIG:unclean.leader.election.enable=false}
> >     retention.bytes=-1 sensitive=false
> synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
> >     delete.retention.ms=86400000 sensitive=false
> synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
> >     segment.ms=604800000 sensitive=false synonyms={}
> >     message.timestamp.difference.max.ms=9223372036854775807
> sensitive=false synonyms={DEFAULT_CONFIG:
> log.message.timestamp.difference.max.ms=9223372036854775807}
> >     segment.index.bytes=10485760 sensitive=false
> synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
> >
> >
> > Here is an example directory content of the logs for both topics:
> >
> > ls -la /apps/kafka-data/tracking.ap.client.traffic-*
> >
> >     /apps/kafka-data/tracking.ap.client.traffic-1:
> >     total 412640
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000000000000.index
> >     -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48
> 00000000000000000000.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000000000000.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
> leader-epoch-checkpoint
> >
> >     /apps/kafka-data/tracking.ap.client.traffic-11:
> >     total 412844
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000000000000.index
> >     -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48
> 00000000000000000000.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000000000000.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
> leader-epoch-checkpoint
> >
> >     /apps/kafka-data/tracking.ap.client.traffic-13:
> >     total 412660
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000000000000.index
> >     -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48
> 00000000000000000000.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000000000000.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01
> leader-epoch-checkpoint
> >
> >     /apps/kafka-data/tracking.ap.client.traffic-14:
> >     total 412920
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000000000000.index
> >     -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48
> 00000000000000000000.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000000000000.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25
> leader-epoch-checkpoint
> >
> >
> > ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
> >
> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
> >     total 288
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000004255431.index
> >     -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48
> 00000000000004255431.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000004255431.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47
> leader-epoch-checkpoint
> >
> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
> >     total 816
> >     drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000011782064.index
> >     -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48
> 00000000000011782064.log
> >     -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45
> 00000000000011782064.snapshot
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000011782064.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45
> leader-epoch-checkpoint
> >
> >     /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
> >     total 1812
> >     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
> >     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
> >     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48
> 00000000000028393394.index
> >     -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48
> 00000000000028393394.log
> >     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48
> 00000000000028393394.timeindex
> >     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47
> leader-epoch-checkpoint
> >
> >
> > Thank you!
> >
> > Sent with ProtonMail Secure Email.
>
>
>

Re: Weird behavior of topic retention - some are cleaned up too often, some are not at all

Posted by Marina Popova <pp...@protonmail.com.INVALID>.

Any idea what is wrong here? I have restarted Kafka brokers a few times, and all other Confluent services like KSQL - but I see exactly the same behavior - one topic gets its logs cleaned up every 5 minutes, while the other one - does not get cleaned up at all ....


Is there anything else I could check, apart from what I already did - see the post below - to troubleshoot this?


thank you!
Marina

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, May 20, 2021 2:10 PM, Marina Popova <pp...@protonmail.com.INVALID> wrote:

> Hi, I have posted this question on SO:
> https://stackoverflow.com/questions/67625641/kafka-segments-are-deleted-too-often-or-not-at-all
> but wanted to re-post here as well in case someone spots the issue right away ....
>
> Thank you for your help!
>
> > > > > >
>
> We have two topics on our Kafka cluster that exhibit weird (wrong) behavior related to retention configuration.
>
> One topic, tracking.ap.client.traffic, has retention set explicitly to "retention.ms=14400000" (4 hrs) - but it is not cleaned up , grows in size, and caused 2 out of 3 kafka brokers to run out of disk space. (details about config and log below)
>
> Second topic, tracking.ap.client.traffic.keyed, is created in KSQL as a stream topic:
>
>     CREATE STREAM AP_CLIENT_TRAFFIC_KEYED
>     WITH (KAFKA_TOPIC='tracking.ap.client.traffic.keyed', TIMESTAMP='ACTIVITY_DATE', VALUE_FORMAT='JSON', PARTITIONS=6)
>     AS SELECT
>     ...
>
>
>
> its retention is set to the default broker value, which is 720 hrs :
>
>     cat /etc/kafka/server.properties | grep retention
>     log.retention.hours=720
>     # A size-based retention policy for logs. Segments are pruned from the log unless the remaining
>     # segments drop below log.retention.bytes. Functions independently of log.retention.hours.
>     #log.retention.bytes=1073741824
>     # to the retention policies
>     log.retention.check.interval.ms=300000
>
>
> This topic, though, gets cleaned up every 5 min or so - according to the logs....
> The log entry says the segment is marked for deletion "due to retention time 2592000000ms breach (kafka.log.Log)" - but how can that be true if this is happening every 5 min??
>
> No size-based retention is set for any of the topics.
>
> Two questions:
>
> 1.  why is the first topic not being cleaned p?
> 2.  why is the second topic being cleaned up so often?
>
>     Below are the details about logs and full config of both topics:
>
>     log entries for tracking.ap.client.traffic.keyed-2 topic/partition - show that this partition is getting cleaned too often:
>
>
>     [2021-05-19 21:35:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11755700 (kafka.log.Log)
>     [2021-05-19 21:36:05,822] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11753910 (kafka.log.Log)
>     [2021-05-19 21:36:05,825] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.log.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:36:05,827] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.index.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:36:05,829] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011753910.timeindex.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:40:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11755700] due to retention time 2592000000ms breach (kafka.log.Log)
>     [2021-05-19 21:40:05,843] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11757417 (kafka.log.ProducerStateManager)
>     [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11757417 in 7 ms. (kafka.log.Log)
>     [2021-05-19 21:40:05,845] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11755700, size 936249] for deletion. (kafka.log.Log)
>     [2021-05-19 21:40:05,847] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11757417 (kafka.log.Log)
>     [2021-05-19 21:41:05,848] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11755700 (kafka.log.Log)
>     [2021-05-19 21:41:05,850] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.log.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:41:05,852] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.index.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:41:05,853] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011755700.timeindex.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:45:05,834] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11757417] due to retention time 2592000000ms breach (kafka.log.Log)
>     [2021-05-19 21:45:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11759201 (kafka.log.ProducerStateManager)
>     [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11759201 in 6 ms. (kafka.log.Log)
>     [2021-05-19 21:45:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11757417, size 977444] for deletion. (kafka.log.Log)
>     [2021-05-19 21:45:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11759201 (kafka.log.Log)
>     [2021-05-19 21:46:05,842] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11757417 (kafka.log.Log)
>     [2021-05-19 21:46:05,844] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.log.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:46:05,845] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.index.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:46:05,847] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011757417.timeindex.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:50:05,832] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11759201] due to retention time 2592000000ms breach (kafka.log.Log)
>     [2021-05-19 21:50:05,837] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11760980 (kafka.log.ProducerStateManager)
>     [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11760980 in 7 ms. (kafka.log.Log)
>     [2021-05-19 21:50:05,839] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11759201, size 991447] for deletion. (kafka.log.Log)
>     [2021-05-19 21:50:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11760980 (kafka.log.Log)
>     [2021-05-19 21:51:05,844] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11759201 (kafka.log.Log)
>     [2021-05-19 21:51:05,846] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.log.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:51:05,848] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.index.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:51:05,849] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011759201.timeindex.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:55:05,828] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Found deletable segments with base offsets [11760980] due to retention time 2592000000ms breach (kafka.log.Log)
>     [2021-05-19 21:55:05,833] INFO [ProducerStateManager partition=tracking.ap.client.traffic.keyed-2] Writing producer snapshot at offset 11762941 (kafka.log.ProducerStateManager)
>     [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Rolled new log segment at offset 11762941 in 7 ms. (kafka.log.Log)
>     [2021-05-19 21:55:05,835] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Scheduling log segment [baseOffset 11760980, size 1079204] for deletion. (kafka.log.Log)
>     [2021-05-19 21:55:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Incrementing log start offset to 11762941 (kafka.log.Log)
>     [2021-05-19 21:56:05,838] INFO [Log partition=tracking.ap.client.traffic.keyed-2, dir=/apps/kafka-data] Deleting segment 11760980 (kafka.log.Log)
>     [2021-05-19 21:56:05,840] INFO Deleted log /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.log.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:56:05,842] INFO Deleted offset index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.index.deleted. (kafka.log.LogSegment)
>     [2021-05-19 21:56:05,843] INFO Deleted time index /apps/kafka-data/tracking.ap.client.traffic.keyed-2/00000000000011760980.timeindex.deleted. (kafka.log.LogSegment)
>
>
> There areno such entries for log cleaning for the first topic, tracking.ap.client.traffic, which indicates it is never even checked for hitting retention ??
>
> Full config for tracking.ap.client.traffic.keyed topic: (that is cleaned up too often):
>
>     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic.keyed --describe --all
>
>     All configs for topic tracking.ap.client.traffic.keyed are:
>     compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>     leader.replication.throttled.replicas= sensitive=false synonyms={}
>     message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>     min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>     segment.jitter.ms=0 sensitive=false synonyms={}
>     cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>     flush.ms=9223372036854775807 sensitive=false synonyms={}
>     follower.replication.throttled.replicas= sensitive=false synonyms={}
>     segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>     retention.ms=2592000000 sensitive=false synonyms={}
>     flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>     message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>     file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>     max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>     message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>     preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>     min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>     index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>     unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>     retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>     delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>     segment.ms=604800000 sensitive=false synonyms={}
>     message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>     segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>
>
> Full config fortracking.ap.client.traffic topic: (not being cleaned up at all):
>
>     mac-lt-mpopova:confluent-6.1.0 mpopova$ ./bin/kafka-configs --bootstrap-server mybroker1:9092 --entity-type topics --entity-name tracking.ap.client.traffic --describe --all
>
>     All configs for topic tracking.ap.client.traffic are:
>     compression.type=producer sensitive=false synonyms={DEFAULT_CONFIG:compression.type=producer}
>     leader.replication.throttled.replicas= sensitive=false synonyms={}
>     message.downconversion.enable=true sensitive=false synonyms={DEFAULT_CONFIG:log.message.downconversion.enable=true}
>     min.insync.replicas=2 sensitive=false synonyms={STATIC_BROKER_CONFIG:min.insync.replicas=2, DEFAULT_CONFIG:min.insync.replicas=1}
>     segment.jitter.ms=0 sensitive=false synonyms={}
>     cleanup.policy=delete sensitive=false synonyms={DEFAULT_CONFIG:log.cleanup.policy=delete}
>     flush.ms=9223372036854775807 sensitive=false synonyms={}
>     follower.replication.throttled.replicas= sensitive=false synonyms={}
>     segment.bytes=1073741824 sensitive=false synonyms={STATIC_BROKER_CONFIG:log.segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
>     retention.ms=14400000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=14400000}
>     flush.messages=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.flush.interval.messages=9223372036854775807}
>     message.format.version=2.1-IV2 sensitive=false synonyms={DEFAULT_CONFIG:log.message.format.version=2.1-IV2}
>     file.delete.delay.ms=60000 sensitive=false synonyms={DEFAULT_CONFIG:log.segment.delete.delay.ms=60000}
>     max.message.bytes=1000012 sensitive=false synonyms={DEFAULT_CONFIG:message.max.bytes=1000012}
>     min.compaction.lag.ms=0 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.compaction.lag.ms=0}
>     message.timestamp.type=CreateTime sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.type=CreateTime}
>     preallocate=false sensitive=false synonyms={DEFAULT_CONFIG:log.preallocate=false}
>     min.cleanable.dirty.ratio=0.5 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.min.cleanable.ratio=0.5}
>     index.interval.bytes=4096 sensitive=false synonyms={DEFAULT_CONFIG:log.index.interval.bytes=4096}
>     unclean.leader.election.enable=false sensitive=false synonyms={STATIC_BROKER_CONFIG:unclean.leader.election.enable=false, DEFAULT_CONFIG:unclean.leader.election.enable=false}
>     retention.bytes=-1 sensitive=false synonyms={DEFAULT_CONFIG:log.retention.bytes=-1}
>     delete.retention.ms=86400000 sensitive=false synonyms={DEFAULT_CONFIG:log.cleaner.delete.retention.ms=86400000}
>     segment.ms=604800000 sensitive=false synonyms={}
>     message.timestamp.difference.max.ms=9223372036854775807 sensitive=false synonyms={DEFAULT_CONFIG:log.message.timestamp.difference.max.ms=9223372036854775807}
>     segment.index.bytes=10485760 sensitive=false synonyms={DEFAULT_CONFIG:log.index.size.max.bytes=10485760}
>
>
> Here is an example directory content of the logs for both topics:
>
> ls -la /apps/kafka-data/tracking.ap.client.traffic-*
>
>     /apps/kafka-data/tracking.ap.client.traffic-1:
>     total 412640
>     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>     -rw-r--r-- 1 cp-kafka confluent 420599482 May 19 22:48 00000000000000000000.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>
>     /apps/kafka-data/tracking.ap.client.traffic-11:
>     total 412844
>     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>     -rw-r--r-- 1 cp-kafka confluent 420788365 May 19 22:48 00000000000000000000.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>
>     /apps/kafka-data/tracking.ap.client.traffic-13:
>     total 412660
>     drwxr-xr-x 2 cp-kafka confluent 141 May 17 15:01 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>     -rw-r--r-- 1 cp-kafka confluent 420606307 May 19 22:48 00000000000000000000.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 8 May 17 15:01 leader-epoch-checkpoint
>
>     /apps/kafka-data/tracking.ap.client.traffic-14:
>     total 412920
>     drwxr-xr-x 2 cp-kafka confluent 141 May 17 11:25 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000000000000.index
>     -rw-r--r-- 1 cp-kafka confluent 420862000 May 19 22:48 00000000000000000000.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000000000000.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 8 May 17 11:25 leader-epoch-checkpoint
>
>
> ls -la /apps/kafka-data/tracking.ap.client.traffic.keyed*
>
>     /apps/kafka-data/tracking.ap.client.traffic.keyed-1:
>     total 288
>     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000004255431.index
>     -rw-r--r-- 1 cp-kafka confluent 254355 May 19 22:48 00000000000004255431.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000004255431.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 14 May 19 22:47 leader-epoch-checkpoint
>
>     /apps/kafka-data/tracking.ap.client.traffic.keyed-2:
>     total 816
>     drwxr-xr-x 2 cp-kafka confluent 178 May 19 22:46 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000011782064.index
>     -rw-r--r-- 1 cp-kafka confluent 787175 May 19 22:48 00000000000011782064.log
>     -rw-r--r-- 1 cp-kafka confluent 10 May 19 22:45 00000000000011782064.snapshot
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000011782064.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:45 leader-epoch-checkpoint
>
>     /apps/kafka-data/tracking.ap.client.traffic.keyed-3:
>     total 1812
>     drwxr-xr-x 2 cp-kafka confluent 141 May 19 22:47 .
>     drwxr-xr-x 290 cp-kafka confluent 20480 May 19 22:48 ..
>     -rw-r--r-- 1 cp-kafka confluent 10485760 May 19 22:48 00000000000028393394.index
>     -rw-r--r-- 1 cp-kafka confluent 1814528 May 19 22:48 00000000000028393394.log
>     -rw-r--r-- 1 cp-kafka confluent 10485756 May 19 22:48 00000000000028393394.timeindex
>     -rw-r--r-- 1 cp-kafka confluent 15 May 19 22:47 leader-epoch-checkpoint
>
>
> Thank you!
>
> Sent with ProtonMail Secure Email.