You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Tomislav Rajakovic (JIRA)" <ji...@apache.org> on 2019/08/07 13:17:00 UTC

[jira] [Created] (KAFKA-8764) LogCleanerManager endless loop while compacting/cleaning segments

Tomislav Rajakovic created KAFKA-8764:
-----------------------------------------

             Summary: LogCleanerManager endless loop while compacting/cleaning segments
                 Key: KAFKA-8764
                 URL: https://issues.apache.org/jira/browse/KAFKA-8764
             Project: Kafka
          Issue Type: Bug
          Components: log cleaner
    Affects Versions: 2.2.1
         Environment: docker base image: openjdk:8-jre-alpine base image, kafka from http://ftp.carnet.hr/misc/apache/kafka/2.2.1/kafka_2.12-2.2.1.tgz
            Reporter: Tomislav Rajakovic
         Attachments: log-cleaner-bug-reproduction.zip

{{LogCleanerManager stuck in endless loop while clearing segments for one partition resulting with many log outputs and heavy disk read/writes/IOPS.}}

 

Issue appeared on follower brokers, and it happens on every (new) broker if partition assignment is changed.

 

Original issue setup:
 * kafka_2.12-2.2.1 deployed as statefulset on kubernetes, 5 brokers
 * log directory is (AWS) EBS mounted PV, gp2 (ssd) kind of 750GiB
 * 5 zookeepers
 * topic created with config:
 ** name = "backup_br_domain_squad"
partitions = 36
replication_factor = 3

config = {
 "cleanup.policy" = "compact"
 "min.compaction.lag.ms" = "86400000"
 "min.cleanable.dirty.ratio" = "0.3"
}

 

 

Log excerpt:

{{[2019-08-07 12:10:53,895] INFO [Log partition=backup_br_domain_squad-14, dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
{{[2019-08-07 12:10:53,895] INFO Deleted log /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.log.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:53,896] INFO Deleted offset index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.index.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:53,896] INFO Deleted time index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:53,964] INFO [Log partition=backup_br_domain_squad-14, dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
{{[2019-08-07 12:10:53,964] INFO Deleted log /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.log.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:53,964] INFO Deleted offset index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.index.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:53,964] INFO Deleted time index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,031] INFO [Log partition=backup_br_domain_squad-14, dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
{{[2019-08-07 12:10:54,032] INFO Deleted log /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.log.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,032] INFO Deleted offset index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.index.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,032] INFO Deleted time index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,101] INFO [Log partition=backup_br_domain_squad-14, dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
{{[2019-08-07 12:10:54,101] INFO Deleted log /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.log.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,101] INFO Deleted offset index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.index.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,101] INFO Deleted time index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,173] INFO [Log partition=backup_br_domain_squad-14, dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
{{[2019-08-07 12:10:54,173] INFO Deleted log /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.log.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,173] INFO Deleted offset index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.index.deleted. (kafka.log.LogSegment)}}
{{[2019-08-07 12:10:54,173] INFO Deleted time index /var/lib/kafka/data/topics/backup_br_domain_squad-14/00000000000000000000.timeindex.deleted. (kafka.log.LogSegment)}}

 

And such log keeps repeating forever.

 

 

I've been able to extract segment files from (running) leader broker, and replicated same behaviour locally.
h1.  
h1. *Reproduction setup:*
 * start single broker kafka_2.12-2.2.1
 * create topic
 ** {{./bin/kafka-topics.sh --bootstrap-server *__BOOTSTRAP_SERVER__*:9092 --create --topic backup_br_domain_squad --partitions 1 --replication-factor 1 --config cleanup.policy=compact --config min.compaction.lag.ms=86400000 --config min.cleanable.dirty.ratio=0.3}}
 * stop broker
 * copy segment files (attachment content, under segments) to backup_br_domain_squad-0 log folder
 * rerun broker again

 

After rerun, fun starts with endless repeating logging outputs:

{{[2019-08-07 13:05:37,610] DEBUG Checking if log segment may be cleaned: log='backup_br_domain_squad-0' segment.baseOffset=0 segment.largestTimestamp=1563203503987; now - compactionLag=1565096737609; is uncleanable=false (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,610] DEBUG Finding range of cleanable offsets for log=backup_br_domain_squad-0 topicPartition=backup_br_domain_squad-0. Last clean offset=Some(0) now=1565183137609 => firstDirtyOffset=0 firstUncleanableOffset=233 activeSegment.baseOffset=233 (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,613] INFO Cleaner 0: Beginning cleaning of log backup_br_domain_squad-0. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,613] INFO Cleaner 0: Building offset map for backup_br_domain_squad-0... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,628] INFO Cleaner 0: Building offset map for log backup_br_domain_squad-0 for 1 segments in offset range [0, 233). (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,643] INFO Cleaner 0: Offset map for log backup_br_domain_squad-0 complete. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,644] INFO Cleaner 0: Cleaning log backup_br_domain_squad-0 (cleaning prior to Wed Aug 07 13:05:22 GMT 2019, discarding tombstones prior to Thu Jan 01 00:00:00 GMT 1970)... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,645] DEBUG Loaded index file /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.index.cleaned with maxEntries = 1310720, maxIndexSize = 10485760, entries = 0, lastOffset = 0, file position = 0 (kafka.log.OffsetIndex)}}
{{[2019-08-07 13:05:37,646] DEBUG Loaded index file /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.timeindex.cleaned with maxEntries = 873813, maxIndexSize = 10485760, entries = 0, lastOffset = TimestampOffset(-1,0), file position = 0 (kafka.log.TimeIndex)}}
{{[2019-08-07 13:05:37,647] INFO Cleaner 0: Cleaning segment 0 in log backup_br_domain_squad-0 (largest timestamp Mon Jul 15 15:11:43 GMT 2019) into 0, retaining deletes. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,656] DEBUG Resized /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.index.cleaned to 32, position is 32 and limit is 32 (kafka.log.OffsetIndex)}}
{{[2019-08-07 13:05:37,657] DEBUG Resized /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.timeindex.cleaned to 36, position is 36 and limit is 36 (kafka.log.TimeIndex)}}
{{[2019-08-07 13:05:37,669] INFO Cleaner 0: Swapping in cleaned segment LogSegment(baseOffset=0, size=1044021) for segment(s) List(LogSegment(baseOffset=0, size=1044021)) in log Log(/runtime/logs/kafka_1/backup_br_domain_squad-0) (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,672] INFO [kafka-log-cleaner-thread-0]:}}
{{ Log cleaner thread 0 cleaned log backup_br_domain_squad-0 (dirty section = [0, 0])}}
{{ 1.0 MB of log processed in 0.1 seconds (17.2 MB/sec).}}
{{ Indexed 1.0 MB in 0.0 seconds (33.2 Mb/sec, 51.7% of total time)}}
{{ Buffer utilization: 0.0%}}
{{ Cleaned 1.0 MB in 0.0 seconds (35.6 Mb/sec, 48.3% of total time)}}
{{ Start size: 1.0 MB (231 messages)}}
{{ End size: 1.0 MB (231 messages)}}
{{ 0.0% size reduction (0.0% fewer messages)}}
{{ (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,676] DEBUG Checking if log segment may be cleaned: log='backup_br_domain_squad-0' segment.baseOffset=0 segment.largestTimestamp=1563203503987; now - compactionLag=1565096737676; is uncleanable=false (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,677] DEBUG Finding range of cleanable offsets for log=backup_br_domain_squad-0 topicPartition=backup_br_domain_squad-0. Last clean offset=Some(232) now=1565183137676 => firstDirtyOffset=232 firstUncleanableOffset=233 activeSegment.baseOffset=233 (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,677] INFO Cleaner 0: Beginning cleaning of log backup_br_domain_squad-0. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,678] INFO Cleaner 0: Building offset map for backup_br_domain_squad-0... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,680] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,9082,ListenerName(INSIDE_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,680] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,9082,ListenerName(INSIDE_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,681] INFO Awaiting socket connections on s0.0.0.0:9092. (kafka.network.Acceptor)}}
{{[2019-08-07 13:05:37,681] INFO Awaiting socket connections on s0.0.0.0:9092. (kafka.network.Acceptor)}}
{{[2019-08-07 13:05:37,700] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,9092,ListenerName(OUTSIDE_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,700] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,9092,ListenerName(OUTSIDE_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,701] INFO Awaiting socket connections on s0.0.0.0:19092. (kafka.network.Acceptor)}}
{{[2019-08-07 13:05:37,701] INFO Cleaner 0: Building offset map for log backup_br_domain_squad-0 for 1 segments in offset range [232, 233). (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,701] INFO Awaiting socket connections on s0.0.0.0:19092. (kafka.network.Acceptor)}}
{{[2019-08-07 13:05:37,702] INFO Cleaner 0: Offset map for log backup_br_domain_squad-0 complete. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,702] INFO Cleaner 0: Cleaning log backup_br_domain_squad-0 (cleaning prior to Wed Aug 07 13:05:22 GMT 2019, discarding tombstones prior to Tue Aug 06 13:05:22 GMT 2019)... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,703] INFO [kafka-log-cleaner-thread-0]:}}
{{ Log cleaner thread 0 cleaned log backup_br_domain_squad-0 (dirty section = [232, 232])}}
{{ 0.0 MB of log processed in 0.0 seconds (0.0 MB/sec).}}
{{ Indexed 0.0 MB in 0.0 seconds (0.3 Mb/sec, 96.2% of total time)}}
{{ Buffer utilization: 0.0%}}
{{ Cleaned 0.0 MB in 0.0 seconds (0.0 Mb/sec, 3.8% of total time)}}
{{ Start size: 0.0 MB (0 messages)}}
{{ End size: 0.0 MB (0 messages)}}
{{ NaN% size reduction (NaN% fewer messages)}}
{{ (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,707] DEBUG Checking if log segment may be cleaned: log='backup_br_domain_squad-0' segment.baseOffset=0 segment.largestTimestamp=1563203503987; now - compactionLag=1565096737707; is uncleanable=false (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,708] DEBUG Finding range of cleanable offsets for log=backup_br_domain_squad-0 topicPartition=backup_br_domain_squad-0. Last clean offset=Some(0) now=1565183137707 => firstDirtyOffset=0 firstUncleanableOffset=233 activeSegment.baseOffset=233 (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,708] INFO Cleaner 0: Beginning cleaning of log backup_br_domain_squad-0. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,708] INFO Cleaner 0: Building offset map for backup_br_domain_squad-0... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,719] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,19092,ListenerName(MAC_OS_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,719] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(null,19092,ListenerName(MAC_OS_DOCKER),PLAINTEXT) (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,721] INFO Cleaner 0: Building offset map for log backup_br_domain_squad-0 for 1 segments in offset range [0, 233). (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,722] INFO [SocketServer brokerId=1001] Started 3 acceptor threads for data-plane (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,722] INFO [SocketServer brokerId=1001] Started 3 acceptor threads for data-plane (kafka.network.SocketServer)}}
{{[2019-08-07 13:05:37,730] INFO Cleaner 0: Offset map for log backup_br_domain_squad-0 complete. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,730] INFO Cleaner 0: Cleaning log backup_br_domain_squad-0 (cleaning prior to Wed Aug 07 13:05:22 GMT 2019, discarding tombstones prior to Thu Jan 01 00:00:00 GMT 1970)... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,732] DEBUG Loaded index file /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.index.cleaned with maxEntries = 1310720, maxIndexSize = 10485760, entries = 0, lastOffset = 0, file position = 0 (kafka.log.OffsetIndex)}}
{{[2019-08-07 13:05:37,733] DEBUG Loaded index file /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.timeindex.cleaned with maxEntries = 873813, maxIndexSize = 10485760, entries = 0, lastOffset = TimestampOffset(-1,0), file position = 0 (kafka.log.TimeIndex)}}
{{[2019-08-07 13:05:37,733] INFO Cleaner 0: Cleaning segment 0 in log backup_br_domain_squad-0 (largest timestamp Mon Jul 15 15:11:43 GMT 2019) into 0, retaining deletes. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,739] DEBUG Resized /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.index.cleaned to 32, position is 32 and limit is 32 (kafka.log.OffsetIndex)}}
{{[2019-08-07 13:05:37,739] DEBUG Resized /runtime/logs/kafka_1/backup_br_domain_squad-0/00000000000000000000.timeindex.cleaned to 36, position is 36 and limit is 36 (kafka.log.TimeIndex)}}
{{[2019-08-07 13:05:37,764] INFO Cleaner 0: Swapping in cleaned segment LogSegment(baseOffset=0, size=1044021) for segment(s) List(LogSegment(baseOffset=0, size=1044021)) in log Log(/runtime/logs/kafka_1/backup_br_domain_squad-0) (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,767] INFO [kafka-log-cleaner-thread-0]:}}
{{ Log cleaner thread 0 cleaned log backup_br_domain_squad-0 (dirty section = [0, 0])}}
{{ 1.0 MB of log processed in 0.1 seconds (17.2 MB/sec).}}
{{ Indexed 1.0 MB in 0.0 seconds (45.3 Mb/sec, 37.9% of total time)}}
{{ Buffer utilization: 0.0%}}
{{ Cleaned 1.0 MB in 0.0 seconds (27.7 Mb/sec, 62.1% of total time)}}
{{ Start size: 1.0 MB (231 messages)}}
{{ End size: 1.0 MB (231 messages)}}
{{ 0.0% size reduction (0.0% fewer messages)}}
{{ (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,772] DEBUG Checking if log segment may be cleaned: log='backup_br_domain_squad-0' segment.baseOffset=0 segment.largestTimestamp=1563203503987; now - compactionLag=1565096737771; is uncleanable=false (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,772] DEBUG Finding range of cleanable offsets for log=backup_br_domain_squad-0 topicPartition=backup_br_domain_squad-0. Last clean offset=Some(232) now=1565183137771 => firstDirtyOffset=232 firstUncleanableOffset=233 activeSegment.baseOffset=233 (kafka.log.LogCleanerManager$)}}
{{[2019-08-07 13:05:37,773] INFO Cleaner 0: Beginning cleaning of log backup_br_domain_squad-0. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,774] INFO Cleaner 0: Building offset map for backup_br_domain_squad-0... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,795] INFO [ExpirationReaper-1001-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,795] INFO [ExpirationReaper-1001-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,800] INFO [ExpirationReaper-1001-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,800] INFO [ExpirationReaper-1001-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,803] INFO [ExpirationReaper-1001-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,803] INFO [ExpirationReaper-1001-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,805] INFO [ExpirationReaper-1001-ElectPreferredLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,805] INFO [ExpirationReaper-1001-ElectPreferredLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)}}
{{[2019-08-07 13:05:37,806] INFO Cleaner 0: Building offset map for log backup_br_domain_squad-0 for 1 segments in offset range [232, 233). (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,807] INFO Cleaner 0: Offset map for log backup_br_domain_squad-0 complete. (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,808] INFO Cleaner 0: Cleaning log backup_br_domain_squad-0 (cleaning prior to Wed Aug 07 13:05:22 GMT 2019, discarding tombstones prior to Tue Aug 06 13:05:22 GMT 2019)... (kafka.log.LogCleaner)}}
{{[2019-08-07 13:05:37,809] INFO [kafka-log-cleaner-thread-0]:}}
{{ Log cleaner thread 0 cleaned log backup_br_domain_squad-0 (dirty section = [232, 232])}}
{{ 0.0 MB of log processed in 0.0 seconds (0.0 MB/sec).}}
{{ Indexed 0.0 MB in 0.0 seconds (0.2 Mb/sec, 97.1% of total time)}}
{{ Buffer utilization: 0.0%}}
{{ Cleaned 0.0 MB in 0.0 seconds (0.0 Mb/sec, 2.9% of total time)}}
{{ Start size: 0.0 MB (0 messages)}}
{{ End size: 0.0 MB (0 messages)}}
{{ NaN% size reduction (NaN% fewer messages)}}
{{ (kafka.log.LogCleaner)}}

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)