You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2023/02/24 20:02:00 UTC

[jira] [Resolved] (KAFKA-10352) Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)

     [ https://issues.apache.org/jira/browse/KAFKA-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthias J. Sax resolved KAFKA-10352.
-------------------------------------
    Resolution: Fixed

> Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10352
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10352
>             Project: Kafka
>          Issue Type: Bug
>          Components: log cleaner
>            Reporter: Seongbae Chang
>            Priority: Critical
>
> One of my Kafka brokers(total 3, and version 2.5.0) was shut down suddenly. And then, other brokers also was shut down because of similar causes.
>  
> Main cause of this problem is '*Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)*
> *java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint*'
>  
> I haven't known why this error occurs and how to solve it. Please, give me some answers or comments about it. Thank you.
> And I attached the content of log files such as kafkaServer.out, log-cleaner.log
>  
> kafkaServer.out
> {code:java}
> [2020-07-30 19:49:05,992] INFO [GroupMetadataManager brokerId=3] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-07-30 19:49:05,992] INFO [GroupMetadataManager brokerId=3] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-07-30 19:56:48,080] ERROR Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.newBufferedReader(Files.java:2816) at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:87) at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86) at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$2(LogCleanerManager.scala:134) at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:583) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:597) at scala.collection.mutable.ListBuffer.addAll(ListBuffer.scala:118) at scala.collection.mutable.ListBuffer$.from(ListBuffer.scala:38) at scala.collection.immutable.List$.from(List.scala:617) at scala.collection.immutable.List$.from(List.scala:611) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.immutable.Iterable$.from(Iterable.scala:35) at scala.collection.immutable.Iterable$.from(Iterable.scala:32) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.IterableOps.flatMap(Iterable.scala:674) at scala.collection.IterableOps.flatMap$(Iterable.scala:674) at scala.collection.AbstractIterable.flatMap(Iterable.scala:921) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$1(LogCleanerManager.scala:132) at kafka.log.LogCleanerManager.allCleanerCheckpoints(LogCleanerManager.scala:140) at kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$1(LogCleanerManager.scala:171) at kafka.log.LogCleanerManager.grabFilthiestCompactedLog(LogCleanerManager.scala:168) at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:327) at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:314) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:303) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)[2020-07-30 19:56:48,083] WARN [ReplicaManager broker=3] Stopping serving replicas in dir /tmp/kafka-logs (kafka.server.ReplicaManager)[2020-07-30 19:56:48,086] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions HashSet(__consumer_offsets-8, sbchang.test.partition-0, __consumer_offsets-47, sbchang.test.partition-2, sbchang.test.header-2, configtest-0, __ispossible-0, __consumer_offsets-32, __consumer_offsets-35, temp-iot-0, __consumer_offsets-41, __consumer_offsets-23, test-security-sasl-plain-001-0, __consumer_offsets-38, __consumer_offsets-17, test-security-ssl-001-0, sbchang.test.header-1, __consumer_offsets-11, __consumer_offsets-2, __consumer_offsets-14, resource-v1-CloudIoTCore-Rule-0, __consumer_offsets-20, __consumer_offsets-44, app001-transform-my001-0, sbchang.test.header-0, __consumer_offsets-5, __consumer_offsets-26, __consumer_offsets-29, sbchang.test.partition-1) (kafka.server.ReplicaFetcherManager)[2020-07-30 19:56:48,086] INFO [ReplicaAlterLogDirsManager on broker 3] Removed fetcher for partitions HashSet(__consumer_offsets-8, sbchang.test.partition-0, __consumer_offsets-47, sbchang.test.partition-2, sbchang.test.header-2, configtest-0, __ispossible-0, __consumer_offsets-32, __consumer_offsets-35, temp-iot-0, __consumer_offsets-41, __consumer_offsets-23, test-security-sasl-plain-001-0, __consumer_offsets-38, __consumer_offsets-17, test-security-ssl-001-0, sbchang.test.header-1, __consumer_offsets-11, __consumer_offsets-2, __consumer_offsets-14, resource-v1-CloudIoTCore-Rule-0, __consumer_offsets-20, __consumer_offsets-44, app001-transform-my001-0, sbchang.test.header-0, __consumer_offsets-5, __consumer_offsets-26, __consumer_offsets-29, sbchang.test.partition-1) (kafka.server.ReplicaAlterLogDirsManager)[2020-07-30 19:56:48,096] WARN [ReplicaManager broker=3] Broker 3 stopped fetcher for partitions __consumer_offsets-8,sbchang.test.partition-0,__consumer_offsets-47,sbchang.test.partition-2,sbchang.test.header-2,configtest-0,__ispossible-0,__consumer_offsets-32,__consumer_offsets-35,temp-iot-0,__consumer_offsets-41,__consumer_offsets-23,test-security-sasl-plain-001-0,__consumer_offsets-38,__consumer_offsets-17,test-security-ssl-001-0,sbchang.test.header-1,__consumer_offsets-11,__consumer_offsets-2,__consumer_offsets-14,resource-v1-CloudIoTCore-Rule-0,__consumer_offsets-20,__consumer_offsets-44,app001-transform-my001-0,sbchang.test.header-0,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,sbchang.test.partition-1 and stopped moving logs for partitions  because they are in the failed log directory /tmp/kafka-logs. (kafka.server.ReplicaManager)[2020-07-30 19:56:48,096] WARN Stopping serving logs in dir /tmp/kafka-logs (kafka.log.LogManager)[2020-07-30 19:56:48,098] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager)
> {code}
>  
> *log-cleaner.log*
> {code:java}
> [2020-07-30 19:56:48,083] ERROR Failed to access checkpoint file cleaner-offset-checkpoint in dir /tmp/kafka-logs (kafka.log.LogCleaner)[2020-07-30 19:56:48,083] ERROR Failed to access checkpoint file cleaner-offset-checkpoint in dir /tmp/kafka-logs (kafka.log.LogCleaner)org.apache.kafka.common.errors.KafkaStorageException: Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpointCaused by: java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.newBufferedReader(Files.java:2816) at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:87) at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86) at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$2(LogCleanerManager.scala:134) at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:583) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:597) at scala.collection.mutable.ListBuffer.addAll(ListBuffer.scala:118) at scala.collection.mutable.ListBuffer$.from(ListBuffer.scala:38) at scala.collection.immutable.List$.from(List.scala:617) at scala.collection.immutable.List$.from(List.scala:611) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.immutable.Iterable$.from(Iterable.scala:35) at scala.collection.immutable.Iterable$.from(Iterable.scala:32) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.IterableOps.flatMap(Iterable.scala:674) at scala.collection.IterableOps.flatMap$(Iterable.scala:674) at scala.collection.AbstractIterable.flatMap(Iterable.scala:921) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$1(LogCleanerManager.scala:132) at kafka.log.LogCleanerManager.allCleanerCheckpoints(LogCleanerManager.scala:140) at kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$1(LogCleanerManager.scala:171) at kafka.log.LogCleanerManager.grabFilthiestCompactedLog(LogCleanerManager.scala:168) at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:327) at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:314) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:303) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)