You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Seongbae Chang (Jira)" <ji...@apache.org> on 2020/08/04 06:36:00 UTC

[jira] [Created] (KAFKA-10352) Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)

Seongbae Chang created KAFKA-10352:
--------------------------------------

             Summary: Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)
                 Key: KAFKA-10352
                 URL: https://issues.apache.org/jira/browse/KAFKA-10352
             Project: Kafka
          Issue Type: Bug
          Components: log cleaner
            Reporter: Seongbae Chang


One of my Kafka brokers(total 3, and version 2.5.0) was shut down suddenly. And then, other brokers also was shut down because of similar causes.

 

Main cause of this problem is '*Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)*
*java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint*'

 

I haven't known why this error occurs and how to solve it. Please, give me some answers or comments about it. Thank you.

And I attached the content of log files such as kafkaServer.out, log-cleaner.log

 

kafkaServer.out
{code:java}
[2020-07-30 19:49:05,992] INFO [GroupMetadataManager brokerId=3] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-07-30 19:49:05,992] INFO [GroupMetadataManager brokerId=3] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-07-30 19:56:48,080] ERROR Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.newBufferedReader(Files.java:2816) at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:87) at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86) at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$2(LogCleanerManager.scala:134) at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:583) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:597) at scala.collection.mutable.ListBuffer.addAll(ListBuffer.scala:118) at scala.collection.mutable.ListBuffer$.from(ListBuffer.scala:38) at scala.collection.immutable.List$.from(List.scala:617) at scala.collection.immutable.List$.from(List.scala:611) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.immutable.Iterable$.from(Iterable.scala:35) at scala.collection.immutable.Iterable$.from(Iterable.scala:32) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.IterableOps.flatMap(Iterable.scala:674) at scala.collection.IterableOps.flatMap$(Iterable.scala:674) at scala.collection.AbstractIterable.flatMap(Iterable.scala:921) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$1(LogCleanerManager.scala:132) at kafka.log.LogCleanerManager.allCleanerCheckpoints(LogCleanerManager.scala:140) at kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$1(LogCleanerManager.scala:171) at kafka.log.LogCleanerManager.grabFilthiestCompactedLog(LogCleanerManager.scala:168) at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:327) at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:314) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:303) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)[2020-07-30 19:56:48,083] WARN [ReplicaManager broker=3] Stopping serving replicas in dir /tmp/kafka-logs (kafka.server.ReplicaManager)[2020-07-30 19:56:48,086] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions HashSet(__consumer_offsets-8, sbchang.test.partition-0, __consumer_offsets-47, sbchang.test.partition-2, sbchang.test.header-2, configtest-0, __ispossible-0, __consumer_offsets-32, __consumer_offsets-35, temp-iot-0, __consumer_offsets-41, __consumer_offsets-23, test-security-sasl-plain-001-0, __consumer_offsets-38, __consumer_offsets-17, test-security-ssl-001-0, sbchang.test.header-1, __consumer_offsets-11, __consumer_offsets-2, __consumer_offsets-14, resource-v1-CloudIoTCore-Rule-0, __consumer_offsets-20, __consumer_offsets-44, app001-transform-my001-0, sbchang.test.header-0, __consumer_offsets-5, __consumer_offsets-26, __consumer_offsets-29, sbchang.test.partition-1) (kafka.server.ReplicaFetcherManager)[2020-07-30 19:56:48,086] INFO [ReplicaAlterLogDirsManager on broker 3] Removed fetcher for partitions HashSet(__consumer_offsets-8, sbchang.test.partition-0, __consumer_offsets-47, sbchang.test.partition-2, sbchang.test.header-2, configtest-0, __ispossible-0, __consumer_offsets-32, __consumer_offsets-35, temp-iot-0, __consumer_offsets-41, __consumer_offsets-23, test-security-sasl-plain-001-0, __consumer_offsets-38, __consumer_offsets-17, test-security-ssl-001-0, sbchang.test.header-1, __consumer_offsets-11, __consumer_offsets-2, __consumer_offsets-14, resource-v1-CloudIoTCore-Rule-0, __consumer_offsets-20, __consumer_offsets-44, app001-transform-my001-0, sbchang.test.header-0, __consumer_offsets-5, __consumer_offsets-26, __consumer_offsets-29, sbchang.test.partition-1) (kafka.server.ReplicaAlterLogDirsManager)[2020-07-30 19:56:48,096] WARN [ReplicaManager broker=3] Broker 3 stopped fetcher for partitions __consumer_offsets-8,sbchang.test.partition-0,__consumer_offsets-47,sbchang.test.partition-2,sbchang.test.header-2,configtest-0,__ispossible-0,__consumer_offsets-32,__consumer_offsets-35,temp-iot-0,__consumer_offsets-41,__consumer_offsets-23,test-security-sasl-plain-001-0,__consumer_offsets-38,__consumer_offsets-17,test-security-ssl-001-0,sbchang.test.header-1,__consumer_offsets-11,__consumer_offsets-2,__consumer_offsets-14,resource-v1-CloudIoTCore-Rule-0,__consumer_offsets-20,__consumer_offsets-44,app001-transform-my001-0,sbchang.test.header-0,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,sbchang.test.partition-1 and stopped moving logs for partitions  because they are in the failed log directory /tmp/kafka-logs. (kafka.server.ReplicaManager)[2020-07-30 19:56:48,096] WARN Stopping serving logs in dir /tmp/kafka-logs (kafka.log.LogManager)[2020-07-30 19:56:48,098] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager)
{code}
 

*log-cleaner.log*
{code:java}
[2020-07-30 19:56:48,083] ERROR Failed to access checkpoint file cleaner-offset-checkpoint in dir /tmp/kafka-logs (kafka.log.LogCleaner)[2020-07-30 19:56:48,083] ERROR Failed to access checkpoint file cleaner-offset-checkpoint in dir /tmp/kafka-logs (kafka.log.LogCleaner)org.apache.kafka.common.errors.KafkaStorageException: Error while reading checkpoint file /tmp/kafka-logs/cleaner-offset-checkpointCaused by: java.nio.file.NoSuchFileException: /tmp/kafka-logs/cleaner-offset-checkpoint at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.newBufferedReader(Files.java:2816) at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:87) at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86) at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$2(LogCleanerManager.scala:134) at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:583) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:597) at scala.collection.mutable.ListBuffer.addAll(ListBuffer.scala:118) at scala.collection.mutable.ListBuffer$.from(ListBuffer.scala:38) at scala.collection.immutable.List$.from(List.scala:617) at scala.collection.immutable.List$.from(List.scala:611) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.immutable.Iterable$.from(Iterable.scala:35) at scala.collection.immutable.Iterable$.from(Iterable.scala:32) at scala.collection.IterableFactory$Delegate.from(Factory.scala:288) at scala.collection.IterableOps.flatMap(Iterable.scala:674) at scala.collection.IterableOps.flatMap$(Iterable.scala:674) at scala.collection.AbstractIterable.flatMap(Iterable.scala:921) at kafka.log.LogCleanerManager.$anonfun$allCleanerCheckpoints$1(LogCleanerManager.scala:132) at kafka.log.LogCleanerManager.allCleanerCheckpoints(LogCleanerManager.scala:140) at kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$1(LogCleanerManager.scala:171) at kafka.log.LogCleanerManager.grabFilthiestCompactedLog(LogCleanerManager.scala:168) at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:327) at kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:314) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:303) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)