You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Kiran (Jira)" <ji...@apache.org> on 2021/07/07 05:37:00 UTC

[jira] [Commented] (KAFKA-9877) ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager)

    [ https://issues.apache.org/jira/browse/KAFKA-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376281#comment-17376281 ] 

Kiran commented on KAFKA-9877:
------------------------------

I am seeing the same issue in kafka1.0 as well..

 

org.apache.kafka.common.errors.KafkaStorageException: Error while deleting segments for testtopic-0 in dir /kafka-logs
Caused by: java.io.IOException: Delete of log 00000000000000000000.log.deleted failed.
 at kafka.log.LogSegment.delete(LogSegment.scala:496)
 at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
 at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
 at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
 at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
 at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
 at kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
 at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
 at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

 

I have compaction enabled for topic with below config:

segment.ms=100ms

delete.retention.ms=100ms

 

also, lot of below errros.

ERROR Error while processing data for partition testtopic1-18 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.KafkaStorageException: Replica 3 is in an offline log directory for partition testtopic-10

 

> ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager)
> ------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-9877
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9877
>             Project: Kafka
>          Issue Type: Bug
>          Components: log cleaner
>    Affects Versions: 2.1.1
>         Environment: Redhat
>            Reporter: Hawking Du
>            Priority: Major
>         Attachments: server-125.log
>
>
> There is a so confused problem around me long time. 
> Kafka server often stop exceptionally seems caused by log clean process. Here are some of logs from server. Can anyone give me some ideas for fixing it.
> {code:java}
> [2020-04-04 02:07:57,410] INFO [GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:07:57,410] INFO [GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:17:57,410] INFO [GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:27:57,410] INFO [GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:30:22,272] INFO [ProducerStateManager partition=__consumer_offsets-35] Writing producer snapshot at offset 741037 (kafka.log.ProducerStateManager)[2020-04-04 02:30:22,274] INFO [Log partition=__consumer_offsets-35, dir=/tmp/kafka-logs] Rolled new log segment at offset 741037 in 3 ms. (kafka.log.Log)[2020-04-04 02:30:26,289] ERROR Failed to clean up log for __consumer_offsets-35 in dir /tmp/kafka-logs due to IOException (kafka.server.LogDirFailureChannel)java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:815) at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:224) at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:508) at kafka.log.Log.asyncDeleteSegment(Log.scala:1962) at kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:2025) at kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:2020) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.log.Log.replaceSegments(Log.scala:2020) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:602) at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:528) at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:527) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.log.Cleaner.doClean(LogCleaner.scala:527) at kafka.log.Cleaner.clean(LogCleaner.scala:501) at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:359) at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:328) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:307) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log -> /tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log.deleted at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:812) ... 17 more[2020-04-04 02:30:26,296] INFO [ReplicaManager broker=5] Stopping serving replicas in dir /tmp/kafka-logs (kafka.server.ReplicaManager)[2020-04-04 02:30:26,302] INFO [ReplicaFetcherManager on broker 5] Removed fetcher for partitions Set(fitment-deduct-0, __consumer_offsets-22, __consumer_offsets-30, __consumer_offsets-4, __consumer_offsets-27, __consumer_offsets-7, __consumer_offsets-9, __consumer_offsets-46, __consumer_offsets-35, __consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, test-0, __consumer_offsets-31, __consumer_offsets-42, __consumer_offsets-3, __consumer_offsets-18, __consumer_offsets-15, __consumer_offsets-24, ajhz-log-0, __consumer_offsets-38, __consumer_offsets-19, __consumer_offsets-11, bpinfo-sync-0, spinfo-sync-backup-0, __consumer_offsets-2, __consumer_offsets-43, __consumer_offsets-6, __consumer_offsets-14, __consumer_offsets-44, __consumer_offsets-39, __consumer_offsets-26, __consumer_offsets-29, __consumer_offsets-34, __consumer_offsets-10, video-log-0) (kafka.server.ReplicaFetcherManager)[2020-04-04 02:30:26,303] INFO [ReplicaAlterLogDirsManager on broker 5] Removed fetcher for partitions Set(fitment-deduct-0, __consumer_offsets-22, __consumer_offsets-30, __consumer_offsets-4, __consumer_offsets-27, __consumer_offsets-7, __consumer_offsets-9, __consumer_offsets-46, __consumer_offsets-35, __consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, test-0, __consumer_offsets-31, __consumer_offsets-42, __consumer_offsets-3, __consumer_offsets-18, __consumer_offsets-15, __consumer_offsets-24, ajhz-log-0, __consumer_offsets-38, __consumer_offsets-19, __consumer_offsets-11, bpinfo-sync-0, spinfo-sync-backup-0, __consumer_offsets-2, __consumer_offsets-43, __consumer_offsets-6, __consumer_offsets-14, __consumer_offsets-44, __consumer_offsets-39, __consumer_offsets-26, __consumer_offsets-29, __consumer_offsets-34, __consumer_offsets-10, video-log-0) (kafka.server.ReplicaAlterLogDirsManager)[2020-04-04 02:30:26,330] INFO [ReplicaManager broker=5] Broker 5 stopped fetcher for partitions fitment-deduct-0,__consumer_offsets-22,__consumer_offsets-30,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-46,__consumer_offsets-35,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,test-0,__consumer_offsets-31,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,__consumer_offsets-15,__consumer_offsets-24,ajhz-log-0,__consumer_offsets-38,__consumer_offsets-19,__consumer_offsets-11,bpinfo-sync-0,spinfo-sync-backup-0,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,__consumer_offsets-44,__consumer_offsets-39,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,video-log-0 and stopped moving logs for partitions  because they are in the failed log directory /tmp/kafka-logs. (kafka.server.ReplicaManager)[2020-04-04 02:30:26,330] INFO Stopping serving logs in dir /tmp/kafka-logs (kafka.log.LogManager)[2020-04-04 02:30:26,347] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)