You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Wenbing Shen (Jira)" <ji...@apache.org> on 2020/12/18 05:58:00 UTC
[jira] [Comment Edited] (KAFKA-9458) Kafka crashed in windows
environment
[ https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251530#comment-17251530 ]
Wenbing Shen edited comment on KAFKA-9458 at 12/18/20, 5:57 AM:
----------------------------------------------------------------
The current patch is deficient. When topic is deleted or partition migration is carried out, the service will still be suspended or the disk will be offline. I have provided the following patch file, which is effective for self-test,My Kafka version is 2.0.0 .
[^kafka_windows_crash_by_delete_topic_and_Partition_migration]
was (Author: wenbing.shen):
The current patch is deficient. When topic is deleted or partition migration is carried out, the service will still be suspended or the disk will be offline. I have provided the following patch file, which is effective for self-test
[^kafka_windows_crash_by_delete_topic_and_Partition_migration]
> Kafka crashed in windows environment
> ------------------------------------
>
> Key: KAFKA-9458
> URL: https://issues.apache.org/jira/browse/KAFKA-9458
> Project: Kafka
> Issue Type: Bug
> Components: log
> Affects Versions: 2.4.0
> Environment: Windows Server 2019
> Reporter: hirik
> Priority: Critical
> Labels: windows
> Attachments: Windows_crash_fix.patch, kafka_windows_crash_by_delete_topic_and_Partition_migration, logs.zip
>
>
> Hi,
> while I was trying to validate Kafka retention policy, Kafka Server crashed with below exception trace.
> [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log)
> [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.
> at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
> at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
> at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
> at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
> at java.base/java.nio.file.Files.move(Files.java:1425)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
> at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
> at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
> at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
> at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
> at scala.collection.immutable.List.foreach(List.scala:305)
> at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
> at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
> at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
> at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
> at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
> at kafka.log.Log.deleteSegments(Log.scala:1691)
> at kafka.log.Log.deleteOldSegments(Log.scala:1686)
> at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
> at kafka.log.Log.deleteOldSegments(Log.scala:1753)
> at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
> at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
> at scala.collection.immutable.List.foreach(List.scala:305)
> at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
> at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
> at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:830)
> Suppressed: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.
> at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
> at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
> at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
> at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
> at java.base/java.nio.file.Files.move(Files.java:1425)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
> ... 27 more
> [2020-01-21 17:10:40,495] INFO [ReplicaManager broker=0] Stopping serving replicas in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.server.ReplicaManager)
> [2020-01-21 17:10:40,495] ERROR Uncaught exception in scheduled task 'kafka-log-retention' (kafka.utils.KafkaScheduler)
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting segments for test1-3 in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka
> Caused by: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.
> at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
> at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
> at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
> at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
> at java.base/java.nio.file.Files.move(Files.java:1425)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
> at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
> at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
> at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
> at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
> at scala.collection.immutable.List.foreach(List.scala:305)
> at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
> at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
> at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
> at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
> at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
> at kafka.log.Log.deleteSegments(Log.scala:1691)
> at kafka.log.Log.deleteOldSegments(Log.scala:1686)
> at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
> at kafka.log.Log.deleteOldSegments(Log.scala:1753)
> at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
> at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
> at scala.collection.immutable.List.foreach(List.scala:305)
> at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
> at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
> at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:830)
> Suppressed: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.
> at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
> at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
> at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
> at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
> at java.base/java.nio.file.Files.move(Files.java:1425)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
> ... 27 more
> [2020-01-21 17:10:40,505] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, test1-1, test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) (kafka.server.ReplicaFetcherManager)
> [2020-01-21 17:10:40,507] INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, test1-1, test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) (kafka.server.ReplicaAlterLogDirsManager)
> [2020-01-21 17:10:40,522] INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions test1-3,test1-7,test-0,test1-0,test1-1,test1-5,test1-2,test1-8,test1-4,test1-9,test1-6 and stopped moving logs for partitions because they are in the failed log directory C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka. (kafka.server.ReplicaManager)
> [2020-01-21 17:10:40,523] INFO Stopping serving logs in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.log.LogManager)
> [2020-01-21 17:10:40,526] ERROR Shutdown broker because all log dirs in C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka have failed (kafka.log.LogManager)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)