You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Harald Kirsch (JIRA)" <ji...@apache.org> on 2016/12/07 10:33:58 UTC

[jira] [Updated] (KAFKA-4502) Exception during startup, append offset to swap file

     [ https://issues.apache.org/jira/browse/KAFKA-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harald Kirsch updated KAFKA-4502:
---------------------------------
    Labels:   (was: log)

> Exception during startup, append offset to swap file
> ----------------------------------------------------
>
>                 Key: KAFKA-4502
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4502
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.10.2.0
>         Environment: Windows Server
>            Reporter: Harald Kirsch
>
> During startup, the Kafka server throws the exception shown below with a bit of pre-context.
> We are using the so-called SiphonRelease (https://github.com/Microsoft/Kafka/tree/SiphonRelease, https://issues.apache.org/jira/browse/KAFKA-1194?focusedCommentId=15702991) which tries to circumvent problems of the logCleaner to rename and delete segments still memory mapped by the broker. 
> The trouble seems to be as follows: since in the SiphonRelease the LogCleaner still sometimes crashes, we have a monitoring script that detects this and then restarts the Windows Service (apache procrun based) running Kafka. My hunch is that the combination of restart-service/procrun does not allow Kafka to shut down smoothly, since when it starts we get tons of messages like:
> {noformat}
> [2016-12-05 23:30:20,704] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (d:\Search\kafka\fileshare-0\00000000000000084814.index) has non-zero size but the last offset is 84814 which is no larger than the base offset 84814.}. deleting d:\Search\kafka\fileshare-0\00000000000000084814.timeindex, d:\Search\kafka\fileshare-0\00000000000000084814.index and rebuilding index... (kafka.log.Log)
> {noformat}
> While this seems fixable by Kafka, my hunch is that a leftover .swap file then breaks it as follows:
> {noformat}
> [2016-12-05 23:32:34,676] INFO Found log file d:\Search\kafka\windream-4\00000000000000000000.log.swap from interrupted swap operation, repairing. (kafka.log.Log)
> [2016-12-05 23:32:34,957] ERROR There was an error in one of the threads during logs loading: kafka.common.InvalidOffsetException: Attempt to append an offset (110460) to position 182 no larger than the last offset appended (110735) to d:\Search\kafka\windream-4\00000000000000000000.index.swap. (kafka.log.LogManager)
> [2016-12-05 23:32:34,957] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> kafka.common.InvalidOffsetException: Attempt to append an offset (110460) to position 182 no larger than the last offset appended (110735) to d:\Search\kafka\windream-4\00000000000000000000.index.swap.
> 	at kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:132)
> 	at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:122)
> 	at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:122)
> 	at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> 	at kafka.log.OffsetIndex.append(OffsetIndex.scala:122)
> 	at kafka.log.LogSegment.recover(LogSegment.scala:224)
> 	at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:248)
> 	at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:232)
> 	at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
> 	at kafka.log.Log.loadSegments(Log.scala:232)
> 	at kafka.log.Log.<init>(Log.scala:108)
> 	at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:151)
> 	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:58)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)