You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "David Lao (JIRA)" <ji...@apache.org> on 2013/11/26 20:02:42 UTC

[jira] [Created] (KAFKA-1145) Broker fail to sync after restart

David Lao created KAFKA-1145:
--------------------------------

             Summary: Broker fail to sync after restart
                 Key: KAFKA-1145
                 URL: https://issues.apache.org/jira/browse/KAFKA-1145
             Project: Kafka
          Issue Type: Bug
          Components: replication
    Affects Versions: 0.8
            Reporter: David Lao
            Assignee: Neha Narkhede
            Priority: Critical


I'm hitting this issue where a freshly join broker is stuck in a replication loop due to error getting offset.  

The sequence of events are as follows:
1) broker-0 and broker-3 holds the logs for partition-1. broker-0 was the partition leader.  broker-0 when down due to a machine failure (ie lost of log data drive)
2) broker-3 became the leader for partition-1
3) broker-0 joins back after log drive replacement

Exceptions observed on broker-0 upon rejoining

kafka.common.KafkaStorageException: Deleting log segment 0 failed.
        at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:613)
        at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:608)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:33)
        at kafka.log.Log.deleteSegments(Log.scala:608)
        at kafka.log.Log.truncateAndStartWithNewOffset(Log.scala:667)
        at kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:97)
        at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:142)
        at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:109)
        at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
        at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply$mcV$sp(AbstractFetcherThread.scala:109)
        at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109)
        at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109)
        at kafka.utils.Utils$.inLock(Utils.scala:565)
        at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:108)
        at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:86)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)

logs are attached



--
This message was sent by Atlassian JIRA
(v6.1#6144)