You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2013/01/11 19:20:12 UTC

[jira] [Resolved] (KAFKA-669) Irrecoverable error on leader while rolling to a new segment

     [ https://issues.apache.org/jira/browse/KAFKA-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede resolved KAFKA-669.
---------------------------------

    Resolution: Implemented
      Assignee: Neha Narkhede
    
> Irrecoverable error on leader while rolling to a new segment
> ------------------------------------------------------------
>
>                 Key: KAFKA-669
>                 URL: https://issues.apache.org/jira/browse/KAFKA-669
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>
> I setup a test cluster of 5 brokers and setup migration tool to send data for ~1000 partitions to this cluster. I see  a bunch of FailedToSendMessageExceptions on the producer. After tracing some of these failed requests, I found that the broker sends a ProducerResponseStatus(-1, -1) to the producer. The impact is the producer marks the messages for those partitions as failed, retries and eventually gives up. This seems like a bug in the truncation logic. I traced the state changes topic SpamReportEvent and seems like first it was the follower, so it truncated its log. Then it became the leader and got into an irrecoverable error state due to log rolling. 
> [2012-12-12 01:47:01,951] INFO [Log Manager on Broker 3] Loading log 'SpamReportEvent-0' (kafka.log.LogManager)
> [2012-12-12 01:47:01,951] INFO Creating log segment /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.log (kafka.log.FileMessageSet)
> [2012-12-12 01:47:01,951] INFO Created index file /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.index with maxEntries = 1310720, maxIndexSize = 10485760, entries = 1310720, lastOffset = 0 (kafka.log.OffsetIndex)
> [2012-12-12 01:47:02,949] INFO Replica Manager on Broker 3: Starting the follower state transition to follow leader 4 for topic SpamReportEvent partition 0 (kafka.server.ReplicaManager)
> [2012-12-12 01:47:02,950] INFO [Kafka Log on Broker 3], Truncated log segment /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.log to target offset 0 (kafka.log.Log)
> [2012-12-12 01:47:34,745] INFO Replica Manager on Broker 3: Becoming Leader for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:47:34,745] INFO Replica Manager on Broker 3: Completed the leader state transition for topic SpamReportEvent partition 0 (kafka.server.ReplicaManager)
> [2012-12-12 01:47:36,071] INFO Replica Manager on Broker 3: Becoming Leader for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:47:36,071] INFO Replica Manager on Broker 3: Completed the leader state transition for topic SpamReportEvent partition 0 (kafka.server.ReplicaManager)
> [2012-12-12 01:48:10,545] INFO Replica Manager on Broker 3: Becoming Leader for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:48:10,545] INFO Replica Manager on Broker 3: Completed the leader state transition for topic SpamReportEvent partition 0 (kafka.server.ReplicaManager)
> [2012-12-12 02:02:08,148] ERROR [KafkaApi-3] Error processing ProducerRequest on SpamReportEvent:0 (kafka.server.Kafka
> Apis)
> [2012-12-12 02:06:06,081] ERROR [KafkaApi-3] Error processing ProducerRequest on SpamReportEvent:0 (kafka.server.KafkaApis)
> kafka.common.KafkaException: Trying to roll a new log segment for topic partition SpamReportEvent-0 with start offset 0 while it already exists
>         at kafka.log.Log.rollToOffset(Log.scala:465)
>         at kafka.log.Log.roll(Log.scala:443)
>         at kafka.log.Log.maybeRoll(Log.scala:431)
>         at kafka.log.Log.append(Log.scala:258)
>         at kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:246)
>         at kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:239)
>         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>         at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
>         at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
>         at scala.collection.immutable.Map$Map1.map(Map.scala:93)
>         at kafka.server.KafkaApis.appendToLocalLog(KafkaApis.scala:239)
>         at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:179)
>         at kafka.server.KafkaApis.handle(KafkaApis.scala:56)
>         at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:41)
>         at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira