You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2013/04/11 00:41:15 UTC

[jira] [Commented] (KAFKA-860) Replica fetcher thread errors out and dies during rolling bounce of cluster

    [ https://issues.apache.org/jira/browse/KAFKA-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628376#comment-13628376 ] 

Neha Narkhede commented on KAFKA-860:
-------------------------------------

This is caused by a race condition between the old leader's local append to the log and the new follower's log truncation. Specifically, the following causes the bug-

1. Current leader receives a produce request.
2. Broker receives leader and isr request making it a follower now
3. Broker starts become follower and truncates log 
4. Broker, not knowing it is not the leader anymore, continues with the produce request and appends some data to the log
5. Become follower starts a fetcher with the old log end offset

At step 5, it runs into the error
                
> Replica fetcher thread errors out and dies during rolling bounce of cluster
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-860
>                 URL: https://issues.apache.org/jira/browse/KAFKA-860
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: kafka-0.8, p1
>
> 2013/04/10 20:04:32.071 ERROR [ReplicaFetcherThread] [ReplicaFetcherThread-0-272] [kafka] [] [ReplicaFetcherThread-0-272], Error due to 
> kafka.common.KafkaException: error processing data for topic PageViewEvent partititon 3 offset 2482625623
>         at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$4.apply(AbstractFetcherThread.scala:135)
>         at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$4.apply(AbstractFetcherThread.scala:113)
>         at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
>         at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:113)
>         at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:89)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> Caused by: java.lang.RuntimeException: Offset mismatch: fetched offset = 2482625623, log end offset = 2482625631.
>         at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:49)
>         at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$4.apply(AbstractFetcherThread.scala:132)
>         ... 5 more
> This causes replica fetcher thread to shut down

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira