You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Dan Swanson (JIRA)" <ji...@apache.org> on 2013/07/16 17:14:48 UTC

[jira] [Created] (KAFKA-975) Leader not local for partition when partition is leader

Dan Swanson created KAFKA-975:
---------------------------------

             Summary: Leader not local for partition when partition is leader
                 Key: KAFKA-975
                 URL: https://issues.apache.org/jira/browse/KAFKA-975
             Project: Kafka
          Issue Type: Bug
          Components: replication
    Affects Versions: 0.8
         Environment: centos 6.4
            Reporter: Dan Swanson
            Assignee: Neha Narkhede


I have a two server kafka cluster (dev003 and dev004).  I am following the example from this URL but using two servers with a single kafka instance instead of using 1 server with two instances..

http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/

Using the following trunk version

commit c27c768463a5dc6be113f2e5b3e00bf8d9d9d602
Author: David Arthur <mu...@gmail.com>
Date:   Thu Jul 11 15:34:57 2013 -0700

    KAFKA-852, remove clientId from Offset{Fetch,Commit}Response. Reviewed by Jay.

------

[2013-07-16 10:56:50,279] INFO [Kafka Server 3], started (kafka.server.KafkaServer)

------

dan@linux-rr29:~/git-data/kafka-current-src> bin/kafka-topics.sh --zookeeper dev003:2181 --create --topic dadj1 --partitions 1 --replication-factor 2 2>/dev/null
Created topic "dadj1".
dan@linux-rr29:~/git-data/kafka-current-src>

-------


[2013-07-16 10:56:57,946] INFO [Replica Manager on Broker 3]: Handling LeaderAndIsr request Name:LeaderAndIsrRequest;Version:0;Controller:4;ControllerEpoch:19;CorrelationId:12;ClientId:id_4-host_dev004-port_9092;PartitionState:(dadj1,0) -> (LeaderAndIsrInfo:(Leader:3,ISR:3,4,LeaderEpoch:0,ControllerEpoch:19),ReplicationFactor:2),AllReplicas:3,4);Leaders:id:3,host:dev003,port:9092 (kafka.server.ReplicaManager)
[2013-07-16 10:56:57,959] INFO [ReplicaFetcherManager on broker 3] Removing fetcher for partition [dadj1,0] (kafka.server.ReplicaFetcherManager)
[2013-07-16 10:57:21,196] WARN [KafkaApi-3] Produce request with correlation id 2 from client  on partition [dadj1,0] failed due to Leader not local for partition [dadj1,0] on broker 3 (kafka.server.KafkaApis)

-----

dan@linux-rr29:~/git-data/kafka-current-src> bin/kafka-topics.sh --zookeeper dev003:2181 --describe --topic dadj1 2>/dev/null
dadj1
	configs: 
	partitions: 1
		topic: dadj1	partition: 0	leader: 3	replicas: 3,4	isr: 3,4
dan@linux-rr29:~/git-data/kafka-current-src>

Dev003 logs show that server is elected as leader and has correct id of 3, zookeeper shows dev003 is leader, but when I try to produce to the topic I get a failure because the server thinks it is not the leader.  This occurs regardless of which server (dev003 or dev004) ends up the leader.

Here is my config which is the same except for the broker id and host names

[root@dev003 kafka-current-src]# grep -v -e '^#' -e '^$' config/server.properties 
broker.id=3
port=9092
host.name=dev003
num.network.threads=2
 
num.io.threads=2
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dir=/opt/kafka/data/8.0/
num.partitions=1
log.flush.interval.messages=10000
log.flush.interval.ms=1000
log.retention.hours=168
log.segment.bytes=536870912
log.cleanup.interval.mins=1
zookeeper.connect=10.200.8.61:2181,10.200.8.62:2181,10.200.8.63:2181
zookeeper.connection.timeout.ms=1000000
kafka.metrics.polling.interval.secs=5
kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter
kafka.csv.metrics.dir=/tmp/kafka_metrics
kafka.csv.metrics.reporter.enabled=false
[root@dev003 kafka-current-src]#



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira