You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Klearchos Chaloulos (JIRA)" <ji...@apache.org> on 2017/07/06 13:17:00 UTC
[jira] [Created] (KAFKA-5564) Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'

Klearchos Chaloulos created KAFKA-5564:
------------------------------------------

             Summary: Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'
                 Key: KAFKA-5564
                 URL: https://issues.apache.org/jira/browse/KAFKA-5564
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.9.0.1
            Reporter: Klearchos Chaloulos


Hello,

*Short version*
we have seen sporadic occurrences of the following issue: Topics whose leader is a specific broker fail to be created properly, and it is impossible to produce to them or consume from them.
 The following logs appears in the broker that is the leader of the faulty topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording the replica LEO, the partition [topic2,0] hasn't been created. (kafka.server.ReplicaManager)
{noformat}

*Detailed version*:
Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, topic6, topic7.

Sometimes (sporadically) some of the topics are faulty. In the particular example I describe here the faulty topics are topics are topic6, topic4, topic2, topic3. The faulty topics all have the same leader broker 3.

If we do a kafka-topics.sh --describe on the topics we see that for topics that do not have broker 3 as leader, the in sync replicas report that broker 3 is not synced:
{noformat}
 bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
Topic:topic6	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic6	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
Topic:topic5	PartitionCount:1	ReplicationFactor:3	Configs:retention.ms=300000
	Topic: topic5	Partition: 0	Leader: 2	Replicas: 2,3,1	Isr: 2,1
Topic:topic7	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic7	Partition: 0	Leader: 1	Replicas: 1,3,2	Isr: 1,2
Topic:topic4	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic4	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
Topic:topic1	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic1	Partition: 0	Leader: 2	Replicas: 2,1,3	Isr: 2,1
Topic:topic2	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic2	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
Topic:topic3	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: topic3	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
{noformat}
While for the faulty topics it is reported that all replicas are in sync.

Also, the topic directories under the log.dir folder were not created in the faulty broker 3.

We see the following logs in broker 3, which is the leader of the faulty topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording the replica LEO, the partition [topic2,0] hasn't been created. (kafka.server.ReplicaManager)
{noformat}
The above log is logged continuously.

and the following error logs in the other 2 brokers, the replicas:
{noformat}
ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition
{noformat}
Again the above log is logged continuously.

The issue described above occurs immediately after the deployment of the kafka cluster.
A restart of the faulty broker (3 in this case) fixes the problem and the faulty topics work normally.

I have also attached the broker configuration we use.

Do you have any idea what might cause this issue?

Best regards,

Klearchos




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)