You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Anatoly Deyneka <ad...@gmail.com> on 2014/04/02 14:53:42 UTC

kafka availability

Hi,

I perform availability tests on next kafka setup:
- 3 nodes(2f+1) for zookeeper(zookeeper1, zookeeper2, zookeeper3)
- 2 nodes(f+1) for kafka brokers(id:0,host:kafka.broker1,port:9092;
id:1,host:kafka.broker2,port:9092)
I use java producer and console consumer.

Test stepts:
1. create topic test.app4

bin/kafka-topics.sh --zookeeper zookeeper1 --create --replication-factor 2
--partitions 1 --topic test.app4

bin/kafka-topics.sh --zookeeper zookeeper1 --topic test.app4 --describe
Topic:test.app4    PartitionCount:1    ReplicationFactor:2    Configs:
    Topic: test.app4    Partition: 0    Leader: 1    Replicas: 1,0    Isr: 1

2. shutdown kafka.broker2(the leader for topic), kafka.broker1 is alive
--> consumer fails in infinite loop:

WARN Fetching topic metadata with correlation id 13 for topics
[Set(test.app4)] from broker [id:0,host:kafka.broker1,port:9092] failed
(kafka.client.ClientUtils$)
java.net.ConnectException: Connection refused
...
WARN
[console-consumer-95116_ad-laptop-1396433768258-bb700f4c-leader-finder-thread],
Failed to find leader for Set([test.app4,0])
(kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
kafka.common.KafkaException: fetching topic metadata for topics
[Set(test.app4)] from broker
[ArrayBuffer(id:0,host:kafka.broker1,port:9092)] failed
...
Caused by: java.net.ConnectException: Connection refused
...
ERROR Producer connection to kafka.broker1:9092 unsuccessful
(kafka.producer.SyncProducer)
java.net.ConnectException: Connection refused
...

bin/kafka-topics.sh --zookeeper zookeeper1 --topic test.app4
--describeTopic:test.app4    PartitionCount:1    ReplicationFactor:2
Configs:
    Topic: test.app4    Partition: 0    Leader: 0    Replicas: 1,0    Isr: 0

The java producer error:
INFO  [kafka.producer.async.DefaultEventHandler] Back off for 100 ms before
retrying send. Remaining retries = 2
INFO  [kafka.client.ClientUtils$] Fetching metadata from broker
id:0,host:kafka.broker1,port:9092 with correlation id 236 for 1 topic(s)
Set(test.app4)
ERROR [kafka.producer.SyncProducer] Producer connection to
kafka.broker1:9092 unsuccessful
java.net.ConnectException: Connection refused
    ...
WARN  [kafka.client.ClientUtils$] Fetching topic metadata with correlation
id 236 for topics [Set(test.app4)] from broker
[id:0,host:kafka.broker1,port:9092] failed
java.net.ConnectException: Connection refused
    ...
INFO  [kafka.client.ClientUtils$] Fetching metadata from broker
id:1,host:kafka.broker2,port:9092 with correlation id 236 for 1 topic(s)
Set(test.app4)
ERROR [kafka.producer.SyncProducer] Producer connection to
kafka.broker2:9092 unsuccessful
java.net.ConnectException: Connection refused
   ...
WARN  [kafka.client.ClientUtils$] Fetching topic metadata with correlation
id 236 for topics [Set(test.app4)] from broker
[id:1,host:kafka.broker2,port:9092] failed
java.net.ConnectException: Connection refused
    ...
ERROR [kafka.utils.Utils$] fetching topic metadata for topics
[Set(test.app4)] from broker
[ArrayBuffer(id:0,host:kafka.broker1,port:9092,
id:1,host:kafka.broker2,port:9092)] failed
kafka.common.KafkaException: fetching topic metadata for topics
[Set(test.app4)] from broker
[ArrayBuffer(id:0,host:kafka.broker1,port:9092,
id:1,host:kafka.broker2,port:9092)] failed
    ...
Caused by: java.net.ConnectException: Connection refused
    ...
2014-04-02 13:31:12,419 DEBUG [kafka.producer.BrokerPartitionInfo] Getting
broker partition info for topic test.app4
2014-04-02 13:31:12,419 DEBUG [kafka.producer.BrokerPartitionInfo]
Partition [test.app4,0] has leader 1
2014-04-02 13:31:12,419 DEBUG [kafka.producer.async.DefaultEventHandler]
Broker partitions registered for topic: test.app4 are 0
2014-04-02 13:31:12,419 DEBUG [kafka.producer.async.DefaultEventHandler]
Sending 1 messages with no compression to [test.app4,0]
2014-04-02 13:31:12,420 DEBUG [kafka.producer.async.DefaultEventHandler]
Producer sending messages with correlation id 238 for topics [test.app4,0]
to broker 1 on kafka.broker2:9092
2014-04-02 13:31:12,423 ERROR [kafka.producer.SyncProducer] Producer
connection to kafka.broker2:9092 unsuccessful
java.net.ConnectException: Connection refused
    ...
WARN  [kafka.producer.async.DefaultEventHandler] Failed to send producer
request with correlation id 238 to broker 1 with data for partitions
[test.app4,0]
java.net.ConnectException: Connection refused
    ...
INFO  [kafka.producer.async.DefaultEventHandler] Back off for 100 ms before
retrying send. Remaining retries = 1

3. startup the kafka.broker2
--> it fails in infinite loop too:

INFO Reconnect due to socket error: null (kafka.consumer.SimpleConsumer)
WARN [ReplicaFetcherThread-0-0], Error in fetch Name: FetchRequest;
Version: 0; CorrelationId: 183; ClientId: ReplicaFetcherThread-0-0;
ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo:
[test.app4,0] -> PartitionFetchInfo(2,1048576)
(kafka.server.ReplicaFetcherThread)
java.net.ConnectException: Connection refused
    ...
[2014-04-02 12:56:33,816] INFO Reconnect due to socket error: null
(kafka.consumer.SimpleConsumer)

5. shutdown kafka.broker1
--> the initialization of kafka.broker2 is complete. System works fine.

6. startup kafka.broker1
--> System works fine


Please advise how to achieve high availability and what is wrong in this
case.

Regards,
Anatoly

Re: kafka availability

Posted by Jun Rao <ju...@gmail.com>.
bin/kafka-topics.sh --zookeeper zookeeper1 --topic test.app4 --describe
Topic:test.app4    PartitionCount:1    ReplicationFactor:2    Configs:
    Topic: test.app4    Partition: 0    Leader: 1    Replicas: 1,0    Isr: 1

Is broker 0 up when you did the above? Normally, when both brokers are up,
isr should include both.

Thanks,

Jun


On Wed, Apr 2, 2014 at 5:53 AM, Anatoly Deyneka <ad...@gmail.com> wrote:

> Hi,
>
> I perform availability tests on next kafka setup:
> - 3 nodes(2f+1) for zookeeper(zookeeper1, zookeeper2, zookeeper3)
> - 2 nodes(f+1) for kafka brokers(id:0,host:kafka.broker1,port:9092;
> id:1,host:kafka.broker2,port:9092)
> I use java producer and console consumer.
>
> Test stepts:
> 1. create topic test.app4
>
> bin/kafka-topics.sh --zookeeper zookeeper1 --create --replication-factor 2
> --partitions 1 --topic test.app4
>
> bin/kafka-topics.sh --zookeeper zookeeper1 --topic test.app4 --describe
> Topic:test.app4    PartitionCount:1    ReplicationFactor:2    Configs:
>     Topic: test.app4    Partition: 0    Leader: 1    Replicas: 1,0    Isr:
> 1
>
> 2. shutdown kafka.broker2(the leader for topic), kafka.broker1 is alive
> --> consumer fails in infinite loop:
>
> WARN Fetching topic metadata with correlation id 13 for topics
> [Set(test.app4)] from broker [id:0,host:kafka.broker1,port:9092] failed
> (kafka.client.ClientUtils$)
> java.net.ConnectException: Connection refused
> ...
> WARN
>
> [console-consumer-95116_ad-laptop-1396433768258-bb700f4c-leader-finder-thread],
> Failed to find leader for Set([test.app4,0])
> (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
> kafka.common.KafkaException: fetching topic metadata for topics
> [Set(test.app4)] from broker
> [ArrayBuffer(id:0,host:kafka.broker1,port:9092)] failed
> ...
> Caused by: java.net.ConnectException: Connection refused
> ...
> ERROR Producer connection to kafka.broker1:9092 unsuccessful
> (kafka.producer.SyncProducer)
> java.net.ConnectException: Connection refused
> ...
>
> bin/kafka-topics.sh --zookeeper zookeeper1 --topic test.app4
> --describeTopic:test.app4    PartitionCount:1    ReplicationFactor:2
> Configs:
>     Topic: test.app4    Partition: 0    Leader: 0    Replicas: 1,0    Isr:
> 0
>
> The java producer error:
> INFO  [kafka.producer.async.DefaultEventHandler] Back off for 100 ms before
> retrying send. Remaining retries = 2
> INFO  [kafka.client.ClientUtils$] Fetching metadata from broker
> id:0,host:kafka.broker1,port:9092 with correlation id 236 for 1 topic(s)
> Set(test.app4)
> ERROR [kafka.producer.SyncProducer] Producer connection to
> kafka.broker1:9092 unsuccessful
> java.net.ConnectException: Connection refused
>     ...
> WARN  [kafka.client.ClientUtils$] Fetching topic metadata with correlation
> id 236 for topics [Set(test.app4)] from broker
> [id:0,host:kafka.broker1,port:9092] failed
> java.net.ConnectException: Connection refused
>     ...
> INFO  [kafka.client.ClientUtils$] Fetching metadata from broker
> id:1,host:kafka.broker2,port:9092 with correlation id 236 for 1 topic(s)
> Set(test.app4)
> ERROR [kafka.producer.SyncProducer] Producer connection to
> kafka.broker2:9092 unsuccessful
> java.net.ConnectException: Connection refused
>    ...
> WARN  [kafka.client.ClientUtils$] Fetching topic metadata with correlation
> id 236 for topics [Set(test.app4)] from broker
> [id:1,host:kafka.broker2,port:9092] failed
> java.net.ConnectException: Connection refused
>     ...
> ERROR [kafka.utils.Utils$] fetching topic metadata for topics
> [Set(test.app4)] from broker
> [ArrayBuffer(id:0,host:kafka.broker1,port:9092,
> id:1,host:kafka.broker2,port:9092)] failed
> kafka.common.KafkaException: fetching topic metadata for topics
> [Set(test.app4)] from broker
> [ArrayBuffer(id:0,host:kafka.broker1,port:9092,
> id:1,host:kafka.broker2,port:9092)] failed
>     ...
> Caused by: java.net.ConnectException: Connection refused
>     ...
> 2014-04-02 13:31:12,419 DEBUG [kafka.producer.BrokerPartitionInfo] Getting
> broker partition info for topic test.app4
> 2014-04-02 13:31:12,419 DEBUG [kafka.producer.BrokerPartitionInfo]
> Partition [test.app4,0] has leader 1
> 2014-04-02 13:31:12,419 DEBUG [kafka.producer.async.DefaultEventHandler]
> Broker partitions registered for topic: test.app4 are 0
> 2014-04-02 13:31:12,419 DEBUG [kafka.producer.async.DefaultEventHandler]
> Sending 1 messages with no compression to [test.app4,0]
> 2014-04-02 13:31:12,420 DEBUG [kafka.producer.async.DefaultEventHandler]
> Producer sending messages with correlation id 238 for topics [test.app4,0]
> to broker 1 on kafka.broker2:9092
> 2014-04-02 13:31:12,423 ERROR [kafka.producer.SyncProducer] Producer
> connection to kafka.broker2:9092 unsuccessful
> java.net.ConnectException: Connection refused
>     ...
> WARN  [kafka.producer.async.DefaultEventHandler] Failed to send producer
> request with correlation id 238 to broker 1 with data for partitions
> [test.app4,0]
> java.net.ConnectException: Connection refused
>     ...
> INFO  [kafka.producer.async.DefaultEventHandler] Back off for 100 ms before
> retrying send. Remaining retries = 1
>
> 3. startup the kafka.broker2
> --> it fails in infinite loop too:
>
> INFO Reconnect due to socket error: null (kafka.consumer.SimpleConsumer)
> WARN [ReplicaFetcherThread-0-0], Error in fetch Name: FetchRequest;
> Version: 0; CorrelationId: 183; ClientId: ReplicaFetcherThread-0-0;
> ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo:
> [test.app4,0] -> PartitionFetchInfo(2,1048576)
> (kafka.server.ReplicaFetcherThread)
> java.net.ConnectException: Connection refused
>     ...
> [2014-04-02 12:56:33,816] INFO Reconnect due to socket error: null
> (kafka.consumer.SimpleConsumer)
>
> 5. shutdown kafka.broker1
> --> the initialization of kafka.broker2 is complete. System works fine.
>
> 6. startup kafka.broker1
> --> System works fine
>
>
> Please advise how to achieve high availability and what is wrong in this
> case.
>
> Regards,
> Anatoly
>