You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by s....@fluent-software.de on 2020/07/06 14:37:18 UTC

Problem with replication?!

Hi there, 

 

I just have a problem with my kafka brokers, maybe a firewall issue, but I don’t know. I have got 3 Brokers at three different Servers (each with another IP) and on the first server running zookeeper:

 

Server1:9092 (zookeeper:2182)

Server2:9092

Server3:9092

 

And I have got a topic with a replication factor of three. If I try to publish new messages to that topic, I got the following error:

 

Confluent.Kafka.ProduceException`2[System.Byte[],System.String]: Local: Message timed out

   at Confluent.Kafka.Producer`2.ProduceAsync(TopicPartition topicPartition, Message`2 message, CancellationToken cancellationToken)

   at FluentSoftware.EventSourcing.KafkaProducerEventPublisher.PublishEvents[TEvent](String topic, IEnumerable`1 eventsToApply)

   at FluentSoftware.EventSourcing.CommandBus.SendAsync[TCommand](TCommand command)

   at Heliprinter.Mediator.Communication.OrderService.Controllers.OrderController.Post(PostOrderViewModel newOrder) in C:\agent\_work\15\s\Source\Services\Heliprinter.Mediator.Communication.OrderService\Controllers\OrderController.cs:line 55

 

and the REST API responses with an Gateway timeout. Could anyone tell me 

 

*	If this error could be an replication issue?
*	How I can debug these issue?
*	And where I can find replication details in the logs?

 

Kind regards, 

Sebastian. 

 


AW: AW: Problem with replication?!

Posted by Sebastian Schabbach <in...@fluent-software.de>.
Hi, 

Here is how to config looks like:

############################# Socket Server Settings #############################
listeners=PLAINTEXT://:9092
#       advertised.listeners=193.135.9.23:9092



To connect using localhost is just a test, I need to connect via IP-Adress and Port only. Could you tell me what's the correct config will look like?

Kind regards, 
Sebastian


-----Ursprüngliche Nachricht-----
Von: Ricardo Ferreira [mailto:riferrei@riferrei.com] 
Gesendet: Wednesday, 8 July 2020 16:20
An: s.schabbach@fluent-software.de; users@kafka.apache.org
Betreff: Re: AW: Problem with replication?!

Hi Sebastian,

Something you can investigate here is which value has been set to the configuration property `advertised.listeners`. Your client is trying to establish a connection over the 9092 port using the `127.0.0.1` interface. Check if this is a valid listener for the Kafka broker.

Thanks,

-- Ricardo

On 7/8/20 7:14 AM, s.schabbach@fluent-software.de wrote:
>
> Hi,
>
> Al right, I am running kafka from an Debian Linux operating system – 
> no docker images are involved. I just can exclude any problems with my 
> producer application – the following command fails too:
>
> /home/kafka/kafka/bin/kafka-topics.sh --list --bootstrap-server
> 127.0.0.1:9092
>
> Error while executing topic command : 
> org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at
> 1594206614223 after 1 attempt(s)
>
> [2020-07-08 13:10:14,227] ERROR
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at
> 1594206614223 after 1 attempt(s)
>
>         at
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFu
> tureImpl.java:45)
>
>         at
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutu
> reImpl.java:32)
>
>         at
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(K
> afkaFutureImpl.java:89)
>
>         at
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.
> java:260)
>
>         at
> kafka.admin.TopicCommand$AdminClientTopicService.getTopics(TopicComman
> d.scala:333)
>
>         at
> kafka.admin.TopicCommand$AdminClientTopicService.listTopics(TopicComma
> nd.scala:252)
>
>         at kafka.admin.TopicCommand$.main(TopicCommand.scala:66)
>
>         at kafka.admin.TopicCommand.main(TopicCommand.scala)
>
> Caused by: org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at
> 1594206614223 after 1 attempt(s)
>
> Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out 
> waiting for a node assignment.
>
> (kafka.admin.TopicCommand$)
>
> This means, that there is a connection problem with the client, 
> supporting what you had said. I just deactivated the firewall (ufw in 
> this case), but it does not look better.
>
> Could you please suggest any options for further investigations?
>
> Is there any logging that could help?
>
> In the kafka log I found the following exception:
>
> 2020-07-08 13:09:33,066] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:39,072] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:45,075] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:51,079] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:57,085] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:03,088] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:09,092] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:14,239] DEBUG [SocketServer brokerId=1] Connection 
> with /127.0.0.1 disconnected 
> (org.apache.kafka.common.network.Selector)
>
> java.io.EOFException
>
>         at
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive
> .java:97)
>
>         at
> org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java
> :448)
>
>         at
> org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:39
> 8)
>
>         at
> org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678
> )
>
>         at
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.ja
> va:580)
>
>         at
> org.apache.kafka.common.network.Selector.poll(Selector.java:485)
>
>         at kafka.network.Processor.poll(SocketServer.scala:861)
>
>         at kafka.network.Processor.run(SocketServer.scala:760)
>
>         at java.lang.Thread.run(Thread.java:748)
>
> [2020-07-08 13:10:15,096] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:21,099] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:27,104] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:33,108] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:39,112] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:45,115] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:51,120] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:57,124] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:03,131] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:09,136] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:15,140] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:21,144] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:27,148] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:33,152] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:39,155] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:45,162] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:51,168] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:57,172] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:03,175] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:09,181] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:15,187] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:21,193] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:27,199] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:33,204] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:39,207] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:45,212] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:51,216] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:57,220] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:03,226] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:09,231] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:15,236] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:21,240] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:27,246] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:33,251] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> Kind regards,
>
> Sebastian
>
> *Von:*Ricardo Ferreira <ri...@riferrei.com>
> *Gesendet:* Dienstag, 7. Juli 2020 17:58
> *An:* users@kafka.apache.org; s.schabbach@fluent-software.de
> *Betreff:* Re: Problem with replication?!
>
> Given the stack trace you've shared below I can tell that this *is not 
> a replication issue* but rather -- your producer is not being able to 
> write records into the partitions because the brokers that host them 
> are unavailable. Now, I know that they are indeed running so 
> "unavailable" here means that from the network perspective your client 
> app is not being able to establish a TCP connection with them.
>
> Firewall is something you should look into, as well as SELinux, and 
> Docker networking (if this is running on Docker).
>
> Thanks,
>
> -- Ricardo
>
> On 7/6/20 10:37 AM, s.schabbach@fluent-software.de 
> <ma...@fluent-software.de> wrote:
>
>     Hi there,
>
>       
>
>     I just have a problem with my kafka brokers, maybe a firewall issue, but I don’t know. I have got 3 Brokers at three different Servers (each with another IP) and on the first server running zookeeper:
>
>       
>
>     Server1:9092 (zookeeper:2182)
>
>     Server2:9092
>
>     Server3:9092
>
>       
>
>     And I have got a topic with a replication factor of three. If I try to publish new messages to that topic, I got the following error:
>
>       
>
>     Confluent.Kafka.ProduceException`2[System.Byte[],System.String]: 
> Local: Message timed out
>
>         at Confluent.Kafka.Producer`2.ProduceAsync(TopicPartition 
> topicPartition, Message`2 message, CancellationToken 
> cancellationToken)
>
>         at 
> FluentSoftware.EventSourcing.KafkaProducerEventPublisher.PublishEvents
> [TEvent](String topic, IEnumerable`1 eventsToApply)
>
>         at 
> FluentSoftware.EventSourcing.CommandBus.SendAsync[TCommand](TCommand 
> command)
>
>         at 
> Heliprinter.Mediator.Communication.OrderService.Controllers.OrderContr
> oller.Post(PostOrderViewModel newOrder) in 
> C:\agent\_work\15\s\Source\Services\Heliprinter.Mediator.Communication
> .OrderService\Controllers\OrderController.cs:line 55
>
>       
>
>     and the REST API responses with an Gateway timeout. Could anyone 
> tell me
>
>       
>
>     * If this error could be an replication issue?
>
>     * How I can debug these issue?
>
>     * And where I can find replication details in the logs?
>
>       
>
>     Kind regards,
>
>     Sebastian.
>
>       
>


Re: AW: Problem with replication?!

Posted by Ricardo Ferreira <ri...@riferrei.com>.
Hi Sebastian,

Something you can investigate here is which value has been set to the 
configuration property `advertised.listeners`. Your client is trying to 
establish a connection over the 9092 port using the `127.0.0.1` 
interface. Check if this is a valid listener for the Kafka broker.

Thanks,

-- Ricardo

On 7/8/20 7:14 AM, s.schabbach@fluent-software.de wrote:
>
> Hi,
>
> Al right, I am running kafka from an Debian Linux operating system – 
> no docker images are involved. I just can exclude any problems with my 
> producer application – the following command fails too:
>
> /home/kafka/kafka/bin/kafka-topics.sh --list --bootstrap-server 
> 127.0.0.1:9092
>
> Error while executing topic command : 
> org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at 
> 1594206614223 after 1 attempt(s)
>
> [2020-07-08 13:10:14,227] ERROR 
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at 
> 1594206614223 after 1 attempt(s)
>
>         at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>
>         at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>
>         at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>
>         at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
>
>         at 
> kafka.admin.TopicCommand$AdminClientTopicService.getTopics(TopicCommand.scala:333)
>
>         at 
> kafka.admin.TopicCommand$AdminClientTopicService.listTopics(TopicCommand.scala:252)
>
>         at kafka.admin.TopicCommand$.main(TopicCommand.scala:66)
>
>         at kafka.admin.TopicCommand.main(TopicCommand.scala)
>
> Caused by: org.apache.kafka.common.errors.TimeoutException: 
> Call(callName=listTopics, deadlineMs=1594206614222) timed out at 
> 1594206614223 after 1 attempt(s)
>
> Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out 
> waiting for a node assignment.
>
> (kafka.admin.TopicCommand$)
>
> This means, that there is a connection problem with the client, 
> supporting what you had said. I just deactivated the firewall (ufw in 
> this case), but it does not look better.
>
> Could you please suggest any options for further investigations?
>
> Is there any logging that could help?
>
> In the kafka log I found the following exception:
>
> 2020-07-08 13:09:33,066] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:39,072] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:45,075] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:51,079] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:09:57,085] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:03,088] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:09,092] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:14,239] DEBUG [SocketServer brokerId=1] Connection 
> with /127.0.0.1 disconnected (org.apache.kafka.common.network.Selector)
>
> java.io.EOFException
>
>         at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:97)
>
>         at 
> org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:448)
>
>         at 
> org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:398)
>
>         at 
> org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678)
>
>         at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580)
>
>         at 
> org.apache.kafka.common.network.Selector.poll(Selector.java:485)
>
>         at kafka.network.Processor.poll(SocketServer.scala:861)
>
>         at kafka.network.Processor.run(SocketServer.scala:760)
>
>         at java.lang.Thread.run(Thread.java:748)
>
> [2020-07-08 13:10:15,096] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:21,099] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:27,104] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:33,108] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:39,112] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:45,115] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:51,120] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:10:57,124] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:03,131] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:09,136] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:15,140] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:21,144] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:27,148] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:33,152] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:39,155] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:45,162] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:51,168] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:11:57,172] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:03,175] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:09,181] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:15,187] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:21,193] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:27,199] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:33,204] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:39,207] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:45,212] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:51,216] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:12:57,220] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:03,226] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:09,231] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:15,236] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:21,240] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:27,246] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> [2020-07-08 13:13:33,251] DEBUG Got ping response for sessionid: 
> 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)
>
> Kind regards,
>
> Sebastian
>
> *Von:*Ricardo Ferreira <ri...@riferrei.com>
> *Gesendet:* Dienstag, 7. Juli 2020 17:58
> *An:* users@kafka.apache.org; s.schabbach@fluent-software.de
> *Betreff:* Re: Problem with replication?!
>
> Given the stack trace you've shared below I can tell that this *is not 
> a replication issue* but rather -- your producer is not being able to 
> write records into the partitions because the brokers that host them 
> are unavailable. Now, I know that they are indeed running so 
> "unavailable" here means that from the network perspective your client 
> app is not being able to establish a TCP connection with them.
>
> Firewall is something you should look into, as well as SELinux, and 
> Docker networking (if this is running on Docker).
>
> Thanks,
>
> -- Ricardo
>
> On 7/6/20 10:37 AM, s.schabbach@fluent-software.de 
> <ma...@fluent-software.de> wrote:
>
>     Hi there,
>
>       
>
>     I just have a problem with my kafka brokers, maybe a firewall issue, but I don’t know. I have got 3 Brokers at three different Servers (each with another IP) and on the first server running zookeeper:
>
>       
>
>     Server1:9092 (zookeeper:2182)
>
>     Server2:9092
>
>     Server3:9092
>
>       
>
>     And I have got a topic with a replication factor of three. If I try to publish new messages to that topic, I got the following error:
>
>       
>
>     Confluent.Kafka.ProduceException`2[System.Byte[],System.String]: Local: Message timed out
>
>         at Confluent.Kafka.Producer`2.ProduceAsync(TopicPartition topicPartition, Message`2 message, CancellationToken cancellationToken)
>
>         at FluentSoftware.EventSourcing.KafkaProducerEventPublisher.PublishEvents[TEvent](String topic, IEnumerable`1 eventsToApply)
>
>         at FluentSoftware.EventSourcing.CommandBus.SendAsync[TCommand](TCommand command)
>
>         at Heliprinter.Mediator.Communication.OrderService.Controllers.OrderController.Post(PostOrderViewModel newOrder) in C:\agent\_work\15\s\Source\Services\Heliprinter.Mediator.Communication.OrderService\Controllers\OrderController.cs:line 55
>
>       
>
>     and the REST API responses with an Gateway timeout. Could anyone tell me
>
>       
>
>     * If this error could be an replication issue?
>
>     * How I can debug these issue?
>
>     * And where I can find replication details in the logs?
>
>       
>
>     Kind regards,
>
>     Sebastian.
>
>       
>

AW: Problem with replication?!

Posted by s....@fluent-software.de.
Hi, 

 

Al right, I am running kafka from an Debian Linux operating system – no docker images are involved. I just can exclude any problems with my producer application – the following command fails too:

 

/home/kafka/kafka/bin/kafka-topics.sh --list --bootstrap-server 127.0.0.1:9092

Error while executing topic command : org.apache.kafka.common.errors.TimeoutException: Call(callName=listTopics, deadlineMs=1594206614222) timed out at 1594206614223 after 1 attempt(s)

[2020-07-08 13:10:14,227] ERROR java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listTopics, deadlineMs=1594206614222) timed out at 1594206614223 after 1 attempt(s)

        at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)

        at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)

        at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)

        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)

        at kafka.admin.TopicCommand$AdminClientTopicService.getTopics(TopicCommand.scala:333)

        at kafka.admin.TopicCommand$AdminClientTopicService.listTopics(TopicCommand.scala:252)

        at kafka.admin.TopicCommand$.main(TopicCommand.scala:66)

        at kafka.admin.TopicCommand.main(TopicCommand.scala)

Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listTopics, deadlineMs=1594206614222) timed out at 1594206614223 after 1 attempt(s)

Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.

(kafka.admin.TopicCommand$)

 

This means, that there is a connection problem with the client, supporting what you had said. I just deactivated the firewall (ufw in this case), but it does not look better. 

Could you please suggest any options for further investigations?

 

Is there any logging that could help?

 

In the kafka log I found the following exception:

 

2020-07-08 13:09:33,066] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:09:39,072] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:09:45,075] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:09:51,079] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:09:57,085] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:03,088] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:09,092] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:14,239] DEBUG [SocketServer brokerId=1] Connection with /127.0.0.1 disconnected (org.apache.kafka.common.network.Selector)

java.io.EOFException

        at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:97)

        at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:448)

        at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:398)

        at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678)

        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580)

        at org.apache.kafka.common.network.Selector.poll(Selector.java:485)

        at kafka.network.Processor.poll(SocketServer.scala:861)

        at kafka.network.Processor.run(SocketServer.scala:760)

        at java.lang.Thread.run(Thread.java:748)

[2020-07-08 13:10:15,096] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:21,099] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:27,104] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:33,108] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:39,112] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:45,115] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:51,120] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:10:57,124] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:03,131] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:09,136] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:15,140] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:21,144] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:27,148] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:33,152] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:39,155] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:45,162] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:51,168] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:11:57,172] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:03,175] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:09,181] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:15,187] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:21,193] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:27,199] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 0ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:33,204] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:39,207] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:45,212] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:51,216] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:12:57,220] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:03,226] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:09,231] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:15,236] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:21,240] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:27,246] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

[2020-07-08 13:13:33,251] DEBUG Got ping response for sessionid: 0x100003f85c80023 after 1ms (org.apache.zookeeper.ClientCnxn)

 

 

Kind regards, 

Sebastian

 

Von: Ricardo Ferreira <ri...@riferrei.com> 
Gesendet: Dienstag, 7. Juli 2020 17:58
An: users@kafka.apache.org; s.schabbach@fluent-software.de
Betreff: Re: Problem with replication?!  

 

Given the stack trace you've shared below I can tell that this is not a replication issue but rather -- your producer is not being able to write records into the partitions because the brokers that host them are unavailable. Now, I know that they are indeed running so "unavailable" here means that from the network perspective your client app is not being able to establish a TCP connection with them.

Firewall is something you should look into, as well as SELinux, and Docker networking (if this is running on Docker).

Thanks,

-- Ricardo

On 7/6/20 10:37 AM, s.schabbach@fluent-software.de <ma...@fluent-software.de>  wrote:

Hi there, 
 
 
 
I just have a problem with my kafka brokers, maybe a firewall issue, but I don’t know. I have got 3 Brokers at three different Servers (each with another IP) and on the first server running zookeeper:
 
 
 
Server1:9092 (zookeeper:2182)
 
Server2:9092
 
Server3:9092
 
 
 
And I have got a topic with a replication factor of three. If I try to publish new messages to that topic, I got the following error:
 
 
 
Confluent.Kafka.ProduceException`2[System.Byte[],System.String]: Local: Message timed out
 
   at Confluent.Kafka.Producer`2.ProduceAsync(TopicPartition topicPartition, Message`2 message, CancellationToken cancellationToken)
 
   at FluentSoftware.EventSourcing.KafkaProducerEventPublisher.PublishEvents[TEvent](String topic, IEnumerable`1 eventsToApply)
 
   at FluentSoftware.EventSourcing.CommandBus.SendAsync[TCommand](TCommand command)
 
   at Heliprinter.Mediator.Communication.OrderService.Controllers.OrderController.Post(PostOrderViewModel newOrder) in C:\agent\_work\15\s\Source\Services\Heliprinter.Mediator.Communication.OrderService\Controllers\OrderController.cs:line 55
 
 
 
and the REST API responses with an Gateway timeout. Could anyone tell me 
 
 
 
* If this error could be an replication issue?
* How I can debug these issue?
* And where I can find replication details in the logs?
 
 
 
Kind regards, 
 
Sebastian. 
 
 
 
 


Re: Problem with replication?!

Posted by Ricardo Ferreira <ri...@riferrei.com>.
Given the stack trace you've shared below I can tell that this *is not a 
replication issue* but rather -- your producer is not being able to 
write records into the partitions because the brokers that host them are 
unavailable. Now, I know that they are indeed running so "unavailable" 
here means that from the network perspective your client app is not 
being able to establish a TCP connection with them.

Firewall is something you should look into, as well as SELinux, and 
Docker networking (if this is running on Docker).

Thanks,

-- Ricardo

On 7/6/20 10:37 AM, s.schabbach@fluent-software.de wrote:
> Hi there,
>
>   
>
> I just have a problem with my kafka brokers, maybe a firewall issue, but I don’t know. I have got 3 Brokers at three different Servers (each with another IP) and on the first server running zookeeper:
>
>   
>
> Server1:9092 (zookeeper:2182)
>
> Server2:9092
>
> Server3:9092
>
>   
>
> And I have got a topic with a replication factor of three. If I try to publish new messages to that topic, I got the following error:
>
>   
>
> Confluent.Kafka.ProduceException`2[System.Byte[],System.String]: Local: Message timed out
>
>     at Confluent.Kafka.Producer`2.ProduceAsync(TopicPartition topicPartition, Message`2 message, CancellationToken cancellationToken)
>
>     at FluentSoftware.EventSourcing.KafkaProducerEventPublisher.PublishEvents[TEvent](String topic, IEnumerable`1 eventsToApply)
>
>     at FluentSoftware.EventSourcing.CommandBus.SendAsync[TCommand](TCommand command)
>
>     at Heliprinter.Mediator.Communication.OrderService.Controllers.OrderController.Post(PostOrderViewModel newOrder) in C:\agent\_work\15\s\Source\Services\Heliprinter.Mediator.Communication.OrderService\Controllers\OrderController.cs:line 55
>
>   
>
> and the REST API responses with an Gateway timeout. Could anyone tell me
>
>   
>
> *	If this error could be an replication issue?
> *	How I can debug these issue?
> *	And where I can find replication details in the logs?
>
>   
>
> Kind regards,
>
> Sebastian.
>
>   
>
>