You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Chieh-Chun Chang <sa...@gmail.com> on 2016/12/14 01:58:14 UTC

Weird message regarding kafka Cluster

Hello Sir:

My name is Chieh-Chun Chang and and we have a problem about our Kafka prod
cluster.


Kafka client version
kafka-clients-0.9.0-kafka-2.0.0.jar
Kafka version
kafka_2.10-0.9.0-kafka-2.0.0.jar

Our kafka broker cluster is experiencing  under replicated partitions
problems and I found out this information
2016-12-13 07:20:24,393 DEBUG org.apache.kafka.common.network.Selector:
Connection with /10.1.205.75 disconnected
java.io.EOFException
at
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
at
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at kafka.network.Processor.poll(SocketServer.scala:472)
at kafka.network.Processor.run(SocketServer.scala:412)
at java.lang.Thread.run(Thread.java:745)
And this is source code



the ip address(10.1.205.75 ) might be random host macihne, might be yarn
cluster machine or one producer machine and it seems relating  to under
replicated partitions problems regarding timestamp.

This is my speculation.
I did research on this and it seems that 0.9.0 will not keep alive socket
connection so this might happen.

but it seems that once it happens, it will delay yarn cluster execution
time so i am wondering if this is a expected behavior.

Would you mind commenting on this  why this might happen?

Thank you very much,
Chieh-Chun Chang