You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Ankur C (JIRA)" <ji...@apache.org> on 2016/12/28 18:22:58 UTC
[jira] [Created] (KAFKA-4573) Producer sporadic timeout
Ankur C created KAFKA-4573:
------------------------------
Summary: Producer sporadic timeout
Key: KAFKA-4573
URL: https://issues.apache.org/jira/browse/KAFKA-4573
Project: Kafka
Issue Type: Bug
Reporter: Ankur C
We had production outage due to sporadic kafka producer timeout. About 1 to 2% of the message would timeout continuously.
Kafka version - 0.9.0.1
#Kafka brokers - 5
#Replication for each topic - 3
#Number of topics - ~30
#Number of partition - ~300
We have kafka 0.9.0.1 running in our 5 broker cluster for 1 month without any issues. However, on Dec 23rd we saw sporadic kafka producer timeout.
Issue begin around 6:51am and continued until we bounced kafka broker.
6:51am Underreplication started on small number of topics
6:53am All underreplication recovered
11:00am We restarted all kafka producer writer app but this didn't solve the sporadic kafka producer timeout issue
12:01pm We restarted all kafka broker after this the issue was resolved.
Kafka metrics and kafka logs doesn't show any major issue. There were no offline partitions during the outage and #controller was exactly 1.
We only saw following exception in kafka broker in controller.log. This log was present for all broker 0 to 4.
java.io.IOException: Connection to 2 was disconnected before the response was read at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84) at scala.Option.foreach(Option.scala:236) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84) at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80) at kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:129) at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139) at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:180) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)