You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by wanghai <wh...@outlook.com> on 2016/04/21 09:41:38 UTC
kafk2.8.0-0.8.1.1 too many close_wait
Hello
When
kafka cluster runs a period of time, I find the cluster stunk. Consumers can’t
read message from cluster.
The
kafka cluster has 5 brokers, they are 0,131,132,133,134. the kafka version is 2.8.0-0.8.1.1
I find a broker server 132 has too many
close_wait tcp, but other brokers haven’t close_wait. It still increments until
reaching “unix max open files”, and are killed as open too many files.
My “unix max open files” is 60000, I
think it is enough
tcp
70 0 192.168.10.132:9092 192.168.10.131:34266 CLOSE_WAIT 17193/java
tcp
70 0 192.168.10.132:9092 192.168.10.134:58585 CLOSE_WAIT 17193/java
tcp
70 0 192.168.10.132:9092 192.168.10.134:56025 CLOSE_WAIT 17193/java
tcp
70 0 192.168.10.132:9092 192.168.10.131:50139 CLOSE_WAIT 17193/java
tcp
62 0 192.168.10.132:9092 192.168.10.131:49371 CLOSE_WAIT 17193/java
tcp
253 0
192.168.10.132:9092
192.168.10.130:50909
CLOSE_WAIT 17193/java
tcp
62 0 192.168.10.132:9092 192.168.10.134:50905 CLOSE_WAIT 17193/java
tcp
70 0 192.168.10.132:9092 192.168.10.134:50393 CLOSE_WAIT 17193/java
tcp
72 0 192.168.10.132:9092 192.168.10.130:47837 CLOSE_WAIT 17193/java
tcp 70
0 192.168.10.132:9092
192.168.10.134:47321
CLOSE_WAIT 17193/java
tcp
1 0 192.168.10.132:9092 192.168.10.134:46809 CLOSE_WAIT 17193/java
The
broker server 132 logs
[2016-04-20 01:09:48,736] INFO Closing socket connection to
/192.168.10.130. (kafka.network.Processor)
[2016-04-20 01:09:49,332] INFO Closing socket connection to
/192.168.10.130. (kafka.network.Processor)
[2016-04-20 01:09:51,523] ERROR Closing socket for /192.168.10.133 because
of error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
at
sun.nio.ch.FileDispatcher.read0(Native Method)
at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at
sun.nio.ch.IOUtil.read(IOUtil.java:206)
at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at
kafka.utils.Utils$.read(Utils.scala:375)
at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at
kafka.network.Processor.read(SocketServer.scala:347)
at
kafka.network.Processor.run(SocketServer.scala:245)
at
java.lang.Thread.run(Thread.java:619)
[2016-04-20 01:09:54,023] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:56,285] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:56,968] ERROR Closing socket for /192.168.10.133
because of error (kafka.network.Processor)
java.io.IOException: Broken pipe
at
sun.nio.ch.FileDispatcher.write0(Native Method)
at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at
sun.nio.ch.IOUtil.write(IOUtil.java:75)
at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at
kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:67)
at
kafka.network.MultiSend.writeTo(Transmission.scala:102)
at
kafka.api.TopicDataSend.writeTo(FetchResponse.scala:124)
at
kafka.network.MultiSend.writeTo(Transmission.scala:102)
at
kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:219)
at
kafka.network.Processor.write(SocketServer.scala:375)
at
kafka.network.Processor.run(SocketServer.scala:247)
at java.lang.Thread.run(Thread.java:619)
[2016-04-20 01:09:56,971] INFO Closing socket connection to
/192.168.10.130. (kafka.network.Processor)
[2016-04-20 01:09:57,328] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:57,682] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:57,683] ERROR Closing socket for /192.168.10.131
because of error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
at
sun.nio.ch.FileDispatcher.read0(Native Method)
at
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at
sun.nio.ch.IOUtil.read(IOUtil.java:206)
at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at
kafka.utils.Utils$.read(Utils.scala:375)
at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at
kafka.network.Processor.read(SocketServer.scala:347)
at
kafka.network.Processor.run(SocketServer.scala:245)
at
java.lang.Thread.run(Thread.java:619)
[2016-04-20 01:09:57,748] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:57,921] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:58,099] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:58,116] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:58,163] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:58,442] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:58,541] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:58,542] INFO Closing socket connection to
/192.168.10.130. (kafka.network.Processor)
[2016-04-20 01:09:58,740] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:58,740] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:58,915] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:58,915] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:58,916] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:58,980] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:58,980] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:58,980] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:59,115] ERROR Closing socket for /192.168.10.133
because of error (kafka.network.Processor)
java.io.IOException: Broken pipe
at
sun.nio.ch.FileDispatcher.write0(Native Method)
at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at
sun.nio.ch.IOUtil.write(IOUtil.java:75)
at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at
kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:67)
at kafka.network.MultiSend.writeTo(Transmission.scala:102)
at
kafka.api.TopicDataSend.writeTo(FetchResponse.scala:124)
at
kafka.network.MultiSend.writeTo(Transmission.scala:102)
at
kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:219)
at
kafka.network.Processor.write(SocketServer.scala:375)
at
kafka.network.Processor.run(SocketServer.scala:247)
at
java.lang.Thread.run(Thread.java:619)
[2016-04-20 01:09:59,115] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:59,115] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:09:59,329] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:59,329] INFO Closing socket connection to
/192.168.10.134. (kafka.network.Processor)
[2016-04-20 01:09:59,329] INFO Closing socket connection to
/192.168.10.133. (kafka.network.Processor)
[2016-04-20 01:09:59,332] INFO Closing socket connection to
/192.168.10.131. (kafka.network.Processor)
[2016-04-20 01:13:43,821] INFO Partition [realtime_hardware,6] on
broker 132: Shrinking ISR for partition [realtime_hardware,6] from 132,134,131
to 132 (kafka.cluster.Partition)
[2016-04-20 01:13:43,822] INFO Partition [realtime_hardware_meta,9]
on broker 132: Shrinking ISR for partition [realtime_hardware_meta,9] from
132,133,131 to 132 (kafka.cluster.Partition)
[2016-04-20 01:13:43,823] INFO Partition [realtime_expansion,5] on
broker 132: Shrinking ISR for partition [realtime_expansion,5] from 132,133 to
132 (kafka.cluster.Partition)
[2016-04-20 01:13:43,824] INFO Partition [realtime_capacity,11] on
broker 132: Shrinking ISR for partition [realtime_capacity,11] from 132,134,131
to 132 (kafka.cluster.Partition)
[2016-04-20 01:13:43,825] INFO Partition [nginx_log,14] on broker 132:
Shrinking ISR for partition [nginx_log,14] from 132,133,131 to 132
(kafka.cluster.Partition)
[2016-04-20 01:13:43,825] INFO Partition [nginx_log,8] on broker
132: Shrinking ISR for partition [nginx_log,8] from 132,133,131 to 132
(kafka.cluster.Partition)
[2016-04-20 01:13:43,826] INFO Partition [realtime_heartbeat,12] on
broker 132: Shrinking ISR for partition [realtime_heartbeat,12] from
132,134,131 to 132 (kafka.cluster.Partition)
So
I discard the borker 132,and restart kafka cluster. After 24 hours, the problem
appears again. It happens to 131.
I don’t know how
to do. Please help me.
Best wishes!
Re: kafk2.8.0-0.8.1.1 too many close_wait
Posted by Manikumar Reddy <ma...@gmail.com>.
We have fixed similar issues in 0.8.2.0 release. you should consider
moving to latest releases.
On Thu, Apr 21, 2016 at 1:11 PM, wanghai <wh...@outlook.com> wrote:
>
>
>
> Hello
>
> When
> kafka cluster runs a period of time, I find the cluster stunk. Consumers
> can’t
> read message from cluster.
>
> The
> kafka cluster has 5 brokers, they are 0,131,132,133,134. the kafka version
> is 2.8.0-0.8.1.1
>
>
>
> I find a broker server 132 has too many
> close_wait tcp, but other brokers haven’t close_wait. It still increments
> until
> reaching “unix max open files”, and are killed as open too many files.
>
> My “unix max open files” is 60000, I
> think it is enough
>
>
>
> tcp
> 70 0 192.168.10.132:9092 192.168.10.131:34266
> CLOSE_WAIT 17193/java
>
> tcp
> 70 0 192.168.10.132:9092 192.168.10.134:58585
> CLOSE_WAIT 17193/java
>
> tcp
> 70 0 192.168.10.132:9092 192.168.10.134:56025
> CLOSE_WAIT 17193/java
>
> tcp
> 70 0 192.168.10.132:9092 192.168.10.131:50139
> CLOSE_WAIT 17193/java
>
> tcp
> 62 0 192.168.10.132:9092 192.168.10.131:49371
> CLOSE_WAIT 17193/java
>
> tcp
> 253 0
> 192.168.10.132:9092
> 192.168.10.130:50909
> CLOSE_WAIT 17193/java
>
> tcp
> 62 0 192.168.10.132:9092 192.168.10.134:50905
> CLOSE_WAIT 17193/java
>
> tcp
> 70 0 192.168.10.132:9092 192.168.10.134:50393
> CLOSE_WAIT 17193/java
>
> tcp
> 72 0 192.168.10.132:9092 192.168.10.130:47837
> CLOSE_WAIT 17193/java
>
> tcp 70
> 0 192.168.10.132:9092
> 192.168.10.134:47321
> CLOSE_WAIT 17193/java
>
> tcp
> 1 0 192.168.10.132:9092 192.168.10.134:46809
> CLOSE_WAIT 17193/java
>
>
>
>
>
>
>
>
>
>
> The
> broker server 132 logs
>
>
>
> [2016-04-20 01:09:48,736] INFO Closing socket connection to
> /192.168.10.130. (kafka.network.Processor)
>
> [2016-04-20 01:09:49,332] INFO Closing socket connection to
> /192.168.10.130. (kafka.network.Processor)
>
> [2016-04-20 01:09:51,523] ERROR Closing socket for /192.168.10.133 because
> of error (kafka.network.Processor)
>
> java.io.IOException: Connection reset by peer
>
> at
> sun.nio.ch.FileDispatcher.read0(Native Method)
>
> at
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>
> at
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
>
> at
> sun.nio.ch.IOUtil.read(IOUtil.java:206)
>
> at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
>
> at
> kafka.utils.Utils$.read(Utils.scala:375)
>
> at
>
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>
> at
> kafka.network.Processor.read(SocketServer.scala:347)
>
> at
> kafka.network.Processor.run(SocketServer.scala:245)
>
> at
> java.lang.Thread.run(Thread.java:619)
>
> [2016-04-20 01:09:54,023] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:56,285] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:56,968] ERROR Closing socket for /192.168.10.133
> because of error (kafka.network.Processor)
>
> java.io.IOException: Broken pipe
>
> at
> sun.nio.ch.FileDispatcher.write0(Native Method)
>
> at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>
> at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>
> at
> sun.nio.ch.IOUtil.write(IOUtil.java:75)
>
> at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>
> at
> kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:67)
>
> at
> kafka.network.MultiSend.writeTo(Transmission.scala:102)
>
> at
> kafka.api.TopicDataSend.writeTo(FetchResponse.scala:124)
>
> at
> kafka.network.MultiSend.writeTo(Transmission.scala:102)
>
> at
> kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:219)
>
> at
> kafka.network.Processor.write(SocketServer.scala:375)
>
> at
> kafka.network.Processor.run(SocketServer.scala:247)
>
> at java.lang.Thread.run(Thread.java:619)
>
> [2016-04-20 01:09:56,971] INFO Closing socket connection to
> /192.168.10.130. (kafka.network.Processor)
>
> [2016-04-20 01:09:57,328] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:57,682] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:57,683] ERROR Closing socket for /192.168.10.131
> because of error (kafka.network.Processor)
>
> java.io.IOException: Connection reset by peer
>
> at
> sun.nio.ch.FileDispatcher.read0(Native Method)
>
> at
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>
> at
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
>
> at
> sun.nio.ch.IOUtil.read(IOUtil.java:206)
>
> at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
>
> at
> kafka.utils.Utils$.read(Utils.scala:375)
>
> at
>
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>
> at
> kafka.network.Processor.read(SocketServer.scala:347)
>
> at
> kafka.network.Processor.run(SocketServer.scala:245)
>
> at
> java.lang.Thread.run(Thread.java:619)
>
> [2016-04-20 01:09:57,748] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:57,921] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,099] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,116] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,163] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,442] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,541] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,542] INFO Closing socket connection to
> /192.168.10.130. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,740] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,740] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,915] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,915] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,916] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,980] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,980] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:58,980] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,115] ERROR Closing socket for /192.168.10.133
> because of error (kafka.network.Processor)
>
> java.io.IOException: Broken pipe
>
> at
> sun.nio.ch.FileDispatcher.write0(Native Method)
>
> at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>
> at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>
> at
> sun.nio.ch.IOUtil.write(IOUtil.java:75)
>
> at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>
> at
> kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:67)
>
> at kafka.network.MultiSend.writeTo(Transmission.scala:102)
>
> at
> kafka.api.TopicDataSend.writeTo(FetchResponse.scala:124)
>
> at
> kafka.network.MultiSend.writeTo(Transmission.scala:102)
>
> at
> kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:219)
>
> at
> kafka.network.Processor.write(SocketServer.scala:375)
>
> at
> kafka.network.Processor.run(SocketServer.scala:247)
>
> at
> java.lang.Thread.run(Thread.java:619)
>
> [2016-04-20 01:09:59,115] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,115] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,329] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,329] INFO Closing socket connection to
> /192.168.10.134. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,329] INFO Closing socket connection to
> /192.168.10.133. (kafka.network.Processor)
>
> [2016-04-20 01:09:59,332] INFO Closing socket connection to
> /192.168.10.131. (kafka.network.Processor)
>
> [2016-04-20 01:13:43,821] INFO Partition [realtime_hardware,6] on
> broker 132: Shrinking ISR for partition [realtime_hardware,6] from
> 132,134,131
> to 132 (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,822] INFO Partition [realtime_hardware_meta,9]
> on broker 132: Shrinking ISR for partition [realtime_hardware_meta,9] from
> 132,133,131 to 132 (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,823] INFO Partition [realtime_expansion,5] on
> broker 132: Shrinking ISR for partition [realtime_expansion,5] from
> 132,133 to
> 132 (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,824] INFO Partition [realtime_capacity,11] on
> broker 132: Shrinking ISR for partition [realtime_capacity,11] from
> 132,134,131
> to 132 (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,825] INFO Partition [nginx_log,14] on broker 132:
> Shrinking ISR for partition [nginx_log,14] from 132,133,131 to 132
> (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,825] INFO Partition [nginx_log,8] on broker
> 132: Shrinking ISR for partition [nginx_log,8] from 132,133,131 to 132
> (kafka.cluster.Partition)
>
> [2016-04-20 01:13:43,826] INFO Partition [realtime_heartbeat,12] on
> broker 132: Shrinking ISR for partition [realtime_heartbeat,12] from
> 132,134,131 to 132 (kafka.cluster.Partition)
>
>
>
>
>
> So
> I discard the borker 132,and restart kafka cluster. After 24 hours, the
> problem
> appears again. It happens to 131.
>
> I don’t know how
> to do. Please help me.
>
>
>
> Best wishes!
>
>
>
>
>