You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Michael Sparr <mi...@goomzee.com> on 2016/09/12 16:33:51 UTC

Too many open files

5-node Kafka cluster, bare metal, Ubuntu 14.04.x LTS with 64GB RAM, 8-core, 960GB SSD boxes and a single node in cluster is filling logs with the following:

[2016-09-12 09:34:49,522] ERROR Error while accepting connection (kafka.network.Acceptor)
java.io.IOException: Too many open files
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
	at kafka.network.Acceptor.accept(SocketServer.scala:323)
	at kafka.network.Acceptor.run(SocketServer.scala:268)
	at java.lang.Thread.run(Thread.java:745)

No other nodes in cluster have this issue. Separate application server has consumers/producers using librdkafka + confluent kafka python library with a few million messages published to under 100 topics.

For days now the /var/log/kafka/kafka.server.log.N are filling up server with this message and using up all space on only a single server node in cluster. I have soft/hard limits at 65,535 for all users so > ulimit -n reveals 65535

Is there a setting I should add from librdkafka config in the Python producer clients to shorten socket connections even further to avoid this or something else going on?

Should I write this as issue in Github repo and if so, which project?


Thanks!


Re: Too many open files

Posted by Jaikiran Pai <ja...@gmail.com>.
What does the output of:

lsof -p <broker-pid>

show on that specific node?

-Jaikiran

On Monday 12 September 2016 10:03 PM, Michael Sparr wrote:
> 5-node Kafka cluster, bare metal, Ubuntu 14.04.x LTS with 64GB RAM, 8-core, 960GB SSD boxes and a single node in cluster is filling logs with the following:
>
> [2016-09-12 09:34:49,522] ERROR Error while accepting connection (kafka.network.Acceptor)
> java.io.IOException: Too many open files
> 	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> 	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> 	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> 	at kafka.network.Acceptor.accept(SocketServer.scala:323)
> 	at kafka.network.Acceptor.run(SocketServer.scala:268)
> 	at java.lang.Thread.run(Thread.java:745)
>
> No other nodes in cluster have this issue. Separate application server has consumers/producers using librdkafka + confluent kafka python library with a few million messages published to under 100 topics.
>
> For days now the /var/log/kafka/kafka.server.log.N are filling up server with this message and using up all space on only a single server node in cluster. I have soft/hard limits at 65,535 for all users so > ulimit -n reveals 65535
>
> Is there a setting I should add from librdkafka config in the Python producer clients to shorten socket connections even further to avoid this or something else going on?
>
> Should I write this as issue in Github repo and if so, which project?
>
>
> Thanks!
>
>


Re: Too many open files

Posted by Jaikiran Pai <ja...@gmail.com>.
What does the output of:

lsof -p <broker-pid>

show?

-Jaikiran

On Monday 12 September 2016 10:03 PM, Michael Sparr wrote:
> 5-node Kafka cluster, bare metal, Ubuntu 14.04.x LTS with 64GB RAM, 8-core, 960GB SSD boxes and a single node in cluster is filling logs with the following:
>
> [2016-09-12 09:34:49,522] ERROR Error while accepting connection (kafka.network.Acceptor)
> java.io.IOException: Too many open files
> 	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> 	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> 	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> 	at kafka.network.Acceptor.accept(SocketServer.scala:323)
> 	at kafka.network.Acceptor.run(SocketServer.scala:268)
> 	at java.lang.Thread.run(Thread.java:745)
>
> No other nodes in cluster have this issue. Separate application server has consumers/producers using librdkafka + confluent kafka python library with a few million messages published to under 100 topics.
>
> For days now the /var/log/kafka/kafka.server.log.N are filling up server with this message and using up all space on only a single server node in cluster. I have soft/hard limits at 65,535 for all users so > ulimit -n reveals 65535
>
> Is there a setting I should add from librdkafka config in the Python producer clients to shorten socket connections even further to avoid this or something else going on?
>
> Should I write this as issue in Github repo and if so, which project?
>
>
> Thanks!
>
>