You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Jerry Cwiklik <cw...@us.ibm.com> on 2013/08/19 21:51:00 UTC
Broker leaks FDs - Too many open files
Our production broker (v.5.6.0) keeps dying while in heavy use. The broker
log is filled with:
2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
- Could not accept connection : java.net.SocketException: Too many open
files
2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
- Could not accept connection : java.net.SocketException: Too many open
files
2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
- Could not accept connection : java.net.SocketException: Too many open
files
This is logged at such a rapid rate that the logs roll and hide the initial
error/warning. We capture open fd of the broker's process and notice that
when the broker starts to croak the open fd count just explodes. Here is
part of the open fd log. The first column shows broker's open fds and each
line is logged every 60 secs.
1284 12569 160194 -- normal count
1294 12669 161438
1305 12779 162812
1318 12909 164426
1328 13009 165658 --------- FD explosion
1393 13659 173816
1528 15009 190748
1611 15839 201152
1701 16739 212419
1951 19239 243520
2310 22830 290374
2667 26399 332362
3013 29859 375262
3369 33422 422638
3729 37019 464017
4111 40841 515342
4484 44570 561933
4870 48432 609992
5249 52219 652157
5634 56071 705356
6019 59919 747457
6484 64571 811476
6892 68652 862375
7307 72802 914122
7727 77002 966555
8129 81022 1016717
8336 83090 1042601
8336 83090 1042584
8336 83090 1042583
It normally shows ~1300 fds and this is more or less constant overtime, but
eventually it rapidly increases to 8336 and the broker becomes unusable. The
ulimit is set to 4094. The netstat shows a ton of sockets in CLOSE_WAIT
suggesting that the broker is not closing its side of a socket.
I found related open issue
https://issues.apache.org/jira/browse/AMQ-4531?page=com.atlassian.jirafisheyeplugin:fisheye-issuepanel
This Jira states that the problem surfaces in 5.8.0 and when
maximumConnections is set. We dont use this setting and we run with an older
version of AMQ. Any ideas how to deal with this? Would closeAsync=false have
any effect?
JC
--
View this message in context: http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Broker leaks FDs - Too many open files
Posted by DV <vi...@gmail.com>.
We've had similar issue in the past, which went away with the following
changes:
* adding closeAsync=false to transportConnectors
* using nio instead of tcp in transportConnectors
* setting ulimits to unlimited for activemq user
* fine-tuning kahaDB settings (enableIndexWriteAsync="true"
enableJournalDiskSyncs="false" journalMaxFileLength="256mb")
* fine-tuning topic and queue settings (destinationPolicy)
* enabling producer flow control
However, all this fine-tuning can only do so much, so ultimately we had to:
* reduce broker usage by splitting datasets onto multiple brokers
* optimize consumers to reduce the length of time a message spends on the
broker
The less messages broker has to hold on to, the less likely you'll run into
some sort of a limit.
On Tue, Aug 20, 2013 at 9:04 AM, Jerry Cwiklik <cw...@us.ibm.com> wrote:
> Thanks, Paul. We are running on Linux (SLES). All clients use openwire. The
> broker is configured
> with producerFlowControl="false", optimizedDispatch="true" for both queues
> and topics.
>
> The openwire connector configured with transport.soWriteTimeout=45000. We
> dont use persistence for messaging. The broker's jvm is given 8Gig.
>
> JC
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496p4670525.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
Best regards, Dmitriy V.
Re: Broker leaks FDs - Too many open files
Posted by Pietro Romanazzi <p....@innovapuglia.it>.
etc/security/limits.conf
set nofile
regards,
-----Messaggio originale-----
From: Jerry Cwiklik
Sent: Tuesday, August 20, 2013 6:04 PM
To: users@activemq.apache.org
Subject: Re: Broker leaks FDs - Too many open files
Thanks, Paul. We are running on Linux (SLES). All clients use openwire. The
broker is configured
with producerFlowControl="false", optimizedDispatch="true" for both queues
and topics.
The openwire connector configured with transport.soWriteTimeout=45000. We
dont use persistence for messaging. The broker's jvm is given 8Gig.
JC
--
View this message in context:
http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496p4670525.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Broker leaks FDs - Too many open files
Posted by Jerry Cwiklik <cw...@us.ibm.com>.
Thanks, Paul. We are running on Linux (SLES). All clients use openwire. The
broker is configured
with producerFlowControl="false", optimizedDispatch="true" for both queues
and topics.
The openwire connector configured with transport.soWriteTimeout=45000. We
dont use persistence for messaging. The broker's jvm is given 8Gig.
JC
--
View this message in context: http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496p4670525.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Broker leaks FDs - Too many open files
Posted by Paul Gale <pa...@gmail.com>.
On Mon, Aug 19, 2013 at 4:57 PM, Jerry Cwiklik <cw...@us.ibm.com> wrote:
> What are the consequences of using closeAsync="false"?
Setting async to false means that the socket close call is blocking and is
not handled in a separate thread.
This is preferable and common in web applications where STOMP clients, say,
will open a connection in the context of a web request, send a message,
then close the connection. Depending on the amount of web requests being
handled this translates into a lot of connect/disconnect traffic for the
broker.
It can take up to 30 seconds (on Linux systems, unless otherwise
configured) for a socket that's been 'closed' for its descriptor to be made
available for a new socket. If the close were made asynchronous then, as
you've seen, socket open requests are being made at a faster rate than
closed sockets can be recycled. Making the close operation synchronous
forces the client to block until it completes, thus controlling the number
of open requests keeping the descriptor count manageable.
What OS are you running your broker on? Please give more detail about your
clients, e.g., are they STOMP based etc? If they're STOMP based you might
want to consider configuring the STOMP transport connector to managed over
NIO for greater efficiency.
Thanks,
Paul
Re: Broker leaks FDs - Too many open files
Posted by Jerry Cwiklik <cw...@us.ibm.com>.
Christian, thanks. More questions:
What would be your theory why we see the "explosion" of open FDs? This is
triggered by some event. Any clues as to what that might be?
Also, isnt it a bug that the broker just goes bezerk logging the same thing
over and over? Our broker's logs are filled with the error below. I suspect
that perhaps Camel or Spring which are used on the client side are in some
recovery mode trying to establish a connection to the broker which keeps
failing.
2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
- Could not accept connection : java.net.SocketException: Too many open
files
..
What are the consequences of using closeAsync="false"?
JC
--
View this message in context: http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496p4670498.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Broker leaks FDs - Too many open files
Posted by Christian Posta <ch...@gmail.com>.
Yah, give that a try (as seen here
https://issues.apache.org/jira/browse/AMQ-1739).
Could also have a look at
http://activemq.apache.org/maven/apidocs/org/apache/activemq/transport/WriteTimeoutFilter.htmlper
this jira
https://issues.apache.org/jira/browse/AMQ-1993
On Mon, Aug 19, 2013 at 12:51 PM, Jerry Cwiklik <cw...@us.ibm.com> wrote:
> Our production broker (v.5.6.0) keeps dying while in heavy use. The broker
> log is filled with:
>
> 2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
> - Could not accept connection : java.net.SocketException: Too many open
> files
>
> 2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
> - Could not accept connection : java.net.SocketException: Too many open
> files
>
> 2013-07-28 00:04:08,264 [teTimeout=45000] ERROR TransportConnector
> - Could not accept connection : java.net.SocketException: Too many open
> files
>
> This is logged at such a rapid rate that the logs roll and hide the initial
> error/warning. We capture open fd of the broker's process and notice that
> when the broker starts to croak the open fd count just explodes. Here is
> part of the open fd log. The first column shows broker's open fds and each
> line is logged every 60 secs.
>
> 1284 12569 160194 -- normal count
> 1294 12669 161438
> 1305 12779 162812
> 1318 12909 164426
> 1328 13009 165658 --------- FD explosion
> 1393 13659 173816
> 1528 15009 190748
> 1611 15839 201152
> 1701 16739 212419
> 1951 19239 243520
> 2310 22830 290374
> 2667 26399 332362
> 3013 29859 375262
> 3369 33422 422638
> 3729 37019 464017
> 4111 40841 515342
> 4484 44570 561933
> 4870 48432 609992
> 5249 52219 652157
> 5634 56071 705356
> 6019 59919 747457
> 6484 64571 811476
> 6892 68652 862375
> 7307 72802 914122
> 7727 77002 966555
> 8129 81022 1016717
> 8336 83090 1042601
> 8336 83090 1042584
> 8336 83090 1042583
>
> It normally shows ~1300 fds and this is more or less constant overtime, but
> eventually it rapidly increases to 8336 and the broker becomes unusable.
> The
> ulimit is set to 4094. The netstat shows a ton of sockets in CLOSE_WAIT
> suggesting that the broker is not closing its side of a socket.
>
> I found related open issue
>
> https://issues.apache.org/jira/browse/AMQ-4531?page=com.atlassian.jirafisheyeplugin:fisheye-issuepanel
>
> This Jira states that the problem surfaces in 5.8.0 and when
> maximumConnections is set. We dont use this setting and we run with an
> older
> version of AMQ. Any ideas how to deal with this? Would closeAsync=false
> have
> any effect?
>
> JC
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Broker-leaks-FDs-Too-many-open-files-tp4670496.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta