You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Tomas Pavelka (JIRA)" <ji...@apache.org> on 2018/04/17 13:46:00 UTC

[jira] [Commented] (AMQ-6937) Recycling TCP/IP stack on z/OS causes an infinite error loop in transport server

    [ https://issues.apache.org/jira/browse/AMQ-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440868#comment-16440868 ] 

Tomas Pavelka commented on AMQ-6937:
------------------------------------

I have discovered that the CPU spin loop can happen even on Linux: whenever the process runs out of file descriptors the accept enters a loop that spins the CPU at 100%. I have attached a patch that slows down exception handling in such case.

> Recycling TCP/IP stack on z/OS causes an infinite error loop in transport server
> --------------------------------------------------------------------------------
>
>                 Key: AMQ-6937
>                 URL: https://issues.apache.org/jira/browse/AMQ-6937
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.15.3
>            Reporter: Tomas Pavelka
>            Priority: Major
>         Attachments: AMQ-6937-cpu-spin-prevention.patch
>
>
> The ActiveMQ transport servers (e.g. TcpTransportServer) run the socket accept (java.net.ServerSocket#accept) in an infinite loop. The accept call can repeatedly fail with an exception spinning the CPU at full speed an filling up logs quickly.
> Here is an example of an exception that gets repeated indefinitely:
> java.net.SocketException: EDC5122I Input/output error. (Accept failed)
>     at java.net.ServerSocket.implAccept(ServerSocket.java:623)
>     at java.net.ServerSocket.accept(ServerSocket.java:582)
>     at org.apache.activemq.transport.tcp.TcpTransportServer.run(TcpTransportServer.java:351)
>     at java.lang.Thread.run(Thread.java:785)
> This is a common problem on z/OS because the pattern of running accept in a loop is used in many open source projects. For example, here is the same issue in Derby:
> https://issues.apache.org/jira/browse/DERBY-5347
> And here in Jetty:
> [https://github.com/eclipse/jetty.project/issues/283]
> Whenever the problem appears the socket becomes unusable. Would it be possible for ActiveMQ to allow to insert a custom org.apache.activemq.transport.TransportAcceptListener that would detect the problem and do a re-bind on the socket?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)