You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alex Rudyy (JIRA)" <ji...@apache.org> on 2019/01/31 15:34:00 UTC

[jira] [Created] (QPID-8276) [Broker-J] Broker can leak closed NonBlockingConnection objects and eventually run out of heap memory

Alex Rudyy created QPID-8276:
--------------------------------

             Summary: [Broker-J] Broker can leak closed NonBlockingConnection objects and eventually run out of heap memory
                 Key: QPID-8276
                 URL: https://issues.apache.org/jira/browse/QPID-8276
             Project: Qpid
          Issue Type: Improvement
          Components: Broker-J
    Affects Versions: qpid-java-broker-7.0.6, qpid-java-broker-7.0.5, qpid-java-broker-7.0.4, qpid-java-broker-7.1.0, qpid-java-6.1.7, qpid-java-broker-7.0.1, qpid-java-broker-7.0.0, qpid-java-broker-7.0.2, qpid-java-broker-7.0.3
            Reporter: Alex Rudyy
             Fix For: qpid-java-broker-7.0.7, qpid-java-broker-7.1.1


The Qpid Broker-J can leak closed NonBlockingConnection objects.

The heap dump analysis of impacted broker instance revealed that leaked {{NonBlockingConnection}} objects are accumulated in {{SelectorThread.SelectionTask#_unscheduledConnections}} belonging to AMQP port IO pool. They have no ticker set and no state changed flag set ({{NonBlockingConnection#isStateChanged() == false)}}. As result, the NonBlockingConnection objects are not removed from {{SelectorThread#_unscheduledConnections}} on invocation of {{SelectorThread.SelectionTask#processUnscheduledConnections()}} called from {{SelectorThread.SelectionTask#performSelect()}}.

The {{NonBlockingConnection}} and underlying model object are in closed state.
 It seems that leaked {{NonBlockingConnection}} was closed as part of invocation {{NonBlockingConnection#doWork()}}. The connection was unregistered on {{VirtualHost}} IO pool and re-registered with port IO pool as part of invocation {{NetworkConnectionScheduler#processConnection}} At first, it was stored in collection {{SelectorThread.SelectionTask#_unregisteredConnections}}. Later on, it was moved from {{SelectorThread.SelectionTask#_unregisteredConnections}} to {{SelectorThread.SelectionTask#_unscheduledConnections}} as part of invocation {{SelectorThread.SelectionTask#reregisterUnregisteredConnections}} and stack there afterwards.

The TLS transport was used in leaked connection, but, I think that connection with plain transport can be leaked as well.

I suspect that connections were leaked in result of following scenario:
 * Invocation of {{SocketChannel#read(java.nio.ByteBuffer[])}} returned {{-1}} in {{NonBlockingConnection#readFromNetwork}}.
 * The flag {{NonBlockingConnection#_closed}} was set to {{true}}. The method {{ProtocolEngine#notifyWork()}} was not invoked to set {{state changed}} flag to {{true}}
 * The execution of {{NonBlockingConnection#doWork()}} ended up it connection shutdown (due to {{NonBlockingConnection#_closed}} being set) and following re-scheduling the connection on port IO scheduler. The latter resulted in connection being put into {{SelectorThread.SelectionTask#_unscheduledConnections}} as described above.

It seems that opening and closing frequent connections with connection life span {{>10s}} (required for tickers to be removed) can ended-up in connections being leaked as described in scenario above. It looks like connections which are closed orderly or closed in result of {{IOException}} being thrown from socket read/write operation are not effected by the defect.

The impacted broker instance can eventually crash with out of memory error. Broker memory monitoring and periodic broker restarts can mitigate the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org