You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alex Rudyy (Jira)" <ji...@apache.org> on 2021/05/25 16:12:00 UTC

[jira] [Updated] (QPID-8514) High CPU usage and stucked connections

     [ https://issues.apache.org/jira/browse/QPID-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Rudyy updated QPID-8514:
-----------------------------
    Fix Version/s: qpid-java-broker-8.0.5

> High CPU usage and stucked connections
> --------------------------------------
>
>                 Key: QPID-8514
>                 URL: https://issues.apache.org/jira/browse/QPID-8514
>             Project: Qpid
>          Issue Type: Bug
>          Components: Broker-J
>    Affects Versions: qpid-java-broker-8.0.4
>            Reporter: Daniil Kirilyuk
>            Assignee: Alex Rudyy
>            Priority: Minor
>             Fix For: qpid-java-broker-8.0.5
>
>         Attachments: QPID-8514 - broker-log.zip, RingPolicyConcurrencyTest.java, threaddump.zip
>
>
> There were around 600 connections in state CLOSE__WAIT: 
> {noformat}
> [prod01@fdmchd log]$ netstat -Ainet | grep 22101 | grep TIME_WAIT | wc -l
> 38
> [prod01@fdmchd log]$ netstat -Ainet | grep 22101 | grep CLOSE_WAIT | wc -l
> 625
> [prod01@fdmchd log]$ netstat -Ainet | grep 22101 | grep -v CLOSE_WAIT | grep -V TIME_WAIT | wc -l
> 7
> {noformat}
> CPU usage raises up to 90% and higher.
> Connections couldn't be closed by webgui. Broker log contains message below, but connections were not closed.
> {noformat}
> 2021-03-26 15:12:28,474 INFO  [Broker-Config] (q.m.c.model_delete) - [mng:3hpU+miq(admin@/127.0.0.1:62743)] [con:751(TEST@/127.0.0.1:40422/default)] CON-1007 : Connection close initiated by operator
> {noformat}
>  Thread dumps shows around 49 threads with same stacktrace related to ring queue
> {noformat}
> java.lang.Thread.State: RUNNABLE
> 	at org.apache.qpid.server.queue.QueueEntryImpl.addStateChangeListener(QueueEntryImpl.java:674)
> 	at org.apache.qpid.server.queue.QueueEntryImpl.acquireOrSteal(QueueEntryImpl.java:316)
> 	at org.apache.qpid.server.queue.AbstractQueue.deleteEntry(AbstractQueue.java:1913)
> 	at org.apache.qpid.server.queue.AbstractQueue.deleteEntry(AbstractQueue.java:1863)
> 	at org.apache.qpid.server.queue.RingOverflowPolicyHandler$Handler.checkOverflow(RingOverflowPolicyHandler.java:102)
> 	at org.apache.qpid.server.queue.RingOverflowPolicyHandler$Handler.access$000(RingOverflowPolicyHandler.java:44)
> 	at org.apache.qpid.server.queue.RingOverflowPolicyHandler.checkOverflow(RingOverflowPolicyHandler.java:41)
> 	at org.apache.qpid.server.queue.AbstractQueue.doEnqueue(AbstractQueue.java:1382)
> 	at org.apache.qpid.server.queue.AbstractQueue.enqueue(AbstractQueue.java:1290)
> 	at org.apache.qpid.server.message.RoutingResult$1.postCommit(RoutingResult.java:135)
> 	at org.apache.qpid.server.txn.AsyncAutoCommitTransaction$3.postCommit(AsyncAutoCommitTransaction.java:309)
> 	at org.apache.qpid.server.txn.AsyncCommand.complete(AsyncCommand.java:84)
> 	at org.apache.qpid.server.protocol.v1_0.StandardReceivingLinkEndpoint.receiveComplete(StandardReceivingLinkEndpoint.java:588)
> 	at org.apache.qpid.server.protocol.v1_0.Session_1_0.lambda$receivedComplete$5(Session_1_0.java:1343)
> 	at org.apache.qpid.server.protocol.v1_0.Session_1_0$$Lambda$142/959267941.accept(Unknown Source)
> 	at java.lang.Iterable.forEach(Iterable.java:75)
> 	at org.apache.qpid.server.protocol.v1_0.Session_1_0.receivedComplete(Session_1_0.java:1343)
> 	at org.apache.qpid.server.protocol.v1_0.AMQPConnection_1_0Impl.lambda$receivedComplete$11(AMQPConnection_1_0Impl.java:1349)
> 	at org.apache.qpid.server.protocol.v1_0.AMQPConnection_1_0Impl$$Lambda$141/1774823837.run(Unknown Source)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at org.apache.qpid.server.protocol.v1_0.AMQPConnection_1_0Impl.receivedComplete(AMQPConnection_1_0Impl.java:1347)
> 	at org.apache.qpid.server.protocol.v1_0.AMQPConnection_1_0Impl.onReceive(AMQPConnection_1_0Impl.java:1328)
> 	at org.apache.qpid.server.transport.AbstractAMQPConnection.lambda$received$2(AbstractAMQPConnection.java:576)
> 	at org.apache.qpid.server.transport.AbstractAMQPConnection$$Lambda$119/6380061.run(Unknown Source)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at org.apache.qpid.server.transport.AbstractAMQPConnection.received(AbstractAMQPConnection.java:571)
> 	at org.apache.qpid.server.transport.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:135)
> 	at org.apache.qpid.server.transport.NonBlockingConnection.processAmqpData(NonBlockingConnection.java:611)
> 	at org.apache.qpid.server.transport.NonBlockingConnectionPlainDelegate.processData(NonBlockingConnectionPlainDelegate.java:58)
> 	at org.apache.qpid.server.transport.NonBlockingConnection.doRead(NonBlockingConnection.java:496)
> 	at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:270)
> 	at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:134)
> 	at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:575)
> 	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:366)
> 	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:97)
> 	at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:533)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464)
> 	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory$$Lambda$10/1487500813.run(Unknown Source)
> 	at java.lang.Thread.run(Thread.java:748)
> {noformat}
> Also ring queue behaves strangely: queue "{{DeadLetterQueue"}} has set maximum message depth to {{1}}, but in there were 55 message with {{Acquired}} state.
>  First message in {{DeadLetterQueue}} - {{2021-03-25, 16:50:59.381 (00:00 UTC)}}.
>  Last message in {{DeadLetterQueue}} - {{2021-03-25, 16:50:59.683 (00:00 UTC)}}.
> These times correlates to timed out drain request at 16:52:00,410, because default drain request is set 60 seconds:
> {noformat}
> ^[[0m^[[0m2021-03-25 16:52:00,408 INFO  [org.apache.qpid.jms.JmsSession] (QpidJMS Connection Executor: ID:986cd915-d072-4711-bd71-50a74aab4be8:1) A JMS MessageConsumer has been closed: JmsConsumerInfo: { ID:986cd915-d072-4711-bd71-50a74aab4be8:1:1:1, destination = broadcast.XX.PRISMA_LIVERiskLimitTrigger }
> ^[[0m^[[33m2021-03-25 16:52:00,410 WARN  [org.apache.camel.component.jms.DefaultJmsMessageListenerContainer] (Camel (camel-1) thread #3 - JmsConsumer[broadcast.XX.PRISMA_LIVERiskLimitTrigger]) Setup of JMS message listener invoker failed for destination 'broadcast.XX.PRISMA_LIVERiskLimitTrigger' - trying to recover. Cause: Remote did not respond to a drain request in time
> ^[[0m^[[33m2021-03-25 16:52:01,414 WARN  [org.apache.camel.component.jms.DefaultJmsMessageListenerContainer] (Camel (camel-1) thread #6 - JmsConsumer[broadcast.XX.PRISMA_LIVERiskLimitTrigger]) Setup of JMS message listener invoker failed for destination 'broadcast.XX.PRISMA_LIVERiskLimitTrigger'
> {noformat}
> After the problem was triggered broker started to reject client connections:
> {noformat}
>  2021-03-25 17:55:43,716 INFO  [Broker-Config] (q.m.c.open) - [con:901(/127.0.0.1:56012)] CON-1001 : Open : Destination : amqp(127.0.0.1:22101) : Protocol Version : 1.0
> 2021-03-25 17:55:43,745 INFO  [IO-/127.0.0.1:56012] (q.m.c.open) - [con:901(ARP@/127.0.0.1:56012/default)] CON-1001 : Open : Destination : amqp(172.16.75.197:22101) : Protocol Version : 1.0 : Client ID : ID:af50d087-0610-406a-9792-ffa5bf6b68b4:5 : Client Version : 0.51.0 : Client Product : QpidJMS
> 2021-03-25 17:55:43,747 INFO  [IO-/127.0.0.1:56012] (q.m.a.denied) - [con:901(ARP@/127.0.0.1:56012/default)] ACL-1002 : Denied : Access Virtualhost NAME=default,VIRTUALHOST_NAME=default
> 2021-03-25 17:55:43,755 INFO  [Broker-Config] (q.m.c.dropped_connection) - [con:901(ARP@/127.0.0.1:56012/default)] CON-1004 : Connection dropped
> {noformat}
> Issue seems to be quite rare: same test ran in a loop over several days and this issue was not triggered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org