You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/11 07:12:00 UTC
[jira] [Work logged] (AMQ-9107) Closing many consumers causes CPU to spike to 100%

     [ https://issues.apache.org/jira/browse/AMQ-9107?focusedWorklogId=815473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-815473 ]

ASF GitHub Bot logged work on AMQ-9107:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Oct/22 07:11
            Start Date: 11/Oct/22 07:11
    Worklog Time Spent: 10m 
      Work Description: lucastetreault opened a new pull request, #908:
URL: https://github.com/apache/activemq/pull/908

   Running a profiler while executing the sample code attached to [AMQ-9107](https://issues.apache.org/jira/browse/AMQ-9107) identified ManagedRegionBroker.removeConsumer as the bottleneck. The existing implementation loops over all the subscriptions to find the subscription for the consumer we want to close. When we have n consumers and we want to close them all this for loop is O(n^2) and when n is big enough it creates a serious performance issue. With 188,000 consumers we observe the CPU at 100% for ~40 minutes while all the connections are closed: 
   
   <img width="1217" alt="image" src="https://user-images.githubusercontent.com/7095337/195011857-a6971abb-b73c-41fd-bd88-9ab376388949.png">
   
   
   After this PR, running the same test case we observe a spike in CPU of only one minute or less, similar to what it took to create the consumers: 
   
   <img width="968" alt="image" src="https://user-images.githubusercontent.com/7095337/195017869-c17c8b4a-fabc-4c2c-a909-6073955613a1.png">
   
   I ran the full suite of tests and everything is passing.
   
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 815473)
    Remaining Estimate: 0h
            Time Spent: 10m

> Closing many consumers causes CPU to spike to 100%
> --------------------------------------------------
>
>                 Key: AMQ-9107
>                 URL: https://issues.apache.org/jira/browse/AMQ-9107
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.17.1, 5.16.5
>            Reporter: Lucas Tétreault
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>         Attachments: example.zip, image-2022-10-07-00-12-39-657.png, image-2022-10-07-00-17-30-657.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When there are many consumers (~188k) on a queue, closing them is incredibly expensive and causes the CPU to spike to 100% while the consumers are closed. Tested on an Amazon MQ mq.m5.large instance (2 vcpu, 8gb memory).
> I have attached a minimal recreation of the issue where the following happens: 
> 1/ Open 100 connections.
> 2/ Create consumers as fast as we can on all of those connections until we hit at least 188k consumers.
> 3/ Sleep for 5 minutes so we can observe the CPU come back down after opening all those connections.
> 4/ Start closing consumers as fast as we can.
> 5/ After all consumers are closed, sleep for 5 minutes to observe the CPU come back down after closing all the connections.
>  
> In this example it seems 5 minutes wasn't actually sufficient time for the CPU to come back down and the consumer and connection counts seem to hit 0 at the same time: 
> !image-2022-10-07-00-12-39-657.png|width=757,height=353!
>  
> In a previous test with more time sleeping after closing all the consumers we can see the CPU come back down before we close the connections. 
> !image-2022-10-07-00-17-30-657.png|width=764,height=348!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)