You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2021/08/23 14:47:03 UTC

[GitHub] [rocketmq-exporter] humkum opened a new pull request #65: fix rocketmq-exporter causes OOM Error

humkum opened a new pull request #65:
URL: https://github.com/apache/rocketmq-exporter/pull/65


   ## What is the purpose of the change
   
   #64 
   
   ## Brief changelog
   By analyzing the heap dump file, it is found that there are two reasons for OOM:
   1. The life cycle of the HashMap used to collect metrics runs through the program and cannot be GC.
   2. The program did not delete out-of-date metrics, resulting in the amount of data in HashMap only increasing.
   
   To solve this problem. Use Google Guava Cache data structure to replace ConcurrentHashMap.Because Cache can be set to automatically delete out-of-date content. In order to adapt to the general situation,we make the out-of-date time configurable.
   ## Verifying this change
   
   1. We compared the heap dump files of the original program and the optimized program,as follows:
   ![image](https://user-images.githubusercontent.com/50660789/130462193-d6edc0ff-7bd5-4236-84e3-a7366f7a9736.png)
   ![6dc891ef-f573-4216-9f6e-ff2888f14614](https://user-images.githubusercontent.com/50660789/130462367-52ff2732-1c2d-41ec-bc5f-fbcee8eb2229.png)
   We can find that the origin program will accumulate a large number of HashMap instances.Over time, it will cause OOM.
   ![image (1)](https://user-images.githubusercontent.com/50660789/130462699-942d3d0a-5a7e-467b-837f-0d533bb07239.png)
   ![image (2)](https://user-images.githubusercontent.com/50660789/130462731-0dd56c1c-8114-469c-8359-b9965953a64e.png)
   Through these two picture, we can find that there are so many out-of-date runtime metrics that can not be deleted.
   2. We also compared the changes in the total number of metrics after the two programs were online for 66 hours.
   ![f30ee75b-7f74-4de0-9abb-e2671120c4c4](https://user-images.githubusercontent.com/50660789/130464264-12893708-e91d-4121-91a5-060c6d2d8509.png)
   We can find that the metrics of origin program keeps growing through the time.While the optimized program has a more stable number of metrics
   3. We created a new consumer group,and make a send and consume test.After that we delete the group.We found that the origin program can't delete the deleted group, but the optimized program using Cache to store metrics can delete the metrics of deleted group.
   Create a new consumer group named "consumer_h".start a consumer instance.
   ![image (3)](https://user-images.githubusercontent.com/50660789/130465888-139c0b36-51bd-4986-ae7b-bf38fa6af9a3.png)
   Start a consumer instance.
   ![image (4)](https://user-images.githubusercontent.com/50660789/130466066-61c58e9b-10e1-4828-a576-e375eafa2259.png)
   Start another instance for consumer group "consumer_h"
   ![image (5)](https://user-images.githubusercontent.com/50660789/130466471-19121859-c506-4bfa-8fa7-83806f563829.png)
   Start third instance for consumer group "consumer_h"
   ![image (6)](https://user-images.githubusercontent.com/50660789/130466600-3041d94c-a3a5-4ed3-95e4-04a670ff1c38.png)
   Shutdown the first instance for consumer group "consumer_h",and wait for 120s (out-of-date time we setted)
   ![image (7)](https://user-images.githubusercontent.com/50660789/130467194-70508f23-2ece-410d-a620-5dceb706a284.png)
   Delete the consumer group "consumer_h",and wait for 120s.
   ![image (8)](https://user-images.githubusercontent.com/50660789/130467329-7a96ea29-135a-465a-9381-c12f49742d03.png)
   We can find that the optimized program can successfully delete the metrics of deleted consumer after expiration.
   
   Follow this checklist to help us incorporate your contribution quickly and easily. Notice, `it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR`.
   
   - [x] Make sure there is a [Github issue](https://github.com/apache/rocketmq/issues) filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue. 
   - [x] Format the pull request title like `[ISSUE #123] Fix UnknownException when host config not exist`. Each commit in the pull request should have a meaningful subject line and body.
   - [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
   - [x] Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in [test module](https://github.com/apache/rocketmq/tree/master/test).
   - [x] Run `mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install -DskipITs` to make sure unit-test pass. Run `mvn clean test-compile failsafe:integration-test`  to make sure integration-test pass.
   - [ ] If this contribution is large, please file an [Apache Individual Contributor License Agreement](http://www.apache.org/licenses/#clas).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq-exporter] maixiaohai commented on pull request #65: fix rocketmq-exporter causes OOM Error

Posted by GitBox <gi...@apache.org>.
maixiaohai commented on pull request #65:
URL: https://github.com/apache/rocketmq-exporter/pull/65#issuecomment-920553097


   merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq-exporter] maixiaohai merged pull request #65: fix rocketmq-exporter causes OOM Error

Posted by GitBox <gi...@apache.org>.
maixiaohai merged pull request #65:
URL: https://github.com/apache/rocketmq-exporter/pull/65


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org