You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/02/17 20:38:00 UTC

[jira] [Commented] (ARTEMIS-3647) rolledbackMessageRefs can grow until OOM with OpenWire clients

    [ https://issues.apache.org/jira/browse/ARTEMIS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494212#comment-17494212 ] 

ASF subversion and git services commented on ARTEMIS-3647:
----------------------------------------------------------

Commit 8a9f326b25e5feaedcb1393f8d2a07a9d9432c48 in activemq-artemis's branch refs/heads/main from AntonRoskvist
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=8a9f326 ]

ARTEMIS-3647 - OpenWire, remove rolledbackMessageRef on Ack


> rolledbackMessageRefs can grow until OOM with OpenWire clients
> --------------------------------------------------------------
>
>                 Key: ARTEMIS-3647
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3647
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>            Reporter: Anton Roskvist
>            Priority: Major
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {color:#1d1c1d}In my use case I have quite a few long lived OpenWire consumers. I noticed that over time the heap usage increases. Looking through a heap dump, I found that memory is held in "rolledbackMessageRefs". In one case holding as much as 1.6GB of data with 0 messages on queue. 
> Disconnecting the consumer and then reconnecting released the memory.
> Clients are running Spring with transactions. The clients affected by this have some small issue receiving messages such that some of them are retried a couple of times before getting processed properly.
> I suspect that "rolledbackMessageRefs"{color} are not getting cleared with the message ref once it's finally processed for some reason.
> {-}{color:#1d1c1d}I have not found a way to reproduce this yet and it happens over several days.
> {color}{-}{color:#1d1c1d}UPDATE: I can easily reproduce this by setting up a standalone Artemis broker with "out-of-the-box"-configuration and using these tools:{color} -- [https://github.com/erik-wramner/JmsTools]  (AmqJmsConsumer and optionally AmqJmsProducer)
> 1. Start the broker
> 2. Send 100k messages to "queue://TEST"
> {code:java}
> # java -jar JmsTools/shaded-jars/AmqJmsProducer.jar -url "tcp://localhost:61616" -user USER -pw PASSWORD -queue TEST -count 100000{code}
> 3. Receive one more message than produced and do a rollback on 30% of them (unrealistic, but means this can be done in minutes instead of days. Receive one more to ensure consumer stays live)
> {code:java}
> # java -jar JmsTools/shaded-jars/AmqJmsConsumer.jar -url "tcp://localhost:61616?jms.prefetchPolicy.all=100&jms.nonBlockingRedelivery=true" -user USER -pw PASSWORD -queue TEST -count 100001 -rollback 30{code}
> 4. Wait until no more messages are left on "queue://TEST" (a few might be on DLQ but that's okay)
> 5. Get a heap dump with the consumer still connected
> {code:java}
> # jmap -dump:format=b,file=dump.hprof Artemis_PID{code}
> 6. Running "Leak suspects" with MAT will show a (relatively) large amount of memory held by {color:#1d1c1d}"rolledbackMessageRefs"{color} for the consumer connected to queue://TEST
> The consumer is run with "jms.nonBlockingRedelivery=true" to speed things up, though it should not be strictly needed.
> As an added bonus this also shows that the prefetch limit "jms.prefetchPolicy.all=100" is not respected while messages are in the redelivery process which can easily be seen in the consoles "Attributes"-section for the queue. This is also true for the default prefetch value of 1000.
> Br,
> Anton



--
This message was sent by Atlassian Jira
(v8.20.1#820001)