You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/11/14 11:29:00 UTC

[jira] [Commented] (ARTEMIS-2174) Broker reconnect to another with scale down policy cause OOM

    [ https://issues.apache.org/jira/browse/ARTEMIS-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686389#comment-16686389 ] 

ASF GitHub Bot commented on ARTEMIS-2174:
-----------------------------------------

GitHub user gaohoward opened a pull request:

    https://github.com/apache/activemq-artemis/pull/2430

    ARTEMIS-2174 Broker reconnect cause OOM with scale down

    When a node tries to reconnects to another node in a scale down cluster,
    the reconnect request gets denied by the other node and keeps retrying,
    which causes tasks in the ordered executor accumulate and eventually OOM.
    
    The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
    to allow reconnect if the scale down node is the node itself.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gaohoward/activemq-artemis e_2174

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/activemq-artemis/pull/2430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2430
    
----
commit 2a108ac8817cb6c2ca3b29092108a3e35e7f5690
Author: Howard Gao <ho...@...>
Date:   2018-11-14T11:21:48Z

    ARTEMIS-2174 Broker reconnect cause OOM with scale down
    
    When a node tries to reconnects to another node in a scale down cluster,
    the reconnect request gets denied by the other node and keeps retrying,
    which causes tasks in the ordered executor accumulate and eventually OOM.
    
    The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
    to allow reconnect if the scale down node is the node itself.

----


> Broker reconnect to another with scale down policy cause OOM
> ------------------------------------------------------------
>
>                 Key: ARTEMIS-2174
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2174
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.6.3
>            Reporter: Howard Gao
>            Assignee: Howard Gao
>            Priority: Major
>             Fix For: 2.6.4
>
>
> When a node tries to reconnects to another node in a scale down cluster, the reconnect request gets denied by the other node and keeps retrying, which causes tasks in the ordered executor accumulate and eventually OOM.
> To reproduce:
>  # Start 2 nodes (node1 and 2) cluster configured in scale down mode.
>  # stop node2 and restart it.
>  # node1 will try to reconnect to node2 repeatedly and ever succeed.
>  # Inspect the connecting ClientSessionFactory (like adding log) and its threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to its queue.
> Over the time the queue keeps ever growing, and will exhaust the heap memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)