You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/27 08:57:00 UTC

[jira] [Commented] (ARTEMIS-1484) Live's topology update may be ignored

    [ https://issues.apache.org/jira/browse/ARTEMIS-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221966#comment-16221966 ] 

ASF GitHub Bot commented on ARTEMIS-1484:
-----------------------------------------

GitHub user dudaerich opened a pull request:

    https://github.com/apache/activemq-artemis/pull/1617

    ARTEMIS-1484 Live's topology update may be ignored

    If the current node has no connector to Live, it is better
    to update it from older message than to have none.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dudaerich/activemq-artemis ARTEMIS-1484

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/activemq-artemis/pull/1617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1617
    
----
commit 84fb07be53b28dd17688bd1a7f81b51bdc4570b9
Author: Erich Duda <du...@gmail.com>
Date:   2017-10-26T13:36:24Z

    ARTEMIS-1484 Live's topology update may be ignored
    
    If the current node has no connector to Live, it is better
    to update it from older message than to have none.

----


> Live's topology update may be ignored
> -------------------------------------
>
>                 Key: ARTEMIS-1484
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1484
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.3.0
>            Reporter: Erich Duda
>
> In tests based on {{MultiServerTestBase}} it sometimes happens that after all servers are started, the check waitForTopology fails with the following error.
> {code}
> Timed out waiting for cluster topology of live=5,backup=5 (received live=4, backup=5) topology = topology on Topology@5884a914[owner=ClusterConnectionImpl@405215542[nodeUUID=bbbae377-ba40-11e7-aff3-fa163e312a80, connector=TransportConfiguration(name=bbbabc66-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0, address=cluster-queues, server=ActiveMQServerImpl::BridgeFailoverTest/Live(0)]]:
>  bbd79349-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd79349-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd79348-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=1, b=TransportConfiguration(name=bbd79353-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=6], backupGroupName=null, scaleDownGroupName=null]
>  bbd7935b-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7935b-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd7935a-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=2, b=TransportConfiguration(name=bbd7ba75-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=7], backupGroupName=null, scaleDownGroupName=null]
>  bbbae377-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbbae377-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbbabc66-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0, b=TransportConfiguration(name=bbd76c31-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=5], backupGroupName=null, scaleDownGroupName=null]
>  bbd7ba8f-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7ba8f-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=null, b=TransportConfiguration(name=bbd7ba99-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=9], backupGroupName=null, scaleDownGroupName=null]
>  bbd7ba7d-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7ba7d-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd7ba7c-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=3, b=TransportConfiguration(name=bbd7ba87-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=8], backupGroupName=null, scaleDownGroupName=null]
>  nodes=9 members=5)
> {code}
> I dug into this and found out that in some certain cases Live's topology update message has older event ID than Backup's update message and it is also received later. In these cases the Live's message is ignored, because it doesn't meet the condition as it is shown below in the code snippet.
> I think if the current node has no connector to Live, it shouldn't ignore topology update from Live even if it is older than the current record.
> {code:java}
> public boolean updateMember(final long uniqueEventID, final String nodeId, final TopologyMemberImpl memberInput) {
>    if (uniqueEventID > currentMember.getUniqueEventID()) {
>             ...
>    }
>    /*
>     * always add the backup, better to try to reconnect to something that's not there then to
>     * not know about it at all
>     */
>    if (currentMember.getBackup() == null && memberInput.getBackup() != null) {
>       currentMember.setBackup(memberInput.getBackup());
>    }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)