You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Ian Cox (Jira)" <ji...@apache.org> on 2023/02/12 12:15:00 UTC
[jira] [Commented] (ARTEMIS-4166) 3x2 live backup pairs, stop and start a live node taking traffic does not restore

    [ https://issues.apache.org/jira/browse/ARTEMIS-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687564#comment-17687564 ] 

Ian Cox commented on ARTEMIS-4166:
----------------------------------

Found the issue. It was down to the configuration of the connectors and cluster. I added local network connectors and set the cluster config to use them (see below). Now I can connect from the ingress addresses but the nodes work on the local address. This seems to remove the issue completely, swapping from live to back and reversing that works every time.

 

      <connectors>
          <connector name="live1-connector">tcp://192.168.0.10:61616</connector>
          <connector name="live2-connector">tcp://192.168.0.11:61617</connector>
          <connector name="live3-connector">tcp://192.168.0.12:61618</connector>
          <connector name="back1-connector">tcp://192.168.0.13:61619</connector>
          <connector name="back2-connector">tcp://192.168.0.10:61620</connector>
          <connector name="back3-connector">tcp://192.168.0.11:61621</connector>

          <connector name="live1-connector">tcp://live1:61616</connector>
          <connector name="live2-connector">tcp://live2:61616</connector>
          <connector name="live3-connector">tcp://live3:61616</connector>
          <connector name="back1-connector">tcp://back1:61616</connector>
          <connector name="back2-connector">tcp://back2:61616</connector>
          <connector name="back3-connector">tcp://back3:61616</connector>
      </connectors>
      
      <cluster-user>my-cluster-user</cluster-user>
      <cluster-password>my-cluster-password</cluster-password>
      <cluster-connections>
          <cluster-connection name="my-cluster">
              <connector-ref>live3-connector</connector-ref>
              <static-connectors>
                  <connector-ref>back3-connector</connector-ref>
                  <connector-ref>live1-connector</connector-ref>
                  <connector-ref>back1-connector</connector-ref>
                  <connector-ref>live2-connector</connector-ref>
                  <connector-ref>back2-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

> 3x2 live backup pairs, stop and start a live node taking traffic does not restore
> ---------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-4166
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4166
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.27.1
>         Environment: 4 node RPI 4 docker with NFS V4 volumes
> 6 node 2.27.1 artemis allocated across the 4 nodes
> Attached log from stopped live node, compose file and broker.xml from each node.
>            Reporter: Ian Cox
>            Priority: Minor
>         Attachments: artemis.zip
>
>
> Have a 6 node cluster running on a 4 node docker swarm (compose file attached). When running a test app that publishes to 1 queue which is replicated to 3 other queues which are consumed, stopping any live node taking traffic will trigger the bakup for that node starting and taking traffic. Restarting the node terminates the connections to the backup node but the live node is never fully activated and cannot be accessed via app or console.
> Restarting the application and the stack allows the 'trapped' items to appear on queues and they get consumed. Presumably because they are stored on the NFS volume associated with the broker.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)