You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Neha Sareen <ne...@oracle.com> on 2018/07/14 00:30:56 UTC

Artemis Failover tests

Hi,

 

- We are setting up a cluster of 6 brokers using Artemis 2.4.0.

- The cluster has 3 groups.

- Each group has one master, and one slave broker pair.

- The HA uses replication.

- Each master broker configuration has the flag 'check-for-live-server' set to true.

- Each slave broker configuration has the flag 'allow-failback' set to true.

- We use static connectors for allowing cluster topology discovery.

- Each broker's static connector list includes the connectors to the other 5 servers in the cluster.

- Each broker declares its acceptor.

- Each broker exports its own connector information via the 'connector-ref' configuration element.

- The acceptor and the connector URLs for each broker are identical with respect to the host and port information

 

We have a standalone test application that created producers and consumers to write messages and receive messages respectively.

 

We are trying to execute an automatic failover test case with the following characteristics, Initially create separate connection and session for both producer and consumer to a master broker.

Now send and consume a handful of messages using this initial connection.

Now gracefully shutdown the master broker.

The test continues trying to produce some more messages and then the consumer consuming those messages.

 

However I see the following exception being encountered after killing the master broker:

javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 71

 

The url being used for our tests is as follows:

tcp://localhost:" + masterBrokerPort + "?ha=true&retryInterval=1000&retryIntervalMultiplier=1.0&reconnectAttempts=-1&clientFailureCheckPeriod=20000

 

Also, this is what I see on the slave broker logs (slave broker running on port 61476):

17:07:16,889 INFO  [org.apache.activemq.artemis.core.server] AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.4.0 [null] started, waiting live to fail before it gets active

17:07:25,366 INFO  [org.apache.activemq.artemis.core.server] AMQ221024: Backup server ActiveMQServerImpl::serverUUID=fa7192a0-816f-11e8-a66a-08002737e2ae is synchronized with live-server.

17:07:25,449 INFO  [org.apache.activemq.artemis.core.server] AMQ221031: backup announced

17:10:10,237 INFO  [org.apache.activemq.artemis.core.server] AMQ221066: Initiating quorum vote: LiveFailoverQuorumVote

17:10:10,238 INFO  [org.apache.activemq.artemis.core.server] AMQ221067: Waiting 30 seconds for quorum vote results.

17:10:10,322 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure has been detected: AMQ119015: The connection was disconnected because of server shutdown [code=DISCONNECTED]

17:10:10,331 INFO  [org.apache.activemq.artemis.core.server] AMQ221060: Sending quorum vote request to localhost/127.0.0.1:61456: ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=false]

17:10:10,333 INFO  [org.apache.activemq.artemis.core.server] AMQ221061: Received quorum vote response from localhost/127.0.0.1:61456: ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=true]

17:10:10,351 INFO  [org.apache.activemq.artemis.core.server] AMQ221060: Sending quorum vote request to localhost/127.0.0.1:61466: ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=false]

17:10:10,353 INFO  [org.apache.activemq.artemis.core.server] AMQ221061: Received quorum vote response from localhost/127.0.0.1:61466: ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=true]

17:10:10,354 INFO  [org.apache.activemq.artemis.core.server] AMQ221068: Received all quorum votes.

17:10:10,438 INFO  [org.apache.activemq.artemis.core.server] AMQ221071: Failing over based on quorum vote results.

17:10:10,499 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure has been detected: AMQ119015: The connection was disconnected because of server shutdown [code=DISCONNECTED]

17:10:10,765 INFO  [org.apache.activemq.artemis.core.server] AMQ221037: ActiveMQServerImpl::serverUUID=fa7192a0-816f-11e8-a66a-08002737e2ae to become 'live'

17:10:10,798 WARN  [org.apache.activemq.artemis.core.client] AMQ212004: Failed to connect to server.

17:10:11,380 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying queue DLQ on address DLQ

17:10:11,381 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying queue ExpiryQueue on address ExpiryQueue

17:10:11,381 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying queue exampleQueue on address exampleQueue

17:10:11,405 INFO  [org.apache.activemq.artemis.core.server] AMQ221007: Server is now live

17:10:11,437 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:61476 for protocols [CORE,MQTT,AMQP,STOMP,HORNETQ,OPENWIRE]

 

 

Can some one let us know what the issue is here and how we can rectify this.

 

Thanks

Neha

 

 

 

Re: Artemis Failover tests

Posted by udayansahu <ud...@oracle.com>.
Thanks, it solved the problem...

-- Udayan Sahu



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: Artemis Failover tests

Posted by Clebert Suconic <cl...@gmail.com>.
Can u try with 2.6.2.  There were a few fixes in voting.

On Fri, Jul 13, 2018 at 8:31 PM Neha Sareen <ne...@oracle.com> wrote:

> Hi,
>
>
>
> - We are setting up a cluster of 6 brokers using Artemis 2.4.0.
>
> - The cluster has 3 groups.
>
> - Each group has one master, and one slave broker pair.
>
> - The HA uses replication.
>
> - Each master broker configuration has the flag 'check-for-live-server'
> set to true.
>
> - Each slave broker configuration has the flag 'allow-failback' set to
> true.
>
> - We use static connectors for allowing cluster topology discovery.
>
> - Each broker's static connector list includes the connectors to the other
> 5 servers in the cluster.
>
> - Each broker declares its acceptor.
>
> - Each broker exports its own connector information via the
> 'connector-ref' configuration element.
>
> - The acceptor and the connector URLs for each broker are identical with
> respect to the host and port information
>
>
>
> We have a standalone test application that created producers and consumers
> to write messages and receive messages respectively.
>
>
>
> We are trying to execute an automatic failover test case with the
> following characteristics, Initially create separate connection and session
> for both producer and consumer to a master broker.
>
> Now send and consume a handful of messages using this initial connection.
>
> Now gracefully shutdown the master broker.
>
> The test continues trying to produce some more messages and then the
> consumer consuming those messages.
>
>
>
> However I see the following exception being encountered after killing the
> master broker:
>
> javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for
> response when sending packet 71
>
>
>
> The url being used for our tests is as follows:
>
> tcp://localhost:" + masterBrokerPort +
> "?ha=true&retryInterval=1000&retryIntervalMultiplier=1.0&reconnectAttempts=-1&clientFailureCheckPeriod=20000
>
>
>
> Also, this is what I see on the slave broker logs (slave broker running on
> port 61476):
>
> 17:07:16,889 INFO  [org.apache.activemq.artemis.core.server] AMQ221109:
> Apache ActiveMQ Artemis Backup Server version 2.4.0 [null] started, waiting
> live to fail before it gets active
>
> 17:07:25,366 INFO  [org.apache.activemq.artemis.core.server] AMQ221024:
> Backup server
> ActiveMQServerImpl::serverUUID=fa7192a0-816f-11e8-a66a-08002737e2ae is
> synchronized with live-server.
>
> 17:07:25,449 INFO  [org.apache.activemq.artemis.core.server] AMQ221031:
> backup announced
>
> 17:10:10,237 INFO  [org.apache.activemq.artemis.core.server] AMQ221066:
> Initiating quorum vote: LiveFailoverQuorumVote
>
> 17:10:10,238 INFO  [org.apache.activemq.artemis.core.server] AMQ221067:
> Waiting 30 seconds for quorum vote results.
>
> 17:10:10,322 WARN  [org.apache.activemq.artemis.core.client] AMQ212037:
> Connection failure has been detected: AMQ119015: The connection was
> disconnected because of server shutdown [code=DISCONNECTED]
>
> 17:10:10,331 INFO  [org.apache.activemq.artemis.core.server] AMQ221060:
> Sending quorum vote request to localhost/127.0.0.1:61456:
> ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=false]
>
> 17:10:10,333 INFO  [org.apache.activemq.artemis.core.server] AMQ221061:
> Received quorum vote response from localhost/127.0.0.1:61456:
> ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=true]
>
> 17:10:10,351 INFO  [org.apache.activemq.artemis.core.server] AMQ221060:
> Sending quorum vote request to localhost/127.0.0.1:61466:
> ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=false]
>
> 17:10:10,353 INFO  [org.apache.activemq.artemis.core.server] AMQ221061:
> Received quorum vote response from localhost/127.0.0.1:61466:
> ServerConnectVote [nodeId=fa7192a0-816f-11e8-a66a-08002737e2ae, vote=true]
>
> 17:10:10,354 INFO  [org.apache.activemq.artemis.core.server] AMQ221068:
> Received all quorum votes.
>
> 17:10:10,438 INFO  [org.apache.activemq.artemis.core.server] AMQ221071:
> Failing over based on quorum vote results.
>
> 17:10:10,499 WARN  [org.apache.activemq.artemis.core.client] AMQ212037:
> Connection failure has been detected: AMQ119015: The connection was
> disconnected because of server shutdown [code=DISCONNECTED]
>
> 17:10:10,765 INFO  [org.apache.activemq.artemis.core.server] AMQ221037:
> ActiveMQServerImpl::serverUUID=fa7192a0-816f-11e8-a66a-08002737e2ae to
> become 'live'
>
> 17:10:10,798 WARN  [org.apache.activemq.artemis.core.client] AMQ212004:
> Failed to connect to server.
>
> 17:10:11,380 INFO  [org.apache.activemq.artemis.core.server] AMQ221003:
> Deploying queue DLQ on address DLQ
>
> 17:10:11,381 INFO  [org.apache.activemq.artemis.core.server] AMQ221003:
> Deploying queue ExpiryQueue on address ExpiryQueue
>
> 17:10:11,381 INFO  [org.apache.activemq.artemis.core.server] AMQ221003:
> Deploying queue exampleQueue on address exampleQueue
>
> 17:10:11,405 INFO  [org.apache.activemq.artemis.core.server] AMQ221007:
> Server is now live
>
> 17:10:11,437 INFO  [org.apache.activemq.artemis.core.server] AMQ221020:
> Started EPOLL Acceptor at 0.0.0.0:61476 for protocols
> [CORE,MQTT,AMQP,STOMP,HORNETQ,OPENWIRE]
>
>
>
>
>
> Can some one let us know what the issue is here and how we can rectify
> this.
>
>
>
> Thanks
>
> Neha
>
>
>
>
>
>
> --
Clebert Suconic