You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by Jeff Mesnil <jm...@gmail.com> on 2015/09/15 14:33:11 UTC

Cluster view of Artemis?

Hi, I'm writing some failover tests for Artemis (using either shared
store or replication).

I have 2 nodes, a master and a backup one.

In the replication cases, I start my nodes and want to wait for the
cluster to be formed.

In the logs, I can see that the cluster is formed, as the backup has this log:

"11:36:39,768 INFO  [org.apache.activemq.artemis.core.server]
(Thread-1 (ActiveMQ-client-netty-threads-1591310642)) AMQ221024:
Backup server ActiveMQServerImpl::serverUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145
is synchronized with live-server."

However, if I look at ClusterConnectionControl#getNodes() on either
nodes, it returns an empty map. I was expecting to have both nodes
returned. Or maybe the other node at the end of the cluster
connection. Returning an empty map sounds suspicious.

If I now call ClusterConnectionControl#getTopology() on the master
node, it returns:

"topology on Topology@1aaa7bfa[owner=ClusterConnectionImpl@764259437[nodeUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145,
connector=TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?host=localhost&http-upgrade-endpoint=http-acceptor&httpUpgradeEnabled=true&port=8080,
address=jms, server=ActiveMQServerImpl::serverUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145]]:
        e7bd42ca-5b8b-11e5-9d19-796a17bef145 => TopologyMember[id =
e7bd42ca-5b8b-11e5-9d19-796a17bef145,
connector=Pair[a=TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?host=localhost&http-upgrade-endpoint=http-acceptor&httpUpgradeEnabled=true&port=8080,
b=TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?httpUpgradeEnabled=true&port=8180&host=localhost&http-upgrade-endpoint=http-acceptor],
backupGroupName=null, scaleDownGroupName=null]
        nodes=2 members=1"

The string is a bit opaque but I can at least see that there are 2
nodes in the cluster. Both nodes form a pair for the single member
identified with e7bd42ca-5b8b-11e5-9d19-796a17bef145

What's the correct way to know the number of nodes in the cluster?
Finally, is there a way to query the live server to know if it has one
or many backups ready to failover? Relying on the topology string
seems quite fragile...

thanks,
jeff

-- 
Jeff Mesnil
jmesnil@gmail.com
http://jmesnil.net/weblog/

Re: Cluster view of Artemis?

Posted by Jeff Mesnil <jm...@gmail.com>.
I want to do that from my test clients. As I wrote in my previous
mail, I tried using ClusterConnectionControl for that but I can't make
sense of the information returned by getTopology() and getNodes()?

We just had another intermittent failure[1]. The test[2] use a cluster
of 2 replicated nodes (both are live nodes).
In the logs, I have:


18:28:39,833 INFO  [org.apache.activemq.artemis.core.server]
(ServerService Thread Pool -- 70) AMQ221007: Server is now live
18:28:39,834 INFO  [org.apache.activemq.artemis.core.server]
(ServerService Thread Pool -- 70) AMQ221001: Apache ActiveMQ Artemis
Message Broker version 1.1.0-wildfly-6
[nodeID=f94b61cc-6146-11e5-937b-a3e9631a50ac]
...
18:28:45,794 INFO  [org.apache.activemq.artemis.core.server]
(ServerService Thread Pool -- 71) AMQ221007: Server is now live
18:28:45,794 INFO  [org.apache.activemq.artemis.core.server]
(ServerService Thread Pool -- 71) AMQ221001: Apache ActiveMQ Artemis
Message Broker version 1.1.0-wildfly-6
[nodeID=fe8b1275-6146-11e5-b3c8-596354b7fc76]
...
18:28:46,367 INFO  [org.apache.activemq.artemis.core.server] (Thread-4
(ActiveMQ-server-ActiveMQServerImpl::serverUUID=fe8b1275-6146-11e5-b3c8-596354b7fc76-7203298))
AMQ221027: Bridge ClusterConnectionBridge@7edf20
[name=sf.my-cluster.f94b61cc-6146-11e5-937b-a3e9631a50ac,
queue=QueueImpl[name=sf.my-cluster.f94b61cc-6146-11e5-937b-a3e9631a50ac,
postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::serverUUID=fe8b1275-6146-11e5-b3c8-596354b7fc76]]@1a3e7c0
targetConnector=ServerLocatorImpl
(identity=(Cluster-connection-bridge::ClusterConnectionBridge@7edf20
[name=sf.my-cluster.f94b61cc-6146-11e5-937b-a3e9631a50ac,
queue=QueueImpl[name=sf.my-cluster.f94b61cc-6146-11e5-937b-a3e9631a50ac,
postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::serverUUID=fe8b1275-6146-11e5-b3c8-596354b7fc76]]@1a3e7c0
targetConnector=ServerLocatorImpl
[initialConnectors=[TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?httpUpgradeEnabled=true&port=8080&host=localhost&http-upgrade-endpoint=http-acceptor],
discoveryGroupConfiguration=null]]::ClusterConnectionImpl@17494988[nodeUUID=fe8b1275-6146-11e5-b3c8-596354b7fc76,
connector=TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?host=localhost&http-upgrade-endpoint=http-acceptor&httpUpgradeEnabled=true&port=8180,
address=jms, server=ActiveMQServerImpl::serverUUID=fe8b1275-6146-11e5-b3c8-596354b7fc76]))
[initialConnectors=[TransportConfiguration(name=http-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?httpUpgradeEnabled=true&port=8080&host=localhost&http-upgrade-endpoint=http-acceptor],
discoveryGroupConfiguration=null]] is connected


This AMQ221027 is a bit hard to read but it basically says that I have
a connection from server f94b61cc to server fe8b1275...
However I don't have a corresponding log to show that the bridge is
also connected from server fe8b1275to server f94b61cc

When the test passes, I do have a 2nd AMQ221027 in the opposite direction.

[1] http://brontes.lab.eng.brq.redhat.com/viewLog.html?buildId=72399&tab=buildResultsDiv&buildTypeId=WildFlyCore_PullRequest_WildFlyCoreFullIntegration
[2] https://github.com/wildfly/wildfly/blob/master/testsuite/integration/clustering/src/test/java/org/jboss/as/test/clustering/messaging/ClusteredMessagingTestCase.java

On Wed, Sep 23, 2015 at 4:44 PM, Clebert Suconic
<cl...@gmail.com> wrote:
> On Artemis... We have some tests that we validate through topology and
> asserting server's bindings...  Can you do the same on your tests.. or you
> won't have such APIs available?
>
> On Wed, Sep 23, 2015 at 10:00 AM, Jeff Mesnil <jm...@gmail.com> wrote:
>
>> Hi,
>>
>> On Tue, Sep 15, 2015 at 5:35 PM, Jeff Mesnil <jm...@gmail.com> wrote:
>> > On Tue, Sep 15, 2015 at 3:11 PM, Clebert Suconic
>> > <cl...@gmail.com> wrote:
>> >> The ClusterConnectionControl will get the topology from the
>> >> ClusterConnection that will get activated only after the server is
>> >> activated (start method called).
>> >>
>> >> As a result this method is only available at the running server.
>> >
>> > I get an empty nodes when I call it on the live server after starting
>> > both servers.
>> > Why is there a discrepancy between the number of nodes in
>> > #getTopology() and #getNodes()?
>>
>> We have some clustering tests for our app server using Artemis.
>> We noticed frequent failures in these tests because the cluster of
>> Artemis nodes is not formed in timely fashion.
>> We use JGroups replication for our cluster communication. Usually, the
>> tests fails because the cluster is not formed before we start testing
>> things.
>>
>> What is the correct way to ascertain that a cluster is formed?
>> Ideally I want to know how many nodes are in the clusters and how many
>> of them are live.
>>
>> Please note that increasing a timeout after servers are started and
>> before tests are exercised is not enough. There are some cases where
>> the cluster is never formed at all (when playing with failover and
>> tailback). But to test this use case, I first must be able to check
>> the cluster topology in a reliable way.
>>
>> thanks,
>> jeff
>>
>> --
>> Jeff Mesnil
>> jmesnil@gmail.com
>> http://jmesnil.net/weblog/
>>
>
>
>
> --
> Clebert Suconic



-- 
Jeff Mesnil
jmesnil@gmail.com
http://jmesnil.net/weblog/

Re: Cluster view of Artemis?

Posted by Clebert Suconic <cl...@gmail.com>.
On Artemis... We have some tests that we validate through topology and
asserting server's bindings...  Can you do the same on your tests.. or you
won't have such APIs available?

On Wed, Sep 23, 2015 at 10:00 AM, Jeff Mesnil <jm...@gmail.com> wrote:

> Hi,
>
> On Tue, Sep 15, 2015 at 5:35 PM, Jeff Mesnil <jm...@gmail.com> wrote:
> > On Tue, Sep 15, 2015 at 3:11 PM, Clebert Suconic
> > <cl...@gmail.com> wrote:
> >> The ClusterConnectionControl will get the topology from the
> >> ClusterConnection that will get activated only after the server is
> >> activated (start method called).
> >>
> >> As a result this method is only available at the running server.
> >
> > I get an empty nodes when I call it on the live server after starting
> > both servers.
> > Why is there a discrepancy between the number of nodes in
> > #getTopology() and #getNodes()?
>
> We have some clustering tests for our app server using Artemis.
> We noticed frequent failures in these tests because the cluster of
> Artemis nodes is not formed in timely fashion.
> We use JGroups replication for our cluster communication. Usually, the
> tests fails because the cluster is not formed before we start testing
> things.
>
> What is the correct way to ascertain that a cluster is formed?
> Ideally I want to know how many nodes are in the clusters and how many
> of them are live.
>
> Please note that increasing a timeout after servers are started and
> before tests are exercised is not enough. There are some cases where
> the cluster is never formed at all (when playing with failover and
> tailback). But to test this use case, I first must be able to check
> the cluster topology in a reliable way.
>
> thanks,
> jeff
>
> --
> Jeff Mesnil
> jmesnil@gmail.com
> http://jmesnil.net/weblog/
>



-- 
Clebert Suconic

Re: Cluster view of Artemis?

Posted by Jeff Mesnil <jm...@gmail.com>.
Hi,

On Tue, Sep 15, 2015 at 5:35 PM, Jeff Mesnil <jm...@gmail.com> wrote:
> On Tue, Sep 15, 2015 at 3:11 PM, Clebert Suconic
> <cl...@gmail.com> wrote:
>> The ClusterConnectionControl will get the topology from the
>> ClusterConnection that will get activated only after the server is
>> activated (start method called).
>>
>> As a result this method is only available at the running server.
>
> I get an empty nodes when I call it on the live server after starting
> both servers.
> Why is there a discrepancy between the number of nodes in
> #getTopology() and #getNodes()?

We have some clustering tests for our app server using Artemis.
We noticed frequent failures in these tests because the cluster of
Artemis nodes is not formed in timely fashion.
We use JGroups replication for our cluster communication. Usually, the
tests fails because the cluster is not formed before we start testing
things.

What is the correct way to ascertain that a cluster is formed?
Ideally I want to know how many nodes are in the clusters and how many
of them are live.

Please note that increasing a timeout after servers are started and
before tests are exercised is not enough. There are some cases where
the cluster is never formed at all (when playing with failover and
tailback). But to test this use case, I first must be able to check
the cluster topology in a reliable way.

thanks,
jeff

-- 
Jeff Mesnil
jmesnil@gmail.com
http://jmesnil.net/weblog/

Re: Cluster view of Artemis?

Posted by Jeff Mesnil <jm...@gmail.com>.
On Tue, Sep 15, 2015 at 3:11 PM, Clebert Suconic
<cl...@gmail.com> wrote:
> The ClusterConnectionControl will get the topology from the
> ClusterConnection that will get activated only after the server is
> activated (start method called).
>
> As a result this method is only available at the running server.

I get an empty nodes when I call it on the live server after starting
both servers.
Why is there a discrepancy between the number of nodes in
#getTopology() and #getNodes()?

-- 
Jeff Mesnil
jmesnil@gmail.com
http://jmesnil.net/weblog/

Re: Cluster view of Artemis?

Posted by Clebert Suconic <cl...@gmail.com>.
The ClusterConnectionControl will get the topology from the
ClusterConnection that will get activated only after the server is
activated (start method called).

As a result this method is only available at the running server.

I'm not sure we could modify this semantic easily, but it but it would be a
JIRA/feature.  There could be added a backupController returning the live
node or any other controls available for the backup node maybe?

On Tue, Sep 15, 2015 at 8:33 AM, Jeff Mesnil <jm...@gmail.com> wrote:

> Hi, I'm writing some failover tests for Artemis (using either shared
> store or replication).
>
> I have 2 nodes, a master and a backup one.
>
> In the replication cases, I start my nodes and want to wait for the
> cluster to be formed.
>
> In the logs, I can see that the cluster is formed, as the backup has this
> log:
>
> "11:36:39,768 INFO  [org.apache.activemq.artemis.core.server]
> (Thread-1 (ActiveMQ-client-netty-threads-1591310642)) AMQ221024:
> Backup server
> ActiveMQServerImpl::serverUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145
> is synchronized with live-server."
>
> However, if I look at ClusterConnectionControl#getNodes() on either
> nodes, it returns an empty map. I was expecting to have both nodes
> returned. Or maybe the other node at the end of the cluster
> connection. Returning an empty map sounds suspicious.
>
> If I now call ClusterConnectionControl#getTopology() on the master
> node, it returns:
>
> "topology on Topology@1aaa7bfa[owner=ClusterConnectionImpl@764259437
> [nodeUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145,
> connector=TransportConfiguration(name=http-connector,
>
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>
> ?host=localhost&http-upgrade-endpoint=http-acceptor&httpUpgradeEnabled=true&port=8080,
> address=jms,
> server=ActiveMQServerImpl::serverUUID=e7bd42ca-5b8b-11e5-9d19-796a17bef145]]:
>         e7bd42ca-5b8b-11e5-9d19-796a17bef145 => TopologyMember[id =
> e7bd42ca-5b8b-11e5-9d19-796a17bef145,
> connector=Pair[a=TransportConfiguration(name=http-connector,
>
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>
> ?host=localhost&http-upgrade-endpoint=http-acceptor&httpUpgradeEnabled=true&port=8080,
> b=TransportConfiguration(name=http-connector,
>
> factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
>
> ?httpUpgradeEnabled=true&port=8180&host=localhost&http-upgrade-endpoint=http-acceptor],
> backupGroupName=null, scaleDownGroupName=null]
>         nodes=2 members=1"
>
> The string is a bit opaque but I can at least see that there are 2
> nodes in the cluster. Both nodes form a pair for the single member
> identified with e7bd42ca-5b8b-11e5-9d19-796a17bef145
>
> What's the correct way to know the number of nodes in the cluster?
> Finally, is there a way to query the live server to know if it has one
> or many backups ready to failover? Relying on the topology string
> seems quite fragile...
>
> thanks,
> jeff
>
> --
> Jeff Mesnil
> jmesnil@gmail.com
> http://jmesnil.net/weblog/
>



-- 
Clebert Suconic