You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by "Stefaniuk, Marcin " <ma...@credit-suisse.com> on 2018/07/04 14:19:05 UTC

[ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

I'm struggling to create set-up as mentioned in the subject on ActiveMQ Artemis 2.5.0. My key configuration looks as follows (for first node of three):

<acceptors>
    <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300</acceptor>
</acceptors>

<connectors>
    <connector name="node-1-connector">tcp://localhost:61616</connector>
    <connector name="node-2-connector">tcp://localhost:62616</connector>
    <connector name="node-3-connector">tcp://localhost:63616</connector>
</connectors>

<cluster-connections>
    <cluster-connection name="showcase-cluster">
        <connector-ref>node-1-connector</connector-ref>
        <retry-interval>500</retry-interval>
        <use-duplicate-detection>true</use-duplicate-detection>
        <message-load-balancing>ON_DEMAND</message-load-balancing>
        <max-hops>1</max-hops>
        <static-connectors>
            <connector-ref>node-2-connector</connector-ref>
            <connector-ref>node-3-connector</connector-ref>
        </static-connectors>
    </cluster-connection>
</cluster-connections>

<ha-policy>
    <replication>
        <colocated>
            <backup-port-offset>7</backup-port-offset>
            <request-backup>true</request-backup>
            <max-backups>2</max-backups>
            <backup-request-retries>-1</backup-request-retries>
            <backup-request-retry-interval>2000</backup-request-retry-interval>
            <master />
            <slave />
        </colocated>
    </replication>
</ha-policy>

Rest of nodes has similar configuration - adjusted cluster connections and acceptors. I'm deploying it also on three separate hosts (each different from localhost). What is important I have no discovery groups (no possibility to use UDP).

So my test is connecting to a cluster using ActiveMQConnectionFactory and URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&reconnectAttempts=-1" (leaving third to be obtained directly from a cluster) and one thread is producing and second consuming messages (separate connection used). Test is working fine (unsurprisingly) even when producer is connected to different nodes of the cluster. But when one node is stopped then producer / consumer connected to that node is affected - no send / receive is performed but some messages on the client side is buffered and flushed when node is again available. I would expect to automagically switch connection to another node but it is not happening here. I have tried that previously without HA but with the same result.

Could you help me determine what I'm doing wrong?

Kind regards
Marcin Stefaniuk
CREDIT SUISSE (POLAND) SP. Z O.O
Solution Architect | Messaging Engineering Warsaw, MITM 47
Atrium 2 | 00-849 Warsaw | Poland
marcin.stefaniuk@credit-suisse.com<ma...@credit-suisse.com> | www.credit-suisse.com<http://www.credit-suisse.com/>

=============================================================================== 
Please access the attached hyperlink for an important electronic communications disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
===============================================================================

RE: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

Posted by "Stefaniuk, Marcin " <ma...@credit-suisse.com>.

Background verification means in separate threads (producer and consumer). What I want to point here is that ServerUtil.killServer() (heavily used in examples) is not actually stopping a broker on Windows machine.

My claim is based on experiment when I've killed all 3 nodes from cluster and both producer and consumer connected to one of a cluster nodes are still working but 3 - 3 should be 0 but on Windows it is at least one ;-). I've run the same code on Linux and both producer and consumer are throwing an error (as expected).

Regarding to my reproducer I'm almost there. I have currently internal use version and then I should extract something free from company's traits.

Kind regards
Marcin


-----Original Message-----
From: Justin Bertram [mailto:jbertram@apache.org] 
Sent: 11 July 2018 18:20
To: users@activemq.apache.org
Subject: Re: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

What do you mean by "background verification"?

Also, what exactly fails immediately when you use 'artemis.cmd stop'?

One of the main goals of having a reproducible test-case is to be able to
share that test-case with others so they can reproduce what you are
seeing.  Reproducibility is one of the foundations of scientific
investigation.  Practically speaking, it cuts through a lot of unnecessary
back and forth since describing the issue with words alone is extremely
time consuming and error prone.  Do you have any kind of reproducer that
you could share?


Justin

On Wed, Jul 11, 2018 at 11:01 AM, Stefaniuk, Marcin <
marcin.stefaniuk@credit-suisse.com> wrote:

> I have tried to recreate my case with use of provided tooling (ServerUtil)
> for orchestrated server start and stop but I received worrying results on
> Windows machine.
>
> I'm starting three nodes of the cluster and background threads for produce
> / consume (for one minute). After 10 seconds first node is killed, next 10
> second node is killed, next 10 seconds third node is killed and...
> background verification is still working! It fails immediately when I use
> artemis.cmd stop from command line.
>
> It works on Linux.
>
> Kind regards
> Marcin
>
>
> -----Original Message-----
> From: Justin Bertram [mailto:jbertram@apache.org]
> Sent: 05 July 2018 16:07
> To: users@activemq.apache.org
> Subject: Re: [ARTEMIS] Three nodes symmetric static discovery cluster with
> HA replication colocated and automatic client failover
>
> I haven't had any time to look into this in depth.  Would you be able to
> work up a reproducer?  I think you could easily modify one of the HA
> examples shipped with the broker to reproduce your use-case.  You might
> even try simplifying it a bit to just 2 nodes.  Simpler is always better
> for reproducers as it narrows down the investigation.  Once you get a
> reproducer you can slap it into a GitHub repo somehwere.
>
>
> Justin
>
> On Wed, Jul 4, 2018 at 9:19 AM, Stefaniuk, Marcin <
> marcin.stefaniuk@credit-suisse.com> wrote:
>
> > I'm struggling to create set-up as mentioned in the subject on ActiveMQ
> > Artemis 2.5.0. My key configuration looks as follows (for first node of
> > three):
> >
> > <acceptors>
> >     <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?
> > tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;
> > protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;
> useEpoll=true;amqpCredits=
> > 1000;amqpLowCredits=300</acceptor>
> > </acceptors>
> >
> > <connectors>
> >     <connector name="node-1-connector">tcp://localhost:61616</connector>
> >     <connector name="node-2-connector">tcp://localhost:62616</connector>
> >     <connector name="node-3-connector">tcp://localhost:63616</connector>
> > </connectors>
> >
> > <cluster-connections>
> >     <cluster-connection name="showcase-cluster">
> >         <connector-ref>node-1-connector</connector-ref>
> >         <retry-interval>500</retry-interval>
> >         <use-duplicate-detection>true</use-duplicate-detection>
> >         <message-load-balancing>ON_DEMAND</message-load-balancing>
> >         <max-hops>1</max-hops>
> >         <static-connectors>
> >             <connector-ref>node-2-connector</connector-ref>
> >             <connector-ref>node-3-connector</connector-ref>
> >         </static-connectors>
> >     </cluster-connection>
> > </cluster-connections>
> >
> > <ha-policy>
> >     <replication>
> >         <colocated>
> >             <backup-port-offset>7</backup-port-offset>
> >             <request-backup>true</request-backup>
> >             <max-backups>2</max-backups>
> >             <backup-request-retries>-1</backup-request-retries>
> >             <backup-request-retry-interval>2000</backup-request-
> > retry-interval>
> >             <master />
> >             <slave />
> >         </colocated>
> >     </replication>
> > </ha-policy>
> >
> > Rest of nodes has similar configuration - adjusted cluster connections
> and
> > acceptors. I'm deploying it also on three separate hosts (each different
> > from localhost). What is important I have no discovery groups (no
> > possibility to use UDP).
> >
> > So my test is connecting to a cluster using ActiveMQConnectionFactory and
> > URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&
> reconnectAttempts=-1"
> > (leaving third to be obtained directly from a cluster) and one thread is
> > producing and second consuming messages (separate connection used). Test
> is
> > working fine (unsurprisingly) even when producer is connected to
> different
> > nodes of the cluster. But when one node is stopped then producer /
> consumer
> > connected to that node is affected - no send / receive is performed but
> > some messages on the client side is buffered and flushed when node is
> again
> > available. I would expect to automagically switch connection to another
> > node but it is not happening here. I have tried that previously without
> HA
> > but with the same result.
> >
> > Could you help me determine what I'm doing wrong?
> >
> > Kind regards
> > Marcin Stefaniuk
> > CREDIT SUISSE (POLAND) SP. Z O.O
> > Solution Architect | Messaging Engineering Warsaw, MITM 47
> > Atrium 2 | 00-849 Warsaw | Poland
> > marcin.stefaniuk@credit-suisse.com<mailto:marcin.
> > stefaniuk@credit-suisse.com> | www.credit-suisse.com<http://
> > www.credit-suisse.com/>
> >
> > ============================================================
> ===================
> >
> > Please access the attached hyperlink for an important electronic
> > communications disclaimer:
> > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> > ============================================================
> ===================
> >
> >
>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>


=============================================================================== 
Please access the attached hyperlink for an important electronic communications disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
===============================================================================

Re: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

Posted by Justin Bertram <jb...@apache.org>.

What do you mean by "background verification"?

Also, what exactly fails immediately when you use 'artemis.cmd stop'?

One of the main goals of having a reproducible test-case is to be able to
share that test-case with others so they can reproduce what you are
seeing.  Reproducibility is one of the foundations of scientific
investigation.  Practically speaking, it cuts through a lot of unnecessary
back and forth since describing the issue with words alone is extremely
time consuming and error prone.  Do you have any kind of reproducer that
you could share?


Justin

On Wed, Jul 11, 2018 at 11:01 AM, Stefaniuk, Marcin <
marcin.stefaniuk@credit-suisse.com> wrote:

> I have tried to recreate my case with use of provided tooling (ServerUtil)
> for orchestrated server start and stop but I received worrying results on
> Windows machine.
>
> I'm starting three nodes of the cluster and background threads for produce
> / consume (for one minute). After 10 seconds first node is killed, next 10
> second node is killed, next 10 seconds third node is killed and...
> background verification is still working! It fails immediately when I use
> artemis.cmd stop from command line.
>
> It works on Linux.
>
> Kind regards
> Marcin
>
>
> -----Original Message-----
> From: Justin Bertram [mailto:jbertram@apache.org]
> Sent: 05 July 2018 16:07
> To: users@activemq.apache.org
> Subject: Re: [ARTEMIS] Three nodes symmetric static discovery cluster with
> HA replication colocated and automatic client failover
>
> I haven't had any time to look into this in depth.  Would you be able to
> work up a reproducer?  I think you could easily modify one of the HA
> examples shipped with the broker to reproduce your use-case.  You might
> even try simplifying it a bit to just 2 nodes.  Simpler is always better
> for reproducers as it narrows down the investigation.  Once you get a
> reproducer you can slap it into a GitHub repo somehwere.
>
>
> Justin
>
> On Wed, Jul 4, 2018 at 9:19 AM, Stefaniuk, Marcin <
> marcin.stefaniuk@credit-suisse.com> wrote:
>
> > I'm struggling to create set-up as mentioned in the subject on ActiveMQ
> > Artemis 2.5.0. My key configuration looks as follows (for first node of
> > three):
> >
> > <acceptors>
> >     <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?
> > tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;
> > protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;
> useEpoll=true;amqpCredits=
> > 1000;amqpLowCredits=300</acceptor>
> > </acceptors>
> >
> > <connectors>
> >     <connector name="node-1-connector">tcp://localhost:61616</connector>
> >     <connector name="node-2-connector">tcp://localhost:62616</connector>
> >     <connector name="node-3-connector">tcp://localhost:63616</connector>
> > </connectors>
> >
> > <cluster-connections>
> >     <cluster-connection name="showcase-cluster">
> >         <connector-ref>node-1-connector</connector-ref>
> >         <retry-interval>500</retry-interval>
> >         <use-duplicate-detection>true</use-duplicate-detection>
> >         <message-load-balancing>ON_DEMAND</message-load-balancing>
> >         <max-hops>1</max-hops>
> >         <static-connectors>
> >             <connector-ref>node-2-connector</connector-ref>
> >             <connector-ref>node-3-connector</connector-ref>
> >         </static-connectors>
> >     </cluster-connection>
> > </cluster-connections>
> >
> > <ha-policy>
> >     <replication>
> >         <colocated>
> >             <backup-port-offset>7</backup-port-offset>
> >             <request-backup>true</request-backup>
> >             <max-backups>2</max-backups>
> >             <backup-request-retries>-1</backup-request-retries>
> >             <backup-request-retry-interval>2000</backup-request-
> > retry-interval>
> >             <master />
> >             <slave />
> >         </colocated>
> >     </replication>
> > </ha-policy>
> >
> > Rest of nodes has similar configuration - adjusted cluster connections
> and
> > acceptors. I'm deploying it also on three separate hosts (each different
> > from localhost). What is important I have no discovery groups (no
> > possibility to use UDP).
> >
> > So my test is connecting to a cluster using ActiveMQConnectionFactory and
> > URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&
> reconnectAttempts=-1"
> > (leaving third to be obtained directly from a cluster) and one thread is
> > producing and second consuming messages (separate connection used). Test
> is
> > working fine (unsurprisingly) even when producer is connected to
> different
> > nodes of the cluster. But when one node is stopped then producer /
> consumer
> > connected to that node is affected - no send / receive is performed but
> > some messages on the client side is buffered and flushed when node is
> again
> > available. I would expect to automagically switch connection to another
> > node but it is not happening here. I have tried that previously without
> HA
> > but with the same result.
> >
> > Could you help me determine what I'm doing wrong?
> >
> > Kind regards
> > Marcin Stefaniuk
> > CREDIT SUISSE (POLAND) SP. Z O.O
> > Solution Architect | Messaging Engineering Warsaw, MITM 47
> > Atrium 2 | 00-849 Warsaw | Poland
> > marcin.stefaniuk@credit-suisse.com<mailto:marcin.
> > stefaniuk@credit-suisse.com> | www.credit-suisse.com<http://
> > www.credit-suisse.com/>
> >
> > ============================================================
> ===================
> >
> > Please access the attached hyperlink for an important electronic
> > communications disclaimer:
> > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> > ============================================================
> ===================
> >
> >
>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>

RE: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

Posted by "Stefaniuk, Marcin " <ma...@credit-suisse.com>.

I have tried to recreate my case with use of provided tooling (ServerUtil) for orchestrated server start and stop but I received worrying results on Windows machine.

I'm starting three nodes of the cluster and background threads for produce / consume (for one minute). After 10 seconds first node is killed, next 10 second node is killed, next 10 seconds third node is killed and... background verification is still working! It fails immediately when I use artemis.cmd stop from command line.

It works on Linux.

Kind regards
Marcin


-----Original Message-----
From: Justin Bertram [mailto:jbertram@apache.org] 
Sent: 05 July 2018 16:07
To: users@activemq.apache.org
Subject: Re: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

I haven't had any time to look into this in depth.  Would you be able to
work up a reproducer?  I think you could easily modify one of the HA
examples shipped with the broker to reproduce your use-case.  You might
even try simplifying it a bit to just 2 nodes.  Simpler is always better
for reproducers as it narrows down the investigation.  Once you get a
reproducer you can slap it into a GitHub repo somehwere.


Justin

On Wed, Jul 4, 2018 at 9:19 AM, Stefaniuk, Marcin <
marcin.stefaniuk@credit-suisse.com> wrote:

> I'm struggling to create set-up as mentioned in the subject on ActiveMQ
> Artemis 2.5.0. My key configuration looks as follows (for first node of
> three):
>
> <acceptors>
>     <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?
> tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;
> protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=
> 1000;amqpLowCredits=300</acceptor>
> </acceptors>
>
> <connectors>
>     <connector name="node-1-connector">tcp://localhost:61616</connector>
>     <connector name="node-2-connector">tcp://localhost:62616</connector>
>     <connector name="node-3-connector">tcp://localhost:63616</connector>
> </connectors>
>
> <cluster-connections>
>     <cluster-connection name="showcase-cluster">
>         <connector-ref>node-1-connector</connector-ref>
>         <retry-interval>500</retry-interval>
>         <use-duplicate-detection>true</use-duplicate-detection>
>         <message-load-balancing>ON_DEMAND</message-load-balancing>
>         <max-hops>1</max-hops>
>         <static-connectors>
>             <connector-ref>node-2-connector</connector-ref>
>             <connector-ref>node-3-connector</connector-ref>
>         </static-connectors>
>     </cluster-connection>
> </cluster-connections>
>
> <ha-policy>
>     <replication>
>         <colocated>
>             <backup-port-offset>7</backup-port-offset>
>             <request-backup>true</request-backup>
>             <max-backups>2</max-backups>
>             <backup-request-retries>-1</backup-request-retries>
>             <backup-request-retry-interval>2000</backup-request-
> retry-interval>
>             <master />
>             <slave />
>         </colocated>
>     </replication>
> </ha-policy>
>
> Rest of nodes has similar configuration - adjusted cluster connections and
> acceptors. I'm deploying it also on three separate hosts (each different
> from localhost). What is important I have no discovery groups (no
> possibility to use UDP).
>
> So my test is connecting to a cluster using ActiveMQConnectionFactory and
> URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&reconnectAttempts=-1"
> (leaving third to be obtained directly from a cluster) and one thread is
> producing and second consuming messages (separate connection used). Test is
> working fine (unsurprisingly) even when producer is connected to different
> nodes of the cluster. But when one node is stopped then producer / consumer
> connected to that node is affected - no send / receive is performed but
> some messages on the client side is buffered and flushed when node is again
> available. I would expect to automagically switch connection to another
> node but it is not happening here. I have tried that previously without HA
> but with the same result.
>
> Could you help me determine what I'm doing wrong?
>
> Kind regards
> Marcin Stefaniuk
> CREDIT SUISSE (POLAND) SP. Z O.O
> Solution Architect | Messaging Engineering Warsaw, MITM 47
> Atrium 2 | 00-849 Warsaw | Poland
> marcin.stefaniuk@credit-suisse.com<mailto:marcin.
> stefaniuk@credit-suisse.com> | www.credit-suisse.com<http://
> www.credit-suisse.com/>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>


=============================================================================== 
Please access the attached hyperlink for an important electronic communications disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
===============================================================================

Re: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

Posted by Justin Bertram <jb...@apache.org>.

I haven't had any time to look into this in depth.  Would you be able to
work up a reproducer?  I think you could easily modify one of the HA
examples shipped with the broker to reproduce your use-case.  You might
even try simplifying it a bit to just 2 nodes.  Simpler is always better
for reproducers as it narrows down the investigation.  Once you get a
reproducer you can slap it into a GitHub repo somehwere.


Justin

On Wed, Jul 4, 2018 at 9:19 AM, Stefaniuk, Marcin <
marcin.stefaniuk@credit-suisse.com> wrote:

> I'm struggling to create set-up as mentioned in the subject on ActiveMQ
> Artemis 2.5.0. My key configuration looks as follows (for first node of
> three):
>
> <acceptors>
>     <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?
> tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;
> protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=
> 1000;amqpLowCredits=300</acceptor>
> </acceptors>
>
> <connectors>
>     <connector name="node-1-connector">tcp://localhost:61616</connector>
>     <connector name="node-2-connector">tcp://localhost:62616</connector>
>     <connector name="node-3-connector">tcp://localhost:63616</connector>
> </connectors>
>
> <cluster-connections>
>     <cluster-connection name="showcase-cluster">
>         <connector-ref>node-1-connector</connector-ref>
>         <retry-interval>500</retry-interval>
>         <use-duplicate-detection>true</use-duplicate-detection>
>         <message-load-balancing>ON_DEMAND</message-load-balancing>
>         <max-hops>1</max-hops>
>         <static-connectors>
>             <connector-ref>node-2-connector</connector-ref>
>             <connector-ref>node-3-connector</connector-ref>
>         </static-connectors>
>     </cluster-connection>
> </cluster-connections>
>
> <ha-policy>
>     <replication>
>         <colocated>
>             <backup-port-offset>7</backup-port-offset>
>             <request-backup>true</request-backup>
>             <max-backups>2</max-backups>
>             <backup-request-retries>-1</backup-request-retries>
>             <backup-request-retry-interval>2000</backup-request-
> retry-interval>
>             <master />
>             <slave />
>         </colocated>
>     </replication>
> </ha-policy>
>
> Rest of nodes has similar configuration - adjusted cluster connections and
> acceptors. I'm deploying it also on three separate hosts (each different
> from localhost). What is important I have no discovery groups (no
> possibility to use UDP).
>
> So my test is connecting to a cluster using ActiveMQConnectionFactory and
> URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&reconnectAttempts=-1"
> (leaving third to be obtained directly from a cluster) and one thread is
> producing and second consuming messages (separate connection used). Test is
> working fine (unsurprisingly) even when producer is connected to different
> nodes of the cluster. But when one node is stopped then producer / consumer
> connected to that node is affected - no send / receive is performed but
> some messages on the client side is buffered and flushed when node is again
> available. I would expect to automagically switch connection to another
> node but it is not happening here. I have tried that previously without HA
> but with the same result.
>
> Could you help me determine what I'm doing wrong?
>
> Kind regards
> Marcin Stefaniuk
> CREDIT SUISSE (POLAND) SP. Z O.O
> Solution Architect | Messaging Engineering Warsaw, MITM 47
> Atrium 2 | 00-849 Warsaw | Poland
> marcin.stefaniuk@credit-suisse.com<mailto:marcin.
> stefaniuk@credit-suisse.com> | www.credit-suisse.com<http://
> www.credit-suisse.com/>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>