You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by "Steigerwald, Aaron" <as...@brandesassociates.com.INVALID> on 2022/05/26 03:34:10 UTC

Cross data center HA cluster

Hello,

Is anyone aware of a production deployment of an Artemis "cross data center" HA cluster? For example, a cluster spread across 3 data centers. Each data center contains a master/slave pair.

I would like to know what kind of issues anyone has overcome with such a configuration. I understand there are many configuration and operational variables. Any info would be helpful.

Note that we are considering asynchronously mirroring each master/slave pair's queues to a dedicated asynchronous target node. The asynchronous target node would exist in a different data center and would not service any other connections. A custom plugin would automatically scale down the messages into a live cluster node if the connections to the master/slave mirror sources were disconnected for a period of time.

Thank you,
Aaron Steigerwald

Re: [EXTERNAL]:Re: Cross data center HA cluster

Posted by Iliya Grushevskiy <il...@gmail.com>.

Hi Aaron.

Sorry there were some misleading information in my previous message. 
I have reviewed my test and it does contains network failure. 
If I turn off network failure everything works as expected.

So as I understand if you consider your LAN a reliable network there should not be any message loss on send.

Regards
Iliya Grushevskiy




> 26 мая 2022 г., в 14:07, Steigerwald, Aaron <as...@brandesassociates.com.INVALID> написал(а):
> 
> Hello Iliya,
> 
> Thank you very much for you response, it's very helpful.
> 
> Regarding "message loss on send on network failure between data centers"- the example architecture I described does not have master/slave HA pairs in separate data centers. Do you think the message loss you described has anything to do with the master/slave pairs being clustered across data centers? I ask because the HA replication takes place between the master/slave pairs on a LAN.
> 
> Thank again,
> 
> Aaron Steigerwald
> 
> -----Original Message-----
> From: Iliya Grushevskiy <il...@gmail.com> 
> Sent: Thursday, May 26, 2022 4:10 AM
> To: users@activemq.apache.org
> Subject: [EXTERNAL]:Re: Cross data center HA cluster
> 
> [CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.] ________________________________
> 
> 
> Hi, Aaron
> 
> We are currently testing similar deployment and have encountered several issues:
> 
> - message lose on send on network failure between data centers
>  I think this is due to the fact that HA replication is asynchronous and replica server may not catch up with primary.
> 
> - message lose or duplicate (depending on error handling strategy) on consumer on network failure between data centers
>  I think this was caused by two factors: duplicate id cache is consistent only in HA pair and message redistribution was on.
>  Switching off redistribution (or as an option increasing delay) should fix this issue.
> 
> - message duplicate on mirrored server
>  This is addressed in pull request: https://github.com/apache/activemq-artemis/pull/4066
> 
> Regards
> Iliya Grushevskiy
> 
> 
>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org> написал(а):
>> 
>> I'm not aware of such a production deployment and I would be surprised 
>> if there was one given that clustering was designed for local area 
>> networks with low latency which typically isn't what is found between data centers.
>> 
>> I recommend you pursue your mirroring approach as that is what 
>> mirroring was designed for (i.e. cross data-center disaster-recovery use-cases).
>> 
>> 
>> Justin
>> 
>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron 
>> <as...@brandesassociates.com.invalid> wrote:
>> 
>>> Hello,
>>> 
>>> Is anyone aware of a production deployment of an Artemis "cross data 
>>> center" HA cluster? For example, a cluster spread across 3 data centers.
>>> Each data center contains a master/slave pair.
>>> 
>>> I would like to know what kind of issues anyone has overcome with 
>>> such a configuration. I understand there are many configuration and 
>>> operational variables. Any info would be helpful.
>>> 
>>> Note that we are considering asynchronously mirroring each 
>>> master/slave pair's queues to a dedicated asynchronous target node. 
>>> The asynchronous target node would exist in a different data center 
>>> and would not service any other connections. A custom plugin would 
>>> automatically scale down the messages into a live cluster node if the 
>>> connections to the master/slave mirror sources were disconnected for a period of time.
>>> 
>>> Thank you,
>>> Aaron Steigerwald
>>> 
>

Re: [EXTERNAL]:Re: Cross data center HA cluster

Posted by Илья Грушевский <il...@gmail.com>.

I have a simple test with single HA pair in which I just kill master while sending messages in different thread. (it almost identical to replicated-transaction-failover example, except the different thread thing)
And I encounter the same message loss, as in network failure scenario.
I suspect there could be some miss configuration on client side in my test.

Example of messages flow in my test:

- send A1
- commit
- send A2 (will not be replicated and will be lost, replica can’t keep up with master)
- commit
- send A3 
- commit (failed, failover to replica)
- resend A3 
- commit (handle duplicate id)

I would expect synchronous replication in HA pair, but again I’m not sure that the client configuration is correct and my test is relevant.

Regards
Iliya Grushevskiy 

> 26 мая 2022 г., в 14:07, Steigerwald, Aaron <as...@brandesassociates.com.INVALID> написал(а):
> 
> Hello Iliya,
> 
> Thank you very much for you response, it's very helpful.
> 
> Regarding "message loss on send on network failure between data centers"- the example architecture I described does not have master/slave HA pairs in separate data centers. Do you think the message loss you described has anything to do with the master/slave pairs being clustered across data centers? I ask because the HA replication takes place between the master/slave pairs on a LAN.
> 
> Thank again,
> 
> Aaron Steigerwald
> 
> -----Original Message-----
> From: Iliya Grushevskiy <il...@gmail.com> 
> Sent: Thursday, May 26, 2022 4:10 AM
> To: users@activemq.apache.org
> Subject: [EXTERNAL]:Re: Cross data center HA cluster
> 
> [CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.] ________________________________
> 
> 
> Hi, Aaron
> 
> We are currently testing similar deployment and have encountered several issues:
> 
> - message lose on send on network failure between data centers
>  I think this is due to the fact that HA replication is asynchronous and replica server may not catch up with primary.
> 
> - message lose or duplicate (depending on error handling strategy) on consumer on network failure between data centers
>  I think this was caused by two factors: duplicate id cache is consistent only in HA pair and message redistribution was on.
>  Switching off redistribution (or as an option increasing delay) should fix this issue.
> 
> - message duplicate on mirrored server
>  This is addressed in pull request: https://github.com/apache/activemq-artemis/pull/4066
> 
> Regards
> Iliya Grushevskiy
> 
> 
>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org> написал(а):
>> 
>> I'm not aware of such a production deployment and I would be surprised 
>> if there was one given that clustering was designed for local area 
>> networks with low latency which typically isn't what is found between data centers.
>> 
>> I recommend you pursue your mirroring approach as that is what 
>> mirroring was designed for (i.e. cross data-center disaster-recovery use-cases).
>> 
>> 
>> Justin
>> 
>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron 
>> <as...@brandesassociates.com.invalid> wrote:
>> 
>>> Hello,
>>> 
>>> Is anyone aware of a production deployment of an Artemis "cross data 
>>> center" HA cluster? For example, a cluster spread across 3 data centers.
>>> Each data center contains a master/slave pair.
>>> 
>>> I would like to know what kind of issues anyone has overcome with 
>>> such a configuration. I understand there are many configuration and 
>>> operational variables. Any info would be helpful.
>>> 
>>> Note that we are considering asynchronously mirroring each 
>>> master/slave pair's queues to a dedicated asynchronous target node. 
>>> The asynchronous target node would exist in a different data center 
>>> and would not service any other connections. A custom plugin would 
>>> automatically scale down the messages into a live cluster node if the 
>>> connections to the master/slave mirror sources were disconnected for a period of time.
>>> 
>>> Thank you,
>>> Aaron Steigerwald
>>> 
>

RE: [EXTERNAL]:Re: Cross data center HA cluster

Posted by "Steigerwald, Aaron" <as...@brandesassociates.com.INVALID>.

Hello Iliya,

Thank you very much for you response, it's very helpful.

Regarding "message loss on send on network failure between data centers"- the example architecture I described does not have master/slave HA pairs in separate data centers. Do you think the message loss you described has anything to do with the master/slave pairs being clustered across data centers? I ask because the HA replication takes place between the master/slave pairs on a LAN.

Thank again,

Aaron Steigerwald

-----Original Message-----
From: Iliya Grushevskiy <il...@gmail.com> 
Sent: Thursday, May 26, 2022 4:10 AM
To: users@activemq.apache.org
Subject: [EXTERNAL]:Re: Cross data center HA cluster

[CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.] ________________________________


Hi, Aaron

We are currently testing similar deployment and have encountered several issues:

- message lose on send on network failure between data centers
  I think this is due to the fact that HA replication is asynchronous and replica server may not catch up with primary.

- message lose or duplicate (depending on error handling strategy) on consumer on network failure between data centers
  I think this was caused by two factors: duplicate id cache is consistent only in HA pair and message redistribution was on.
  Switching off redistribution (or as an option increasing delay) should fix this issue.

- message duplicate on mirrored server
  This is addressed in pull request: https://github.com/apache/activemq-artemis/pull/4066

Regards
Iliya Grushevskiy


> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org> написал(а):
>
> I'm not aware of such a production deployment and I would be surprised 
> if there was one given that clustering was designed for local area 
> networks with low latency which typically isn't what is found between data centers.
>
> I recommend you pursue your mirroring approach as that is what 
> mirroring was designed for (i.e. cross data-center disaster-recovery use-cases).
>
>
> Justin
>
> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron 
> <as...@brandesassociates.com.invalid> wrote:
>
>> Hello,
>>
>> Is anyone aware of a production deployment of an Artemis "cross data 
>> center" HA cluster? For example, a cluster spread across 3 data centers.
>> Each data center contains a master/slave pair.
>>
>> I would like to know what kind of issues anyone has overcome with 
>> such a configuration. I understand there are many configuration and 
>> operational variables. Any info would be helpful.
>>
>> Note that we are considering asynchronously mirroring each 
>> master/slave pair's queues to a dedicated asynchronous target node. 
>> The asynchronous target node would exist in a different data center 
>> and would not service any other connections. A custom plugin would 
>> automatically scale down the messages into a live cluster node if the 
>> connections to the master/slave mirror sources were disconnected for a period of time.
>>
>> Thank you,
>> Aaron Steigerwald
>>

RE: [EXTERNAL]:Re: Cross data center HA cluster

Posted by "Steigerwald, Aaron" <as...@brandesassociates.com.INVALID>.

Hello Justin,

Thank you for your response and sorry for the delay is this follow-up. Regarding your response:

> I'm not aware of such a production deployment and I would be surprised if there was one given that clustering was designed for local area networks with low latency which typically isn't what is found between data centers.

to my initial scenario:

>> Is anyone aware of a production deployment of an Artemis "cross data center" HA cluster? For example, a cluster spread across 3 data centers. Each data center contains a master/slave pair.

What are the risks and/or technical downsides of clustering across data centers? For example, let's say each master/slave pair's leader election is solved within each data center using Zookeeper. What bad things can happen when clustering live nodes between data centers? I imagine performance can suffer depending on what load balancing scheme is used, like ON_DEMAND or OFF_WITH_REDISTRIBUTION, with STRICT being the worst. Anything else?

I would like clients to use a cluster-aware Artemis driver to take advantage of all the Artemis cluster capabilities. If each data center has its own cluster separate from other data centers, I think my only driver choice is to use classic ActiveMQ drivers with the failover transport, correct? Otherwise clients will have to use custom code to failover and balance connections to the different data centers' brokers.

Thanks again,
Aaron Steigerwald

-----Original Message-----
From: Iliya Grushevskiy <il...@gmail.com> 
Sent: Friday, May 27, 2022 2:27 AM
To: users@activemq.apache.org
Subject: [EXTERNAL]:Re: Cross data center HA cluster

[CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.] ________________________________


Suppose we have a cluster.
And we have a HA pair of primary (P) and backup (B) nodes in this cluster.

1. Client connects to P and start sending messages 2. Network between P and B fails 3. Client continue sending message to P 4. Quorum vote completes. B become active and P stops

All messages between 2 and 4 will be lost.

Regards
Iliya Grushevskiy




> 27 мая 2022 г., в 02:36, Justin Bertram <jb...@apache.org> написал(а):
>
>> If you expect any connection loss between HA pair while having client
> able to connect to both, you may encounter message loss, and 
> replication is not a right solution.
>
> Can you elaborate on where the message loss would be in your example here?
> Please be precise about the details as they are especially important 
> in situations like this.
>
>
> Justin
>
> On Thu, May 26, 2022 at 6:12 PM Илья Грушевский <il...@gmail.com> wrote:
>
>> Justin, thanks for replying.
>> Now I’ve got the point of such behavior.
>>
>> Yes, that is exactly what I meant.
>> If you expect any connection loss between HA pair while having client 
>> able to connect to both, you may encounter message loss, and 
>> replication is not a right solution.
>>
>>> 27 мая 2022 г., в 01:42, Justin Bertram <jb...@apache.org>
>> написал(а):
>>>
>>>> But in case of replication failure, for example network failure it 
>>>> will
>>> not fail a transaction.
>>>
>>> I'm not exactly sure what you mean here. Are you saying that if a 
>>> primary is replicating to a backup and a client is in the middle of 
>>> a transaction and the replication connection between the primary and 
>>> backup fails then the client's transaction will complete 
>>> successfully? If so, that's the behavior I would expect. The loss of 
>>> the replication connection
>> potentially
>>> represents the crashing of the backup. At the very least it 
>>> represents
>> some
>>> kind of network problem. In any case, there is nothing wrong with 
>>> the primary broker in this circumstance so there is no reason to 
>>> fail the client's transaction. If failures with the backup caused 
>>> the primary
>> broker
>>> to fail client operations that would otherwise succeed then adding a
>> backup
>>> would *increase* the likelihood of client-facing failures instead of 
>>> decreasing them. This is typically the opposite of what is wanted 
>>> when configuring HA.
>>>
>>> I can imagine use-cases where it is absolutely critical for data to 
>>> be replicated successfully and any failure to replicate should be 
>>> considered fatal for any client operation. However, that behavior is 
>>> not supported
>> via
>>> replication. You would need to use shared-storage to get this kind 
>>> of behavior.
>>>
>>>
>>> Justin
>>>
>>> On Thu, May 26, 2022 at 4:51 PM Илья Грушевский <il...@gmail.com>
>> wrote:
>>>
>>>> You are right, I should have not use the term asynchronous.
>>>> But in case of replication failure, for example network failure it 
>>>> will not fail a transaction.
>>>> So if I gradually lose connections, first between primary and 
>>>> backup and then between primary and client I will lose all send 
>>>> message between
>> those
>>>> events.
>>>> In case of cluster I may lose connection between primary and client 
>>>> if primary node decides to turn itself off after quorum vote.
>>>>
>>>>> 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org>
>>>> написал(а):
>>>>>
>>>>>> I think this is due to the fact that HA replication is 
>>>>>> asynchronous
>> and
>>>>> replica server may not catch up with primary.
>>>>>
>>>>> To be clear, message replication between a primary and a backup is 
>>>>> *synchronous*.
>>>>>
>>>>>
>>>>> Justin
>>>>>
>>>>> On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy 
>>>>> <il...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, Aaron
>>>>>>
>>>>>> We are currently testing similar deployment and have encountered
>> several
>>>>>> issues:
>>>>>>
>>>>>> - message lose on send on network failure between data centers I 
>>>>>> think this is due to the fact that HA replication is asynchronous
>> and
>>>>>> replica server may not catch up with primary.
>>>>>>
>>>>>> - message lose or duplicate (depending on error handling 
>>>>>> strategy) on consumer on network failure between data centers I 
>>>>>> think this was caused by two factors: duplicate id cache is
>>>> consistent
>>>>>> only in HA pair and message redistribution was on.
>>>>>> Switching off redistribution (or as an option increasing delay) 
>>>>>> should fix this issue.
>>>>>>
>>>>>> - message duplicate on mirrored server This is addressed in pull 
>>>>>> request:
>>>>>> https://github.com/apache/activemq-artemis/pull/4066
>>>>>>
>>>>>> Regards
>>>>>> Iliya Grushevskiy
>>>>>>
>>>>>>
>>>>>>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
>>>>>> написал(а):
>>>>>>>
>>>>>>> I'm not aware of such a production deployment and I would be
>> surprised
>>>> if
>>>>>>> there was one given that clustering was designed for local area
>>>> networks
>>>>>>> with low latency which typically isn't what is found between 
>>>>>>> data
>>>>>> centers.
>>>>>>>
>>>>>>> I recommend you pursue your mirroring approach as that is what
>>>> mirroring
>>>>>>> was designed for (i.e. cross data-center disaster-recovery
>> use-cases).
>>>>>>>
>>>>>>>
>>>>>>> Justin
>>>>>>>
>>>>>>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron 
>>>>>>> <as...@brandesassociates.com.invalid> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Is anyone aware of a production deployment of an Artemis "cross 
>>>>>>>> data center" HA cluster? For example, a cluster spread across 3 
>>>>>>>> data
>>>> centers.
>>>>>>>> Each data center contains a master/slave pair.
>>>>>>>>
>>>>>>>> I would like to know what kind of issues anyone has overcome 
>>>>>>>> with
>>>> such a
>>>>>>>> configuration. I understand there are many configuration and
>>>> operational
>>>>>>>> variables. Any info would be helpful.
>>>>>>>>
>>>>>>>> Note that we are considering asynchronously mirroring each
>>>> master/slave
>>>>>>>> pair's queues to a dedicated asynchronous target node. The
>>>> asynchronous
>>>>>>>> target node would exist in a different data center and would 
>>>>>>>> not
>>>> service
>>>>>>>> any other connections. A custom plugin would automatically 
>>>>>>>> scale
>> down
>>>>>> the
>>>>>>>> messages into a live cluster node if the connections to the
>>>> master/slave
>>>>>>>> mirror sources were disconnected for a period of time.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Aaron Steigerwald
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>

Re: Cross data center HA cluster

Posted by Iliya Grushevskiy <il...@gmail.com>.

Suppose we have a cluster. 
And we have a HA pair of primary (P) and backup (B) nodes in this cluster.

1. Client connects to P and start sending messages
2. Network between P and B fails
3. Client continue sending message to P
4. Quorum vote completes. B become active and P stops

All messages between 2 and 4 will be lost.

Regards
Iliya Grushevskiy




> 27 мая 2022 г., в 02:36, Justin Bertram <jb...@apache.org> написал(а):
> 
>> If you expect any connection loss between HA pair while having client
> able to connect to both, you may encounter message loss, and replication is
> not a right solution.
> 
> Can you elaborate on where the message loss would be in your example here?
> Please be precise about the details as they are especially important in
> situations like this.
> 
> 
> Justin
> 
> On Thu, May 26, 2022 at 6:12 PM Илья Грушевский <il...@gmail.com> wrote:
> 
>> Justin, thanks for replying.
>> Now I’ve got the point of such behavior.
>> 
>> Yes, that is exactly what I meant.
>> If you expect any connection loss between HA pair while having client able
>> to connect to both,
>> you may encounter message loss, and replication is not a right solution.
>> 
>>> 27 мая 2022 г., в 01:42, Justin Bertram <jb...@apache.org>
>> написал(а):
>>> 
>>>> But in case of replication failure, for example network failure it will
>>> not fail a transaction.
>>> 
>>> I'm not exactly sure what you mean here. Are you saying that if a primary
>>> is replicating to a backup and a client is in the middle of a transaction
>>> and the replication connection between the primary and backup fails then
>>> the client's transaction will complete successfully? If so, that's the
>>> behavior I would expect. The loss of the replication connection
>> potentially
>>> represents the crashing of the backup. At the very least it represents
>> some
>>> kind of network problem. In any case, there is nothing wrong with the
>>> primary broker in this circumstance so there is no reason to fail the
>>> client's transaction. If failures with the backup caused the primary
>> broker
>>> to fail client operations that would otherwise succeed then adding a
>> backup
>>> would *increase* the likelihood of client-facing failures instead of
>>> decreasing them. This is typically the opposite of what is wanted when
>>> configuring HA.
>>> 
>>> I can imagine use-cases where it is absolutely critical for data to be
>>> replicated successfully and any failure to replicate should be considered
>>> fatal for any client operation. However, that behavior is not supported
>> via
>>> replication. You would need to use shared-storage to get this kind of
>>> behavior.
>>> 
>>> 
>>> Justin
>>> 
>>> On Thu, May 26, 2022 at 4:51 PM Илья Грушевский <il...@gmail.com>
>> wrote:
>>> 
>>>> You are right, I should have not use the term asynchronous.
>>>> But in case of replication failure, for example network failure it will
>>>> not fail a transaction.
>>>> So if I gradually lose connections, first between primary and backup and
>>>> then between primary and client I will lose all send message between
>> those
>>>> events.
>>>> In case of cluster I may lose connection between primary and client if
>>>> primary node decides to turn itself off after quorum vote.
>>>> 
>>>>> 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org>
>>>> написал(а):
>>>>> 
>>>>>> I think this is due to the fact that HA replication is asynchronous
>> and
>>>>> replica server may not catch up with primary.
>>>>> 
>>>>> To be clear, message replication between a primary and a backup is
>>>>> *synchronous*.
>>>>> 
>>>>> 
>>>>> Justin
>>>>> 
>>>>> On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi, Aaron
>>>>>> 
>>>>>> We are currently testing similar deployment and have encountered
>> several
>>>>>> issues:
>>>>>> 
>>>>>> - message lose on send on network failure between data centers
>>>>>> I think this is due to the fact that HA replication is asynchronous
>> and
>>>>>> replica server may not catch up with primary.
>>>>>> 
>>>>>> - message lose or duplicate (depending on error handling strategy) on
>>>>>> consumer on network failure between data centers
>>>>>> I think this was caused by two factors: duplicate id cache is
>>>> consistent
>>>>>> only in HA pair and message redistribution was on.
>>>>>> Switching off redistribution (or as an option increasing delay) should
>>>>>> fix this issue.
>>>>>> 
>>>>>> - message duplicate on mirrored server
>>>>>> This is addressed in pull request:
>>>>>> https://github.com/apache/activemq-artemis/pull/4066
>>>>>> 
>>>>>> Regards
>>>>>> Iliya Grushevskiy
>>>>>> 
>>>>>> 
>>>>>>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
>>>>>> написал(а):
>>>>>>> 
>>>>>>> I'm not aware of such a production deployment and I would be
>> surprised
>>>> if
>>>>>>> there was one given that clustering was designed for local area
>>>> networks
>>>>>>> with low latency which typically isn't what is found between data
>>>>>> centers.
>>>>>>> 
>>>>>>> I recommend you pursue your mirroring approach as that is what
>>>> mirroring
>>>>>>> was designed for (i.e. cross data-center disaster-recovery
>> use-cases).
>>>>>>> 
>>>>>>> 
>>>>>>> Justin
>>>>>>> 
>>>>>>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
>>>>>>> <as...@brandesassociates.com.invalid> wrote:
>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> Is anyone aware of a production deployment of an Artemis "cross data
>>>>>>>> center" HA cluster? For example, a cluster spread across 3 data
>>>> centers.
>>>>>>>> Each data center contains a master/slave pair.
>>>>>>>> 
>>>>>>>> I would like to know what kind of issues anyone has overcome with
>>>> such a
>>>>>>>> configuration. I understand there are many configuration and
>>>> operational
>>>>>>>> variables. Any info would be helpful.
>>>>>>>> 
>>>>>>>> Note that we are considering asynchronously mirroring each
>>>> master/slave
>>>>>>>> pair's queues to a dedicated asynchronous target node. The
>>>> asynchronous
>>>>>>>> target node would exist in a different data center and would not
>>>> service
>>>>>>>> any other connections. A custom plugin would automatically scale
>> down
>>>>>> the
>>>>>>>> messages into a live cluster node if the connections to the
>>>> master/slave
>>>>>>>> mirror sources were disconnected for a period of time.
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> Aaron Steigerwald
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Cross data center HA cluster

Posted by Justin Bertram <jb...@apache.org>.

> If you expect any connection loss between HA pair while having client
able to connect to both, you may encounter message loss, and replication is
not a right solution.

Can you elaborate on where the message loss would be in your example here?
Please be precise about the details as they are especially important in
situations like this.


Justin

On Thu, May 26, 2022 at 6:12 PM Илья Грушевский <il...@gmail.com> wrote:

> Justin, thanks for replying.
> Now I’ve got the point of such behavior.
>
> Yes, that is exactly what I meant.
> If you expect any connection loss between HA pair while having client able
> to connect to both,
> you may encounter message loss, and replication is not a right solution.
>
> > 27 мая 2022 г., в 01:42, Justin Bertram <jb...@apache.org>
> написал(а):
> >
> >> But in case of replication failure, for example network failure it will
> > not fail a transaction.
> >
> > I'm not exactly sure what you mean here. Are you saying that if a primary
> > is replicating to a backup and a client is in the middle of a transaction
> > and the replication connection between the primary and backup fails then
> > the client's transaction will complete successfully? If so, that's the
> > behavior I would expect. The loss of the replication connection
> potentially
> > represents the crashing of the backup. At the very least it represents
> some
> > kind of network problem. In any case, there is nothing wrong with the
> > primary broker in this circumstance so there is no reason to fail the
> > client's transaction. If failures with the backup caused the primary
> broker
> > to fail client operations that would otherwise succeed then adding a
> backup
> > would *increase* the likelihood of client-facing failures instead of
> > decreasing them. This is typically the opposite of what is wanted when
> > configuring HA.
> >
> > I can imagine use-cases where it is absolutely critical for data to be
> > replicated successfully and any failure to replicate should be considered
> > fatal for any client operation. However, that behavior is not supported
> via
> > replication. You would need to use shared-storage to get this kind of
> > behavior.
> >
> >
> > Justin
> >
> > On Thu, May 26, 2022 at 4:51 PM Илья Грушевский <il...@gmail.com>
> wrote:
> >
> >> You are right, I should have not use the term asynchronous.
> >> But in case of replication failure, for example network failure it will
> >> not fail a transaction.
> >> So if I gradually lose connections, first between primary and backup and
> >> then between primary and client I will lose all send message between
> those
> >> events.
> >> In case of cluster I may lose connection between primary and client if
> >> primary node decides to turn itself off after quorum vote.
> >>
> >>> 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org>
> >> написал(а):
> >>>
> >>>> I think this is due to the fact that HA replication is asynchronous
> and
> >>> replica server may not catch up with primary.
> >>>
> >>> To be clear, message replication between a primary and a backup is
> >>> *synchronous*.
> >>>
> >>>
> >>> Justin
> >>>
> >>> On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi, Aaron
> >>>>
> >>>> We are currently testing similar deployment and have encountered
> several
> >>>> issues:
> >>>>
> >>>> - message lose on send on network failure between data centers
> >>>> I think this is due to the fact that HA replication is asynchronous
> and
> >>>> replica server may not catch up with primary.
> >>>>
> >>>> - message lose or duplicate (depending on error handling strategy) on
> >>>> consumer on network failure between data centers
> >>>> I think this was caused by two factors: duplicate id cache is
> >> consistent
> >>>> only in HA pair and message redistribution was on.
> >>>> Switching off redistribution (or as an option increasing delay) should
> >>>> fix this issue.
> >>>>
> >>>> - message duplicate on mirrored server
> >>>> This is addressed in pull request:
> >>>> https://github.com/apache/activemq-artemis/pull/4066
> >>>>
> >>>> Regards
> >>>> Iliya Grushevskiy
> >>>>
> >>>>
> >>>>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
> >>>> написал(а):
> >>>>>
> >>>>> I'm not aware of such a production deployment and I would be
> surprised
> >> if
> >>>>> there was one given that clustering was designed for local area
> >> networks
> >>>>> with low latency which typically isn't what is found between data
> >>>> centers.
> >>>>>
> >>>>> I recommend you pursue your mirroring approach as that is what
> >> mirroring
> >>>>> was designed for (i.e. cross data-center disaster-recovery
> use-cases).
> >>>>>
> >>>>>
> >>>>> Justin
> >>>>>
> >>>>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
> >>>>> <as...@brandesassociates.com.invalid> wrote:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> Is anyone aware of a production deployment of an Artemis "cross data
> >>>>>> center" HA cluster? For example, a cluster spread across 3 data
> >> centers.
> >>>>>> Each data center contains a master/slave pair.
> >>>>>>
> >>>>>> I would like to know what kind of issues anyone has overcome with
> >> such a
> >>>>>> configuration. I understand there are many configuration and
> >> operational
> >>>>>> variables. Any info would be helpful.
> >>>>>>
> >>>>>> Note that we are considering asynchronously mirroring each
> >> master/slave
> >>>>>> pair's queues to a dedicated asynchronous target node. The
> >> asynchronous
> >>>>>> target node would exist in a different data center and would not
> >> service
> >>>>>> any other connections. A custom plugin would automatically scale
> down
> >>>> the
> >>>>>> messages into a live cluster node if the connections to the
> >> master/slave
> >>>>>> mirror sources were disconnected for a period of time.
> >>>>>>
> >>>>>> Thank you,
> >>>>>> Aaron Steigerwald
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Cross data center HA cluster

Posted by Илья Грушевский <il...@gmail.com>.

Justin, thanks for replying.
Now I’ve got the point of such behavior.

Yes, that is exactly what I meant.
If you expect any connection loss between HA pair while having client able to connect to both, 
you may encounter message loss, and replication is not a right solution.

> 27 мая 2022 г., в 01:42, Justin Bertram <jb...@apache.org> написал(а):
> 
>> But in case of replication failure, for example network failure it will
> not fail a transaction.
> 
> I'm not exactly sure what you mean here. Are you saying that if a primary
> is replicating to a backup and a client is in the middle of a transaction
> and the replication connection between the primary and backup fails then
> the client's transaction will complete successfully? If so, that's the
> behavior I would expect. The loss of the replication connection potentially
> represents the crashing of the backup. At the very least it represents some
> kind of network problem. In any case, there is nothing wrong with the
> primary broker in this circumstance so there is no reason to fail the
> client's transaction. If failures with the backup caused the primary broker
> to fail client operations that would otherwise succeed then adding a backup
> would *increase* the likelihood of client-facing failures instead of
> decreasing them. This is typically the opposite of what is wanted when
> configuring HA.
> 
> I can imagine use-cases where it is absolutely critical for data to be
> replicated successfully and any failure to replicate should be considered
> fatal for any client operation. However, that behavior is not supported via
> replication. You would need to use shared-storage to get this kind of
> behavior.
> 
> 
> Justin
> 
> On Thu, May 26, 2022 at 4:51 PM Илья Грушевский <il...@gmail.com> wrote:
> 
>> You are right, I should have not use the term asynchronous.
>> But in case of replication failure, for example network failure it will
>> not fail a transaction.
>> So if I gradually lose connections, first between primary and backup and
>> then between primary and client I will lose all send message between those
>> events.
>> In case of cluster I may lose connection between primary and client if
>> primary node decides to turn itself off after quorum vote.
>> 
>>> 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org>
>> написал(а):
>>> 
>>>> I think this is due to the fact that HA replication is asynchronous and
>>> replica server may not catch up with primary.
>>> 
>>> To be clear, message replication between a primary and a backup is
>>> *synchronous*.
>>> 
>>> 
>>> Justin
>>> 
>>> On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
>>> wrote:
>>> 
>>>> Hi, Aaron
>>>> 
>>>> We are currently testing similar deployment and have encountered several
>>>> issues:
>>>> 
>>>> - message lose on send on network failure between data centers
>>>> I think this is due to the fact that HA replication is asynchronous and
>>>> replica server may not catch up with primary.
>>>> 
>>>> - message lose or duplicate (depending on error handling strategy) on
>>>> consumer on network failure between data centers
>>>> I think this was caused by two factors: duplicate id cache is
>> consistent
>>>> only in HA pair and message redistribution was on.
>>>> Switching off redistribution (or as an option increasing delay) should
>>>> fix this issue.
>>>> 
>>>> - message duplicate on mirrored server
>>>> This is addressed in pull request:
>>>> https://github.com/apache/activemq-artemis/pull/4066
>>>> 
>>>> Regards
>>>> Iliya Grushevskiy
>>>> 
>>>> 
>>>>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
>>>> написал(а):
>>>>> 
>>>>> I'm not aware of such a production deployment and I would be surprised
>> if
>>>>> there was one given that clustering was designed for local area
>> networks
>>>>> with low latency which typically isn't what is found between data
>>>> centers.
>>>>> 
>>>>> I recommend you pursue your mirroring approach as that is what
>> mirroring
>>>>> was designed for (i.e. cross data-center disaster-recovery use-cases).
>>>>> 
>>>>> 
>>>>> Justin
>>>>> 
>>>>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
>>>>> <as...@brandesassociates.com.invalid> wrote:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> Is anyone aware of a production deployment of an Artemis "cross data
>>>>>> center" HA cluster? For example, a cluster spread across 3 data
>> centers.
>>>>>> Each data center contains a master/slave pair.
>>>>>> 
>>>>>> I would like to know what kind of issues anyone has overcome with
>> such a
>>>>>> configuration. I understand there are many configuration and
>> operational
>>>>>> variables. Any info would be helpful.
>>>>>> 
>>>>>> Note that we are considering asynchronously mirroring each
>> master/slave
>>>>>> pair's queues to a dedicated asynchronous target node. The
>> asynchronous
>>>>>> target node would exist in a different data center and would not
>> service
>>>>>> any other connections. A custom plugin would automatically scale down
>>>> the
>>>>>> messages into a live cluster node if the connections to the
>> master/slave
>>>>>> mirror sources were disconnected for a period of time.
>>>>>> 
>>>>>> Thank you,
>>>>>> Aaron Steigerwald
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Cross data center HA cluster

Posted by Justin Bertram <jb...@apache.org>.

> But in case of replication failure, for example network failure it will
not fail a transaction.

I'm not exactly sure what you mean here. Are you saying that if a primary
is replicating to a backup and a client is in the middle of a transaction
and the replication connection between the primary and backup fails then
the client's transaction will complete successfully? If so, that's the
behavior I would expect. The loss of the replication connection potentially
represents the crashing of the backup. At the very least it represents some
kind of network problem. In any case, there is nothing wrong with the
primary broker in this circumstance so there is no reason to fail the
client's transaction. If failures with the backup caused the primary broker
to fail client operations that would otherwise succeed then adding a backup
would *increase* the likelihood of client-facing failures instead of
decreasing them. This is typically the opposite of what is wanted when
configuring HA.

I can imagine use-cases where it is absolutely critical for data to be
replicated successfully and any failure to replicate should be considered
fatal for any client operation. However, that behavior is not supported via
replication. You would need to use shared-storage to get this kind of
behavior.


Justin

On Thu, May 26, 2022 at 4:51 PM Илья Грушевский <il...@gmail.com> wrote:

> You are right, I should have not use the term asynchronous.
> But in case of replication failure, for example network failure it will
> not fail a transaction.
> So if I gradually lose connections, first between primary and backup and
> then between primary and client I will lose all send message between those
> events.
> In case of cluster I may lose connection between primary and client if
> primary node decides to turn itself off after quorum vote.
>
> > 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org>
> написал(а):
> >
> >> I think this is due to the fact that HA replication is asynchronous and
> > replica server may not catch up with primary.
> >
> > To be clear, message replication between a primary and a backup is
> > *synchronous*.
> >
> >
> > Justin
> >
> > On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
> > wrote:
> >
> >> Hi, Aaron
> >>
> >> We are currently testing similar deployment and have encountered several
> >> issues:
> >>
> >> - message lose on send on network failure between data centers
> >>  I think this is due to the fact that HA replication is asynchronous and
> >> replica server may not catch up with primary.
> >>
> >> - message lose or duplicate (depending on error handling strategy) on
> >> consumer on network failure between data centers
> >>  I think this was caused by two factors: duplicate id cache is
> consistent
> >> only in HA pair and message redistribution was on.
> >>  Switching off redistribution (or as an option increasing delay) should
> >> fix this issue.
> >>
> >> - message duplicate on mirrored server
> >>  This is addressed in pull request:
> >> https://github.com/apache/activemq-artemis/pull/4066
> >>
> >> Regards
> >> Iliya Grushevskiy
> >>
> >>
> >>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
> >> написал(а):
> >>>
> >>> I'm not aware of such a production deployment and I would be surprised
> if
> >>> there was one given that clustering was designed for local area
> networks
> >>> with low latency which typically isn't what is found between data
> >> centers.
> >>>
> >>> I recommend you pursue your mirroring approach as that is what
> mirroring
> >>> was designed for (i.e. cross data-center disaster-recovery use-cases).
> >>>
> >>>
> >>> Justin
> >>>
> >>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
> >>> <as...@brandesassociates.com.invalid> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> Is anyone aware of a production deployment of an Artemis "cross data
> >>>> center" HA cluster? For example, a cluster spread across 3 data
> centers.
> >>>> Each data center contains a master/slave pair.
> >>>>
> >>>> I would like to know what kind of issues anyone has overcome with
> such a
> >>>> configuration. I understand there are many configuration and
> operational
> >>>> variables. Any info would be helpful.
> >>>>
> >>>> Note that we are considering asynchronously mirroring each
> master/slave
> >>>> pair's queues to a dedicated asynchronous target node. The
> asynchronous
> >>>> target node would exist in a different data center and would not
> service
> >>>> any other connections. A custom plugin would automatically scale down
> >> the
> >>>> messages into a live cluster node if the connections to the
> master/slave
> >>>> mirror sources were disconnected for a period of time.
> >>>>
> >>>> Thank you,
> >>>> Aaron Steigerwald
> >>>>
> >>
> >>
>
>

Re: Cross data center HA cluster

Posted by Илья Грушевский <il...@gmail.com>.

You are right, I should have not use the term asynchronous.
But in case of replication failure, for example network failure it will not fail a transaction.
So if I gradually lose connections, first between primary and backup and then between primary and client I will lose all send message between those events.
In case of cluster I may lose connection between primary and client if primary node decides to turn itself off after quorum vote.

> 27 мая 2022 г., в 00:31, Justin Bertram <jb...@apache.org> написал(а):
> 
>> I think this is due to the fact that HA replication is asynchronous and
> replica server may not catch up with primary.
> 
> To be clear, message replication between a primary and a backup is
> *synchronous*.
> 
> 
> Justin
> 
> On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
> wrote:
> 
>> Hi, Aaron
>> 
>> We are currently testing similar deployment and have encountered several
>> issues:
>> 
>> - message lose on send on network failure between data centers
>>  I think this is due to the fact that HA replication is asynchronous and
>> replica server may not catch up with primary.
>> 
>> - message lose or duplicate (depending on error handling strategy) on
>> consumer on network failure between data centers
>>  I think this was caused by two factors: duplicate id cache is consistent
>> only in HA pair and message redistribution was on.
>>  Switching off redistribution (or as an option increasing delay) should
>> fix this issue.
>> 
>> - message duplicate on mirrored server
>>  This is addressed in pull request:
>> https://github.com/apache/activemq-artemis/pull/4066
>> 
>> Regards
>> Iliya Grushevskiy
>> 
>> 
>>> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
>> написал(а):
>>> 
>>> I'm not aware of such a production deployment and I would be surprised if
>>> there was one given that clustering was designed for local area networks
>>> with low latency which typically isn't what is found between data
>> centers.
>>> 
>>> I recommend you pursue your mirroring approach as that is what mirroring
>>> was designed for (i.e. cross data-center disaster-recovery use-cases).
>>> 
>>> 
>>> Justin
>>> 
>>> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
>>> <as...@brandesassociates.com.invalid> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Is anyone aware of a production deployment of an Artemis "cross data
>>>> center" HA cluster? For example, a cluster spread across 3 data centers.
>>>> Each data center contains a master/slave pair.
>>>> 
>>>> I would like to know what kind of issues anyone has overcome with such a
>>>> configuration. I understand there are many configuration and operational
>>>> variables. Any info would be helpful.
>>>> 
>>>> Note that we are considering asynchronously mirroring each master/slave
>>>> pair's queues to a dedicated asynchronous target node. The asynchronous
>>>> target node would exist in a different data center and would not service
>>>> any other connections. A custom plugin would automatically scale down
>> the
>>>> messages into a live cluster node if the connections to the master/slave
>>>> mirror sources were disconnected for a period of time.
>>>> 
>>>> Thank you,
>>>> Aaron Steigerwald
>>>> 
>> 
>>

Re: Cross data center HA cluster

Posted by Justin Bertram <jb...@apache.org>.

> I think this is due to the fact that HA replication is asynchronous and
replica server may not catch up with primary.

To be clear, message replication between a primary and a backup is
*synchronous*.


Justin

On Thu, May 26, 2022 at 3:11 AM Iliya Grushevskiy <il...@gmail.com>
wrote:

> Hi, Aaron
>
> We are currently testing similar deployment and have encountered several
> issues:
>
> - message lose on send on network failure between data centers
>   I think this is due to the fact that HA replication is asynchronous and
> replica server may not catch up with primary.
>
> - message lose or duplicate (depending on error handling strategy) on
> consumer on network failure between data centers
>   I think this was caused by two factors: duplicate id cache is consistent
> only in HA pair and message redistribution was on.
>   Switching off redistribution (or as an option increasing delay) should
> fix this issue.
>
> - message duplicate on mirrored server
>   This is addressed in pull request:
> https://github.com/apache/activemq-artemis/pull/4066
>
> Regards
> Iliya Grushevskiy
>
>
> > 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org>
> написал(а):
> >
> > I'm not aware of such a production deployment and I would be surprised if
> > there was one given that clustering was designed for local area networks
> > with low latency which typically isn't what is found between data
> centers.
> >
> > I recommend you pursue your mirroring approach as that is what mirroring
> > was designed for (i.e. cross data-center disaster-recovery use-cases).
> >
> >
> > Justin
> >
> > On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
> > <as...@brandesassociates.com.invalid> wrote:
> >
> >> Hello,
> >>
> >> Is anyone aware of a production deployment of an Artemis "cross data
> >> center" HA cluster? For example, a cluster spread across 3 data centers.
> >> Each data center contains a master/slave pair.
> >>
> >> I would like to know what kind of issues anyone has overcome with such a
> >> configuration. I understand there are many configuration and operational
> >> variables. Any info would be helpful.
> >>
> >> Note that we are considering asynchronously mirroring each master/slave
> >> pair's queues to a dedicated asynchronous target node. The asynchronous
> >> target node would exist in a different data center and would not service
> >> any other connections. A custom plugin would automatically scale down
> the
> >> messages into a live cluster node if the connections to the master/slave
> >> mirror sources were disconnected for a period of time.
> >>
> >> Thank you,
> >> Aaron Steigerwald
> >>
>
>

Re: Cross data center HA cluster

Posted by Iliya Grushevskiy <il...@gmail.com>.

Hi, Aaron

We are currently testing similar deployment and have encountered several issues:

- message lose on send on network failure between data centers
  I think this is due to the fact that HA replication is asynchronous and replica server may not catch up with primary.

- message lose or duplicate (depending on error handling strategy) on consumer on network failure between data centers
  I think this was caused by two factors: duplicate id cache is consistent only in HA pair and message redistribution was on.
  Switching off redistribution (or as an option increasing delay) should fix this issue.

- message duplicate on mirrored server
  This is addressed in pull request: https://github.com/apache/activemq-artemis/pull/4066

Regards
Iliya Grushevskiy


> 26 мая 2022 г., в 07:46, Justin Bertram <jb...@apache.org> написал(а):
> 
> I'm not aware of such a production deployment and I would be surprised if
> there was one given that clustering was designed for local area networks
> with low latency which typically isn't what is found between data centers.
> 
> I recommend you pursue your mirroring approach as that is what mirroring
> was designed for (i.e. cross data-center disaster-recovery use-cases).
> 
> 
> Justin
> 
> On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
> <as...@brandesassociates.com.invalid> wrote:
> 
>> Hello,
>> 
>> Is anyone aware of a production deployment of an Artemis "cross data
>> center" HA cluster? For example, a cluster spread across 3 data centers.
>> Each data center contains a master/slave pair.
>> 
>> I would like to know what kind of issues anyone has overcome with such a
>> configuration. I understand there are many configuration and operational
>> variables. Any info would be helpful.
>> 
>> Note that we are considering asynchronously mirroring each master/slave
>> pair's queues to a dedicated asynchronous target node. The asynchronous
>> target node would exist in a different data center and would not service
>> any other connections. A custom plugin would automatically scale down the
>> messages into a live cluster node if the connections to the master/slave
>> mirror sources were disconnected for a period of time.
>> 
>> Thank you,
>> Aaron Steigerwald
>>

Re: Cross data center HA cluster

Posted by Justin Bertram <jb...@apache.org>.

I'm not aware of such a production deployment and I would be surprised if
there was one given that clustering was designed for local area networks
with low latency which typically isn't what is found between data centers.

I recommend you pursue your mirroring approach as that is what mirroring
was designed for (i.e. cross data-center disaster-recovery use-cases).

Justin

On Wed, May 25, 2022 at 10:36 PM Steigerwald, Aaron
<as...@brandesassociates.com.invalid> wrote:

> Hello,
>
> Is anyone aware of a production deployment of an Artemis "cross data
> center" HA cluster? For example, a cluster spread across 3 data centers.
> Each data center contains a master/slave pair.
>
> I would like to know what kind of issues anyone has overcome with such a
> configuration. I understand there are many configuration and operational
> variables. Any info would be helpful.
>
> Note that we are considering asynchronously mirroring each master/slave
> pair's queues to a dedicated asynchronous target node. The asynchronous
> target node would exist in a different data center and would not service
> any other connections. A custom plugin would automatically scale down the
> messages into a live cluster node if the connections to the master/slave
> mirror sources were disconnected for a period of time.
>
> Thank you,
> Aaron Steigerwald
>