You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Cosmin Petre Florea <fo...@gmail.com> on 2020/10/07 12:46:26 UTC

ActiveMQ advisory message and failover

Hi,

  We are using ActiveMQ C++ library with failover transport. We have:

  - 2  brokers configured in "shared file system master slave" (only 1 
is active at a time)

-  2 clients connected to the active broker Broker_1:

     Client_1 has a consumer for |ActiveMQ.Advisory.Consumer.Topic|

     Client_2 has a consumer C2 for Topic_2;

  When Client_1 finds out that C2 is no longer a consumer on Topic_2 
(via ConsumerInfo/RemoveInfo), Client_1 assumes that Client_2 has 
suddenly died and stops communication with Client_2. We want that to 
happens only when Client_2 is down.

  If the Broker_1 is up and Client_2 stops, Client_1 receives the 
RemoveInfo and everything OK.

  if Broker_1 goes down and Client_1 and Client_2 keep running but move 
to Broker_2, then Client_1 receives a RemoveInfo + ConsumerInfo for C2 
and the  content of the advisory is the same as when the Client_2 
actually went down  (we don't know by just reading the Advisory message 
if the broker went down or the client went down).

We noticed that, at failover, we receive in Client_1 first a RemoveInfo, 
then a TransportListener::transportInterrupted() so in order to "ignore" 
the RemoveInfo , we don't process it when it arrives, but wait (N 
milliseconds) to see if transportInterrupted() is called too. This works 
almost all the time, only that under heavy loads the distance in time 
between the 2 events tends to get bigger than our configured N 
(sometimes more than 3 seconds), so maybe we need something else.

Do you know of any approach to distinguish between a client failure and 
a broker failure when failover is activated? Is there a "separated" 
channel for advisory messages or they are sent in the same stream as 
regular data? How reliable are the advisory messages sent during 
failover - should we always expect a RemoveInfo+ConsumerInfo (sometimes 
we don't receive the ConsumerInfo, don't know if bug in our application)?

Thank you,

Cosmin



Re: ActiveMQ advisory message and failover

Posted by Tim Bain <tb...@alumni.duke.edu>.
Sorry for the delay in responding to your questions.

I'm not aware of a better way to do what you're looking for. The advisories
are only about clients rather than the brokers themselves, and each broker
is only aware of state for themselves rather than anything network-wide, so
unfortunately there's no built-in way to tell the difference between a
client failure and a broker failover that will result in client
reconnection.

There is no separate channel for advisory messages; they travel with the
"normal" message stream, and would be delayed if either the message stream
gets backed up or or the broker's CPU is heavily used.

The sending of advisory messages is supposed to be reliable even under
load, so you should receive both messages but possibly with a delay between
them if the CPU is heavily contended. There is no way to set an upper limit
on how long that delay is because that's a product of how contended the CPU
is, but if the CPU is contended to the level that a meaningful delay would
exist, that's an indication that either the broker should be running on a
more powerful machine or you should make changes to your application design
to reduce the usage or spread it across multiple brokers.

If you're not seeing the second advisory message consistently, you could
submit a bug in JIRA, but these types of bugs (if it even is a bug in the
broker, rather than a bug in your code, as you said) are hard to find
without a reliable reproducer. Are you able to reproduce the problem
reliably via code/config that you can attach to the bug? If not, that may
reduce the likelihood of someone successfully investigating and finding a
solution, so the more you can do to identify a way to reproduce the
problem, the greater the odds of it getting fixed.

Tim

On Wed, Oct 7, 2020, 6:46 AM Cosmin Petre Florea <fo...@gmail.com>
wrote:

> Hi,
>
>   We are using ActiveMQ C++ library with failover transport. We have:
>
>   - 2  brokers configured in "shared file system master slave" (only 1
> is active at a time)
>
> -  2 clients connected to the active broker Broker_1:
>
>      Client_1 has a consumer for |ActiveMQ.Advisory.Consumer.Topic|
>
>      Client_2 has a consumer C2 for Topic_2;
>
>   When Client_1 finds out that C2 is no longer a consumer on Topic_2
> (via ConsumerInfo/RemoveInfo), Client_1 assumes that Client_2 has
> suddenly died and stops communication with Client_2. We want that to
> happens only when Client_2 is down.
>
>   If the Broker_1 is up and Client_2 stops, Client_1 receives the
> RemoveInfo and everything OK.
>
>   if Broker_1 goes down and Client_1 and Client_2 keep running but move
> to Broker_2, then Client_1 receives a RemoveInfo + ConsumerInfo for C2
> and the  content of the advisory is the same as when the Client_2
> actually went down  (we don't know by just reading the Advisory message
> if the broker went down or the client went down).
>
> We noticed that, at failover, we receive in Client_1 first a RemoveInfo,
> then a TransportListener::transportInterrupted() so in order to "ignore"
> the RemoveInfo , we don't process it when it arrives, but wait (N
> milliseconds) to see if transportInterrupted() is called too. This works
> almost all the time, only that under heavy loads the distance in time
> between the 2 events tends to get bigger than our configured N
> (sometimes more than 3 seconds), so maybe we need something else.
>
> Do you know of any approach to distinguish between a client failure and
> a broker failure when failover is activated? Is there a "separated"
> channel for advisory messages or they are sent in the same stream as
> regular data? How reliable are the advisory messages sent during
> failover - should we always expect a RemoveInfo+ConsumerInfo (sometimes
> we don't receive the ConsumerInfo, don't know if bug in our application)?
>
> Thank you,
>
> Cosmin
>
>
>