You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@qpid.apache.org by Kerry Bonin <ke...@gmail.com> on 2011/06/27 21:19:14 UTC

Client broker failover notification/callback?

I was wondering if there was any existing way to know when a broker failover
occurs (and which broker is active), other then hacking the client?

Re: Client broker failover notification/callback?

Posted by Gordon Sim <gs...@redhat.com>.

On 08/09/2011 05:44 PM, Kerry Bonin wrote:
> Missed this to reply.
>
> We spin to receive messages, the older API had a callback for a received
> message, but we see no equivalent for messaging.  There is an asynchronous
> fetch, so we have a receiving thread that spins on those.  I haven't looked
> at the recent library (we're currently using 0.8), although I see QPID-2451
> is dead (sigh).  In a multithreaded app, I really hate to have a thread
> burning CPU in an idiotic spin loop.

It shouldn't need to burn CPU while waiting for a message. The issue at 
present is that you need a thread per session. However that was the case 
with the older API also, its just that the threads were hidden from you.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: Client broker failover notification/callback?

Posted by Kerry Bonin <ke...@gmail.com>.

Missed this to reply.

We spin to receive messages, the older API had a callback for a received
message, but we see no equivalent for messaging.  There is an asynchronous
fetch, so we have a receiving thread that spins on those.  I haven't looked
at the recent library (we're currently using 0.8), although I see QPID-2451
is dead (sigh).  In a multithreaded app, I really hate to have a thread
burning CPU in an idiotic spin loop.


On Tue, Jun 28, 2011 at 12:29 PM, Gordon Sim <gs...@redhat.com> wrote:

> On 06/28/2011 04:35 PM, Kerry Bonin wrote:
>
>> Irregardless of how it is accomplished, more control is needed over
>> failover to prevent network splits.  The current model is
>> essentially to accept the broker as a single point of failure, or
>> deploy Linux clustering.
>>
>
> Without replication or persistence of the queue state broker failure
> implies potential message loss. In the case of persistence the
> availability of the messages is tied to the availability of a broker
> using that store.
>
> Assuming you can tolerate message loss, the issue is simply to ensure
> that communication remains possible i.e. that producers and consumers
> always use (or at least gravitate to) the same broker instance.
>
> You could perhaps use a QMF based approach to this.
>
> E.g. you could have an application that connected to all the brokers in
> a list, kept track of their availability and retried periodically to
> connect to unavailable brokers in the list. It would then control which
> of these brokers was the 'primary' and would be able to close all other
> connections on the other brokers using QMF commands.
>
> Of course you would need to ensure that this application didn't itself
> become a single point of failure. However it would be simple enough to
> have a couple of redundant instances waiting to take over.
>
> The one fly in the ointment at present is that a QMF close of a
> connection will result in a connection exception on the associated
> client rather than triggering failover. However adding another command,
> abort say, that simply disconnected the client with no explicit
> handshake would fix that (e.g. see attached patch).
>
> Does this approach sound workable for you? The benefit is that it
> doesn't require any client modification and would also provide a quite
> valuable tool for centralised monitoring of general failover behaviour.
>
>
>  I'd recommend an ordered broker list with monitoring and automatic
>> fallback, subject to some flap mitigation rules. I also would expose
>> more client state, I understand the desire to hide stuff so people
>> don't use unsupported interfaces, but it is useful for
>> serviceability and diagnostics to have a library expose basic health
>> information.
>>
>
> Yes, I agree that being able to determine the remote peer address would
> be valuable, as well as perhaps other aspects of retry such as time
> since last connected etc. (You can at present determine whether you are
> currently connected using Connection::isOpen()).
>
>
>  Out of honest curiosity, why don't you like callbacks?  We've had to
>> use threading with spin loops to get around the lack of callbacks
>> for the messaging APIs, and we don't like having CPUs loads float
>> high under low load even if the spin loop load drops gracefully
>> under pressure, it feels inelegant and it raises power consumption.
>>
>
> Yes, I don't like the need to spin either. I want to expose a more
> general notification system to avoid the need for any polling for
> changes (including changes to failover/connection related state). I prefer
> that to callbacks at the level the API is operating at. Callback based
> approaches could then be built on top of this.
>
> Out of curiosity, what is it you are spinning for?
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>

Re: Client broker failover notification/callback?

Posted by Gordon Sim <gs...@redhat.com>.

On 06/28/2011 04:35 PM, Kerry Bonin wrote:
> Irregardless of how it is accomplished, more control is needed over
> failover to prevent network splits.  The current model is
> essentially to accept the broker as a single point of failure, or
> deploy Linux clustering.

Without replication or persistence of the queue state broker failure
implies potential message loss. In the case of persistence the
availability of the messages is tied to the availability of a broker
using that store.

Assuming you can tolerate message loss, the issue is simply to ensure
that communication remains possible i.e. that producers and consumers
always use (or at least gravitate to) the same broker instance.

You could perhaps use a QMF based approach to this.

E.g. you could have an application that connected to all the brokers in
a list, kept track of their availability and retried periodically to
connect to unavailable brokers in the list. It would then control which
of these brokers was the 'primary' and would be able to close all other
connections on the other brokers using QMF commands.

Of course you would need to ensure that this application didn't itself
become a single point of failure. However it would be simple enough to
have a couple of redundant instances waiting to take over.

The one fly in the ointment at present is that a QMF close of a
connection will result in a connection exception on the associated
client rather than triggering failover. However adding another command,
abort say, that simply disconnected the client with no explicit
handshake would fix that (e.g. see attached patch).

Does this approach sound workable for you? The benefit is that it
doesn't require any client modification and would also provide a quite
valuable tool for centralised monitoring of general failover behaviour.

> I'd recommend an ordered broker list with monitoring and automatic
> fallback, subject to some flap mitigation rules. I also would expose
> more client state, I understand the desire to hide stuff so people
> don't use unsupported interfaces, but it is useful for
> serviceability and diagnostics to have a library expose basic health
> information.

Yes, I agree that being able to determine the remote peer address would
be valuable, as well as perhaps other aspects of retry such as time
since last connected etc. (You can at present determine whether you are
currently connected using Connection::isOpen()).

> Out of honest curiosity, why don't you like callbacks?  We've had to
> use threading with spin loops to get around the lack of callbacks
> for the messaging APIs, and we don't like having CPUs loads float
> high under low load even if the spin loop load drops gracefully
> under pressure, it feels inelegant and it raises power consumption.

Yes, I don't like the need to spin either. I want to expose a more
general notification system to avoid the need for any polling for
changes (including changes to failover/connection related state). I 
prefer that to callbacks at the level the API is operating at. Callback 
based approaches could then be built on top of this.

Out of curiosity, what is it you are spinning for?

Re: Client broker failover notification/callback?

Posted by Kerry Bonin <ke...@gmail.com>.

Thank you for the replies!

I'll probably end up creating a wrapper library to address this.  Not very
clean with the boost object wrapper behavior, but I'm not willing to write
my own client at this time, so not much of an alternative.  (I don't like
the deeper code structure, its a pain to hack anything that needlessly
complex.)

Irregardless of how it is accomplished, more control is needed over failover
to prevent network splits.  The current model is essentially to accept the
broker as a single point of failure, or deploy Linux clustering.  I'd
recommend an ordered broker list with monitoring and automatic fallback,
subject to some flap mitigation rules.  I also would expose more client
state, I understand the desire to hide stuff so people don't use unsupported
interfaces, but it is useful for serviceability and diagnostics to have a
library expose basic health information.

Out of honest curiosity, why don't you like callbacks?  We've had to use
threading with spin loops to get around the lack of callbacks for the
messaging APIs, and we don't like having CPUs loads float high under low
load even if the spin loop load drops gracefully under pressure, it feels
inelegant and it raises power consumption.  Personally, I like callbacks as
long as threading models are clearly documented so its understood what can
and can't be done safely, especially in conjunction with a decent sig/slot
library to decouple threading models of subsystems.  I've built servers that
handle nearly 100k active sessions (video games) this way, FWIW...

It would be nice if the Apache QPID team gave a little more support to
Windows.  No (official) Windows service, still no broker federation /
clustering.

Kerry Bonin

On Tue, Jun 28, 2011 at 7:52 AM, Gordon Sim <gs...@redhat.com> wrote:

> On 06/27/2011 08:19 PM, Kerry Bonin wrote:
>
>> I was wondering if there was any existing way to know when a broker
>> failover
>> occurs (and which broker is active), other then hacking the client?
>>
>
> In a clustered broker you can get notifications of changes to cluster
> membership. However other than that the c++ client library doesn't have any
> callbacks to notify of automatic reconnection attempts or to determine where
> a connection is connected to at any point in time (assuming it is
> connected).
>
> I don't like callbacks, at least at this level of the API, but I would
> agree that these would be nice concerns to address. Having some way to get
> some form of peer address would be nice, as would a way to be notified of
> disconnect and reconnect events.
>
> (Apologies for the delayed response on user thread btw)
>
>
>
> ------------------------------**------------------------------**---------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.**apache.org<de...@qpid.apache.org>
>
>

Re: Client broker failover notification/callback?

Posted by Gordon Sim <gs...@redhat.com>.

On 06/27/2011 08:19 PM, Kerry Bonin wrote:
> I was wondering if there was any existing way to know when a broker failover
> occurs (and which broker is active), other then hacking the client?

In a clustered broker you can get notifications of changes to cluster 
membership. However other than that the c++ client library doesn't have 
any callbacks to notify of automatic reconnection attempts or to 
determine where a connection is connected to at any point in time 
(assuming it is connected).

I don't like callbacks, at least at this level of the API, but I would 
agree that these would be nice concerns to address. Having some way to 
get some form of peer address would be nice, as would a way to be 
notified of disconnect and reconnect events.

(Apologies for the delayed response on user thread btw)

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org