You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@qpid.apache.org by Alan Conway <ac...@redhat.com> on 2009/11/18 15:44:30 UTC

[c++ cluster] request for opinions, cluster behavior during update.

At the moment a clustered broker stalls clients while it is initializing, giving 
or receiving an update.  It's been pointed out that this can result in long 
delays for clients connected to a broker that elects to give an update to a 
newcomer, and it might be better for the broker to disconnect clients so they 
can fail over to another broker not busy with an update.

There are 3 cases to consider:

  - new member joining/getting update, new client: stall or reject?
  - established member giving update, new client: stall or reject?
  - established member giving update, connected client: stall or disconnect?

On the 3rd point I would note that it's possible for clients to disconnect 
themselves if the broker is unresponsive by using heartbeats, and that not all 
clients can fail-over so I'd lean towards stall on that one, but I think 
rejecting new clients may make sense here.

Part of the original motivation for stalling is that it makes it easy to write 
tests. You can start a broker and immediately start a client without worrying 
about waiting till the broker is ready. That's a nice property but there are 
other ways to achieve that. Current qpidd -d returns as soon as the broker is 
ready to listen for TCP requests, which may be before the broker is has joined 
the cluster. We could change that behavior to wait till all plugins report 
"ready". For tests we could also grep the log output for the ready message.

Thoughts appreciated!

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Alan Conway <ac...@redhat.com>.

On 11/19/2009 08:11 AM, Carl Trieloff wrote:
> Alan Conway wrote:
>> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>>> Alan Conway wrote:
>>>>> At the moment a clustered broker stalls clients while it is
>>>>> initializing, giving or receiving an update. It's been pointed out
>>>>> that this can result in long delays for clients connected to a broker
>>>>> that elects to give an update to a newcomer, and it might be better
>>>>> for the broker to disconnect clients so they can fail over to another
>>>>> broker not busy with an update.
>>>>>
>>>>> There are 3 cases to consider:
>>>>>
>>>>> - new member joining/getting update, new client: stall or reject?
>>>>> - established member giving update, new client: stall or reject?
>>>>> - established member giving update, connected client: stall or
>>>>> disconnect?
>>>>>
>>>>> On the 3rd point I would note that it's possible for clients to
>>>>> disconnect themselves if the broker is unresponsive by using
>>>>> heartbeats, and that not all clients can fail-over so I'd lean towards
>>>>> stall on that one, but I think rejecting new clients may make sense
>>>>> here.
>>>>>
>>>>> Part of the original motivation for stalling is that it makes it easy
>>>>> to write tests. You can start a broker and immediately start a client
>>>>> without worrying about waiting till the broker is ready. That's a nice
>>>>> property but there are other ways to achieve that. Current qpidd -d
>>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>>> which may be before the broker is has joined the cluster. We could
>>>>> change that behavior to wait till all plugins report "ready". For
>>>>> tests we could also grep the log output for the ready message.
>>>
>>> I think it would be better not to stall existing clients at all in the
>>> established member giving update case.
>>>
>>> It would be better to not be available for connect at all until up to
>>> date in the new member case
>>>
>>> It's arguable what the best behaviour is in the established member
>>> giving update gets a new client case. However I would note that the low
>>> level code isn't capable of stopping accepting connections and then
>>> starting again once it has started accepting connections. So they would
>>> have to connect then be disconnected with an exception.
>>>
>>> I would also suggest that considering the number of likely cluster
>>> members is important here - I'd expect very installations to run more
>>> than 4 machines in a cluster and 2 is probably the norm.
>>>
>>> So if a single broker goes down and restarts it's going to be made up to
>>> date by the only other cluster member. In this case if that member
>>> stalls no more work can get done until the rejoining member is now up to
>>> date.
>>>
>>> I guess this sort of case can be dealt with in the current scheme by
>>> having multiple cluster members on a single piece of hardware.
>>>
>>> Andrew
>>>
>>>>>
>>>>> Thoughts appreciated!
>>>>
>>>> I would dis-allow connections to the new broker until it is synced. I
>>>> would not bump any active connections, but rather leave that to
>>>> heartbeat.
>>>>
>>>> One other idea would be to add an option to cluster config which could
>>>> specify the preferred nodes to update from, and it would try this list
>>>> first. I.e. in a 4 node cluster, all updates are made from node 4
>>>> (preferred) if there, and then from an app point of view I connect to
>>>> node 1-3 for example. This way updates have no effect on my clients and
>>>> if I care about being stalled I set this option. if the prefered node/s
>>>> are not running it would just pick one as it does today
>>>>
>>>> Carl.
>>>>
>>
>> Good ideas here. To bring it together, how about this:
>>
>> There are 2 kinds of broker process:
>> - Service brokers serve clients, they never give updates.
>> - Update brokers give updates, they never serve clients.
>>
>> We create them automatically in pairs: a service broker forks an
>> update broker and restarts it if it dies. The update broker never
>> accepts connections and is not advertised to clients for failover.
>>
>> So the 3 cases are now
>> - new member joining/getting update: rejected (with exception) until
>> ready.
>> - established member giving update, new client: never happens.
>> - established member giving update, connected client: never happens.
>>
>> We could further constrain things and say a service broker can *only*
>> get an update from its own update broker (once the update broker is up
>> to date). The advantage is they'll be on the same host so less network
>> traffic, the disadvantage is they can't update in parallel if there
>> are multiple update brokers available.
>>
>> Does that address all the issues? There is some extra complexity in
>> having 2 processes per broker, but for the moment I can't see any
>> insurmountable hurdles. The nice thing is that we can do this with 0
>> new configuration so it will Just Work when its installed.
>
> Is not having two processes not a lot more management and bit a big
> hammer with the item being work which most people probably don't care about
>

It may be. The issue is delays for clients connected to a broker receiving an 
update. They can work around with heartbeats + failover.

If we want to eliminate the delays then the easiest way to do that would be to 
have a second broker process that can do updates while the other broker services 
clients. It doesn't have to be automatically forked, but then we have to explain 
to the customer how to start this update broker. As Andrew points out, you don't 
want to reserve a host just for updates, and you don't need to since having an 
update broker on the same host as a service broker will give you what you want. 
 From the point of view of reliability, you want updates to be as reliable as 
client service hence the idea of pairs. However it may be going too far, maybe 
some configuration with a "preferred update" is sufficient.

I'll definitely fix the reject instead of stall for new clients. 
Preferred/update-only brokers needs more thought...


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Carl Trieloff <cc...@redhat.com>.

Alan Conway wrote:
> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>> Alan Conway wrote:
>>>> At the moment a clustered broker stalls clients while it is
>>>> initializing, giving or receiving an update.  It's been pointed out
>>>> that this can result in long delays for clients connected to a broker
>>>> that elects to give an update to a newcomer, and it might be better
>>>> for the broker to disconnect clients so they can fail over to another
>>>> broker not busy with an update.
>>>>
>>>> There are 3 cases to consider:
>>>>
>>>>   - new member joining/getting update, new client: stall or reject?
>>>>   - established member giving update, new client: stall or reject?
>>>>   - established member giving update, connected client: stall or
>>>> disconnect?
>>>>
>>>> On the 3rd point I would note that it's possible for clients to
>>>> disconnect themselves if the broker is unresponsive by using
>>>> heartbeats, and that not all clients can fail-over so I'd lean towards
>>>> stall on that one, but I think rejecting new clients may make sense 
>>>> here.
>>>>
>>>> Part of the original motivation for stalling is that it makes it easy
>>>> to write tests. You can start a broker and immediately start a client
>>>> without worrying about waiting till the broker is ready. That's a nice
>>>> property but there are other ways to achieve that. Current qpidd -d
>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>> which may be before the broker is has joined the cluster. We could
>>>> change that behavior to wait till all plugins report "ready". For
>>>> tests we could also grep the log output for the ready message.
>>
>> I think it would be better not to stall existing clients at all in the
>> established member giving update case.
>>
>> It would be better to not be available for connect at all until up to
>> date in the new member case
>>
>> It's arguable what the best behaviour is in the established member
>> giving update gets a new client case. However I would note that the low
>> level code isn't capable of stopping accepting connections and then
>> starting again once it has started accepting connections. So they would
>> have to connect then be disconnected with an exception.
>>
>> I would also suggest that considering the number of likely cluster
>> members is important here - I'd expect very installations to run more
>> than 4 machines in a cluster and 2 is probably the norm.
>>
>> So if a single broker goes down and restarts it's going to be made up to
>> date by the only other cluster member. In this case if that member
>> stalls no more work can get done until the rejoining member is now up to
>> date.
>>
>> I guess this sort of case can be dealt with in the current scheme by
>> having multiple cluster members on a single piece of hardware.
>>
>> Andrew
>>
>>>>
>>>> Thoughts appreciated!
>>>
>>> I would dis-allow connections to the new broker until it is synced.  I
>>> would not bump any active connections, but rather leave that to 
>>> heartbeat.
>>>
>>> One other idea would be to add an option to cluster config which could
>>> specify the preferred nodes to update from, and it would try this list
>>> first.  I.e. in a 4 node cluster, all updates are made from node 4
>>> (preferred) if there, and then from an app point of view I connect to
>>> node 1-3 for example.  This way updates have no effect on my clients 
>>> and
>>> if I care about being stalled I set this option. if the prefered node/s
>>> are not running it would just pick one as it does today
>>>
>>> Carl.
>>>
>
> Good ideas here. To bring it together, how about this:
>
> There are 2 kinds of broker process:
>  - Service brokers serve clients, they  never give updates.
>  - Update brokers give updates, they never serve clients.
>
> We create them automatically in pairs: a service broker forks an 
> update broker and restarts it if it dies. The update broker never 
> accepts connections and is not advertised to clients for failover.
>
> So the 3 cases are now
> - new member joining/getting update: rejected (with exception) until 
> ready.
> - established member giving update, new client: never happens.
> - established member giving update, connected client: never happens.
>
> We could further constrain things and say a service broker can *only* 
> get an update from its own update broker (once the update broker is up 
> to date). The advantage is they'll be on the same host so less network 
> traffic, the disadvantage is they can't update in parallel if there 
> are multiple update brokers available.
>
> Does that address all the issues? There is some extra complexity in 
> having 2 processes per broker, but for the moment I can't see any 
> insurmountable hurdles. The nice thing is that we can do this with 0 
> new configuration so it will Just Work when its installed. 

Is not having two processes not a lot more management and bit a big 
hammer with the item being work which most people probably don't care about

Carl.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Andrew Stitcher <as...@redhat.com>.

On Wed, 2009-11-18 at 17:23 -0500, Alan Conway wrote:
> ...
> Good ideas here. To bring it together, how about this:
> 
> There are 2 kinds of broker process:
>   - Service brokers serve clients, they  never give updates.
>   - Update brokers give updates, they never serve clients.
> 
> We create them automatically in pairs: a service broker forks an update broker 
> and restarts it if it dies. The update broker never accepts connections and is 
> not advertised to clients for failover.
> 
> So the 3 cases are now
> - new member joining/getting update: rejected (with exception) until ready.
> - established member giving update, new client: never happens.
> - established member giving update, connected client: never happens.
> 
> We could further constrain things and say a service broker can *only* get an 
> update from its own update broker (once the update broker is up to date). The 
> advantage is they'll be on the same host so less network traffic, the 
> disadvantage is they can't update in parallel if there are multiple update 
> brokers available.
> 
> Does that address all the issues? There is some extra complexity in having 2 
> processes per broker, but for the moment I can't see any insurmountable hurdles. 
> The nice thing is that we can do this with 0 new configuration so it will Just 
> Work when its installed.

The main problem I can see here is that of introducing new points of
failure, but I guess it's a little unavoidable.

The other small issue is the process restarting logic - in many ways
this is duplicative of the functionality of init etc. and is also fiddly
and hard to get right the first time you do it. Perhaps we can use the
existing system capability to do this.

Andrew



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Alan Conway <ac...@redhat.com>.

On 11/19/2009 09:31 AM, Gordon Sim wrote:
> On 11/19/2009 02:29 PM, Alan Conway wrote:
>> On 11/19/2009 09:04 AM, Gordon Sim wrote:
>>> On 11/18/2009 10:23 PM, Alan Conway wrote:
>>>> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>>>>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>>>>> Alan Conway wrote:
>>>>>>> At the moment a clustered broker stalls clients while it is
>>>>>>> initializing, giving or receiving an update. It's been pointed out
>>>>>>> that this can result in long delays for clients connected to a
>>>>>>> broker
>>>>>>> that elects to give an update to a newcomer, and it might be better
>>>>>>> for the broker to disconnect clients so they can fail over to
>>>>>>> another
>>>>>>> broker not busy with an update.
>>>>>>>
>>>>>>> There are 3 cases to consider:
>>>>>>>
>>>>>>> - new member joining/getting update, new client: stall or reject?
>>>>>>> - established member giving update, new client: stall or reject?
>>>>>>> - established member giving update, connected client: stall or
>>>>>>> disconnect?
>>>>>>>
>>>>>>> On the 3rd point I would note that it's possible for clients to
>>>>>>> disconnect themselves if the broker is unresponsive by using
>>>>>>> heartbeats, and that not all clients can fail-over so I'd lean
>>>>>>> towards
>>>>>>> stall on that one, but I think rejecting new clients may make sense
>>>>>>> here.
>>>>>>>
>>>>>>> Part of the original motivation for stalling is that it makes it
>>>>>>> easy
>>>>>>> to write tests. You can start a broker and immediately start a
>>>>>>> client
>>>>>>> without worrying about waiting till the broker is ready. That's a
>>>>>>> nice
>>>>>>> property but there are other ways to achieve that. Current qpidd -d
>>>>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>>>>> which may be before the broker is has joined the cluster. We could
>>>>>>> change that behavior to wait till all plugins report "ready". For
>>>>>>> tests we could also grep the log output for the ready message.
>>>>>
>>>>> I think it would be better not to stall existing clients at all in the
>>>>> established member giving update case.
>>>>>
>>>>> It would be better to not be available for connect at all until up to
>>>>> date in the new member case
>>>>>
>>>>> It's arguable what the best behaviour is in the established member
>>>>> giving update gets a new client case. However I would note that the
>>>>> low
>>>>> level code isn't capable of stopping accepting connections and then
>>>>> starting again once it has started accepting connections. So they
>>>>> would
>>>>> have to connect then be disconnected with an exception.
>>>>>
>>>>> I would also suggest that considering the number of likely cluster
>>>>> members is important here - I'd expect very installations to run more
>>>>> than 4 machines in a cluster and 2 is probably the norm.
>>>>>
>>>>> So if a single broker goes down and restarts it's going to be made
>>>>> up to
>>>>> date by the only other cluster member. In this case if that member
>>>>> stalls no more work can get done until the rejoining member is now
>>>>> up to
>>>>> date.
>>>>>
>>>>> I guess this sort of case can be dealt with in the current scheme by
>>>>> having multiple cluster members on a single piece of hardware.
>>>>>
>>>>> Andrew
>>>>>
>>>>>>>
>>>>>>> Thoughts appreciated!
>>>>>>
>>>>>> I would dis-allow connections to the new broker until it is synced. I
>>>>>> would not bump any active connections, but rather leave that to
>>>>>> heartbeat.
>>>>>>
>>>>>> One other idea would be to add an option to cluster config which
>>>>>> could
>>>>>> specify the preferred nodes to update from, and it would try this
>>>>>> list
>>>>>> first. I.e. in a 4 node cluster, all updates are made from node 4
>>>>>> (preferred) if there, and then from an app point of view I connect to
>>>>>> node 1-3 for example. This way updates have no effect on my clients
>>>>>> and
>>>>>> if I care about being stalled I set this option. if the prefered
>>>>>> node/s
>>>>>> are not running it would just pick one as it does today
>>>>>>
>>>>>> Carl.
>>>>>>
>>>>
>>>> Good ideas here. To bring it together, how about this:
>>>>
>>>> There are 2 kinds of broker process:
>>>> - Service brokers serve clients, they never give updates.
>>>> - Update brokers give updates, they never serve clients.
>>>>
>>>> We create them automatically in pairs: a service broker forks an update
>>>> broker and restarts it if it dies. The update broker never accepts
>>>> connections and is not advertised to clients for failover.
>>>>
>>>> So the 3 cases are now
>>>> - new member joining/getting update: rejected (with exception) until
>>>> ready.
>>>> - established member giving update, new client: never happens.
>>>> - established member giving update, connected client: never happens.
>>>>
>>>> We could further constrain things and say a service broker can *only*
>>>> get an update from its own update broker (once the update broker is up
>>>> to date). The advantage is they'll be on the same host so less network
>>>> traffic, the disadvantage is they can't update in parallel if there are
>>>> multiple update brokers available.
>>>>
>>>> Does that address all the issues? There is some extra complexity in
>>>> having 2 processes per broker, but for the moment I can't see any
>>>> insurmountable hurdles. The nice thing is that we can do this with 0
>>>> new
>>>> configuration so it will Just Work when its installed.
>>>
>>> With Carl's suggestion of an option to allow you to restrict which
>>> servers are used for updating you could get the set up the same thing.
>>> That allows those who don't want or need the extra complexity to avoid
>>> it while allowing the full flexibility to those who do need it.
>>
>> Yes its probably better to make it an option than an automatic behavior.
>>
>>> On a separate but related point, how are nodes not doing the update
>>> affected by the updater stalling? If there is an application error (e.g.
>>> queue not found) does the whole cluster hang until the update is
>>> complete?
>> Yes, unfortunately it would. Another good reason to split the work of
>> updating from servicing clients, so the update can complete more quickly.
>>
>> If there is a high load on the other nodes is there any danger
>>> that the updater will never be able to catch up after the update is
>>> complete and the cpg queue would keep filling up?
>>
>> Yes, if the load is continuously at the max the update broker can
>> handle it
>> could end up lagging behind the other brokers. We could introduce a
>> limit on the size of the brokers CPG queue to push back on CPG and let
>> its flow control slow down the other brokers if one starts to lag too
>> far behind.
>>
>> This is again another reason for making the updater faster by separating
>> updates from client service.
>
> I thought that client service was stalled during an update anyway on the
> node performing the update?
Yes indeed. Good point.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Gordon Sim <gs...@redhat.com>.

On 11/19/2009 02:29 PM, Alan Conway wrote:
> On 11/19/2009 09:04 AM, Gordon Sim wrote:
>> On 11/18/2009 10:23 PM, Alan Conway wrote:
>>> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>>>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>>>> Alan Conway wrote:
>>>>>> At the moment a clustered broker stalls clients while it is
>>>>>> initializing, giving or receiving an update. It's been pointed out
>>>>>> that this can result in long delays for clients connected to a broker
>>>>>> that elects to give an update to a newcomer, and it might be better
>>>>>> for the broker to disconnect clients so they can fail over to another
>>>>>> broker not busy with an update.
>>>>>>
>>>>>> There are 3 cases to consider:
>>>>>>
>>>>>> - new member joining/getting update, new client: stall or reject?
>>>>>> - established member giving update, new client: stall or reject?
>>>>>> - established member giving update, connected client: stall or
>>>>>> disconnect?
>>>>>>
>>>>>> On the 3rd point I would note that it's possible for clients to
>>>>>> disconnect themselves if the broker is unresponsive by using
>>>>>> heartbeats, and that not all clients can fail-over so I'd lean
>>>>>> towards
>>>>>> stall on that one, but I think rejecting new clients may make sense
>>>>>> here.
>>>>>>
>>>>>> Part of the original motivation for stalling is that it makes it easy
>>>>>> to write tests. You can start a broker and immediately start a client
>>>>>> without worrying about waiting till the broker is ready. That's a
>>>>>> nice
>>>>>> property but there are other ways to achieve that. Current qpidd -d
>>>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>>>> which may be before the broker is has joined the cluster. We could
>>>>>> change that behavior to wait till all plugins report "ready". For
>>>>>> tests we could also grep the log output for the ready message.
>>>>
>>>> I think it would be better not to stall existing clients at all in the
>>>> established member giving update case.
>>>>
>>>> It would be better to not be available for connect at all until up to
>>>> date in the new member case
>>>>
>>>> It's arguable what the best behaviour is in the established member
>>>> giving update gets a new client case. However I would note that the low
>>>> level code isn't capable of stopping accepting connections and then
>>>> starting again once it has started accepting connections. So they would
>>>> have to connect then be disconnected with an exception.
>>>>
>>>> I would also suggest that considering the number of likely cluster
>>>> members is important here - I'd expect very installations to run more
>>>> than 4 machines in a cluster and 2 is probably the norm.
>>>>
>>>> So if a single broker goes down and restarts it's going to be made
>>>> up to
>>>> date by the only other cluster member. In this case if that member
>>>> stalls no more work can get done until the rejoining member is now
>>>> up to
>>>> date.
>>>>
>>>> I guess this sort of case can be dealt with in the current scheme by
>>>> having multiple cluster members on a single piece of hardware.
>>>>
>>>> Andrew
>>>>
>>>>>>
>>>>>> Thoughts appreciated!
>>>>>
>>>>> I would dis-allow connections to the new broker until it is synced. I
>>>>> would not bump any active connections, but rather leave that to
>>>>> heartbeat.
>>>>>
>>>>> One other idea would be to add an option to cluster config which could
>>>>> specify the preferred nodes to update from, and it would try this list
>>>>> first. I.e. in a 4 node cluster, all updates are made from node 4
>>>>> (preferred) if there, and then from an app point of view I connect to
>>>>> node 1-3 for example. This way updates have no effect on my clients
>>>>> and
>>>>> if I care about being stalled I set this option. if the prefered
>>>>> node/s
>>>>> are not running it would just pick one as it does today
>>>>>
>>>>> Carl.
>>>>>
>>>
>>> Good ideas here. To bring it together, how about this:
>>>
>>> There are 2 kinds of broker process:
>>> - Service brokers serve clients, they never give updates.
>>> - Update brokers give updates, they never serve clients.
>>>
>>> We create them automatically in pairs: a service broker forks an update
>>> broker and restarts it if it dies. The update broker never accepts
>>> connections and is not advertised to clients for failover.
>>>
>>> So the 3 cases are now
>>> - new member joining/getting update: rejected (with exception) until
>>> ready.
>>> - established member giving update, new client: never happens.
>>> - established member giving update, connected client: never happens.
>>>
>>> We could further constrain things and say a service broker can *only*
>>> get an update from its own update broker (once the update broker is up
>>> to date). The advantage is they'll be on the same host so less network
>>> traffic, the disadvantage is they can't update in parallel if there are
>>> multiple update brokers available.
>>>
>>> Does that address all the issues? There is some extra complexity in
>>> having 2 processes per broker, but for the moment I can't see any
>>> insurmountable hurdles. The nice thing is that we can do this with 0 new
>>> configuration so it will Just Work when its installed.
>>
>> With Carl's suggestion of an option to allow you to restrict which
>> servers are used for updating you could get the set up the same thing.
>> That allows those who don't want or need the extra complexity to avoid
>> it while allowing the full flexibility to those who do need it.
>
> Yes its probably better to make it an option than an automatic behavior.
>
>> On a separate but related point, how are nodes not doing the update
>> affected by the updater stalling? If there is an application error (e.g.
>> queue not found) does the whole cluster hang until the update is
>> complete?
> Yes, unfortunately it would. Another good reason to split the work of
> updating from servicing clients, so the update can complete more quickly.
>
> If there is a high load on the other nodes is there any danger
>> that the updater will never be able to catch up after the update is
>> complete and the cpg queue would keep filling up?
>
> Yes, if the load is continuously at the max the update broker can handle it
> could end up lagging behind the other brokers. We could introduce a
> limit on the size of the brokers CPG queue to push back on CPG and let
> its flow control slow down the other brokers if one starts to lag too
> far behind.
>
> This is again another reason for making the updater faster by separating
> updates from client service.

I thought that client service was stalled during an update anyway on the 
node performing the update?

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Alan Conway <ac...@redhat.com>.

On 11/19/2009 09:04 AM, Gordon Sim wrote:
> On 11/18/2009 10:23 PM, Alan Conway wrote:
>> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>>> Alan Conway wrote:
>>>>> At the moment a clustered broker stalls clients while it is
>>>>> initializing, giving or receiving an update. It's been pointed out
>>>>> that this can result in long delays for clients connected to a broker
>>>>> that elects to give an update to a newcomer, and it might be better
>>>>> for the broker to disconnect clients so they can fail over to another
>>>>> broker not busy with an update.
>>>>>
>>>>> There are 3 cases to consider:
>>>>>
>>>>> - new member joining/getting update, new client: stall or reject?
>>>>> - established member giving update, new client: stall or reject?
>>>>> - established member giving update, connected client: stall or
>>>>> disconnect?
>>>>>
>>>>> On the 3rd point I would note that it's possible for clients to
>>>>> disconnect themselves if the broker is unresponsive by using
>>>>> heartbeats, and that not all clients can fail-over so I'd lean towards
>>>>> stall on that one, but I think rejecting new clients may make sense
>>>>> here.
>>>>>
>>>>> Part of the original motivation for stalling is that it makes it easy
>>>>> to write tests. You can start a broker and immediately start a client
>>>>> without worrying about waiting till the broker is ready. That's a nice
>>>>> property but there are other ways to achieve that. Current qpidd -d
>>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>>> which may be before the broker is has joined the cluster. We could
>>>>> change that behavior to wait till all plugins report "ready". For
>>>>> tests we could also grep the log output for the ready message.
>>>
>>> I think it would be better not to stall existing clients at all in the
>>> established member giving update case.
>>>
>>> It would be better to not be available for connect at all until up to
>>> date in the new member case
>>>
>>> It's arguable what the best behaviour is in the established member
>>> giving update gets a new client case. However I would note that the low
>>> level code isn't capable of stopping accepting connections and then
>>> starting again once it has started accepting connections. So they would
>>> have to connect then be disconnected with an exception.
>>>
>>> I would also suggest that considering the number of likely cluster
>>> members is important here - I'd expect very installations to run more
>>> than 4 machines in a cluster and 2 is probably the norm.
>>>
>>> So if a single broker goes down and restarts it's going to be made up to
>>> date by the only other cluster member. In this case if that member
>>> stalls no more work can get done until the rejoining member is now up to
>>> date.
>>>
>>> I guess this sort of case can be dealt with in the current scheme by
>>> having multiple cluster members on a single piece of hardware.
>>>
>>> Andrew
>>>
>>>>>
>>>>> Thoughts appreciated!
>>>>
>>>> I would dis-allow connections to the new broker until it is synced. I
>>>> would not bump any active connections, but rather leave that to
>>>> heartbeat.
>>>>
>>>> One other idea would be to add an option to cluster config which could
>>>> specify the preferred nodes to update from, and it would try this list
>>>> first. I.e. in a 4 node cluster, all updates are made from node 4
>>>> (preferred) if there, and then from an app point of view I connect to
>>>> node 1-3 for example. This way updates have no effect on my clients and
>>>> if I care about being stalled I set this option. if the prefered node/s
>>>> are not running it would just pick one as it does today
>>>>
>>>> Carl.
>>>>
>>
>> Good ideas here. To bring it together, how about this:
>>
>> There are 2 kinds of broker process:
>> - Service brokers serve clients, they never give updates.
>> - Update brokers give updates, they never serve clients.
>>
>> We create them automatically in pairs: a service broker forks an update
>> broker and restarts it if it dies. The update broker never accepts
>> connections and is not advertised to clients for failover.
>>
>> So the 3 cases are now
>> - new member joining/getting update: rejected (with exception) until
>> ready.
>> - established member giving update, new client: never happens.
>> - established member giving update, connected client: never happens.
>>
>> We could further constrain things and say a service broker can *only*
>> get an update from its own update broker (once the update broker is up
>> to date). The advantage is they'll be on the same host so less network
>> traffic, the disadvantage is they can't update in parallel if there are
>> multiple update brokers available.
>>
>> Does that address all the issues? There is some extra complexity in
>> having 2 processes per broker, but for the moment I can't see any
>> insurmountable hurdles. The nice thing is that we can do this with 0 new
>> configuration so it will Just Work when its installed.
>
> With Carl's suggestion of an option to allow you to restrict which
> servers are used for updating you could get the set up the same thing.
> That allows those who don't want or need the extra complexity to avoid
> it while allowing the full flexibility to those who do need it.

Yes its probably better to make it an option than an automatic behavior.

> On a separate but related point, how are nodes not doing the update
> affected by the updater stalling? If there is an application error (e.g.
> queue not found) does the whole cluster hang until the update is
> complete?
Yes, unfortunately it would. Another good reason to split the work of updating 
from servicing clients, so the update can complete more quickly.

  If there is a high load on the other nodes is there any danger
> that the updater will never be able to catch up after the update is
> complete and the cpg queue would keep filling up?

Yes, if the load is continuously at the max the update broker can handle it
could end up lagging behind the other brokers. We could introduce a limit on the 
size of the brokers CPG queue to push back on CPG and let its flow control slow 
down the other brokers if one starts to lag too far behind.

This is again another reason for making the updater faster by separating updates 
from client service.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Gordon Sim <gs...@redhat.com>.

On 11/18/2009 10:23 PM, Alan Conway wrote:
> On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
>> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>>> Alan Conway wrote:
>>>> At the moment a clustered broker stalls clients while it is
>>>> initializing, giving or receiving an update. It's been pointed out
>>>> that this can result in long delays for clients connected to a broker
>>>> that elects to give an update to a newcomer, and it might be better
>>>> for the broker to disconnect clients so they can fail over to another
>>>> broker not busy with an update.
>>>>
>>>> There are 3 cases to consider:
>>>>
>>>> - new member joining/getting update, new client: stall or reject?
>>>> - established member giving update, new client: stall or reject?
>>>> - established member giving update, connected client: stall or
>>>> disconnect?
>>>>
>>>> On the 3rd point I would note that it's possible for clients to
>>>> disconnect themselves if the broker is unresponsive by using
>>>> heartbeats, and that not all clients can fail-over so I'd lean towards
>>>> stall on that one, but I think rejecting new clients may make sense
>>>> here.
>>>>
>>>> Part of the original motivation for stalling is that it makes it easy
>>>> to write tests. You can start a broker and immediately start a client
>>>> without worrying about waiting till the broker is ready. That's a nice
>>>> property but there are other ways to achieve that. Current qpidd -d
>>>> returns as soon as the broker is ready to listen for TCP requests,
>>>> which may be before the broker is has joined the cluster. We could
>>>> change that behavior to wait till all plugins report "ready". For
>>>> tests we could also grep the log output for the ready message.
>>
>> I think it would be better not to stall existing clients at all in the
>> established member giving update case.
>>
>> It would be better to not be available for connect at all until up to
>> date in the new member case
>>
>> It's arguable what the best behaviour is in the established member
>> giving update gets a new client case. However I would note that the low
>> level code isn't capable of stopping accepting connections and then
>> starting again once it has started accepting connections. So they would
>> have to connect then be disconnected with an exception.
>>
>> I would also suggest that considering the number of likely cluster
>> members is important here - I'd expect very installations to run more
>> than 4 machines in a cluster and 2 is probably the norm.
>>
>> So if a single broker goes down and restarts it's going to be made up to
>> date by the only other cluster member. In this case if that member
>> stalls no more work can get done until the rejoining member is now up to
>> date.
>>
>> I guess this sort of case can be dealt with in the current scheme by
>> having multiple cluster members on a single piece of hardware.
>>
>> Andrew
>>
>>>>
>>>> Thoughts appreciated!
>>>
>>> I would dis-allow connections to the new broker until it is synced. I
>>> would not bump any active connections, but rather leave that to
>>> heartbeat.
>>>
>>> One other idea would be to add an option to cluster config which could
>>> specify the preferred nodes to update from, and it would try this list
>>> first. I.e. in a 4 node cluster, all updates are made from node 4
>>> (preferred) if there, and then from an app point of view I connect to
>>> node 1-3 for example. This way updates have no effect on my clients and
>>> if I care about being stalled I set this option. if the prefered node/s
>>> are not running it would just pick one as it does today
>>>
>>> Carl.
>>>
>
> Good ideas here. To bring it together, how about this:
>
> There are 2 kinds of broker process:
> - Service brokers serve clients, they never give updates.
> - Update brokers give updates, they never serve clients.
>
> We create them automatically in pairs: a service broker forks an update
> broker and restarts it if it dies. The update broker never accepts
> connections and is not advertised to clients for failover.
>
> So the 3 cases are now
> - new member joining/getting update: rejected (with exception) until ready.
> - established member giving update, new client: never happens.
> - established member giving update, connected client: never happens.
>
> We could further constrain things and say a service broker can *only*
> get an update from its own update broker (once the update broker is up
> to date). The advantage is they'll be on the same host so less network
> traffic, the disadvantage is they can't update in parallel if there are
> multiple update brokers available.
>
> Does that address all the issues? There is some extra complexity in
> having 2 processes per broker, but for the moment I can't see any
> insurmountable hurdles. The nice thing is that we can do this with 0 new
> configuration so it will Just Work when its installed.

With Carl's suggestion of an option to allow you to restrict which 
servers are used for updating you could get the set up the same thing. 
That allows those who don't want or need the extra complexity to avoid 
it while allowing the full flexibility to those who do need it.

On a separate but related point, how are nodes not doing the update 
affected by the updater stalling? If there is an application error (e.g. 
queue not found) does the whole cluster hang until the update is 
complete? If there is a high load on the other nodes is there any danger 
that the updater will never be able to catch up after the update is 
complete and the cpg queue would keep filling up?

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Alan Conway <ac...@redhat.com>.

On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
> On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
>> Alan Conway wrote:
>>> At the moment a clustered broker stalls clients while it is
>>> initializing, giving or receiving an update.  It's been pointed out
>>> that this can result in long delays for clients connected to a broker
>>> that elects to give an update to a newcomer, and it might be better
>>> for the broker to disconnect clients so they can fail over to another
>>> broker not busy with an update.
>>>
>>> There are 3 cases to consider:
>>>
>>>   - new member joining/getting update, new client: stall or reject?
>>>   - established member giving update, new client: stall or reject?
>>>   - established member giving update, connected client: stall or
>>> disconnect?
>>>
>>> On the 3rd point I would note that it's possible for clients to
>>> disconnect themselves if the broker is unresponsive by using
>>> heartbeats, and that not all clients can fail-over so I'd lean towards
>>> stall on that one, but I think rejecting new clients may make sense here.
>>>
>>> Part of the original motivation for stalling is that it makes it easy
>>> to write tests. You can start a broker and immediately start a client
>>> without worrying about waiting till the broker is ready. That's a nice
>>> property but there are other ways to achieve that. Current qpidd -d
>>> returns as soon as the broker is ready to listen for TCP requests,
>>> which may be before the broker is has joined the cluster. We could
>>> change that behavior to wait till all plugins report "ready". For
>>> tests we could also grep the log output for the ready message.
>
> I think it would be better not to stall existing clients at all in the
> established member giving update case.
>
> It would be better to not be available for connect at all until up to
> date in the new member case
>
> It's arguable what the best behaviour is in the established member
> giving update gets a new client case. However I would note that the low
> level code isn't capable of stopping accepting connections and then
> starting again once it has started accepting connections. So they would
> have to connect then be disconnected with an exception.
>
> I would also suggest that considering the number of likely cluster
> members is important here - I'd expect very installations to run more
> than 4 machines in a cluster and 2 is probably the norm.
>
> So if a single broker goes down and restarts it's going to be made up to
> date by the only other cluster member. In this case if that member
> stalls no more work can get done until the rejoining member is now up to
> date.
>
> I guess this sort of case can be dealt with in the current scheme by
> having multiple cluster members on a single piece of hardware.
>
> Andrew
>
>>>
>>> Thoughts appreciated!
>>
>> I would dis-allow connections to the new broker until it is synced.  I
>> would not bump any active connections, but rather leave that to heartbeat.
>>
>> One other idea would be to add an option to cluster config which could
>> specify the preferred nodes to update from, and it would try this list
>> first.  I.e. in a 4 node cluster, all updates are made from node 4
>> (preferred) if there, and then from an app point of view I connect to
>> node 1-3 for example.  This way updates have no effect on my clients and
>> if I care about being stalled I set this option. if the prefered node/s
>> are not running it would just pick one as it does today
>>
>> Carl.
>>

Good ideas here. To bring it together, how about this:

There are 2 kinds of broker process:
  - Service brokers serve clients, they  never give updates.
  - Update brokers give updates, they never serve clients.

We create them automatically in pairs: a service broker forks an update broker 
and restarts it if it dies. The update broker never accepts connections and is 
not advertised to clients for failover.

So the 3 cases are now
- new member joining/getting update: rejected (with exception) until ready.
- established member giving update, new client: never happens.
- established member giving update, connected client: never happens.

We could further constrain things and say a service broker can *only* get an 
update from its own update broker (once the update broker is up to date). The 
advantage is they'll be on the same host so less network traffic, the 
disadvantage is they can't update in parallel if there are multiple update 
brokers available.

Does that address all the issues? There is some extra complexity in having 2 
processes per broker, but for the moment I can't see any insurmountable hurdles. 
The nice thing is that we can do this with 0 new configuration so it will Just 
Work when its installed.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Andrew Stitcher <as...@redhat.com>.

On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
> Alan Conway wrote:
> > At the moment a clustered broker stalls clients while it is 
> > initializing, giving or receiving an update.  It's been pointed out 
> > that this can result in long delays for clients connected to a broker 
> > that elects to give an update to a newcomer, and it might be better 
> > for the broker to disconnect clients so they can fail over to another 
> > broker not busy with an update.
> >
> > There are 3 cases to consider:
> >
> >  - new member joining/getting update, new client: stall or reject?
> >  - established member giving update, new client: stall or reject?
> >  - established member giving update, connected client: stall or 
> > disconnect?
> >
> > On the 3rd point I would note that it's possible for clients to 
> > disconnect themselves if the broker is unresponsive by using 
> > heartbeats, and that not all clients can fail-over so I'd lean towards 
> > stall on that one, but I think rejecting new clients may make sense here.
> >
> > Part of the original motivation for stalling is that it makes it easy 
> > to write tests. You can start a broker and immediately start a client 
> > without worrying about waiting till the broker is ready. That's a nice 
> > property but there are other ways to achieve that. Current qpidd -d 
> > returns as soon as the broker is ready to listen for TCP requests, 
> > which may be before the broker is has joined the cluster. We could 
> > change that behavior to wait till all plugins report "ready". For 
> > tests we could also grep the log output for the ready message.

I think it would be better not to stall existing clients at all in the
established member giving update case.

It would be better to not be available for connect at all until up to
date in the new member case

It's arguable what the best behaviour is in the established member
giving update gets a new client case. However I would note that the low
level code isn't capable of stopping accepting connections and then
starting again once it has started accepting connections. So they would
have to connect then be disconnected with an exception.

I would also suggest that considering the number of likely cluster
members is important here - I'd expect very installations to run more
than 4 machines in a cluster and 2 is probably the norm.

So if a single broker goes down and restarts it's going to be made up to
date by the only other cluster member. In this case if that member
stalls no more work can get done until the rejoining member is now up to
date.

I guess this sort of case can be dealt with in the current scheme by
having multiple cluster members on a single piece of hardware.

Andrew

> >
> > Thoughts appreciated! 
> 
> I would dis-allow connections to the new broker until it is synced.  I 
> would not bump any active connections, but rather leave that to heartbeat.
> 
> One other idea would be to add an option to cluster config which could 
> specify the preferred nodes to update from, and it would try this list 
> first.  I.e. in a 4 node cluster, all updates are made from node 4 
> (preferred) if there, and then from an app point of view I connect to 
> node 1-3 for example.  This way updates have no effect on my clients and 
> if I care about being stalled I set this option. if the prefered node/s 
> are not running it would just pick one as it does today
> 
> Carl.
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
> 


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org

Re: [c++ cluster] request for opinions, cluster behavior during update.

Posted by Carl Trieloff <cc...@redhat.com>.

Alan Conway wrote:
> At the moment a clustered broker stalls clients while it is 
> initializing, giving or receiving an update.  It's been pointed out 
> that this can result in long delays for clients connected to a broker 
> that elects to give an update to a newcomer, and it might be better 
> for the broker to disconnect clients so they can fail over to another 
> broker not busy with an update.
>
> There are 3 cases to consider:
>
>  - new member joining/getting update, new client: stall or reject?
>  - established member giving update, new client: stall or reject?
>  - established member giving update, connected client: stall or 
> disconnect?
>
> On the 3rd point I would note that it's possible for clients to 
> disconnect themselves if the broker is unresponsive by using 
> heartbeats, and that not all clients can fail-over so I'd lean towards 
> stall on that one, but I think rejecting new clients may make sense here.
>
> Part of the original motivation for stalling is that it makes it easy 
> to write tests. You can start a broker and immediately start a client 
> without worrying about waiting till the broker is ready. That's a nice 
> property but there are other ways to achieve that. Current qpidd -d 
> returns as soon as the broker is ready to listen for TCP requests, 
> which may be before the broker is has joined the cluster. We could 
> change that behavior to wait till all plugins report "ready". For 
> tests we could also grep the log output for the ready message.
>
> Thoughts appreciated! 

I would dis-allow connections to the new broker until it is synced.  I 
would not bump any active connections, but rather leave that to heartbeat.

One other idea would be to add an option to cluster config which could 
specify the preferred nodes to update from, and it would try this list 
first.  I.e. in a 4 node cluster, all updates are made from node 4 
(preferred) if there, and then from an app point of view I connect to 
node 1-3 for example.  This way updates have no effect on my clients and 
if I care about being stalled I set this option. if the prefered node/s 
are not running it would just pick one as it does today

Carl.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org