You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by jean-frederic clere <jf...@gmail.com> on 2009/05/05 10:45:41 UTC

mod_proxy / mod_proxy_balancer

Hi,

There are 2 weird things in the logic.
- In ap_proxy_add_worker_to_balancer() we make a copy of the worker, why 
not just the address?
If you looks to child_init() in mod_proxy and mod_proxy_balancer we see 
that mod_proxy initialise one copy and mod_proxy_balancer the other, it 
is working but one of the copies is never used.

- We want the child_init of mod_proxy before mod_proxy_balancer, that 
prevents reset() of the balancer_method to control the creation of the 
worker.

Comments?

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.


> -----Ursprüngliche Nachricht-----
> Von: jean-frederic clere  
> Gesendet: Dienstag, 5. Mai 2009 10:46
> An: dev@httpd.apache.org
> Betreff: mod_proxy / mod_proxy_balancer
> 
> Hi,
> 
> There are 2 weird things in the logic.

As you say the logic is weird and IMHO this needs serious
reconstruction.

Regards

Rüdiger

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 8:08 AM, Jim Jagielski wrote:

>
> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>
>> Hi,
>>
>> There are 2 weird things in the logic.
>> - In ap_proxy_add_worker_to_balancer() we make a copy of the  
>> worker, why not just the address?
>> If you looks to child_init() in mod_proxy and mod_proxy_balancer we  
>> see that mod_proxy initialise one copy and mod_proxy_balancer the  
>> other, it is working but one of the copies is never used.
>>
>> - We want the child_init of mod_proxy before mod_proxy_balancer,  
>> that prevents reset() of the balancer_method to control the  
>> creation of the worker.
>>
>
> Yeah, all on target.
>


The rub, of course, is that the inits in child_init/mod_proxy *are*
required presently... In fact, did we explicitly added the "already
inited" test due to this interaction?

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 9:32 AM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>>> Hi,
>>>
>>> There are 2 weird things in the logic.
>>> - In ap_proxy_add_worker_to_balancer() we make a copy of the  
>>> worker, why not just the address?
>>> If you looks to child_init() in mod_proxy and mod_proxy_balancer  
>>> we see that mod_proxy initialise one copy and mod_proxy_balancer  
>>> the other, it is working but one of the copies is never used.
>>>
>>> - We want the child_init of mod_proxy before mod_proxy_balancer,  
>>> that prevents reset() of the balancer_method to control the  
>>> creation of the worker.
>>>
>> Yeah, all on target.
>
> The next thing I am on is the ap_proxy_create_worker() called for  
> reverse and forward (conf->reverse and conf->forward).  
> ap_proxy_create_worker() fills the worker->id and they use  
> ap_proxy_initialize_worker_share().e really need a shared  
> information for those?
>

Hoping this goes thru: having major issues with SMTP while in
Altanta....

The history was that we assumed that mod_proxy_balancer was
an "optional" package for mod_proxy, and so we would create the
worker entities, and then add (memcpy) them to the balancers
entry after the fact (or as needed). So that's why we are copying
instead of simply passing pointers... Ideally, it would be
good to get back to that, and just have mod_proxy worry about
the default forward and reverse proxy workers and m_p_b worry
about balancers. Otherwise, we have even more nasty overlap
than we already do...

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 9:32 AM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>>> Hi,
>>>
>>> There are 2 weird things in the logic.
>>> - In ap_proxy_add_worker_to_balancer() we make a copy of the  
>>> worker, why not just the address?
>>> If you looks to child_init() in mod_proxy and mod_proxy_balancer  
>>> we see that mod_proxy initialise one copy and mod_proxy_balancer  
>>> the other, it is working but one of the copies is never used.
>>>
>>> - We want the child_init of mod_proxy before mod_proxy_balancer,  
>>> that prevents reset() of the balancer_method to control the  
>>> creation of the worker.
>>>
>> Yeah, all on target.
>
> The next thing I am on is the ap_proxy_create_worker() called for  
> reverse and forward (conf->reverse and conf->forward).  
> ap_proxy_create_worker() fills the worker->id and they use  
> ap_proxy_initialize_worker_share().e really need a shared  
> information for those?
>

Hoping this goes thru: having major issues with SMTP while in
Altanta....

The history was that we assumed that mod_proxy_balancer was
an "optional" package for mod_proxy, and so we would create the
worker entities, and then add (memcpy) them to the balancers
entry after the fact (or as needed). So that's why we are copying
instead of simply passing pointers... Ideally, it would be
good to get back to that, and just have mod_proxy worry about
the default forward and reverse proxy workers and m_p_b worry
about balancers. Otherwise, we have even more nasty overlap
than we already do...

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 9:32 AM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>>> Hi,
>>>
>>> There are 2 weird things in the logic.
>>> - In ap_proxy_add_worker_to_balancer() we make a copy of the  
>>> worker, why not just the address?
>>> If you looks to child_init() in mod_proxy and mod_proxy_balancer  
>>> we see that mod_proxy initialise one copy and mod_proxy_balancer  
>>> the other, it is working but one of the copies is never used.
>>>
>>> - We want the child_init of mod_proxy before mod_proxy_balancer,  
>>> that prevents reset() of the balancer_method to control the  
>>> creation of the worker.
>>>
>> Yeah, all on target.
>
> The next thing I am on is the ap_proxy_create_worker() called for  
> reverse and forward (conf->reverse and conf->forward).  
> ap_proxy_create_worker() fills the worker->id and they use  
> ap_proxy_initialize_worker_share().e really need a shared  
> information for those?
>

I think keeping these shared makes sense...

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 4:41 PM, jean-frederic clere wrote:

>
> I think we need it for few reasons:
> - When a worker is idle the information about its load is irrelevant.

Agreed... but I'm not sure how age affects this :)

>
> - Being able to calculate throughput and load balance using that  
> information is only valid if you have a kind of ticker.

For a balancer that worries about time-related balancing, yes,
I agree.

>
> - In some tests I have made with a mixture of long sessions and  
> single request "sessions" you need to "forget" the load caused by  
> the long sessions.
>

Now that is weird :)

> The next question is how do we call the ageing?
> - Via a thread that calls it after an elapsed time.
> - When there is a request and the actual time is greater than the  
> time we should have call it.
>

I prefer the latter... IMO, I would prefer performance over
accuracy, and so I'd like faster evaluation even if it means
instead of 60/40 (for example), we have 58/42 or so...

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jess Holle wrote:
> jean-frederic clere wrote:
>> Jess Holle wrote:
>>> An ability to balance based on new sessions with an idle time out on 
>>> such sessions would be close enough to reality in cases where 
>>> sessions expire rather than being explicitly invalidated (e.g. by a 
>>> logout).
>> Storing the sessionid to share the load depending on the number of 
>> active sessions, brings a problem of security, no?
> To the degree that you consider Apache vulnerable to attack to retrieve 
> these, yes.
> 
> I prefer the health check request approach below for this and other 
> reasons (amount of required bookkeeping, etc).
>>> Of course that redoes what a servlet engine would be doing and does 
>>> so with lower fidelity.  An ability to ask a backend for its current 
>>> session count and load balance new requests on that basis would be 
>>> really helpful.  Whether this ability is buried into AJP, for 
>>> instance, or is simply a separate request to a designated URL is 
>>> another question, but the latter approach seems fairly general and 
>>> the number of such requests could be throttled by a time-to-live 
>>> setting on the last such count obtained.
>>>
>>> Actually this could and should be generalized beyond active sessions 
>>> to a back-end health metric.  Each backend could compute and respond 
>>> with a relative measure of busyness/health and respond and the load 
>>> balancer could then balance new (session-less) requests to the least 
>>> busy / most healthy backend.  This would seem to be *huge* step 
>>> forward in load balancing capability/fidelity.
>>>
>>> It's my understanding that mod_cluster is pursuing just this sort of 
>>> thing to some degree -- but currently only works for JBoss backends.
>> This wrong it works with Tomcat too.
> mod_cluster works with Tomcat, but according to the docs I've seen the 
> dynamic (health/session metric based rather than static) load balancing 
> only worked with JBoss backends.
> 
> Or has this changed?

No it is still like that but the singleton logic used in JBossAS 
requires a JBoss clustering logic but it should be available in the next 
version.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

jean-frederic clere wrote:
> Jess Holle wrote:
>> An ability to balance based on new sessions with an idle time out on 
>> such sessions would be close enough to reality in cases where 
>> sessions expire rather than being explicitly invalidated (e.g. by a 
>> logout).
> Storing the sessionid to share the load depending on the number of 
> active sessions, brings a problem of security, no?
To the degree that you consider Apache vulnerable to attack to retrieve 
these, yes.

I prefer the health check request approach below for this and other 
reasons (amount of required bookkeeping, etc).
>> Of course that redoes what a servlet engine would be doing and does 
>> so with lower fidelity.  An ability to ask a backend for its current 
>> session count and load balance new requests on that basis would be 
>> really helpful.  Whether this ability is buried into AJP, for 
>> instance, or is simply a separate request to a designated URL is 
>> another question, but the latter approach seems fairly general and 
>> the number of such requests could be throttled by a time-to-live 
>> setting on the last such count obtained.
>>
>> Actually this could and should be generalized beyond active sessions 
>> to a back-end health metric.  Each backend could compute and respond 
>> with a relative measure of busyness/health and respond and the load 
>> balancer could then balance new (session-less) requests to the least 
>> busy / most healthy backend.  This would seem to be *huge* step 
>> forward in load balancing capability/fidelity.
>>
>> It's my understanding that mod_cluster is pursuing just this sort of 
>> thing to some degree -- but currently only works for JBoss backends.
> This wrong it works with Tomcat too.
mod_cluster works with Tomcat, but according to the docs I've seen the 
dynamic (health/session metric based rather than static) load balancing 
only worked with JBoss backends.

Or has this changed?

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jess Holle wrote:
> Rainer Jung wrote:
>> In most situations aplications need stickyness. So balancing will not
>> happen in an ideal situation, instead it tries to keep load equal
>> although most requests are sticky.
>>
>> Because of the influence of sticky requests it can happen that
>> accumulated load distributes very uneven between the nodes. Should the
>> balancer try to correct such accumulated differences?
>>   Other applications are memory bound. Memory is needed by request
>> handling but also by session handling. Data accumulation is mor
>> eimportant here, because of the sessions. Again, we can not be perfect,
>> because we don't get a notification, when a session expires or a user
>> logs out. So we can only count the "new" sessions. This counter in my
>> opinion also needs some aging, so that we won't compensate historic
>> inequality without bounds. I must confess, that I don't have an example
>> here, how this inequality can happen for sessions when balancing new
>> session requests (stickyness doesn't influence this), but I think
>> balancing based on old data is the wrong model here too.
>>   
> An ability to balance based on new sessions with an idle time out on 
> such sessions would be close enough to reality in cases where sessions 
> expire rather than being explicitly invalidated (e.g. by a logout).

Storing the sessionid to share the load depending on the number of 
active sessions, brings a problem of security, no?

> 
> Of course that redoes what a servlet engine would be doing and does so 
> with lower fidelity.  An ability to ask a backend for its current 
> session count and load balance new requests on that basis would be 
> really helpful.  Whether this ability is buried into AJP, for instance, 
> or is simply a separate request to a designated URL is another question, 
> but the latter approach seems fairly general and the number of such 
> requests could be throttled by a time-to-live setting on the last such 
> count obtained.
> 
> Actually this could and should be generalized beyond active sessions to 
> a back-end health metric.  Each backend could compute and respond with a 
> relative measure of busyness/health and respond and the load balancer 
> could then balance new (session-less) requests to the least busy / most 
> healthy backend.  This would seem to be *huge* step forward in load 
> balancing capability/fidelity.
> 
> It's my understanding that mod_cluster is pursuing just this sort of 
> thing to some degree -- but currently only works for JBoss backends.

This wrong it works with Tomcat too.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

Rainer Jung wrote:
> On 06.05.2009 14:35, jean-frederic clere wrote:
>   
>> Jess Holle wrote:
>>     
>>> Rainer Jung wrote:
>>>       
>>>> Yes, I think the counter/aging discussion is for the baseline, i.e. when
>>>> we do not have any information channel to or from the backend nodes.
>>>>
>>>> As soon as mod_cluster comes into play, we can use more up-to-date real
>>>> data and only need to decide how to interprete it and how to interpolate
>>>> during the update interval.
>>>>   
>>>>         
>>> Should general support for a query URL be provided in
>>> mod_proxy_balancer?  Or should this be left to mod_cluster?
>>>       
>> Can you explain more? I don't get the question.
>>
>>     
>>>  Does mod_cluster provide yet another approach top to bottom (separate
>>> than mod_jk and mod_proxy/mod_proxy_ajp)?
>>>       
>> Mod_cluster is just a balancer for mod_proxy but due to the dynamic
>> creation of balancers and workers it can't get in the httpd-trunk code
>> right now.
>>
>>     
>>>  It would seem nice to me if mod_jk and/or mod_proxy_balancer could do
>>> health checks, but you have to draw the line somewhere on growing any
>>> given module and if mod_jk and mod_proxy_balancer are not going in
>>> that direction at some point mod_cluster may be in my future.
>>>       
>> Cool :-)
>>     
>
> There are at several different sub systems, and as I understood
> mod_cluster it already carefully separates them:
>
> 1) Dynamic topology detection (optional)
>
> What are our backend nodes? If you do not want to statically configure
> them, you need some mechanism based on either
>
> - registration: backend nodes register at one or multiple topology
> management nodes; the addresses of those are either configured, or they
> announce themselves on the network via broad- or multicast).
>
> - detection: topology manager receives broad- or multicast packets of
> the backend nodes. They do not need to know the topology manager, only
> the multicast address
>
> More enhanced would be to already learn the forwarding rules (e.g. URLs
> to map) from the backend nodes.
>
> In the simpler case, the topology would be configured statically.
>
> 2) Dynamic state detection
>
> a) Livelyness
> b) Load numbers
>
> Both could be either polled by (maybe scalability issues) or pushed to a
> state manager. Push could be done by tcp (the address could be sent to
> the backend, once it was detected in 1) or defined statically). Maybe
> one would use both ways, e.g. push for active state changes, like when
> an admin stops a node, poll for state manager driven things. Not sure.
>
> 3) Balancing
>
> Would be done based on the data collected by the state manager.
>
> It's not clear at all, whether those three should be glued together
> tightly, or kept in different pieces. I had the impression the general
> direction is more about separating them and to allow multiple
> experiments, like mod_cluster and mod_heartbeat.
>
> The interaction would be done via some common data container, e.g.
> slotmem or in a distributed (multiple Apaches) situation memcache or
> similar.
>
> Does this make sense?
>   
Yes.

I've been working around #1 by using pre-designated port ranges for 
backends, e.g. configuring for balancing over a port range of 10 and 
only having a couple of servers running in this range at most given 
times.  That's fine as long as one quiets Apache's error logging so that 
it only complains about backends that are *newly* unreachable rather 
than complaining each time a backend is retried.  I supplied a patch for 
this some time back.

#2 and #3 are huge, however, and it would be good to see something firm 
rather than experimental in these areas sooner than later.

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

FWIW, I've been looking into using Tribes for httpd.

Re: mod_proxy / mod_proxy_balancer

Posted by Rainer Jung <ra...@kippdata.de>.

On 06.05.2009 14:35, jean-frederic clere wrote:
> Jess Holle wrote:
>> Rainer Jung wrote:
>>> Yes, I think the counter/aging discussion is for the baseline, i.e. when
>>> we do not have any information channel to or from the backend nodes.
>>>
>>> As soon as mod_cluster comes into play, we can use more up-to-date real
>>> data and only need to decide how to interprete it and how to interpolate
>>> during the update interval.
>>>   
>> Should general support for a query URL be provided in
>> mod_proxy_balancer?  Or should this be left to mod_cluster?
> 
> Can you explain more? I don't get the question.
> 
>>  Does mod_cluster provide yet another approach top to bottom (separate
>> than mod_jk and mod_proxy/mod_proxy_ajp)?
> 
> Mod_cluster is just a balancer for mod_proxy but due to the dynamic
> creation of balancers and workers it can't get in the httpd-trunk code
> right now.
> 
>>  It would seem nice to me if mod_jk and/or mod_proxy_balancer could do
>> health checks, but you have to draw the line somewhere on growing any
>> given module and if mod_jk and mod_proxy_balancer are not going in
>> that direction at some point mod_cluster may be in my future.
> 
> Cool :-)

There are at several different sub systems, and as I understood
mod_cluster it already carefully separates them:

1) Dynamic topology detection (optional)

What are our backend nodes? If you do not want to statically configure
them, you need some mechanism based on either

- registration: backend nodes register at one or multiple topology
management nodes; the addresses of those are either configured, or they
announce themselves on the network via broad- or multicast).

- detection: topology manager receives broad- or multicast packets of
the backend nodes. They do not need to know the topology manager, only
the multicast address

More enhanced would be to already learn the forwarding rules (e.g. URLs
to map) from the backend nodes.

In the simpler case, the topology would be configured statically.

2) Dynamic state detection

a) Livelyness
b) Load numbers

Both could be either polled by (maybe scalability issues) or pushed to a
state manager. Push could be done by tcp (the address could be sent to
the backend, once it was detected in 1) or defined statically). Maybe
one would use both ways, e.g. push for active state changes, like when
an admin stops a node, poll for state manager driven things. Not sure.

3) Balancing

Would be done based on the data collected by the state manager.

It's not clear at all, whether those three should be glued together
tightly, or kept in different pieces. I had the impression the general
direction is more about separating them and to allow multiple
experiments, like mod_cluster and mod_heartbeat.

The interaction would be done via some common data container, e.g.
slotmem or in a distributed (multiple Apaches) situation memcache or
similar.

Does this make sense?

Regards,

Rainer

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 6, 2009, at 9:23 AM, Jess Holle wrote:

> You're right -- I was being weird.  Sorry.
>

No apology needed :)

> I guess part of the reason for my asking was whether the ASF was  
> basically saying "we're not chasing this problem, see mod_cluster  
> folk if you need it solved" -- and, if so, hoping to get a little  
> starting info as to what I'd be getting into chasing mod_cluster.
>
> I'd like to see this capability in httpd itself -- or at least have  
> it very easy to add in a very seamless fashion via a pluggable  
> custom balancer algorithm (without other larger configuration side  
> effects) -- and thus would hope the ASF sees this as within the  
> scope of httpd's core suite of modules.
>

I think it's safe to say that there is enough interest here in the
httpd dev team for this capability to be part of httpd itself...

What we like to do is provide the basic implementation and capability
and allow others to build on top of that if needed.

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

Jim Jagielski wrote:
> On May 6, 2009, at 9:07 AM, Jess Holle wrote:
>> jean-frederic clere wrote:
>>>> Should general support for a query URL be provided in 
>>>> mod_proxy_balancer?  Or should this be left to mod_cluster?
>>> Can you explain more? I don't get the question.
>> What I mean is
>>     • Should mod_proxy_balancer be extended to provide a balancer 
>> algorithm in which one specifies a backend URL that will provide a 
>> single numeric health metric, throttle the number of such requests 
>> via a time-to-live associated with this information, and balance on 
>> this basis or
>>     • Should mod_cluster handle this issue?
>>     • Or both?
> Please recall that, afaik, mod_cluster is not AL nor is it part
> of Apache. So asking for direction for what is basically an external
> project on the Apache httpd dev list is kinda weird :)
>
> In any case, I think the hope of the ASF is that this capability is
> part of httpd, and you can see, with mod_heartbeat and the like,
> efforts in the direction.
>
> But the world is big enough for different implementations...
You're right -- I was being weird.  Sorry.

I guess part of the reason for my asking was whether the ASF was 
basically saying "we're not chasing this problem, see mod_cluster folk 
if you need it solved" -- and, if so, hoping to get a little starting 
info as to what I'd be getting into chasing mod_cluster.

I'd like to see this capability in httpd itself -- or at least have it 
very easy to add in a very seamless fashion via a pluggable custom 
balancer algorithm (without other larger configuration side effects) -- 
and thus would hope the ASF sees this as within the scope of httpd's 
core suite of modules.

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 6, 2009, at 9:07 AM, Jess Holle wrote:
> 
>> jean-frederic clere wrote:
>>>
>>>> Should general support for a query URL be provided in 
>>>> mod_proxy_balancer?  Or should this be left to mod_cluster?
>>> Can you explain more? I don't get the question.
>> What I mean is
>>     • Should mod_proxy_balancer be extended to provide a balancer 
>> algorithm in which one specifies a backend URL that will provide a 
>> single numeric health metric, throttle the number of such requests via 
>> a time-to-live associated with this information, and balance on this 
>> basis or
>>     • Should mod_cluster handle this issue?
>>     • Or both?
> 
> Please recall that, afaik, mod_cluster is not AL nor is it part
> of Apache. So asking for direction for what is basically an external
> project on the Apache httpd dev list is kinda weird :)

Yep there is a JBoss list for that: mod_cluster-dev@lists.jboss.org

> 
> In any case, I think the hope of the ASF is that this capability is
> part of httpd, and you can see, with mod_heartbeat and the like,
> efforts in the direction.

Yes I am experimenting there too.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 6, 2009, at 9:07 AM, Jess Holle wrote:

> jean-frederic clere wrote:
>>
>>> Should general support for a query URL be provided in  
>>> mod_proxy_balancer?  Or should this be left to mod_cluster?
>> Can you explain more? I don't get the question.
> What I mean is
> 	• Should mod_proxy_balancer be extended to provide a balancer  
> algorithm in which one specifies a backend URL that will provide a  
> single numeric health metric, throttle the number of such requests  
> via a time-to-live associated with this information, and balance on  
> this basis or
> 	• Should mod_cluster handle this issue?
> 	• Or both?

Please recall that, afaik, mod_cluster is not AL nor is it part
of Apache. So asking for direction for what is basically an external
project on the Apache httpd dev list is kinda weird :)

In any case, I think the hope of the ASF is that this capability is
part of httpd, and you can see, with mod_heartbeat and the like,
efforts in the direction.

But the world is big enough for different implementations...

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jess Holle wrote:
> jean-frederic clere wrote:
>>> Should general support for a query URL be provided in 
>>> mod_proxy_balancer?  Or should this be left to mod_cluster?
>> Can you explain more? I don't get the question.
> What I mean is
> 
>    1. Should mod_proxy_balancer be extended to provide a balancer
>       algorithm in which one specifies a backend URL that will provide a
>       single numeric health metric, throttle the number of such requests
>       via a time-to-live associated with this information, and balance
>       on this basis or
>    2. Should mod_cluster handle this issue?
>    3. Or both?
>           * For instance, mod_cluster might leverage special nuances in
>             AJP, JBoss, and Tomcat, whereas mod_proxy_balancer might
>             provide more generic support for helath checks on any back
>             end server that can expose a health metric URL.
> 
>  >From your response below, it sounds like you're saying it's #2, which 
> is /largely /fine and good -- but this raises questions:
> 
>    1. How general is the health check metric in mod_cluster?
>           * I only care about Tomcat backends myself, but control over
>             the metric would be good.
>    2. Does this require special JBoss nuggets in Tomcat?
>           * I'd hope not, i.e. that this is a simple matter of a
>             pre-designated URL or a very simple standalone socket protocol.
>    3. When will mod_cluster support health metric based balancing of Tomcat?
>    4. How "disruptive" to an existing configuration using
>       mod_proxy_balancer/mod_proxy_ajp is mod_cluster?
>           * How much needs to be changed?
>    5. How portable is the mod_cluster code?
>           * Does it build on Windows?  HPUX?  AIX?

Please ask the mod_cluster questions in the 
mod_cluster-dev@lists.jboss.org list. I will answer there.

Cheers

Jean-Frederic

> 
> I say this is largely fine and good as I'd like to see just the 
> health-metric based balancing algorithm in Apache 2.2.x itself.
>>> Does mod_cluster provide yet another approach top to bottom (separate 
>>> than mod_jk and mod_proxy/mod_proxy_ajp)?
>> Mod_cluster is just a balancer for mod_proxy but due to the dynamic 
>> creation of balancers and workers it can't get in the httpd-trunk code 
>> right now.
>>>  It would seem nice to me if mod_jk and/or mod_proxy_balancer could 
>>> do health checks, but you have to draw the line somewhere on growing 
>>> any given module and if mod_jk and mod_proxy_balancer are not going 
>>> in that direction at some point mod_cluster may be in my future. 
> --
> Jess Holle
>

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

jean-frederic clere wrote:
>> Should general support for a query URL be provided in 
>> mod_proxy_balancer?  Or should this be left to mod_cluster?
> Can you explain more? I don't get the question.
What I mean is

   1. Should mod_proxy_balancer be extended to provide a balancer
      algorithm in which one specifies a backend URL that will provide a
      single numeric health metric, throttle the number of such requests
      via a time-to-live associated with this information, and balance
      on this basis or
   2. Should mod_cluster handle this issue?
   3. Or both?
          * For instance, mod_cluster might leverage special nuances in
            AJP, JBoss, and Tomcat, whereas mod_proxy_balancer might
            provide more generic support for helath checks on any back
            end server that can expose a health metric URL.

 From your response below, it sounds like you're saying it's #2, which 
is /largely /fine and good -- but this raises questions:

   1. How general is the health check metric in mod_cluster?
          * I only care about Tomcat backends myself, but control over
            the metric would be good.
   2. Does this require special JBoss nuggets in Tomcat?
          * I'd hope not, i.e. that this is a simple matter of a
            pre-designated URL or a very simple standalone socket protocol.
   3. When will mod_cluster support health metric based balancing of Tomcat?
   4. How "disruptive" to an existing configuration using
      mod_proxy_balancer/mod_proxy_ajp is mod_cluster?
          * How much needs to be changed?
   5. How portable is the mod_cluster code?
          * Does it build on Windows?  HPUX?  AIX?

I say this is largely fine and good as I'd like to see just the 
health-metric based balancing algorithm in Apache 2.2.x itself.
>> Does mod_cluster provide yet another approach top to bottom (separate 
>> than mod_jk and mod_proxy/mod_proxy_ajp)?
> Mod_cluster is just a balancer for mod_proxy but due to the dynamic 
> creation of balancers and workers it can't get in the httpd-trunk code 
> right now.
>>  It would seem nice to me if mod_jk and/or mod_proxy_balancer could 
>> do health checks, but you have to draw the line somewhere on growing 
>> any given module and if mod_jk and mod_proxy_balancer are not going 
>> in that direction at some point mod_cluster may be in my future. 
--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jess Holle wrote:
> Rainer Jung wrote:
>>> An ability to balance based on new sessions with an idle time out on
>>> such sessions would be close enough to reality in cases where sessions
>>> expire rather than being explicitly invalidated (e.g. by a logout).
>>>     
>> But then we end up in a stateful situation. This is a serious design
>> decision. If we want to track idleness for sessions, we need to track a
>> list of sessions (session ids) the balancer has seen. This makes things
>> much more complex. Combined with the non-ability to track logouts and
>> the errors coming in form a global situation (more than one Apache
>> instance), I think it will be more of a problem than a solution.
>>   
> The more I think about this the more I agree.
> 
>  >From the start I preferred the session/health query to the back-end 
> with a time-to-live, on further consideration I *greatly* prefer this 
> approach.
>>> Of course that redoes what a servlet engine would be doing and does so
>>> with lower fidelity.  An ability to ask a backend for its current
>>> session count and load balance new requests on that basis would be
>>> really helpful.
>>>     
>> Seems much nicer.
>>   
> Agreed.
>>> Actually this could and should be generalized beyond active sessions to
>>> a back-end health metric.  Each backend could compute and respond with a
>>> relative measure of busyness/health and respond and the load balancer
>>> could then balance new (session-less) requests to the least busy / most
>>> healthy backend.  This would seem to be *huge* step forward in load
>>> balancing capability/fidelity.
>>>
>>> It's my understanding that mod_cluster is pursuing just this sort of
>>> thing to some degree -- but currently only works for JBoss backends.
>>>     
>> Yes, I think the counter/aging discussion is for the baseline, i.e. when
>> we do not have any information channel to or from the backend nodes.
>>
>> As soon as mod_cluster comes into play, we can use more up-to-date real
>> data and only need to decide how to interprete it and how to interpolate
>> during the update interval.
>>   
> Should general support for a query URL be provided in 
> mod_proxy_balancer?  Or should this be left to mod_cluster?

Can you explain more? I don't get the question.

>  Does 
> mod_cluster provide yet another approach top to bottom (separate than 
> mod_jk and mod_proxy/mod_proxy_ajp)?

Mod_cluster is just a balancer for mod_proxy but due to the dynamic 
creation of balancers and workers it can't get in the httpd-trunk code 
right now.

>  It would seem nice to me if mod_jk 
> and/or mod_proxy_balancer could do health checks, but you have to draw 
> the line somewhere on growing any given module and if mod_jk and 
> mod_proxy_balancer are not going in that direction at some point 
> mod_cluster may be in my future.

Cool :-)

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

Rainer Jung wrote:
>> An ability to balance based on new sessions with an idle time out on
>> such sessions would be close enough to reality in cases where sessions
>> expire rather than being explicitly invalidated (e.g. by a logout).
>>     
> But then we end up in a stateful situation. This is a serious design
> decision. If we want to track idleness for sessions, we need to track a
> list of sessions (session ids) the balancer has seen. This makes things
> much more complex. Combined with the non-ability to track logouts and
> the errors coming in form a global situation (more than one Apache
> instance), I think it will be more of a problem than a solution.
>   
The more I think about this the more I agree.

 From the start I preferred the session/health query to the back-end 
with a time-to-live, on further consideration I *greatly* prefer this 
approach.
>> Of course that redoes what a servlet engine would be doing and does so
>> with lower fidelity.  An ability to ask a backend for its current
>> session count and load balance new requests on that basis would be
>> really helpful.
>>     
> Seems much nicer.
>   
Agreed.
>> Actually this could and should be generalized beyond active sessions to
>> a back-end health metric.  Each backend could compute and respond with a
>> relative measure of busyness/health and respond and the load balancer
>> could then balance new (session-less) requests to the least busy / most
>> healthy backend.  This would seem to be *huge* step forward in load
>> balancing capability/fidelity.
>>
>> It's my understanding that mod_cluster is pursuing just this sort of
>> thing to some degree -- but currently only works for JBoss backends.
>>     
> Yes, I think the counter/aging discussion is for the baseline, i.e. when
> we do not have any information channel to or from the backend nodes.
>
> As soon as mod_cluster comes into play, we can use more up-to-date real
> data and only need to decide how to interprete it and how to interpolate
> during the update interval.
>   
Should general support for a query URL be provided in 
mod_proxy_balancer?  Or should this be left to mod_cluster?  Does 
mod_cluster provide yet another approach top to bottom (separate than 
mod_jk and mod_proxy/mod_proxy_ajp)?  It would seem nice to me if mod_jk 
and/or mod_proxy_balancer could do health checks, but you have to draw 
the line somewhere on growing any given module and if mod_jk and 
mod_proxy_balancer are not going in that direction at some point 
mod_cluster may be in my future.

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by Rainer Jung <ra...@kippdata.de>.

On 06.05.2009 10:35, Jess Holle wrote:
> Rainer Jung wrote:
>> In most situations aplications need stickyness. So balancing will not
>> happen in an ideal situation, instead it tries to keep load equal
>> although most requests are sticky.
>>
>> Because of the influence of sticky requests it can happen that
>> accumulated load distributes very uneven between the nodes. Should the
>> balancer try to correct such accumulated differences?
>>   Other applications are memory bound. Memory is needed by request
>> handling but also by session handling. Data accumulation is mor
>> eimportant here, because of the sessions. Again, we can not be perfect,
>> because we don't get a notification, when a session expires or a user
>> logs out. So we can only count the "new" sessions. This counter in my
>> opinion also needs some aging, so that we won't compensate historic
>> inequality without bounds. I must confess, that I don't have an example
>> here, how this inequality can happen for sessions when balancing new
>> session requests (stickyness doesn't influence this), but I think
>> balancing based on old data is the wrong model here too.
>>   
> An ability to balance based on new sessions with an idle time out on
> such sessions would be close enough to reality in cases where sessions
> expire rather than being explicitly invalidated (e.g. by a logout).

But then we end up in a stateful situation. This is a serious design
decision. If we want to track idleness for sessions, we need to track a
list of sessions (session ids) the balancer has seen. This makes things
much more complex. Combined with the non-ability to track logouts and
the errors coming in form a global situation (more than one Apache
instance), I think it will be more of a problem than a solution.

> Of course that redoes what a servlet engine would be doing and does so
> with lower fidelity.  An ability to ask a backend for its current
> session count and load balance new requests on that basis would be
> really helpful.

Seems much nicer.

> Actually this could and should be generalized beyond active sessions to
> a back-end health metric.  Each backend could compute and respond with a
> relative measure of busyness/health and respond and the load balancer
> could then balance new (session-less) requests to the least busy / most
> healthy backend.  This would seem to be *huge* step forward in load
> balancing capability/fidelity.
> 
> It's my understanding that mod_cluster is pursuing just this sort of
> thing to some degree -- but currently only works for JBoss backends.

Yes, I think the counter/aging discussion is for the baseline, i.e. when
we do not have any information channel to or from the backend nodes.

As soon as mod_cluster comes into play, we can use more up-to-date real
data and only need to decide how to interprete it and how to interpolate
during the update interval.

Regards,

Rainer

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

Jim Jagielski wrote:
> On May 6, 2009, at 4:35 AM, Jess Holle wrote:
>> Of course that redoes what a servlet engine would be doing and does 
>> so with lower fidelity.  An ability to ask a backend for its current 
>> session count and load balance new requests on that basis would be 
>> really helpful.  Whether this ability is buried into AJP, for 
>> instance, or is simply a separate request to a designated URL is 
>> another question, but the latter approach seems fairly general and 
>> the number of such requests could be throttled by a time-to-live 
>> setting on the last such count obtained.
>>
>> Actually this could and should be generalized beyond active sessions 
>> to a back-end health metric.  Each backend could compute and respond 
>> with a relative measure of busyness/health and respond and the load 
>> balancer could then balance new (session-less) requests to the least 
>> busy / most healthy backend.  This would seem to be *huge* step 
>> forward in load balancing capability/fidelity.
> The trick, of course, at least with HTTP, is that the querying of
> the backend is, of course, a request, and so one needs to worry about
> such things as keepalives and persistent connections, and how long
> do we wait for responses, etc...
>
> That's why oob-like health-and-status chatter is nice, because
> it doesn't interfere with the normal reverse-proxy/host logic.
>
> An idea: Instead of asking for this info before sending the
> request, what about the backend sending it as part of the response,
> as a response header. You don't know that status of the machine
> "now", but you do know the status of it right after it handled the last
> request (the last time you saw it) and, assuming nothing else touched
> it, that status is likely still "good". Latency will be an issue,
> of course... Overlapping requests where you don't have the response
> from req1 before you send req2 means that both requests think the
> server is at the same state, whereas of course, they aren't, but it
> may even out since req3, for example, (which happens after req1 is done)
> thinks that the backend has 2 concurrent requests, instead of the 1
> (req2) and so maybe isn't selected... The hysteresis would be interesting
> to model :)
There's inherent hysteresis in this sort of thing.

Including health information (e.g. via a custom response header) on all 
responses is an interesting notion.

Exposing a URL on Apache through which the backend can push its health 
information (e.g. upon starting a new session or invalidating a session 
or detecting a low memory condition) also makes sense.

If these do not suffice a watchdog thread (as in mod_jk) could do 
periodic health checks on the backends in a separate thread or requests 
could pre-request health information for a backend if that backend's 
health information is sufficiently old.

There's lots of possibilities here.

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 6, 2009, at 1:00 PM, William A. Rowe, Jr. wrote:

> Jim Jagielski wrote:
>>
>> That's why oob-like health-and-status chatter is nice, because
>> it doesn't interfere with the normal reverse-proxy/host logic.
>
> +1, for a backend of unknown status (let's just say it's a few minutes
> old, effectively useless information now) ping/pong is the right first
> approach.  But...
>
>> An idea: Instead of asking for this info before sending the
>> request, what about the backend sending it as part of the response,
>> as a response header. You don't know that status of the machine
>> "now", but you do know the status of it right after it handled the  
>> last
>> request (the last time you saw it) and, assuming nothing else touched
>> it, that status is likely still "good".
>
> Yup; that seems like the only sane approach, add an X-Backend-Status  
> or
> whatnot to report the load or other health data.

For example, how long it took me (the backend server) to handle
this request... would be useful to know *that* in additional to
the typical "round-trip" time :)

Re: mod_proxy / mod_proxy_balancer

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

Jim Jagielski wrote:
> 
> That's why oob-like health-and-status chatter is nice, because
> it doesn't interfere with the normal reverse-proxy/host logic.

+1, for a backend of unknown status (let's just say it's a few minutes
old, effectively useless information now) ping/pong is the right first
approach.  But...

> An idea: Instead of asking for this info before sending the
> request, what about the backend sending it as part of the response,
> as a response header. You don't know that status of the machine
> "now", but you do know the status of it right after it handled the last
> request (the last time you saw it) and, assuming nothing else touched
> it, that status is likely still "good".

Yup; that seems like the only sane approach, add an X-Backend-Status or
whatnot to report the load or other health data.  It's easily consumed
(erased) from the front end response.  If done correctly in a backend
server, it can convey information from the ultimate back end resources
that actually cause the congestion (DB servers or whatnot) rather than
the default response (CPU or whatnot) at the middle tier..

Re: mod_proxy / mod_proxy_balancer

Posted by Rainer Jung <ra...@kippdata.de>.

On 06.05.2009 15:08, Jim Jagielski wrote:
> 
> On May 6, 2009, at 4:35 AM, Jess Holle wrote:
> 
>>
>> Of course that redoes what a servlet engine would be doing and does so
>> with lower fidelity.  An ability to ask a backend for its current
>> session count and load balance new requests on that basis would be
>> really helpful.  Whether this ability is buried into AJP, for
>> instance, or is simply a separate request to a designated URL is
>> another question, but the latter approach seems fairly general and the
>> number of such requests could be throttled by a time-to-live setting
>> on the last such count obtained.
>>
>> Actually this could and should be generalized beyond active sessions
>> to a back-end health metric.  Each backend could compute and respond
>> with a relative measure of busyness/health and respond and the load
>> balancer could then balance new (session-less) requests to the least
>> busy / most healthy backend.  This would seem to be *huge* step
>> forward in load balancing capability/fidelity.
>>
> 
> The trick, of course, at least with HTTP, is that the querying of
> the backend is, of course, a request, and so one needs to worry about
> such things as keepalives and persistent connections, and how long
> do we wait for responses, etc...
> 
> That's why oob-like health-and-status chatter is nice, because
> it doesn't interfere with the normal reverse-proxy/host logic.
> 
> An idea: Instead of asking for this info before sending the
> request, what about the backend sending it as part of the response,
> as a response header. You don't know that status of the machine
> "now", but you do know the status of it right after it handled the last
> request (the last time you saw it) and, assuming nothing else touched
> it, that status is likely still "good". Latency will be an issue,
> of course... Overlapping requests where you don't have the response
> from req1 before you send req2 means that both requests think the
> server is at the same state, whereas of course, they aren't, but it
> may even out since req3, for example, (which happens after req1 is done)
> thinks that the backend has 2 concurrent requests, instead of the 1
> (req2) and so maybe isn't selected... The hysteresis would be interesting
> to model :)

I think asking each time before sending data is to much overhead in
general. Of course it depends on how accurate you try to distribute
load. I would expect, that in most situations the overhead for a per
request accurate decision does not pay off, especially when under high
load there is always a time window between getting the data and handling
the request, and a lot of concurrent requests will already again have
changed the data.

I expect in most cases a granularity of status data between once per
second and once per minute will be appropriate (still a factor of 60 to
decide or configure).

When sending the data back as part of the response: some load numbers
might be to expensive to retrieve like 500 times a second. Other load
numbers might not really make sense as a snapshot (per request), only as
an average value (like: what does CPU load as a snapshot mean? Since
your load data collecting code is on the CPU, a one CPU system will be
100% busy at this point in time. So CPU measurement mostly makes sense
as average values over relatively short intervals).

So we should already expect the backend to send data with is not
necessarily up-to-date w.r.t. each request. I would assume, that when
data comes with each response, one would use some sort of floating average.

Piggybacking will be easier to implement (no real protocol needed etc.),
out of band communication will be more flexible.

Regards,

Rainer

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 6, 2009, at 4:35 AM, Jess Holle wrote:

>
> Of course that redoes what a servlet engine would be doing and does  
> so with lower fidelity.  An ability to ask a backend for its current  
> session count and load balance new requests on that basis would be  
> really helpful.  Whether this ability is buried into AJP, for  
> instance, or is simply a separate request to a designated URL is  
> another question, but the latter approach seems fairly general and  
> the number of such requests could be throttled by a time-to-live  
> setting on the last such count obtained.
>
> Actually this could and should be generalized beyond active sessions  
> to a back-end health metric.  Each backend could compute and respond  
> with a relative measure of busyness/health and respond and the load  
> balancer could then balance new (session-less) requests to the least  
> busy / most healthy backend.  This would seem to be *huge* step  
> forward in load balancing capability/fidelity.
>

The trick, of course, at least with HTTP, is that the querying of
the backend is, of course, a request, and so one needs to worry about
such things as keepalives and persistent connections, and how long
do we wait for responses, etc...

That's why oob-like health-and-status chatter is nice, because
it doesn't interfere with the normal reverse-proxy/host logic.

An idea: Instead of asking for this info before sending the
request, what about the backend sending it as part of the response,
as a response header. You don't know that status of the machine
"now", but you do know the status of it right after it handled the last
request (the last time you saw it) and, assuming nothing else touched
it, that status is likely still "good". Latency will be an issue,
of course... Overlapping requests where you don't have the response
from req1 before you send req2 means that both requests think the
server is at the same state, whereas of course, they aren't, but it
may even out since req3, for example, (which happens after req1 is done)
thinks that the backend has 2 concurrent requests, instead of the 1
(req2) and so maybe isn't selected... The hysteresis would be  
interesting
to model :)

Re: mod_proxy / mod_proxy_balancer

Posted by Jess Holle <je...@ptc.com>.

Rainer Jung wrote:
> In most situations aplications need stickyness. So balancing will not
> happen in an ideal situation, instead it tries to keep load equal
> although most requests are sticky.
>
> Because of the influence of sticky requests it can happen that
> accumulated load distributes very uneven between the nodes. Should the
> balancer try to correct such accumulated differences?
>   
> Other applications are memory bound. Memory is needed by request
> handling but also by session handling. Data accumulation is mor
> eimportant here, because of the sessions. Again, we can not be perfect,
> because we don't get a notification, when a session expires or a user
> logs out. So we can only count the "new" sessions. This counter in my
> opinion also needs some aging, so that we won't compensate historic
> inequality without bounds. I must confess, that I don't have an example
> here, how this inequality can happen for sessions when balancing new
> session requests (stickyness doesn't influence this), but I think
> balancing based on old data is the wrong model here too.
>   
An ability to balance based on new sessions with an idle time out on 
such sessions would be close enough to reality in cases where sessions 
expire rather than being explicitly invalidated (e.g. by a logout).

Of course that redoes what a servlet engine would be doing and does so 
with lower fidelity.  An ability to ask a backend for its current 
session count and load balance new requests on that basis would be 
really helpful.  Whether this ability is buried into AJP, for instance, 
or is simply a separate request to a designated URL is another question, 
but the latter approach seems fairly general and the number of such 
requests could be throttled by a time-to-live setting on the last such 
count obtained.

Actually this could and should be generalized beyond active sessions to 
a back-end health metric.  Each backend could compute and respond with a 
relative measure of busyness/health and respond and the load balancer 
could then balance new (session-less) requests to the least busy / most 
healthy backend.  This would seem to be *huge* step forward in load 
balancing capability/fidelity.

It's my understanding that mod_cluster is pursuing just this sort of 
thing to some degree -- but currently only works for JBoss backends.

--
Jess Holle

Re: mod_proxy / mod_proxy_balancer

Posted by Rainer Jung <ra...@kippdata.de>.

Caution: long response!

On 05.05.2009 22:41, jean-frederic clere wrote:
> Jim Jagielski wrote:
>>
>> On May 5, 2009, at 3:02 PM, jean-frederic clere wrote:
>>
>>> Jim Jagielski wrote:
>>>> On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:
>>>>> Jim Jagielski wrote:
>>>>>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>>>>>> Jim Jagielski wrote:
>>>>>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>>>>>
>>>>>>>>> I am trying to get the worker->id and the scoreboard associated
>>>>>>>>> logic moved in the reset() when using a balancer, those workers
>>>>>>>>> need a different handling if we want to have a shared
>>>>>>>>> information area for them.
>>>>>>>>>
>>>>>>>> The thing is that those workers are not really handled
>>>>>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>>>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>>>>>> and m_p_b should handle the balancer-related ones.
>>>>>>>
>>>>>>> Ok by running first the m_p_b child_init() the worker is
>>>>>>> initialised by the m_p_b logic and mod_proxy won't change it later.
>>>>>>>
>>>>>>>>
>>>>>> Yeah... a quick test indicates, at least as far as the perl
>>>>>> framework is considered, changing to that m_p_b runs 1st in
>>>>>> child_init
>>>>>> results in normal and expected behavior.... Need to do some more
>>>>>> tracing to see if we can copy the pointer instead of the whole
>>>>>> data set with this ordering.
>>>>>
>>>>> I have committed the code... It works for my tests.
>>>>>
>>>> Beat me to it :)
>>>> BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
>>>> lot of what we do in trunk we can backport to 2.2.x....
>>>
>>> Yep but I think we should first have the reset()/age() stuff working
>>> in trunk before backporting to httpd-2.2-proxy :-)
>>>
>>
>> For sure!!
>>
>> BTW: it seems to me that aging is only really needed when the
>> environment changes,
>> mostly when a worker comes back, or when the actual limits are changed
>> in real-time during runtime. Except for these, aging doesn't seem to
>> really add much... long-term steady state only gets invalid when the
>> steady-state changes, after all :)
>>
>> Comments?
>>
>>
> 
> I think we need it for few reasons:
> - When a worker is idle the information about its load is irrelevant.
> - Being able to calculate throughput and load balance using that
> information is only valid if you have a kind of ticker.
> - In some tests I have made with a mixture of long sessions and single
> request "sessions" you need to "forget" the load caused by the long
> sessions.

Balancing and stickyness are conflicting goals. Stickyness dictates the
node, once a session is created, balancing tries to distribute load
equally, so needs to choose the least loaded node.

In most situations aplications need stickyness. So balancing will not
happen in an ideal situation, instead it tries to keep load equal
although most requests are sticky.

Because of the influence of sticky requests it can happen that
accumulated load distributes very uneven between the nodes. Should the
balancer try to correct such accumulated differences?

It depends (yeah, as always): what we actually mean by "load" is varying
on the application. Abstractly we are talking about resource usage. The
backend nodes have limited resources and we want to make optimal use of
them by distributing the resource usage equally.

For some applications CPU is the limiting resource. This resource is
typically coupled to actual requests in flight and not to longer living
objects like sessions. Of course not all requests need an equal amount
of CPU, but as long as we can't actually measure the CPU load, balancing
the number of requests in the sense of "busyness" (parallel requests)
should be best for CPU. Because CPU monitoring is often done on the
basis of averages (and not the maximum short term use per interval),
some request count acumulation as a basis for the balancing will result
in better measured numbers (not necessarily in better "smallest maximum
load"). If we do not age, then strongly unequal historic distribution
(caused by stickyness) will result in an opposite unequal distribution
as soon as a lot of non-sticky requests come in. I think that's not optimal.

Other applications are memory bound. Memory is needed by request
handling but also by session handling. Data accumulation is mor
eimportant here, because of the sessions. Again, we can not be perfect,
because we don't get a notification, when a session expires or a user
logs out. So we can only count the "new" sessions. This counter in my
opinion also needs some aging, so that we won't compensate historic
inequality without bounds. I must confess, that I don't have an example
here, how this inequality can happen for sessions when balancing new
session requests (stickyness doesn't influence this), but I think
balancing based on old data is the wrong model here too.

Then another important resource is bandwith. Here we are more concerned
about the amount of transferred data. Although in all real cases I know
the limiting bandwith is always in front of the web server and not
between the web server and the backend, this would be a case to consider
at least theoretically. Here again stickyness conflicts with optimal
balancing, so we can get into a very inequal distribution which should
not be compensated in the future, so aging seems appropriate.

Finally the number of backend connections (and depending on the backend
connector the number of threads needed for them on the backend) is often
a limiting resource. That would be best handled by "busyness", which
doesn't need aging, because it is not an accumulating counter but
instead a snapshot number. "busyness" does behave somewhat unexpected
under low load though (when the measured busyness is nearly always "0").

> The next question is how do we call the ageing?
> - Via a thread that calls it after an elapsed time.
> - When there is a request and the actual time is greater than the time
> we should have call it.

Since the numbers one would like to age are global over all Apache
children, one needs to use a global mutex in the second case. Another
detail: mod_jk models the aging as dividing by 2 once a minute. Of
course the factor and the interval could be varied. When doing that
coupled with request handling, it is division by 2^n, where n is the
number of times the interval passed since the last request (yeah, that's
not relevant, when there is load, but it is relevant, when people start
testing without stress testing tool by issuing single clicks).

As soon as the watchdog is assumed to be fully accepted, I think the
first option should be considered.

Finally: the data used to decide on the balancing decision should be
kept separate from statistical data used to monitor the usage of the
nodes. Those data can accumulate without aging. Making the decision data
avilable via monitoring (balancer manager) additionally helps in
understanding the correctness of the balancer.

Regards,

Rainer

Re: mod_proxy / mod_proxy_balancer

Posted by Ruediger Pluem <rp...@apache.org>.

On 05/05/2009 10:41 PM, jean-frederic clere wrote:

> 
> The next question is how do we call the ageing?
> - Via a thread that calls it after an elapsed time.

More accurate, but more complex.

> - When there is a request and the actual time is greater than the time

Less accurate, but less complex.

I tend to say lets go with this approach. It works even for non threaded MPMs,
seems accurate enough on a somewhat busy system and isn't that complex as
with a separate thread.

Regards

Rüdiger

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 5, 2009, at 3:02 PM, jean-frederic clere wrote:
> 
>> Jim Jagielski wrote:
>>> On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:
>>>> Jim Jagielski wrote:
>>>>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>>>>> Jim Jagielski wrote:
>>>>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>>>>
>>>>>>>> I am trying to get the worker->id and the scoreboard associated 
>>>>>>>> logic moved in the reset() when using a balancer, those workers 
>>>>>>>> need a different handling if we want to have a shared 
>>>>>>>> information area for them.
>>>>>>>>
>>>>>>> The thing is that those workers are not really handled
>>>>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>>>>> and m_p_b should handle the balancer-related ones.
>>>>>>
>>>>>> Ok by running first the m_p_b child_init() the worker is 
>>>>>> initialised by the m_p_b logic and mod_proxy won't change it later.
>>>>>>
>>>>>>>
>>>>> Yeah... a quick test indicates, at least as far as the perl
>>>>> framework is considered, changing to that m_p_b runs 1st in child_init
>>>>> results in normal and expected behavior.... Need to do some more
>>>>> tracing to see if we can copy the pointer instead of the whole
>>>>> data set with this ordering.
>>>>
>>>> I have committed the code... It works for my tests.
>>>>
>>> Beat me to it :)
>>> BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
>>> lot of what we do in trunk we can backport to 2.2.x....
>>
>> Yep but I think we should first have the reset()/age() stuff working 
>> in trunk before backporting to httpd-2.2-proxy :-)
>>
> 
> For sure!!
> 
> BTW: it seems to me that aging is only really needed when the 
> environment changes,
> mostly when a worker comes back, or when the actual limits are changed
> in real-time during runtime. Except for these, aging doesn't seem to
> really add much... long-term steady state only gets invalid when the
> steady-state changes, after all :)
> 
> Comments?
> 
> 

I think we need it for few reasons:
- When a worker is idle the information about its load is irrelevant.
- Being able to calculate throughput and load balance using that 
information is only valid if you have a kind of ticker.
- In some tests I have made with a mixture of long sessions and single 
request "sessions" you need to "forget" the load caused by the long 
sessions.

The next question is how do we call the ageing?
- Via a thread that calls it after an elapsed time.
- When there is a request and the actual time is greater than the time 
we should have call it.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 3:02 PM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:
>>> Jim Jagielski wrote:
>>>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>>>> Jim Jagielski wrote:
>>>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>>>
>>>>>>> I am trying to get the worker->id and the scoreboard  
>>>>>>> associated logic moved in the reset() when using a balancer,  
>>>>>>> those workers need a different handling if we want to have a  
>>>>>>> shared information area for them.
>>>>>>>
>>>>>> The thing is that those workers are not really handled
>>>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>>>> and m_p_b should handle the balancer-related ones.
>>>>>
>>>>> Ok by running first the m_p_b child_init() the worker is  
>>>>> initialised by the m_p_b logic and mod_proxy won't change it  
>>>>> later.
>>>>>
>>>>>>
>>>> Yeah... a quick test indicates, at least as far as the perl
>>>> framework is considered, changing to that m_p_b runs 1st in  
>>>> child_init
>>>> results in normal and expected behavior.... Need to do some more
>>>> tracing to see if we can copy the pointer instead of the whole
>>>> data set with this ordering.
>>>
>>> I have committed the code... It works for my tests.
>>>
>> Beat me to it :)
>> BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
>> lot of what we do in trunk we can backport to 2.2.x....
>
> Yep but I think we should first have the reset()/age() stuff working  
> in trunk before backporting to httpd-2.2-proxy :-)
>

For sure!!

BTW: it seems to me that aging is only really needed when the  
environment changes,
mostly when a worker comes back, or when the actual limits are changed
in real-time during runtime. Except for these, aging doesn't seem to
really add much... long-term steady state only gets invalid when the
steady-state changes, after all :)

Comments?

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:
> 
>> Jim Jagielski wrote:
>>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>>> Jim Jagielski wrote:
>>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>>
>>>>>> I am trying to get the worker->id and the scoreboard associated 
>>>>>> logic moved in the reset() when using a balancer, those workers 
>>>>>> need a different handling if we want to have a shared information 
>>>>>> area for them.
>>>>>>
>>>>> The thing is that those workers are not really handled
>>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>>> and m_p_b should handle the balancer-related ones.
>>>>
>>>> Ok by running first the m_p_b child_init() the worker is initialised 
>>>> by the m_p_b logic and mod_proxy won't change it later.
>>>>
>>>>>
>>> Yeah... a quick test indicates, at least as far as the perl
>>> framework is considered, changing to that m_p_b runs 1st in child_init
>>> results in normal and expected behavior.... Need to do some more
>>> tracing to see if we can copy the pointer instead of the whole
>>> data set with this ordering.
>>
>> I have committed the code... It works for my tests.
>>
> 
> Beat me to it :)
> 
> BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
> lot of what we do in trunk we can backport to 2.2.x....
> 
> 

Yep but I think we should first have the reset()/age() stuff working in 
trunk before backporting to httpd-2.2-proxy :-)

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>> Jim Jagielski wrote:
>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>
>>>>> I am trying to get the worker->id and the scoreboard associated  
>>>>> logic moved in the reset() when using a balancer, those workers  
>>>>> need a different handling if we want to have a shared  
>>>>> information area for them.
>>>>>
>>>> The thing is that those workers are not really handled
>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>> and m_p_b should handle the balancer-related ones.
>>>
>>> Ok by running first the m_p_b child_init() the worker is  
>>> initialised by the m_p_b logic and mod_proxy won't change it later.
>>>
>>>>
>> Yeah... a quick test indicates, at least as far as the perl
>> framework is considered, changing to that m_p_b runs 1st in  
>> child_init
>> results in normal and expected behavior.... Need to do some more
>> tracing to see if we can copy the pointer instead of the whole
>> data set with this ordering.
>
> I have committed the code... It works for my tests.
>

Beat me to it :)

BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
lot of what we do in trunk we can backport to 2.2.x....

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
> 
>> Jim Jagielski wrote:
>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>
>>>> I am trying to get the worker->id and the scoreboard associated 
>>>> logic moved in the reset() when using a balancer, those workers need 
>>>> a different handling if we want to have a shared information area 
>>>> for them.
>>>>
>>> The thing is that those workers are not really handled
>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>> and m_p_b should handle the balancer-related ones.
>>
>> Ok by running first the m_p_b child_init() the worker is initialised 
>> by the m_p_b logic and mod_proxy won't change it later.
>>
>>>
> 
> Yeah... a quick test indicates, at least as far as the perl
> framework is considered, changing to that m_p_b runs 1st in child_init
> results in normal and expected behavior.... Need to do some more
> tracing to see if we can copy the pointer instead of the whole
> data set with this ordering.
> 

I have committed the code... It works for my tests.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 1:12 PM, Jim Jagielski wrote:

>
> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>
>> Jim Jagielski wrote:
>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>
>>>> I am trying to get the worker->id and the scoreboard associated  
>>>> logic moved in the reset() when using a balancer, those workers  
>>>> need a different handling if we want to have a shared information  
>>>> area for them.
>>>>
>>> The thing is that those workers are not really handled
>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>> and m_p_b should handle the balancer-related ones.
>>
>> Ok by running first the m_p_b child_init() the worker is  
>> initialised by the m_p_b logic and mod_proxy won't change it later.
>>
>>>
>
> Yeah... a quick test indicates, at least as far as the perl
> framework is considered, changing to that m_p_b runs 1st in child_init
> results in normal and expected behavior.... Need to do some more
> tracing to see if we can copy the pointer instead of the whole
> data set with this ordering.
>

Looks like we can simply copy the pointer with that change...
So far, haven't run into any scoping or lifetime issues... The
change is ugly though, since we need to adjust for ** instead
of * for lots of code fragments... :)

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>
>>> I am trying to get the worker->id and the scoreboard associated  
>>> logic moved in the reset() when using a balancer, those workers  
>>> need a different handling if we want to have a shared information  
>>> area for them.
>>>
>> The thing is that those workers are not really handled
>> by the balancer itself (nor should be), so the reset() shouldn;'t
>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>> and m_p_b should handle the balancer-related ones.
>
> Ok by running first the m_p_b child_init() the worker is initialised  
> by the m_p_b logic and mod_proxy won't change it later.
>
>>

Yeah... a quick test indicates, at least as far as the perl
framework is considered, changing to that m_p_b runs 1st in child_init
results in normal and expected behavior.... Need to do some more
tracing to see if we can copy the pointer instead of the whole
data set with this ordering.

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:

> Jim Jagielski wrote:
>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>
>>> I am trying to get the worker->id and the scoreboard associated  
>>> logic moved in the reset() when using a balancer, those workers  
>>> need a different handling if we want to have a shared information  
>>> area for them.
>>>
>> The thing is that those workers are not really handled
>> by the balancer itself (nor should be), so the reset() shouldn;'t
>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>> and m_p_b should handle the balancer-related ones.
>
> Ok by running first the m_p_b child_init() the worker is initialised  
> by the m_p_b logic and mod_proxy won't change it later.
>
>> And those
>> are the only one that have lbmethods associated with them
>> and use (or will use :) ) reset().
>
> There is always a lbmethod associated to a balancer (for the moment  
> byrequests) so we should be able to have the worker belonging to a  
> balancer initialised there and the shared information also created/ 
> initialized there too.
>

But the forward and default reverse workers are not associated with
any balancer and remain in conf->workers and not copied to conf- 
 >balancers,
iirc

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
> 
>>
>> I am trying to get the worker->id and the scoreboard associated logic 
>> moved in the reset() when using a balancer, those workers need a 
>> different handling if we want to have a shared information area for them.
>>
> 
> The thing is that those workers are not really handled
> by the balancer itself (nor should be), so the reset() shouldn;'t
> apply. IMO, mod_proxy inits the generic forward/reverse workers
> and m_p_b should handle the balancer-related ones.

Ok by running first the m_p_b child_init() the worker is initialised by 
the m_p_b logic and mod_proxy won't change it later.

> And those
> are the only one that have lbmethods associated with them
> and use (or will use :) ) reset().
> 

There is always a lbmethod associated to a balancer (for the moment 
byrequests) so we should be able to have the worker belonging to a 
balancer initialised there and the shared information also 
created/initialized there too.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:

>
> I am trying to get the worker->id and the scoreboard associated  
> logic moved in the reset() when using a balancer, those workers need  
> a different handling if we want to have a shared information area  
> for them.
>

The thing is that those workers are not really handled
by the balancer itself (nor should be), so the reset() shouldn;'t
apply. IMO, mod_proxy inits the generic forward/reverse workers
and m_p_b should handle the balancer-related ones. And those
are the only one that have lbmethods associated with them
and use (or will use :) ) reset().

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Mladen Turk wrote:
> jean-frederic clere wrote:
>> Jim Jagielski wrote:
>>>
>>> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>>>
>>>> Hi,
>>>>
>>>> There are 2 weird things in the logic.
>>>> - In ap_proxy_add_worker_to_balancer() we make a copy of the worker, 
>>>> why not just the address?
>>>> If you looks to child_init() in mod_proxy and mod_proxy_balancer we 
>>>> see that mod_proxy initialise one copy and mod_proxy_balancer the 
>>>> other, it is working but one of the copies is never used.
>>>>
>>>> - We want the child_init of mod_proxy before mod_proxy_balancer, 
>>>> that prevents reset() of the balancer_method to control the creation 
>>>> of the worker.
>>>>
>>>
>>> Yeah, all on target.
>>>
>>>
>>
>> The next thing I am on is the ap_proxy_create_worker() called for 
>> reverse and forward (conf->reverse and conf->forward). 
>> ap_proxy_create_worker() fills the worker->id and they use 
>> ap_proxy_initialize_worker_share().e really need a shared information 
>> for those?
>>
> 
> I already answered that to you ;)
> 
> The rest of the code doesn't differentiate the worker types,
> so it is presumed that the worker has a share.
> Sure you can use the malloc for the share, but then you will
> have no track of data transfers on those workers.
> 
> May I ask why is that such a problem?

I am trying to get the worker->id and the scoreboard associated logic 
moved in the reset() when using a balancer, those workers need a 
different handling if we want to have a shared information area for them.

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Mladen Turk <mt...@apache.org>.

jean-frederic clere wrote:
> Jim Jagielski wrote:
>>
>> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
>>
>>> Hi,
>>>
>>> There are 2 weird things in the logic.
>>> - In ap_proxy_add_worker_to_balancer() we make a copy of the worker, 
>>> why not just the address?
>>> If you looks to child_init() in mod_proxy and mod_proxy_balancer we 
>>> see that mod_proxy initialise one copy and mod_proxy_balancer the 
>>> other, it is working but one of the copies is never used.
>>>
>>> - We want the child_init of mod_proxy before mod_proxy_balancer, that 
>>> prevents reset() of the balancer_method to control the creation of 
>>> the worker.
>>>
>>
>> Yeah, all on target.
>>
>>
> 
> The next thing I am on is the ap_proxy_create_worker() called for 
> reverse and forward (conf->reverse and conf->forward). 
> ap_proxy_create_worker() fills the worker->id and they use 
> ap_proxy_initialize_worker_share().e really need a shared information 
> for those?
> 

I already answered that to you ;)

The rest of the code doesn't differentiate the worker types,
so it is presumed that the worker has a share.
Sure you can use the malloc for the share, but then you will
have no track of data transfers on those workers.

May I ask why is that such a problem?


Regards
-- 
^(TM)

Re: mod_proxy / mod_proxy_balancer

Posted by jean-frederic clere <jf...@gmail.com>.

Jim Jagielski wrote:
> 
> On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:
> 
>> Hi,
>>
>> There are 2 weird things in the logic.
>> - In ap_proxy_add_worker_to_balancer() we make a copy of the worker, 
>> why not just the address?
>> If you looks to child_init() in mod_proxy and mod_proxy_balancer we 
>> see that mod_proxy initialise one copy and mod_proxy_balancer the 
>> other, it is working but one of the copies is never used.
>>
>> - We want the child_init of mod_proxy before mod_proxy_balancer, that 
>> prevents reset() of the balancer_method to control the creation of the 
>> worker.
>>
> 
> Yeah, all on target.
> 
> 

The next thing I am on is the ap_proxy_create_worker() called for 
reverse and forward (conf->reverse and conf->forward). 
ap_proxy_create_worker() fills the worker->id and they use 
ap_proxy_initialize_worker_share().e really need a shared information 
for those?

Cheers

Jean-Frederic

Re: mod_proxy / mod_proxy_balancer

Posted by Jim Jagielski <ji...@jaguNET.com>.

On May 5, 2009, at 4:45 AM, jean-frederic clere wrote:

> Hi,
>
> There are 2 weird things in the logic.
> - In ap_proxy_add_worker_to_balancer() we make a copy of the worker,  
> why not just the address?
> If you looks to child_init() in mod_proxy and mod_proxy_balancer we  
> see that mod_proxy initialise one copy and mod_proxy_balancer the  
> other, it is working but one of the copies is never used.
>
> - We want the child_init of mod_proxy before mod_proxy_balancer,  
> that prevents reset() of the balancer_method to control the creation  
> of the worker.
>

Yeah, all on target.