You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2005/01/07 20:52:09 UTC

Working on some load balancing methods

I'm currently working on code that extended the lb method
within the 2.1/2.2 proxy from what is basically a
weighted request count to also be a weighted
traffic count (as measured by bytes transferred)
and a weighted "load" count (as measured by response
time). The former is further along and the methods
will be selectable at runtime... This is definitely
a scratch I'm itching, but before I spend too much
(additional) time on it, I'd like some feedback
on whether the concept is one we can all get behind.

I am also toying with the idea of supporting
a CPU load method when the origin servers are
Apache via a custom response header...


Re: Working on some load balancing methods

Posted by Graham Leggett <mi...@sharp.fm>.
Justin Erenkrantz wrote:

> mod_backhand doesn't support Apache 2.x:

But mod_backhand can be ported to Apache 2, if people have the itch 
(which is what seems to be the case).

mod_backhand is my next itch to scratch after the LDAP stuff finally works.

Regards,
Graham
--

Re: Working on some load balancing methods

Posted by Theo Schlossnagle <je...@omniti.com>.
On Jan 11, 2005, at 9:44 AM, Jim Jagielski wrote:
> On Jan 11, 2005, at 4:20 AM, Ben Laurie wrote:
>> Justin Erenkrantz wrote:
>>> --On Saturday, January 8, 2005 10:43 PM +0000 Ben Laurie 
>>> <be...@algroup.co.uk> wrote:
>>>> Errr... mod_backhand?
>>> mod_backhand doesn't support Apache 2.x:
>>> <http://www.backhand.org/mod_backhand/FAQ.shtml#question0>
>>
>> Port it?
>>
>
> I think that we can come much further along with extending
> the lb capability in proxy... For more sophisticated and
> demanding environments, an external lb mechanism
> is likely used. So my pers. pref would be to
> see what can be done in proxy before seeing if
> mod_backhand even needs to be ported. I don't think
> that the web server should need to do everything :)

Having mod_backhand use mod_proxy isn't very difficult.  We implemented 
that for a client.  I don't understand the comment about the web server 
doing stuff.  mod_proxy sits inside apache and adheres to the same 
limitations do to it architectural position as mod_backhand.

mod_backhand already allows for users to write their own arbitrary load 
balancing decision functions and run them.  Perhaps looking at it 
wouldn't be so bad as it has been around for a while and understands 
the problems inherent in building a load balancer inside of a web 
server.

One extremely important thing to consider is that Alteons and all their 
appliance competitors support 100k concurrent SMTP sessions at a _bare 
minimum_.  Apache 2 is hard pressed to do that.  So, you need more than 
one front-end load balancer to share the load balancing over.  At this 
point, the issues of decision making compound dramatically as total 
knowledge is lost.  As the 2+ front-end load balancers are all 
accepting client-originated traffic, they don't know the decisions that 
the other peers are making.  This results in hard contention problems 
to solve -- which mod_backhand accounts for.

So, if you go implement something I have some advice.  Don't speculate 
on what you think is cool or what worked for you in your specific 
environment.  Alteon and big/ip and foundry, etc. etc. all have several 
load balancing policies for a _reason_.  They all don't work 
everywhere.  There is a lot of theory and research behind this stuff.  
go read some Sigmetrics issues and make sure you wrap your head around 
the whole problem before you "solve it" -- because you will be 
disappointed with the "it" you wind up with.

// Theo Schlossnagle
// Principal Engineer -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
// Ecelerity: fastest MTA on Earth


Re: Working on some load balancing methods

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Jan 11, 2005, at 4:20 AM, Ben Laurie wrote:

> Justin Erenkrantz wrote:
>> --On Saturday, January 8, 2005 10:43 PM +0000 Ben Laurie 
>> <be...@algroup.co.uk> wrote:
>>> Errr... mod_backhand?
>> mod_backhand doesn't support Apache 2.x:
>> <http://www.backhand.org/mod_backhand/FAQ.shtml#question0>
>
> Port it?
>

I think that we can come much further along with extending
the lb capability in proxy... For more sophisticated and
demanding environments, an external lb mechanism
is likely used. So my pers. pref would be to
see what can be done in proxy before seeing if
mod_backhand even needs to be ported. I don't think
that the web server should need to do everything :)


Re: Working on some load balancing methods

Posted by Ben Laurie <be...@algroup.co.uk>.
Justin Erenkrantz wrote:
> --On Saturday, January 8, 2005 10:43 PM +0000 Ben Laurie 
> <be...@algroup.co.uk> wrote:
> 
>> Errr... mod_backhand?
> 
> 
> mod_backhand doesn't support Apache 2.x:
> 
> <http://www.backhand.org/mod_backhand/FAQ.shtml#question0>

Port it?

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Re: Working on some load balancing methods

Posted by Paul Querna <ch...@force-elite.com>.
Justin Erenkrantz wrote:
> --On Saturday, January 8, 2005 10:43 PM +0000 Ben Laurie 
> <be...@algroup.co.uk> wrote:
> 
>> Errr... mod_backhand?
> 
> 
> mod_backhand doesn't support Apache 2.x:
> 
> <http://www.backhand.org/mod_backhand/FAQ.shtml#question0>
> 
> HTH.  -- justin
> 

Hence, the APR Multicast stuff I have recently added.  I would like to 
make something *like* mod_backhand for Apache 2....  The APR Multicast 
code was a first step.

-Paul

Re: Working on some load balancing methods

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Saturday, January 8, 2005 10:43 PM +0000 Ben Laurie 
<be...@algroup.co.uk> wrote:

> Errr... mod_backhand?

mod_backhand doesn't support Apache 2.x:

<http://www.backhand.org/mod_backhand/FAQ.shtml#question0>

HTH.  -- justin

Re: Working on some load balancing methods

Posted by Ben Laurie <be...@algroup.co.uk>.
Jim Jagielski wrote:
> I'm currently working on code that extended the lb method
> within the 2.1/2.2 proxy from what is basically a
> weighted request count to also be a weighted
> traffic count (as measured by bytes transferred)
> and a weighted "load" count (as measured by response
> time). The former is further along and the methods
> will be selectable at runtime... This is definitely
> a scratch I'm itching, but before I spend too much
> (additional) time on it, I'd like some feedback
> on whether the concept is one we can all get behind.
> 
> I am also toying with the idea of supporting
> a CPU load method when the origin servers are
> Apache via a custom response header...

Errr... mod_backhand?

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Re: Working on some load balancing methods

Posted by Paul Querna <ch...@force-elite.com>.
Brian Akins wrote:
> Mladen Turk wrote:
> 
>> Sure, the general idea was to allow different lb methods.
>> I've started to collect the transferred for various
>> protocols to be able to do a traffic balancing.
> 
> 
> So, there will be an API to allow developers to develop load balancers 
> without having to muck with the proxy code?  If so, yes!  If not, it 
> would rule.
> 
> 
> What we did is write our own proxy module because of limitations in the 
> 2.0 one.  Basically it's like this:
> 
> -There is an external programe (started like a piped log program) that 
> does "active" health checking. Every x seconds it checks all the 
> "origin" servers and records their status in shared memory.

This is more or less what mod_backhand does. (it is more passive, 
waiting for a active server to send out a multicast or an ethernet 
broadcast saying that it is alive, along with stats.)

> 
> -In our proxy module, it uses the information in the shared memory to 
> maintain a apr_reslist of connections.  It it gets an error before 
> sending data to the client (one origin server hiccup'ed, for example), 
> it will apr_reslist_invalidate that connection, apr_reslist_acquire 
> another and try again.  It will do this x times.
>

I like this approach. It makes sense in most situations to try multiple 
reslist connections.

> -only after x "errors" will an error actually be returned to the client.
> 
> -because of this, we could not use mod_proxy_http and use our own http 
> client code.
> 

I don't see why with todays framework it couldn't re-use most of proxy_http.

> 
> Does this sound doable within the 2.1/2.2 proxy lb?  Or do will we have 
> to continue to maintain our own proxy?  This functionality is very 
> important for us.

This sounds like a well designed module for your needs, but I think most 
of the features could be well used in other places.  Could your module 
be made open source, and used as a starting point for us?


-Paul Querna


Re: Working on some load balancing methods

Posted by Brian Akins <ba...@web.turner.com>.
Mladen Turk wrote:
> Sure, the general idea was to allow different lb methods.
> I've started to collect the transferred for various
> protocols to be able to do a traffic balancing.

So, there will be an API to allow developers to develop load balancers 
without having to muck with the proxy code?  If so, yes!  If not, it 
would rule.


What we did is write our own proxy module because of limitations in the 
2.0 one.  Basically it's like this:

-There is an external programe (started like a piped log program) that 
does "active" health checking. Every x seconds it checks all the 
"origin" servers and records their status in shared memory.

-In our proxy module, it uses the information in the shared memory to 
maintain a apr_reslist of connections.  It it gets an error before 
sending data to the client (one origin server hiccup'ed, for example), 
it will apr_reslist_invalidate that connection, apr_reslist_acquire 
another and try again.  It will do this x times.

-only after x "errors" will an error actually be returned to the client.

-because of this, we could not use mod_proxy_http and use our own http 
client code.


Does this sound doable within the 2.1/2.2 proxy lb?  Or do will we have 
to continue to maintain our own proxy?  This functionality is very 
important for us.


-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: Working on some load balancing methods

Posted by Mladen Turk <mt...@apache.org>.
Jim Jagielski wrote:
> I'm currently working on code that extended the lb method
> within the 2.1/2.2 proxy from what is basically a
> weighted request count to also be a weighted
> traffic count (as measured by bytes transferred)
> and a weighted "load" count (as measured by response
> time).

Sure, the general idea was to allow different lb methods.
I've started to collect the transferred for various
protocols to be able to do a traffic balancing.
You have that params already present in the shared
memory, so making sure we collect the real values
transferred is a first step thought.

I'm also working on extending the load balancer to
a domain clustering model for grouping cluster nodes
in groups to lower the session replication transfer
for backends that offer the session replication.

> The former is further along and the methods
> will be selectable at runtime... This is definitely
> a scratch I'm itching, but before I spend too much
> (additional) time on it, I'd like some feedback
> on whether the concept is one we can all get behind.
> 

+1.

> I am also toying with the idea of supporting
> a CPU load method when the origin servers are
> Apache via a custom response header...
> 

Well, that'll be awesome to achieve, not only to
dynamically update the balancer node member, but
to safely remove it from the cluster in a two
phase process if the backend 'decides' so.
Right now this is possible using 'balancer-handler',
so the dynamic logic is there already.

The header processing should also allow to
discover the topology dynamically (there is
already PROXY_DYNAMIC_BALANCER_LIMIT meant to
be used for that).
I'm not sure if this will be possible on all
protocols but we have a long-standing AJP13
extension proposal that if implemented will
allow that kind of operation.
http://jakarta.apache.org/tomcat/connectors-doc/common/ajpv13ext.html

Regards,
Mladen.

Re: Working on some load balancing methods

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.

On Fri, 7 Jan 2005, Jim Jagielski wrote:

> I'm currently working on code that extended the lb method within the
> 2.1/2.2 proxy from what is basically a weighted request count to also be
> a weighted traffic count (as measured by bytes transferred) and a
> weighted "load" count (as measured by response time). The former is
> further along and the methods will be selectable at runtime... This is
> definitely a scratch I'm itching, but before I spend too much
> (additional) time on it, I'd like some feedback on whether the concept
> is one we can all get behind.

We scratched this one at some point at Covalent as well - using the
scoreboard and some mod_snmp stuff.

> I am also toying with the idea of supporting a CPU load method when the
> origin servers are Apache via a custom response header...

What you'd ideally want is a pluggable 'metric' mechanism - which has two
layers (this is more or less in line with what we dit at Covalent and what
you can do inside some of the Cisco LD's):

layer one:

	get_current_SRV(for some container)

which returns a list of

	id, IP, weight, priority

for a given service, uri subspace, etc. The result is a function of the
callee. So this is not a de-facto statement server wide.

This works much the same way as a true SRV rercod (as in DNS, as use by
various services such as an active directory service, etc) - the priority
allows for fail over and the weight allows for loadbalancing.

and secondly a

	handoff_to(id)

to confirm to the plugin that a certain backend is used for statistics,
N-path, firewall/nat reprogramming, whatever.

The second layer is also pluggable anbd simply returns state

	get_load(given backend, for some container)

	load: value between -1 (off-line) and 0..255.

this is used by layer one to get the list on a case by case basis.

On top of this you also want a 'ConfigBackEnd <type> [ opaquate values]'
which carries the config info to the right plugin.

And finally (and this is where I got stuck last time and used a horrible
hack (private scoreboard) some neat way of expending the scoreboard.

What I never did - but which you really do want ultimately is something
which can snarf timing and data out of the returned header to detect 500
error's, time-to-server and other things, such as cookies - to record
these and use in the next round as an indication of reliability.

The reason why I never got this done properly is that in a lot of setups
you do NOT want the frontend apache to properly process the data going
back - but you want to route that TCP traffic around the server. Which is
what that handoff() is for. But in those cases such return sniffing is
of no use. As it never sees the data.

The reason for the two layers is that

->	on layer one you can do three simple implementations:
	static: round robin on a list, fail over, etc
	DNS: based off SRV records in DNS
	real: based on a backend which uses layer 2
->	on layer two you can do simple things; like just
	pinging - and augment as needed with things like
	SNMP later.
->	the reason you have container info (uri space, vhost,
	cookies, ssl session info) is to do all sort of
	stickyness when you feel like it.

end of braindump and itch sratching.

Dw

Re: Working on some load balancing methods

Posted by Bill Stoddard <bi...@wstoddard.com>.
Jeffrey Burgoyne wrote:
> I'm using least connections with an Alteon on a recently installd system.
> Least connections, whil eperhaps crude, is one of the most effective
> methods for load balancing.
> 
> The general reason is that over the long haul, it will be putting
> conenctions onto those mahcines which are discharging their old
> connections faster. That is usually due to better performance, althoguh
> not always.
> 
> We have run into issues with more "intelligent" load balancing that kept
> pumping up one machine more than others because its CPU load was low. In
> fact there was a disk issue which kept the machine in a bad IO wait state
> which was not taken into account, hence most of the connections went to a
> "slower" machine.
> 
> In playing aorund with the various types of laod balancing, I came to the
> conclusion that the KISS principle applied. More intelligent load
> balancing is probably better, but in all honesty its splitting hairs. If I
> can get 98% of my connections going to the best machine, why work harder
> to get to 99%? And also remember, just because it doesn't get to the best
> machine does not mean it will get done that much faster. Maybe 3 seconds
> instead of 2? Not worth the effort in most cases.
> \
> IMO if you ever get to the case where a machine is simply way to busy to
> handle the load, you need to add more hardware. Thats a much better
> investment instead of trying to squeeze every last drop of performance out
> of a system, at least for a larger organization where time is money.
> 
> One caveat on least connections though, it would be nice if it was a
> weighted least connections average where you could target a percentage of
> connections to every machine. I've got one J2EE back end system needing to
> be load balanced now where one machine is a 4 year odl Sun 450 and the
> other is a newer V880. In an ideal world I want 70% of the traffic to the
> V880. Least connections will get me close to that, but not all the way.
> 
> Jeffrey Burgoyne

+1

Re: Working on some load balancing methods

Posted by Jeffrey Burgoyne <bu...@keenuh.com>.
I'm using least connections with an Alteon on a recently installd system.
Least connections, whil eperhaps crude, is one of the most effective
methods for load balancing.

The general reason is that over the long haul, it will be putting
conenctions onto those mahcines which are discharging their old
connections faster. That is usually due to better performance, althoguh
not always.

We have run into issues with more "intelligent" load balancing that kept
pumping up one machine more than others because its CPU load was low. In
fact there was a disk issue which kept the machine in a bad IO wait state
which was not taken into account, hence most of the connections went to a
"slower" machine.

In playing aorund with the various types of laod balancing, I came to the
conclusion that the KISS principle applied. More intelligent load
balancing is probably better, but in all honesty its splitting hairs. If I
can get 98% of my connections going to the best machine, why work harder
to get to 99%? And also remember, just because it doesn't get to the best
machine does not mean it will get done that much faster. Maybe 3 seconds
instead of 2? Not worth the effort in most cases.
\
IMO if you ever get to the case where a machine is simply way to busy to
handle the load, you need to add more hardware. Thats a much better
investment instead of trying to squeeze every last drop of performance out
of a system, at least for a larger organization where time is money.

One caveat on least connections though, it would be nice if it was a
weighted least connections average where you could target a percentage of
connections to every machine. I've got one J2EE back end system needing to
be load balanced now where one machine is a 4 year odl Sun 450 and the
other is a newer V880. In an ideal world I want 70% of the traffic to the
V880. Least connections will get me close to that, but not all the way.

Jeffrey Burgoyne

Chief Technology Architect
KCSI Keenuh Consulting Services Inc
burgoyne@keenuh.com

On Mon, 10 Jan 2005, Stefan Hueneburg wrote:

> Hallo,
>
> Jim Jagielski <ji...@jaguNET.com> wrote:
>
> > I'm currently working on code that extended the lb method
> > within the 2.1/2.2 proxy from what is basically a
> > weighted request count to also be a weighted
> > traffic count (as measured by bytes transferred)
> > and a weighted "load" count (as measured by response
> > time). The former is further along and the methods
> > will be selectable at runtime... This is definitely
> > a scratch I'm itching, but before I spend too much
> > (additional) time on it, I'd like some feedback
> > on whether the concept is one we can all get behind.
>
> Maybe "least connection count" is also interesting.
>
> However, on Alteon-LB i experienced some kind of weird load pumping
>  when i used "least connection" mode.
>
> Bis dann,
> Stefan Hüneburg
> --
> easynet GmbH (http://www.de.easynet.net)
> System Administrator, Application Services
> Harburger Schlossstrasse 1, D-21079 Hamburg
> fon: +49-40-77175-526, fax: +49-40-77175-519
>

Re: Working on some load balancing methods

Posted by Stefan Hueneburg <sh...@de.easynet.net>.
Hallo,

Jim Jagielski <ji...@jaguNET.com> wrote:

> I'm currently working on code that extended the lb method
> within the 2.1/2.2 proxy from what is basically a
> weighted request count to also be a weighted
> traffic count (as measured by bytes transferred)
> and a weighted "load" count (as measured by response
> time). The former is further along and the methods
> will be selectable at runtime... This is definitely
> a scratch I'm itching, but before I spend too much
> (additional) time on it, I'd like some feedback
> on whether the concept is one we can all get behind.

Maybe "least connection count" is also interesting. 

However, on Alteon-LB i experienced some kind of weird load pumping
 when i used "least connection" mode.

Bis dann,
Stefan Hüneburg
-- 
easynet GmbH (http://www.de.easynet.net)
System Administrator, Application Services
Harburger Schlossstrasse 1, D-21079 Hamburg
fon: +49-40-77175-526, fax: +49-40-77175-519

RE: Working on some load balancing methods

Posted by Sander Striker <st...@apache.org>.
> From: Jim Jagielski [mailto:jim@jaguNET.com] 
> Sent: Friday, January 07, 2005 8:52 PM
> To: dev@httpd.apache.org
> Subject: Working on some load balancing methods
>
> I'm currently working on code that extended the lb method within the
> 2.1/2.2 proxy from what is basically a weighted request count to also
> be a weighted traffic count (as measured by bytes transferred) and a
> weighted "load" count (as measured by response time). The former is
> further along and the methods will be selectable at runtime... This is
> definitely a scratch I'm itching,

I'm sure you are not the only one with that itch.

> but before I spend too much (additional) time on it, I'd like some
> feedback on whether the concept is one we can all get behind.

FWIW, I like it.

> I am also toying with the idea of supporting a CPU load method when
> the origin servers are Apache via a custom response header...

+1!


Sander