You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modproxy-dev@apache.org by Bill Stoddard <bi...@wstoddard.com> on 2003/06/13 16:29:21 UTC

modproxy load balancer

Ping to all list citizens (listizens?) ...
Who would be interested in seeing some load balancing function being put 
into mod_proxy?  Anyone given any though on what you would like to see 
or maybe even have a design proposal you'd like to discuss?

My short requirements list:
- selectable load balancing algorithm: Round robin, LRU,  response time, 
url driven, session affinity, ?
- automatic detection of backend server failure and removal of the 
failed server from the load balancing routing tables (forever? for a 
period of time? other?)
- connection pooling using HTTP keep-alive (this is a no brainer since 
it is a simple extension of what browsers already do, but it needs to be 
designed in from the start)
- must be effective with multiple child processes, each child must make 
routing decisions globally based on stats maintained in a shared memory 
segment

To do this properly, I would think we need some new config directives. 
Perhaps a new container directive to define a group of backend servers,  
another container directive to define URLs served by a particular group 
of backend servers. Need some way to bind a url group to a server group.

Bill



Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> Ping to all list citizens (listizens?) ...
> Who would be interested in seeing some load balancing function being put 
> into mod_proxy?

Very interested: There is a placeholder in the existing code for this, 
along the lines of "order the list of IPs I should try and connect to here".

A better approach would be to turn this into a hook - then we can have 
proxy_balancer in addition to proxy_http, proxy_ftp, etc.

During Apachecon 2002, there were some discussions on bringing in 
mod_backhand in to do this - backhand could handle the load balancing, 
and proxy would handle the protocol.

> - selectable load balancing algorithm: Round robin, LRU,  response time, 
> url driven, session affinity, ?

Each load balancer in its own module.

> - automatic detection of backend server failure and removal of the 
> failed server from the load balancing routing tables (forever? for a 
> period of time? other?)

And the concept of URL retry - example: if the first server returns a 
4xx or a 5xx, then try the next one transparently.

> - connection pooling using HTTP keep-alive (this is a no brainer since 
> it is a simple extension of what browsers already do, but it needs to be 
> designed in from the start)

Connection pooling was given a lot of thought, and I don't think that 
the performance advantage is worth the effort.

In a reverse proxy situation, the network between the proxy and the 
backend is likely to be fast enough that pooling gives virtually no 
advantage.

In a forward proxy situation, the large spread of URLs being accessed 
means that the vast majority of pooled connections will simply hang 
around unused, eating up server resources.

> - must be effective with multiple child processes, each child must make 
> routing decisions globally based on stats maintained in a shared memory 
> segment
> 
> To do this properly, I would think we need some new config directives. 
> Perhaps a new container directive to define a group of backend servers,  
> another container directive to define URLs served by a particular group 
> of backend servers. Need some way to bind a url group to a server group.

We should just define some sane namespaces for directives, and then do 
them on a per-module basis.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Mathias Herberts <Ma...@gicm.fr>.
How about mod_backhand ?

www.backhand.org

-- 
--  Informatique du Credit Mutuel  ----  Reseaux et Systemes Distribues
--  32 rue Mirabeau -- Le Relecq-Kerhuon -- 29808 Brest Cedex 9, FRANCE
--  Tel +33298004653 - Fax +33298284005 - Mail Mathias.Herberts@gicm.fr
--  Key Fingerprint: 8778 D2FD 3B4A 6B33 10AB  F503 63D0 ADAE 9112 03E4

Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> Thinking out loud..... Should this be a hook or an optional function?  A 
> hook could be useful for iterating across multiple load balancing 
> modules, routing requests for different urls using different algorithms; 
> would this be a common configuration?

If given many options, I would want the ability to select more than one. 
Even though in 90% of the cases the default round robin may suffice, I 
would probably be annoyed if the last 10% of the time I needed the 
ability and it was not available to me.

> The load balance module would 
> also need to be told when the request was complete (it needs to keep 
> track of how many active connections there are to each backend machine) 
> and when an ip address was unsuccessfully tried (so that ip address can 
> be taken out of the list of candidates). The former can be done by 
> registering a cleanup against the request pool. The latter could be done 
> with a callback function, optional function or hook back into the load 
> balance module.

All of these can be achieved by registering hooks.

For example, a simple DNS round robin module would hook into the "give 
me an URL I'll give you some IP addresses" bit, but would leave the 
other hooks alone.

A more advanced backhand module might do the URL to IP translation, then 
would hook into the end of the request to gather stats about that 
request for it's own purposes.

I would also like to specify the order in which the modules are tried 
somehow, in the same way that mod_cache chooses either memory or disk 
for its cache.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Bill Stoddard <bi...@wstoddard.com>.
Graham Leggett wrote:

> Theo E. Schlossnagle wrote:
>
>>> Proxy contains a placeholder (which should be replaced with a hook) 
>>> that says "I have a list of IP addresses, decide in what order I 
>>> should try these addresses here".
>>>
>>> My understanding of backhand is that it answers the above question - 
>>> in theory we could pull the code in by hooking it in.
>>
>>
>>
>> mod_backhand can also provide the list of IPs -- in fact, it would be 
>> best that way.
>
>
> To rephrase it, mod_proxy should give an URL to one or more backend 
> modules (most likely backhand), which should return a list of IP 
> addresses saying "try these in this order".
>
> The backend module might do simple DNS

> round robin in its simplest form, going all the way up to all the 
> functionality of backhand. 


Thinking out loud..... Should this be a hook or an optional function?  A 
hook could be useful for iterating across multiple load balancing 
modules, routing requests for different urls using different algorithms; 
would this be a common configuration?  The load balance module would 
also need to be told when the request was complete (it needs to keep 
track of how many active connections there are to each backend machine) 
and when an ip address was unsuccessfully tried (so that ip address can 
be taken out of the list of candidates). The former can be done by 
registering a cleanup against the request pool. The latter could be done 
with a callback function, optional function or hook back into the load 
balance module.

Bill



Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Theo E. Schlossnagle wrote:

>> Proxy contains a placeholder (which should be replaced with a hook) 
>> that says "I have a list of IP addresses, decide in what order I 
>> should try these addresses here".
>>
>> My understanding of backhand is that it answers the above question - 
>> in theory we could pull the code in by hooking it in.
> 
> 
> mod_backhand can also provide the list of IPs -- in fact, it would be 
> best that way.

To rephrase it, mod_proxy should give an URL to one or more backend 
modules (most likely backhand), which should return a list of IP 
addresses saying "try these in this order".

The backend module might do simple DNS round robin in its simplest form, 
going all the way up to all the functionality of backhand.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> If I am reading this correctly, my polarity must be different that the 
> author of this comment.  The way I look at it, most bytes flow from the 
> webserver to the web client.

But clients make connections to webservers, webservers do not make 
connection to clients. The "stream" represents the sense of the data 
connection (client down to webserver, down to backend), not the byte flow.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Bill Stoddard <bi...@wstoddard.com>.
Perhaps I am being silly, but do we need to standardize on a definition 
of 'downstream' and 'upstream'?  Here is a comment from proxy_http.c:

  /* Note: Memory pool allocation.
   * A downstream keepalive connection is always connected to the existence
   * (or not) of an upstream keepalive connection. If this is not done then
   * load balancing against multiple backend servers breaks (one backend
   * server ends up taking 100% of the load), and the risk is run of
   * downstream keepalive connections being kept open unnecessarily. This
   * keeps webservers busy and ties up resources.
   *
   * As a result, we allocate all sockets out of the upstream connection
   * pool, and when we want to reuse a socket, we check first whether the
   * connection ID of the current upstream connection is the same as that
   * of the connection when the socket was opened.
   */

If I am reading this correctly, my polarity must be different that the 
author of this comment.  The way I look at it, most bytes flow from the 
webserver to the web client. Analogous to water, the bytes flow 
'downstream' from the server to the client.  Now a proxy maintains two 
connections, one to the client and one to the webserver. I would call 
the connection from the client to the proxy the 'downstream' connection 
and the connection from the proxy to the server the 'upstream' 
connection.  What say you?

Bill



Re: modproxy load balancer

Posted by Bill Stoddard <bi...@wstoddard.com>.
Graham Leggett wrote:

> Theo E. Schlossnagle wrote:
>
>> I am not sure how you invision the hooks being loaded at runtime.  If 
>> they are their own modules, and just place themselves in the 
>> mod_proxy chain, then I can piggyback on the the builtin module 
>> inititalization functions.  Otherwise, I need someway to initialize 
>> my module.
>
>
> Use the exact same model as is used now for proxy_http, proxy_ftp and 
> proxy_connect. All three of these modules depend on hooks defined 
> inside mod_proxy.
>
> mod_dns, mod_sticky, mod_backhand, etc would simply be the 4th, 5th 
> and 6th module dependant on mod_proxy.
>
>> I don't think the hook should be responsible for making the 
>> connection.  I think the hook should be solely responsible for 
>> listing, in order of preference, where connections should be 
>> established.  In perl syntax:
>>
>> [
>>   { 'protocol' => 'http',
>>     'IP' => '10.2.3.4',
>>     'port' => '80' },
>>   { 'protocol' => 'http',
>>     'IP' => '10.2.3.8',
>>     'port' => '8080' },
>> ]
>>
>> then mod_proxy should be responsible for taking that list and making 
>> real and usable connections out of them.
>
>
> Ok... my thinking was that it would simplify the notification to the 
> backend of connection success or failure, but then doing it your way 
> simplifies the backend module.
>
> What we could do is have two hooks - the first gets given an 
> URL/hostname/port, and returns a list of IPs to try.
>
> Proxy then tries those IPs in turn.
>
> Then a second hook is run saying "oh by the way, that IP address you 
> gave me is down with status whatever, or it worked fine thanks".
>
> If a connection failed, all the backend modules get to find out and 
> can blacklist that server, whatever. If the connection succeeded, the 
> time difference between the first and second hook would be the total 
> time of connection, which could be used for loading stats. 

Yes, this is exactly what I was thinking.  Thanks for explaining your 
reason for making the hook do the connection.

Bill



Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Theo E. Schlossnagle wrote:

> I am not sure how you invision the hooks being loaded at runtime.  If 
> they are their own modules, and just place themselves in the mod_proxy 
> chain, then I can piggyback on the the builtin module inititalization 
> functions.  Otherwise, I need someway to initialize my module.

Use the exact same model as is used now for proxy_http, proxy_ftp and 
proxy_connect. All three of these modules depend on hooks defined inside 
mod_proxy.

mod_dns, mod_sticky, mod_backhand, etc would simply be the 4th, 5th and 
6th module dependant on mod_proxy.

> I don't think the hook should be responsible for making the connection.  
> I think the hook should be solely responsible for listing, in order of 
> preference, where connections should be established.  In perl syntax:
> 
> [
>   { 'protocol' => 'http',
>     'IP' => '10.2.3.4',
>     'port' => '80' },
>   { 'protocol' => 'http',
>     'IP' => '10.2.3.8',
>     'port' => '8080' },
> ]
> 
> then mod_proxy should be responsible for taking that list and making 
> real and usable connections out of them.

Ok... my thinking was that it would simplify the notification to the 
backend of connection success or failure, but then doing it your way 
simplifies the backend module.

What we could do is have two hooks - the first gets given an 
URL/hostname/port, and returns a list of IPs to try.

Proxy then tries those IPs in turn.

Then a second hook is run saying "oh by the way, that IP address you 
gave me is down with status whatever, or it worked fine thanks".

If a connection failed, all the backend modules get to find out and can 
blacklist that server, whatever. If the connection succeeded, the time 
difference between the first and second hook would be the total time of 
connection, which could be used for loading stats.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by "Theo E. Schlossnagle" <je...@omniti.com>.
Graham Leggett wrote:
> In order to complete the request, a function:
> 
> proxy_run_scheme_handler(r, conf, url, ents[i].hostname, ents[i].port);
> 
> is run. This either connects to hostname and port, and asks for URL 
> (forward proxy), or if hostname and port are NULL, it connects to the 
> host in URL (reverse proxy).

I am not sure how you invision the hooks being loaded at runtime.  If they are 
their own modules, and just place themselves in the mod_proxy chain, then I 
can piggyback on the the builtin module inititalization functions.  Otherwise, 
I need someway to initialize my module.

> I think the hook should go inside the proxy_run_scheme_handler() 
> function, and the hooked code should accept an URL (or a hostname and 
> port) and convert it into a connection, which is passed back to the rest 
> of the code path.

I don't think the hook should be responsible for making the connection.  I 
think the hook should be solely responsible for listing, in order of 
preference, where connections should be established.  In perl syntax:

[
   { 'protocol' => 'http',
     'IP' => '10.2.3.4',
     'port' => '80' },
   { 'protocol' => 'http',
     'IP' => '10.2.3.8',
     'port' => '8080' },
]

then mod_proxy should be responsible for taking that list and making real and 
usable connections out of them.

> 
> The hooked module can then do what it likes with connection failure: 
> retry with a round robin connection, etc until it is happy (or unhappy).
> 
> The existing code can be pulled out of what's there now, and moved into 
> a simple module called "proxy_dns" (or something).
> 
> One other thing that must be looked at is module ordering:
> 
> Take for example the case where you want to support "sticky" 
> connections. You would probably want to watch either a cookie or a 
> request variable called JSESSIONID, and make sure that all requests with 
> that session id go to that server.

mod_backhand can do that now because it has access to the whole request_rec 
structures in 1.3.x.  So, similar access would be very useful.

My approach to mod_backhand 2.0 was to:
   (a) take all systems code, shared segments, etc. and place them in a 
standalone process
   (b) throw out 80% of the code and use mod_proxy :-)
   (c) rewrite the candidacy functions for the new API.

> But what happens if the sticky server is down? The module would say 
> DECLINED and hand it on to the next module, which might be proxy_dns, 
> whatever.

Perfect.  Also, it is important for each "module" or hook to be able to see 
the complete list that resulted from the previous hook.  See this for a clear 
idea of what I mean:

http://www.backhand.org/ApacheCon2000/EU/img24.htm
http://www.backhand.org/ApacheCon2000/EU/img25.htm
http://www.backhand.org/ApacheCon2000/EU/img26.htm

Obviously the API here would need to change a tad, but if the ServerSlot 
structure contained all the information (IP:port) that mod_proxy needed to 
establish a connection, the API actually would pop right in.

Having this ability will allow someone to mix and match modules to recall 
achieve complex proxy decision making that matches their needs.  And if it 
doesn't do _exactly_ what they want, they can write a very small link to put 
in the chain instead of writing a big hook that reinvents a lot of working, 
tested code.


> We need some way though of telling proxy that proxy_sticky comes before 
> proxy_dns.
> 
> Perhaps we can have a directive the same as in mod_cache, which 
> specifies the order in which the backend modules are tried.

This ordering is important for mod_backhand integration.  Most of the 
candidacy functions (that reorder and augment the "list of hosts") are very 
simply and complicated balancing logic is achieved by cascading them -- and 
order matters :-)

-- 
Theo Schlossnagle
Principal Consultant
OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Phone:  +1 410 872 4910 x201     Fax:  +1 410 872 4911
1024D/82844984/95FD 30F1 489E 4613 F22E  491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA  3D 90 B9 9F BE 27 24 E7


Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

> I'll start implementing some of the hook calls in mod_proxy. Once I get 
> the hooks in place, you should be able to write a backhand load balancer 
> module that declares interest in these hooks. Hope to start working on 
> the mod_proxy mods within the next few days.

In line 433 of mod_proxy.c, there is the code that (IMHO) needs to be 
changed.

First, if we are configured to connect to one or more further downstream 
proxies, we try to connect to each one in the order they are specified 
in the config file. If we are configured to connect direct (the usual 
case), then we try that direct connection. The result is a connection to 
some remote server.

In order to complete the request, a function:

proxy_run_scheme_handler(r, conf, url, ents[i].hostname, ents[i].port);

is run. This either connects to hostname and port, and asks for URL 
(forward proxy), or if hostname and port are NULL, it connects to the 
host in URL (reverse proxy).

I think the hook should go inside the proxy_run_scheme_handler() 
function, and the hooked code should accept an URL (or a hostname and 
port) and convert it into a connection, which is passed back to the rest 
of the code path.

The hooked module can then do what it likes with connection failure: 
retry with a round robin connection, etc until it is happy (or unhappy).

The existing code can be pulled out of what's there now, and moved into 
a simple module called "proxy_dns" (or something).

One other thing that must be looked at is module ordering:

Take for example the case where you want to support "sticky" 
connections. You would probably want to watch either a cookie or a 
request variable called JSESSIONID, and make sure that all requests with 
that session id go to that server.

But what happens if the sticky server is down? The module would say 
DECLINED and hand it on to the next module, which might be proxy_dns, 
whatever.

We need some way though of telling proxy that proxy_sticky comes before 
proxy_dns.

Perhaps we can have a directive the same as in mod_cache, which 
specifies the order in which the backend modules are tried.

Thoughts?

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Bill Stoddard <bi...@wstoddard.com>.
Theo E. Schlossnagle wrote:

> Graham Leggett wrote:
>
>> George Schlossnagle wrote:
>>
>>> Isn't there ongoing discussion about incorporating mod_backhand into 
>>> mod_proxy for this?
>>
>>
>> This I think would be the quickest solution.
>>
>> Proxy contains a placeholder (which should be replaced with a hook) 
>> that says "I have a list of IP addresses, decide in what order I 
>> should try these addresses here".
>>
>> My understanding of backhand is that it answers the above question - 
>> in theory we could pull the code in by hooking it in.
>
>
> mod_backhand can also provide the list of IPs -- in fact, it would be 
> best that way.
>
I'll start implementing some of the hook calls in mod_proxy. Once I get 
the hooks in place, you should be able to write a backhand load balancer 
module that declares interest in these hooks. Hope to start working on 
the mod_proxy mods within the next few days.

Bill


Re: modproxy load balancer

Posted by "Theo E. Schlossnagle" <je...@omniti.com>.
Graham Leggett wrote:
> George Schlossnagle wrote:
> 
>> Isn't there ongoing discussion about incorporating mod_backhand into 
>> mod_proxy for this?
> 
> This I think would be the quickest solution.
> 
> Proxy contains a placeholder (which should be replaced with a hook) that 
> says "I have a list of IP addresses, decide in what order I should try 
> these addresses here".
> 
> My understanding of backhand is that it answers the above question - in 
> theory we could pull the code in by hooking it in.

mod_backhand can also provide the list of IPs -- in fact, it would be best 
that way.

-- 
Theo Schlossnagle
Principal Consultant
OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Phone:  +1 410 872 4910 x201     Fax:  +1 410 872 4911
1024D/82844984/95FD 30F1 489E 4613 F22E  491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA  3D 90 B9 9F BE 27 24 E7


Re: modproxy load balancer

Posted by Graham Leggett <mi...@sharp.fm>.
George Schlossnagle wrote:

> Isn't there ongoing discussion about incorporating mod_backhand into 
> mod_proxy for this?

This I think would be the quickest solution.

Proxy contains a placeholder (which should be replaced with a hook) that 
says "I have a list of IP addresses, decide in what order I should try 
these addresses here".

My understanding of backhand is that it answers the above question - in 
theory we could pull the code in by hooking it in.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."


Re: modproxy load balancer

Posted by Bill Stoddard <bi...@wstoddard.com>.
George Schlossnagle wrote:

> Isn't there ongoing discussion about incorporating mod_backhand into 
> mod_proxy for this?
>
>
> George

Hi George,
Thanks for the reminder. I can't say that I have been paying much 
attention lately, but this discussion goes back quite some time.

I am digging into the doc on mod_backhand  but still don't quite grok 
how backhand works (I am still reading).  It appears to rely on all the 
servers in the cluster being 'backhand aware' (for lack of a better 
term) and that the servers communicate their status to each other via 
UDP. Requiring all the servers in the cluster to be backhand aware 
severly limits the usefulness of mod_backhand.  There would also be 
potential security issues with the ports backhand aware servers use to 
communicate with each other (none that could not be fixed if they even 
exist at all).  Am I missing something important here?

Bill


Re: modproxy load balancer

Posted by Chuck Murcko <ch...@topsail.org>.
On Friday, Jun 13, 2003, at 08:03 America/Phoenix, George Schlossnagle 
wrote:

> On Friday, June 13, 2003, at 10:48  AM, Eli Marmor wrote:
>
>> Bill Stoddard wrote:
>>
>>> Who would be interested in seeing some load balancing function being 
>>> put
>>> into mod_proxy?
>>
>> Several millions users?
>> Many thousands of webmasters?
>
> Isn't there ongoing discussion about incorporating mod_backhand into 
> mod_proxy for this?
>

I presumed there was some discussion between Theo and Graham at 
ApacheCon about this. The copyrights prevent us from just dropping 
mod_backhand into httpd. However, I have a patch that adds persistent 
connections to mod_proxy that I am packaging and plan to submit over 
the weekend, after I finish some current day job work.

Chuck


Re: modproxy load balancer

Posted by George Schlossnagle <ge...@omniti.com>.
On Friday, June 13, 2003, at 10:48  AM, Eli Marmor wrote:

> Bill Stoddard wrote:
>
>> Who would be interested in seeing some load balancing function being 
>> put
>> into mod_proxy?
>
> Several millions users?
> Many thousands of webmasters?

Isn't there ongoing discussion about incorporating mod_backhand into 
mod_proxy for this?


George


Re: modproxy load balancer

Posted by Eli Marmor <ma...@netmask.it>.
Bill Stoddard wrote:

> Who would be interested in seeing some load balancing function being put
> into mod_proxy?

Several millions users?
Many thousands of webmasters?

> - selectable load balancing algorithm: Round robin, LRU,  response time,
> url driven, session affinity, ?

Before providing multi-choicing, a basic algorithm (maybe based on one
on the above) will be great too.

> - automatic detection of backend server failure and removal of the
> failed server from the load balancing routing tables (forever? for a
> period of time? other?)

Hmmm...
You continue to sample it, once per a time (1 minute? more?)
You can also add a directive to define the time resolution, or the
action to be taken.
But why not just "steal" ideas from the LVS project?
They already passed all these steps (for another layer) and chose/
developed solutions.

> Perhaps a new container directive to define a group of backend servers,
> another container directive to define URLs served by a particular group
> of backend servers. Need some way to bind a url group to a server group.

I would re-use the syntax of mod_rewrite, including its parser and
other stuff from that module.
It enables you a bunch of other goodies, such as choosing the exact way
to bring the stuff (proxy/redirect), etc.

-- 
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.:   +972-9-766-1020          8 Yad-Harutzim St.
Fax.:   +972-9-766-1314          P.O.B. 7004
Mobile: +972-50-23-7338          Kfar-Saba 44641, Israel

Re: modproxy load balancer

Posted by Federico Mennite <fe...@lifeware.ch>.
Hi Bill,
Bill Stoddard wrote:

> Ping to all list citizens (listizens?) ...
> Who would be interested in seeing some load balancing function being 
> put into mod_proxy?  Anyone given any though on what you would like to 
> see or maybe even have a design proposal you'd like to discuss? 

I'm definetively interested.
I managed to configure mod_proxy in combination with mod_rewrite 
(internal rewrite do mod_proxy) to do some load balancing.
Using mod_rewrite's map feature, I was able to feed an home made program 
(a script) with data gathered from the incoming connections. The program 
returns to mod_rewrite the ip numbers which mod_proxy  should use for 
the backend connections.
The only requirement that I'm missing to put this on a productive 
environment, is the ability to feed the external program through a unix 
and/or network socket instead of its standard input and output.
This can probably be done without too much rocket science, but I didn't 
have time to try to implement something yet.

>
> My short requirements list:
> - selectable load balancing algorithm: Round robin, LRU,  response 
> time, url driven, session affinity, ?
> - automatic detection of backend server failure and removal of the 
> failed server from the load balancing routing tables (forever? for a 
> period of time? other?)
> - connection pooling using HTTP keep-alive (this is a no brainer since 
> it is a simple extension of what browsers already do, but it needs to 
> be designed in from the start)
> - must be effective with multiple child processes, each child must 
> make routing decisions globally based on stats maintained in a shared 
> memory segment 

This can probably all be handled by an external program as described 
above (which for my requirements would be enough).
However having your points implemented might have performance advantages 
(and other that I'm missing) over the mod_rewrite solution...

Regards.

--
Federico Mennite.