You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2018/10/30 12:53:20 UTC

Load balancing and load determination

As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)

Re: Load balancing and load determination

Posted by jean-frederic clere <jf...@gmail.com>.

On 05/11/2018 15:05, Jim Jagielski wrote:
> I was thinking about something more robust and usable than heartbeat (due to multicast) but similar in basic concept.

I remember trying mod_heartmonitor with a simple listener like
<Listener className="org.apache.catalina.ha.backend.HeartbeatListener"
Port="8009" ProxyList="127.0.0.1:7779" />

Where ProxyList is the lit of httpd that are able to proxy to tomcat
back-ends. See http://jfclere.blogspot.com/2009/04/ I guess I need to
revisit that to use mod_prxy_balancer logic.


> 
>> On Nov 5, 2018, at 8:48 AM, jean-frederic clere <jf...@gmail.com> wrote:
>>
>> On 30/10/2018 13:53, Jim Jagielski wrote:
>>> As some of you know, one of my passions and area of focus is
>>> on the use of Apache httpd as a reverse proxy and, as such, load
>>> balancing, failover, etc are of vital interest to me.
>>>
>>> One topic which I have mulling over, off and on, has been the
>>> idea of some sort of universal load number, that could be used
>>> and agreed upon by web servers. Right now, the reverse proxy
>>> "guesses" the load on the backend servers which is OK, and
>>> works well enough, but it would be great if it actually "knew"
>>> the current loads on those servers. I already have code that
>>> shares basic architectural info, such as number of CPUs, available
>>> memory, loadavg, etc which can help, of course, but again, all
>>> this info can be used to *infer* the current status of those backend
>>> servers; it doesn't really provide what the current load actually
>>> *is*.
>>>
>>> So I was thinking maybe some sort of small, simple and "fast"
>>> benchmark which could be run by the backends as part of their
>>> "status" update to the front-end reverse proxy server... something
>>> that shows general capability at that point in time, like Hanoi or
>>> something similar. Or maybe some hash function. Some simple code
>>> that could be used to create that "universal" load number.
>>>
>>> Thoughts? Ideas? Comments? Suggestions? :)
>>
>> having the back-ends to provide the load they are able to handle
>> lbfactor (via w_lf or somethere similar. That requires the back-ends to
>> be able to send request to httpd balancer-manager handler.
>>
>>>
>>
>>
>> -- 
>> Cheers
>>
>> Jean-Frederic
> 
> 


-- 
Cheers

Jean-Frederic

Re: Load balancing and load determination

Posted by Jim Jagielski <ji...@jaguNET.com>.

I was thinking about something more robust and usable than heartbeat (due to multicast) but similar in basic concept.

> On Nov 5, 2018, at 8:48 AM, jean-frederic clere <jf...@gmail.com> wrote:
> 
> On 30/10/2018 13:53, Jim Jagielski wrote:
>> As some of you know, one of my passions and area of focus is
>> on the use of Apache httpd as a reverse proxy and, as such, load
>> balancing, failover, etc are of vital interest to me.
>> 
>> One topic which I have mulling over, off and on, has been the
>> idea of some sort of universal load number, that could be used
>> and agreed upon by web servers. Right now, the reverse proxy
>> "guesses" the load on the backend servers which is OK, and
>> works well enough, but it would be great if it actually "knew"
>> the current loads on those servers. I already have code that
>> shares basic architectural info, such as number of CPUs, available
>> memory, loadavg, etc which can help, of course, but again, all
>> this info can be used to *infer* the current status of those backend
>> servers; it doesn't really provide what the current load actually
>> *is*.
>> 
>> So I was thinking maybe some sort of small, simple and "fast"
>> benchmark which could be run by the backends as part of their
>> "status" update to the front-end reverse proxy server... something
>> that shows general capability at that point in time, like Hanoi or
>> something similar. Or maybe some hash function. Some simple code
>> that could be used to create that "universal" load number.
>> 
>> Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
>> 
> 
> 
> -- 
> Cheers
> 
> Jean-Frederic

Re: Load balancing and load determination

Posted by jean-frederic clere <jf...@gmail.com>.

On 05/11/2018 16:58, William A Rowe Jr wrote:
> On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere <jfclere@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     On 30/10/2018 13:53, Jim Jagielski wrote:
>     > As some of you know, one of my passions and area of focus is
>     > on the use of Apache httpd as a reverse proxy and, as such, load
>     > balancing, failover, etc are of vital interest to me.
>     >
>     > One topic which I have mulling over, off and on, has been the
>     > idea of some sort of universal load number, that could be used
>     > and agreed upon by web servers. Right now, the reverse proxy
>     > "guesses" the load on the backend servers which is OK, and
>     > works well enough, but it would be great if it actually "knew"
>     > the current loads on those servers. I already have code that
>     > shares basic architectural info, such as number of CPUs, available
>     > memory, loadavg, etc which can help, of course, but again, all
>     > this info can be used to *infer* the current status of those backend
>     > servers; it doesn't really provide what the current load actually
>     > *is*.
>     >
>     > So I was thinking maybe some sort of small, simple and "fast"
>     > benchmark which could be run by the backends as part of their
>     > "status" update to the front-end reverse proxy server... something
>     > that shows general capability at that point in time, like Hanoi or
>     > something similar. Or maybe some hash function. Some simple code
>     > that could be used to create that "universal" load number.
>     >
>     > Thoughts? Ideas? Comments? Suggestions? :)
> 
>     having the back-ends to provide the load they are able to handle
>     lbfactor (via w_lf or somethere similar. That requires the back-ends to
>     be able to send request to httpd balancer-manager handler.
> 
> 
> Not really. I'd suggest a response header, travelling with each response
> back to the balancer, which can be composed quickly enough to share
> a play-by-play snapshot of the availability of that backend. This adds
> next to no traffic and minimal cpu drain if composed cleanly. And it can
> optionally be axed by the balancer in the response to the client.

The problem is that if there is no requests going to back-end the
load-balancer won't know that the back-end is available again after a
load peak.

> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.
> 
> 

cping/cpong or options * allows check back-end nodes before sending
requests.

-- 
Cheers

Jean-Frederic

Re: Load balancing and load determination

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 05.11.2018 um 16:58 schrieb William A Rowe Jr <wr...@rowe-clan.net>:
> 
> On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere <jf...@gmail.com> wrote:
> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> > 
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> > 
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> > 
> > Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
> Not really. I'd suggest a response header, travelling with each response
> back to the balancer, which can be composed quickly enough to share
> a play-by-play snapshot of the availability of that backend. This adds
> next to no traffic and minimal cpu drain if composed cleanly. And it can
> optionally be axed by the balancer in the response to the client.
> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.

Funnily enough, I did my master thesis (is that a word?) a long, long
while ago on scheduling in distributed systems. And with "distributed"
the general tricky thing is that there is not global knowledge of the
system state.

While any load indicator reported from the backends might look very
useful, once you deal with several front ends, this degenerates quickly
(where each frontend makes its own decision without talking to each
other).

If you detect and exclude any failing backends (heartbeat), then, with
growing number of back- and frontends, it's very hard to beat a random
job distribution.

I found that, in general, pulling works slightly better than pushing. The
scenario here would be that backends ask frontends for requests to execute.
That is also very stable in case of backend failures, of course.

tl;dr

If your problem scenario includes more than a single frontend, go for random.

Cheers,

Stefan

Re: Load balancing and load determination

Posted by Jim Jagielski <ji...@jaguNET.com>.

Which is why we allow for both pre-send checks and out-of-band health checks...

> On Nov 5, 2018, at 10:58 AM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.

Re: Load balancing and load determination

Posted by William A Rowe Jr <wr...@rowe-clan.net>.

On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere <jf...@gmail.com>
wrote:

> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> >
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> >
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> >
> > Thoughts? Ideas? Comments? Suggestions? :)
>
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.

Not really. I'd suggest a response header, travelling with each response
back to the balancer, which can be composed quickly enough to share
a play-by-play snapshot of the availability of that backend. This adds
next to no traffic and minimal cpu drain if composed cleanly. And it can
optionally be axed by the balancer in the response to the client.

The last thing we want are the routing headaches of contacting an
ever-changing list one-or-many potential balancers. And we can't
rely on a dying lbmember to "check in" that it isn't functional. Since
the balancer must already start requests to the backend, having that
backend supplement the responses with its health status is simple.

Re: Load balancing and load determination

Posted by jean-frederic clere <jf...@gmail.com>.

On 30/10/2018 13:53, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

having the back-ends to provide the load they are able to handle
lbfactor (via w_lf or somethere similar. That requires the back-ends to
be able to send request to httpd balancer-manager handler.

> 


-- 
Cheers

Jean-Frederic

Re: Load balancing and load determination

Posted by Michal Karm <mi...@gmail.com>.

On 10/30/2018 01:53 PM, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

Hello,

It seems that is exactly what https://modcluster.io/ does.

- it has a Tomcat listener / JBoss AS (Wildfly) module that reports
  a worker-side calculated load number
https://docs.modcluster.io/#_worker_side_load_metrics

- httpd side is completely oblivious as to how the number was calculated,
  which worker used which load metric to calculate it etc, it just receives a number

- httpd side dynamically configures mod_proxy balancers according to joining
  and leaving worker nodes

- httpd side uses the load number to balance requests among healthy workers


An obvious down side is that the worker must implement this mod_cluster
logic. Implementations exist for JBoss AS/Wildfly/Tomcat, but we don't have
one for Jetty for example. On the bright side, the protocol itself is dead simple.

Disclosure: I am involved in the project.


Cheers

Michal Karm Babacek

-- 
Sent from my Hosaka Ono-Sendai Cyberspace 7

Re: Load balancing and load determination

Posted by Eric Covener <co...@gmail.com>.

> The main consideration is one of consistency... unless there is some agreed upon "standard" then comparisons are worthless and the resulting load balancing will be inaccurate. For example, say that Apache is front-ending 10 servers, 5 are Apache and other 5 are Foo, but Foo consistently falsifies it's capability simply to ensure that it gets all the traffic. Sure, you can adjust settings on the front end to offset that, but that defeats the whole purpose of some *accurate*, objective measure of capability.

Wouldn't the servers you load balance between for the same URL
generally be more similar than that?

Re: Load balancing and load determination

Posted by Jim Jagielski <ji...@jaguNET.com>.

The only reason why I brought up the concept of a benchmark is because it is dead easy to provide the source for said benchmark and have backend servers simply time how long it takes to run it each status update. Each backend would simply then send the "time taken" and that would provide some measure of how beefy and/or loaded said server was.

The main consideration is one of consistency... unless there is some agreed upon "standard" then comparisons are worthless and the resulting load balancing will be inaccurate. For example, say that Apache is front-ending 10 servers, 5 are Apache and other 5 are Foo, but Foo consistently falsifies it's capability simply to ensure that it gets all the traffic. Sure, you can adjust settings on the front end to offset that, but that defeats the whole purpose of some *accurate*, objective measure of capability.

Yeah, I recall JR posting something after I brought up this topic at one of my ApacheCon sessions...

Maybe it's more of an "availability factor" than a load factor... with 0 being "send me nothing" and 1 being "I am completely unloaded" and decimal values between indicating their "availability" to handle traffic.

> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri <DR...@primary.net> wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while back... I think it was shared here on list and some light iteration was done. IIUC, he was even planning to present it at ACNA until travel plans fell through.
> 
> Hi, Jim R;
> Any chance you have the latest and greatest, or is the version from the list archives current state?
> 
> 
> One of the things I recall *really liking* from the recommendation is letting the backend decide its factor based on whatever it believes is most important. In some servers, that may be available threads. In others it could be percentage of memory used. Still yet, other servers may decide based on number of idle GPUs on-system. I think this is roughly the same you are suggesting, Jim J, but I struggle to think of a universal benchmark because backends are so varied.
> -- 
> Daniel Ruggeri
> 
> On October 30, 2018 7:53:20 AM CDT, Jim Jagielski <ji...@jaguNET.com> wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

Re: Load balancing and load determination

Posted by Jim Jagielski <ji...@jaguNET.com>.


> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri <DR...@primary.net> wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while back... I think it was shared here on list and some light iteration was done. IIUC, he was even planning to present it at ACNA until travel plans fell through.
> 


https://lists.apache.org/thread.html/ca115bd3f21f7da91fa01a4d83af7d73987750e1e48bb2bf76236e52@1430369651@%3Cdev.httpd.apache.org%3E

Re: Load balancing and load determination

Posted by Daniel Ruggeri <dr...@primary.net>.

Hi, Jim J;
   I recall a while back that Jim Riggs proposed a spec for exactly this a while back... I think it was shared here on list and some light iteration was done. IIUC, he was even planning to present it at ACNA until travel plans fell through.

Hi, Jim R;
   Any chance you have the latest and greatest, or is the version from the list archives current state?


One of the things I recall *really liking* from the recommendation is letting the backend decide its factor based on whatever it believes is most important. In some servers, that may be available threads. In others it could be percentage of memory used. Still yet, other servers may decide based on number of idle GPUs on-system. I think this is roughly the same you are suggesting, Jim J, but I struggle to think of a universal benchmark because backends are so varied.
-- 
Daniel Ruggeri

On October 30, 2018 7:53:20 AM CDT, Jim Jagielski <ji...@jaguNET.com> wrote:
>As some of you know, one of my passions and area of focus is
>on the use of Apache httpd as a reverse proxy and, as such, load
>balancing, failover, etc are of vital interest to me.
>
>One topic which I have mulling over, off and on, has been the
>idea of some sort of universal load number, that could be used
>and agreed upon by web servers. Right now, the reverse proxy
>"guesses" the load on the backend servers which is OK, and
>works well enough, but it would be great if it actually "knew"
>the current loads on those servers. I already have code that
>shares basic architectural info, such as number of CPUs, available
>memory, loadavg, etc which can help, of course, but again, all
>this info can be used to *infer* the current status of those backend
>servers; it doesn't really provide what the current load actually
>*is*.
>
>So I was thinking maybe some sort of small, simple and "fast"
>benchmark which could be run by the backends as part of their
>"status" update to the front-end reverse proxy server... something
>that shows general capability at that point in time, like Hanoi or
>something similar. Or maybe some hash function. Some simple code
>that could be used to create that "universal" load number.
>
>Thoughts? Ideas? Comments? Suggestions? :)

Re: Load balancing and load determination

Posted by Yehuda Katz <ye...@ymkatz.net>.

HAProxy has a similar feature called agent-check (
https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.2-agent-check)
although in their case, the backend server specifies it's own weight.
Either way - whether the frontend or backend determines the weight - it
would be useful.

- Y

Sent from a device with a very small keyboard and hyperactive autocorrect.

On Tue, Oct 30, 2018, 8:53 AM Jim Jagielski <ji...@jagunet.com> wrote:

> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

On Oct 30, 2018 8:53 AM, "Jim Jagielski" <ji...@jagunet.com> wrote:

As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)

Re: Load balancing and load determination

Posted by Greg Ames <am...@gmail.com>.

On Tue, Oct 30, 2018 at 8:53 AM Jim Jagielski <ji...@jagunet.com> wrote:

> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

There are a couple of analogous systems from my former employer that used a
"pull from front end queue" concept for load balancing.  I thought that was
very interesting, although I never had any practical experience servicing
those systems.

The idea is that each back end pulls off work as quickly as possible.  If
one back end is slower/faster than the average, it just does less/more work
that the others, and no (bound to fail) clever oracle is required.  The
downside might be the presence of a front end queue which implies latency.
I don't know how such a system would perform when the back ends are only
lightly or moderately loaded and the front end queue is usually empty.

Greg Ames

Re: Load balancing and load determination

Posted by Mark Blackman <ma...@exonetric.com>.


> On 30 Oct 2018, at 12:53, Jim Jagielski <ji...@jaguNET.com> wrote:
> 
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

What problem are you trying to solve? Broadly, I think they best you can do is ask the backends to include a response header indicating their current appetite for more connections.

- Mark

Re: Load balancing and load determination

Posted by William A Rowe Jr <wr...@rowe-clan.net>.

On Thu, Nov 8, 2018 at 1:48 PM Jim Jagielski <ji...@jagunet.com> wrote:

> I have a semi-working implementation that I'll be committing to trunk in a
> bit...

 I'm confused. Semi-working would seem to be orthoganal to keeping trunk in
a releasable state, but it depends on what you mean. But before you commit
a significant change, please first consider posting the patch, or simpler,
please
consider a sandbox fork for iterative development?  From the project bylaws;

When to Commit a Change
<http://httpd.apache.org/dev/guidelines.html#when-to-commit-a-change>
Ideas must be review-then-commit; patches can be commit-then-review. With a
commit-then-review process, we trust that the developer doing the commit
has a high degree of confidence in the change. Doubtful changes, new
features, and large-scale overhauls need to be discussed before being
committed to a repository. Any change that affects the semantics of
arguments to configurable directives, significantly adds to the runtime
size of the program, or changes the semantics of an existing API function
must receive consensus approval on the mailing list before being committed.

Re: Load balancing and load determination

Posted by Jim Jagielski <ji...@jaguNET.com>.

I have a semi-working implementation that I'll be committing to trunk in a bit...

> On Nov 8, 2018, at 1:33 AM, Mladen Turk <mt...@apache.org> wrote:
> 
> On 30.10.2018. 13:53, Jim Jagielski wrote:
>> As some of you know, one of my passions and area of focus is
>> on the use of Apache httpd as a reverse proxy and, as such, load
>> balancing, failover, etc are of vital interest to me.
> 
> Been a while, but seems I'm back :D
> Love the idea to have more intelligent then "lets guess"
> way of deducting the load balancer score.
> 
> What we did for heartbeat/heartmonitor/watchdog can be used
> for collecting backend data.
> 
> The thing I'm trying to do is the way that backend can
> register or remove itself as node inside load balancer.
> That would also require some sort of backend-server communication,
> shared memory management (mod_slotmem maybe), and a way to
> survive graceful restart.
> 
> Backend sending its load status at regular intervals would
> be addition to "I'm here, count me in" or
> "I'm out, bye, good luck with other nodes".
> 
> What do you think?
> 
> 
> 
> Regards
> -- 
> ^TM

Re: Load balancing and load determination

Posted by Mladen Turk <mt...@apache.org>.

On 30.10.2018. 13:53, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 

Been a while, but seems I'm back :D
Love the idea to have more intelligent then "lets guess"
way of deducting the load balancer score.

What we did for heartbeat/heartmonitor/watchdog can be used
for collecting backend data.

The thing I'm trying to do is the way that backend can
register or remove itself as node inside load balancer.
That would also require some sort of backend-server communication,
shared memory management (mod_slotmem maybe), and a way to
survive graceful restart.

Backend sending its load status at regular intervals would
be addition to "I'm here, count me in" or
"I'm out, bye, good luck with other nodes".

What do you think?

Regards
-- 
^TM