You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ed Korthof <ed...@organic.com> on 1997/01/08 00:08:48 UTC

Patch to slow down children's deaths

I'd like to get a vote on whether or not to include this in the next 1.2
beta. 

A while ago, I posted about the rate at which children die.  It's
basically immediate, and children die until we're down to MaxFreeServers;
this interfears with StartServers for large web sites, since
StartServers > MaxFreeServers (as is reasonable) means the following
occurs:

The 'main' (parent) server starts, and
	1) It forks StartServer children.
	2) (StartServers-MaxFreeServers) die off (this happens
           before any significant number of requests can be read,
           because of the point at which children check to see if
           they should die).
        3) Requests come in, rapidly fill MaxFreeServers, and
           the parent server forks one child a second until
           equilibrium is reached.

The result is significant amount of slow down at server restart time --
excatly what StartServers is designed to prevent.

This patch is designed to prevent that problem, as well as making more
rational use of server resources: since requests may fluctuate rapidly,
it's probably better not to kill all the extra servers at once -- it's
better, imo, to kill them one per second, as this does.  This way fewer
forks should be necessary (watching the number of httpd children in
existence, there are times when the number drops dramatically and then
slowly increases (with few spare children) back up to a reasonable level).

Criticism of this patch would be appriciated; I send a SIGKILL.  It's
unlikely that any children will have recieved a request but not yet
registered themselves as SERVER_BUSY_READ, but remotely possible.  If
there is a signal which will cause children to answer any current request
(if one exists) and otherwise to die immediately, that would be ideal to
use here -- is there such a thing?

The wait_or_timeout & associated stuff is necessary to prevent this from
killing all the children w/o a second between each one.

     -- Ed Korthof        |  Web Server Engineer --
     -- ed@organic.com    |  Organic Online, Inc --
     -- (415) 278-5676    |  Fax: (415) 284-6891 --


Re: Patch to slow down children's deaths

Posted by Dean Gaudet <dg...@arctic.org>.
I've done this with side-by-side machines handling equal portions of the
load.  The machine that had been running would be in steady state with 70
to 80 active children, and another 40 waiting, and 13 hits per second. 
The other machine, which I had just restarted, would have 0 inactive
children for most of a minute and ramp from 3 to 4 hits per second up to
13 hits per second after a minute.  So essentially for that minute the
machine was hardly serving the load it could serve. 

Yeah I could just beef up maxfree, but I'm not sure that's correct either. 
Consider that the same machine might do a lot of log crunching at night
when it doesn't need 90 spare servers...

Dean

On Tue, 7 Jan 1997, Cliff Skolnick wrote:

> 
> So if your load is 20/second, start 20 or 25 and why would you have
> maxfree <20 or 30.  Your machine will mostly be pretty beefy and dedicated
> with a load like that :)
> 
> One per second...so most servers would reach steady state in less than a
> minute, maybe a couple minutes for very large servers.  It would be nice 
> to have some hard data here.  Does anyone record hits/second verses 
> number of active chilren on a web server?  It would be interesting to 
> look at the graph and figure out what to do from there.  Anyone care to 
> share?
> 
> On Tue, 7 Jan 1997, Dean Gaudet wrote:
> 
> > It starts one per second, and if your load is 20 hits per second that is a
> > serious performance bottleneck.
> > 
> > Dean
> > 
> > On Tue, 7 Jan 1997, Cliff Skolnick wrote:
> > 
> > > On Tue, 7 Jan 1997, Ed Korthof wrote:
> > > 
> > > > A while ago, I posted about the rate at which children die.  It's
> > > > basically immediate, and children die until we're down to MaxFreeServers;
> > > > this interfears with StartServers for large web sites, since
> > > > StartServers > MaxFreeServers (as is reasonable) means the following
> > > > occurs:
> > > 
> > > I think apache should print a warning, since these values conflict with
> > > each other in a logic sense to me and aparently apache does not like it
> > > either. 
> > > 
> > > Also can someone explain why a large start servers is better than letting
> > > apache start more as the load rises?  I would think you simply trade off a
> > > few seconds of unavaliablity at startup for a slightly higher latency on
> > > http requests untila steady state is reached. 
> > > 
> > > --
> > > Cliff Skolnick, Technical Consultant
> > > Steam Tunnel Operations
> > > cliff@steam.com, 415.297.5938
> > > http://www.steam.com/
> > > 
> > > 
> > 
> > 
> 
> --
> Cliff Skolnick, Technical Consultant
> Steam Tunnel Operations
> cliff@steam.com, 415.297.5938
> http://www.steam.com/
> 
> 


Re: Patch to slow down children's deaths

Posted by Cliff Skolnick <cl...@steam.com>.
So if your load is 20/second, start 20 or 25 and why would you have
maxfree <20 or 30.  Your machine will mostly be pretty beefy and dedicated
with a load like that :)

One per second...so most servers would reach steady state in less than a
minute, maybe a couple minutes for very large servers.  It would be nice 
to have some hard data here.  Does anyone record hits/second verses 
number of active chilren on a web server?  It would be interesting to 
look at the graph and figure out what to do from there.  Anyone care to 
share?

On Tue, 7 Jan 1997, Dean Gaudet wrote:

> It starts one per second, and if your load is 20 hits per second that is a
> serious performance bottleneck.
> 
> Dean
> 
> On Tue, 7 Jan 1997, Cliff Skolnick wrote:
> 
> > On Tue, 7 Jan 1997, Ed Korthof wrote:
> > 
> > > A while ago, I posted about the rate at which children die.  It's
> > > basically immediate, and children die until we're down to MaxFreeServers;
> > > this interfears with StartServers for large web sites, since
> > > StartServers > MaxFreeServers (as is reasonable) means the following
> > > occurs:
> > 
> > I think apache should print a warning, since these values conflict with
> > each other in a logic sense to me and aparently apache does not like it
> > either. 
> > 
> > Also can someone explain why a large start servers is better than letting
> > apache start more as the load rises?  I would think you simply trade off a
> > few seconds of unavaliablity at startup for a slightly higher latency on
> > http requests untila steady state is reached. 
> > 
> > --
> > Cliff Skolnick, Technical Consultant
> > Steam Tunnel Operations
> > cliff@steam.com, 415.297.5938
> > http://www.steam.com/
> > 
> > 
> 
> 

--
Cliff Skolnick, Technical Consultant
Steam Tunnel Operations
cliff@steam.com, 415.297.5938
http://www.steam.com/


Re: Patch to slow down children's deaths

Posted by Dean Gaudet <dg...@arctic.org>.
It starts one per second, and if your load is 20 hits per second that is a
serious performance bottleneck.

Dean

On Tue, 7 Jan 1997, Cliff Skolnick wrote:

> On Tue, 7 Jan 1997, Ed Korthof wrote:
> 
> > A while ago, I posted about the rate at which children die.  It's
> > basically immediate, and children die until we're down to MaxFreeServers;
> > this interfears with StartServers for large web sites, since
> > StartServers > MaxFreeServers (as is reasonable) means the following
> > occurs:
> 
> I think apache should print a warning, since these values conflict with
> each other in a logic sense to me and aparently apache does not like it
> either. 
> 
> Also can someone explain why a large start servers is better than letting
> apache start more as the load rises?  I would think you simply trade off a
> few seconds of unavaliablity at startup for a slightly higher latency on
> http requests untila steady state is reached. 
> 
> --
> Cliff Skolnick, Technical Consultant
> Steam Tunnel Operations
> cliff@steam.com, 415.297.5938
> http://www.steam.com/
> 
> 


Re: Patch to slow down children's deaths

Posted by Ed Korthof <ed...@organic.com>.
On Tue, 7 Jan 1997, Cliff Skolnick wrote:
> On Tue, 7 Jan 1997, Ed Korthof wrote:
> I think apache should print a warning, since these values conflict with
> each other in a logic sense to me and aparently apache does not like it
> either. 

Yeah -- if we don't do a patch of some sort, I'd also like to see a
warning about this.
 
> Also can someone explain why a large start servers is better than letting
> apache start more as the load rises?  I would think you simply trade off a
> few seconds of unavaliablity at startup for a slightly higher latency on
> http requests untila steady state is reached. 

Well, here's my reasoning.  Apache isn't designed to deal with a rapidly
increasing load, but that is exactly the conditions a server experiences
at start up time.  With a large server, load will rise extremely fast
until there are sufficiently many servers in existence; if the server has
a reasonably fast CPU, it could have easily forked the extra children
(which doesn't take all that long) at start up time, but since it's
finished the start up procedure, the 1 fork per second limitation will
slow things down significantly. If there aren't enough servers, then
additional requests are simply delayed until more are created. 

The server I help administer is fairly large; after a restart, when I
attempt to load a page -- any page -- there is a delay of at least 10-15
seconds.  Clearly not everyone will experience that (after restarting, it
takes about 5 seconds to request the page), but I'm confident that the
average delay is 5 seconds or more during the 30 seconds starting about 3
seconds after server restart.  I'm afraid I don't have any hard data,
however, other than my observation that the servers increase at one per
second for more than a minute.  I'm inclined to try avoid these delays
wherever possible.

I do know that we reach equilibrium at more than 256 servers (mostly
Keep-Alives, but we get more than 20 hits per second). 

Anyway, on a dedicated server, MaxSpareServers can be made high to
compensate, but then what's the point of having StartServers at all?

> --
> Cliff Skolnick, Technical Consultant
> Steam Tunnel Operations
> cliff@steam.com, 415.297.5938
> http://www.steam.com/
> 


     -- Ed Korthof        |  Web Server Engineer --
     -- ed@organic.com    |  Organic Online, Inc --
     -- (415) 278-5676    |  Fax: (415) 284-6891 --




Re: Patch to slow down children's deaths

Posted by Cliff Skolnick <cl...@steam.com>.
On Tue, 7 Jan 1997, Ed Korthof wrote:

> A while ago, I posted about the rate at which children die.  It's
> basically immediate, and children die until we're down to MaxFreeServers;
> this interfears with StartServers for large web sites, since
> StartServers > MaxFreeServers (as is reasonable) means the following
> occurs:

I think apache should print a warning, since these values conflict with
each other in a logic sense to me and aparently apache does not like it
either. 

Also can someone explain why a large start servers is better than letting
apache start more as the load rises?  I would think you simply trade off a
few seconds of unavaliablity at startup for a slightly higher latency on
http requests untila steady state is reached. 

--
Cliff Skolnick, Technical Consultant
Steam Tunnel Operations
cliff@steam.com, 415.297.5938
http://www.steam.com/