You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Bill Stoddard <bi...@wstoddard.com> on 2002/04/26 17:00:24 UTC

Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Paul has been doing some testing and this patch seems to help (but not solve) the reported
problem.  The idea behind the patch is to start as many worker threads as possible on the
first pass through the for() loop before starting the listener.  Starting the listener
earlier on a loaded server will drive a lot of additional thread context switches
accepting connections which will just delay the new process getting to steady state.

This patch fixes a couple of problems I noticed with the first patch... with the previous
patch, we could get multiple lieteners or in one case we could start the listener w/o
having any worker threads.

Jeff, note that this patch really does not break the graceful restart case since the call
to create_listener_thread is still made inside the while (1) loop.

Bill

Index: worker.c
===================================================================
RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
retrieving revision 1.117
diff -u -r1.117 worker.c
--- worker.c 18 Apr 2002 17:46:20 -0000 1.117
+++ worker.c 26 Apr 2002 14:50:43 -0000
@@ -948,6 +948,7 @@
     apr_status_t rv;
     int i;
     int threads_created = 0;
+    int listener_started = 0;
     int loops;
     int prev_threads_created;

@@ -999,16 +1000,18 @@
                 clean_child_exit(APEXIT_CHILDFATAL);
             }
             threads_created++;
-            if (threads_created == 1) {
-                /* now that we have a worker thread, it makes sense to create
-                 * a listener thread (we don't want a listener without a worker!)
-                 */
-                create_listener_thread(ts);
-            }
+
+        }
+        /* Start the listener only when there are workers available */
+        if (!listener_started && threads_created) {
+            create_listener_thread(ts);
+            listener_started = 1;
         }
+
         if (start_thread_may_exit || threads_created == ap_threads_per_child) {
             break;
         }
+
         /* wait for previous generation to clean up an entry */
         apr_sleep(1 * APR_USEC_PER_SEC);
         ++loops;


> Would someone care to see if this fixes the worker MPM performance problem reported
> earlier on the list (request-per-second dropping when clients exceeded threadsperchild)?
> This patch defers starting the listener untill -all- the workers have started.
>
> Bill
>
> Index: worker.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
> retrieving revision 1.117
> diff -u -r1.117 worker.c
> --- worker.c 18 Apr 2002 17:46:20 -0000 1.117
> +++ worker.c 25 Apr 2002 15:24:51 -0000
> @@ -999,16 +999,15 @@
>                  clean_child_exit(APEXIT_CHILDFATAL);
>              }
>              threads_created++;
> -            if (threads_created == 1) {
> -                /* now that we have a worker thread, it makes sense to create
> -                 * a listener thread (we don't want a listener without a worker!)
> -                 */
> -                create_listener_thread(ts);
> -            }
> +
>          }
>          if (start_thread_may_exit || threads_created == ap_threads_per_child) {
>              break;
>          }
> +
> +        /* All the workers have started. Now start the listener thread */
> +        create_listener_thread(ts);
> +
>          /* wait for previous generation to clean up an entry */
>          apr_sleep(1 * APR_USEC_PER_SEC);
>          ++loops;
>
>


Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Posted by Bill Stoddard <bi...@wstoddard.com>.

> On Fri, Apr 26, 2002 at 11:32:19AM -0400, Paul J. Reder wrote:
> > In my tests, this patch allows existing worker threads to continue
> > procesing requests while the new threads are started.
> >
> > In the previous code the server would pause while new threads were
> > being created. The new threads started accepting work immediately,
> > causing the existing threads to starve even though there are a
> > small (but growing) number of new threads.
> >
> > This patch allows the server to maintain a higher level of responsiveness
> > during the ramp up time.
>
> I don't quite understand what you are saying here. AIUI the worker MPM
> creates all threads as soon as it is started, and as an optimization it
> creates the listener thread as soon as there are at least one worker
> thread availble. By delaying the startup of the listener thread we're
> merely increasing the amount of time it takes to start a new child and
> start accepting connections.

By deferring the start-up of the listener, we are decreasing the amount of time it takes
to start the new process. My speculation in creating the patch was that we could save time
spent context switching between a few active workers and the listen thread and use that
time to startup the new threads. More speculation...contect switching may be particularly
expensive when threads are starting, or conversly, thread starting may be really expensive
when lots of context switches are happening in the process. What is interesting is that,
at least by Paul's measurements, the patch does make a difference.

I think Jeff's comment was close to on target as well. If the listener thread can
efficiently defer accepting connections when there are no workers available, that would
probably accomplish much the same.

Bill

> Please correct me if I'm missing something.
>
> The reason I think you were seeing a pause while new threads were being
> created, as Jeff points out, was because our listener thread was able
> to accept far more connections than we had available workers or would
> have available workers. In the worst case, since we create the listener
> as soon as there is 1 worker, it is possible to have a queue filled
> with ap_threads_per_child accept()ed connections and only 1 worker.
> As soon as the next worker is created the listener is able to accept()
> yet another connection and stuff that into the queue.
>
> And I think I've just realized something else. Since the scoreboard
> is not updated until a worker thread pulls the connection off of the
> queue, the parent is not going to create another child in accordance
> with how many connections are accept()ed. This means that we are able to
> accept up to 2*ThreadsPerChild*number_of_children connections while the
> parent will only count us as having 1/2 that amount of concurrency, and
> therefore will not match the demand. This is another bug in the worker
> MPM that would be fixed if we prevented the listener from accepting more
> connections that workers.

Yep and that is closly related to another problem Paul is tracking down.
process_idle_server maintenance is thrashing a bit when a load spike comes in (ie,
processes are actually being told to shutdown in the midst of a load spike).

Bill

>
> -aaron
>


Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Posted by Aaron Bannert <aa...@clove.org>.
On Fri, Apr 26, 2002 at 11:32:19AM -0400, Paul J. Reder wrote:
> In my tests, this patch allows existing worker threads to continue
> procesing requests while the new threads are started.
> 
> In the previous code the server would pause while new threads were
> being created. The new threads started accepting work immediately,
> causing the existing threads to starve even though there are a
> small (but growing) number of new threads.
> 
> This patch allows the server to maintain a higher level of responsiveness
> during the ramp up time.

I don't quite understand what you are saying here. AIUI the worker MPM
creates all threads as soon as it is started, and as an optimization it
creates the listener thread as soon as there are at least one worker
thread availble. By delaying the startup of the listener thread we're
merely increasing the amount of time it takes to start a new child and
start accepting connections. Please correct me if I'm missing something.

The reason I think you were seeing a pause while new threads were being
created, as Jeff points out, was because our listener thread was able
to accept far more connections than we had available workers or would
have available workers. In the worst case, since we create the listener
as soon as there is 1 worker, it is possible to have a queue filled
with ap_threads_per_child accept()ed connections and only 1 worker.
As soon as the next worker is created the listener is able to accept()
yet another connection and stuff that into the queue.

And I think I've just realized something else. Since the scoreboard
is not updated until a worker thread pulls the connection off of the
queue, the parent is not going to create another child in accordance
with how many connections are accept()ed. This means that we are able to
accept up to 2*ThreadsPerChild*number_of_children connections while the
parent will only count us as having 1/2 that amount of concurrency, and
therefore will not match the demand. This is another bug in the worker
MPM that would be fixed if we prevented the listener from accepting more
connections that workers.

-aaron

Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Posted by "Paul J. Reder" <re...@remulak.net>.
Jeff,

In my tests, this patch allows existing worker threads to continue
procesing requests while the new threads are started.

In the previous code the server would pause while new threads were
being created. The new threads started accepting work immediately,
causing the existing threads to starve even though there are a
small (but growing) number of new threads.

This patch allows the server to maintain a higher level of responsiveness
during the ramp up time.

Paul J. Reder

Jeff Trawick wrote:

> "Bill Stoddard" <bi...@wstoddard.com> writes:
> 
> (I would have quoted but the text is way out at columns 89-92 or so)
> 
> I think the patch is fine, but I can't help but suspect that some of
> the pain you are alleviating is caused by the known problem where the
> listener thread can accept connections when there are no spare workers
> to handle it.
> 
> Yes, the listener thread may do a little more at startup than a worker
> thread, but if it knew better than to go grab connections it couldn't
> service then would it really cause the system to thrash any more than
> it is going to thrash anyway in order to get those threads created and
> dispatched?
> 
> 
>>Index: worker.c
>>===================================================================
>>RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
>>retrieving revision 1.117
>>diff -u -r1.117 worker.c
>>--- worker.c 18 Apr 2002 17:46:20 -0000 1.117
>>+++ worker.c 26 Apr 2002 14:50:43 -0000
>>@@ -948,6 +948,7 @@
>>     apr_status_t rv;
>>     int i;
>>     int threads_created = 0;
>>+    int listener_started = 0;
>>     int loops;
>>     int prev_threads_created;
>>
>>@@ -999,16 +1000,18 @@
>>                 clean_child_exit(APEXIT_CHILDFATAL);
>>             }
>>             threads_created++;
>>-            if (threads_created == 1) {
>>-                /* now that we have a worker thread, it makes sense to create
>>-                 * a listener thread (we don't want a listener without a worker!)
>>-                 */
>>-                create_listener_thread(ts);
>>-            }
>>+
>>+        }
>>+        /* Start the listener only when there are workers available */
>>+        if (!listener_started && threads_created) {
>>+            create_listener_thread(ts);
>>+            listener_started = 1;
>>         }
>>+
>>         if (start_thread_may_exit || threads_created == ap_threads_per_child) {
>>             break;
>>         }
>>+
>>         /* wait for previous generation to clean up an entry */
>>         apr_sleep(1 * APR_USEC_PER_SEC);
>>         ++loops;
>>
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein



Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Posted by Jeff Trawick <tr...@attglobal.net>.
"Bill Stoddard" <bi...@wstoddard.com> writes:

(I would have quoted but the text is way out at columns 89-92 or so)

I think the patch is fine, but I can't help but suspect that some of
the pain you are alleviating is caused by the known problem where the
listener thread can accept connections when there are no spare workers
to handle it.

Yes, the listener thread may do a little more at startup than a worker
thread, but if it knew better than to go grab connections it couldn't
service then would it really cause the system to thrash any more than
it is going to thrash anyway in order to get those threads created and
dispatched?

> Index: worker.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
> retrieving revision 1.117
> diff -u -r1.117 worker.c
> --- worker.c 18 Apr 2002 17:46:20 -0000 1.117
> +++ worker.c 26 Apr 2002 14:50:43 -0000
> @@ -948,6 +948,7 @@
>      apr_status_t rv;
>      int i;
>      int threads_created = 0;
> +    int listener_started = 0;
>      int loops;
>      int prev_threads_created;
> 
> @@ -999,16 +1000,18 @@
>                  clean_child_exit(APEXIT_CHILDFATAL);
>              }
>              threads_created++;
> -            if (threads_created == 1) {
> -                /* now that we have a worker thread, it makes sense to create
> -                 * a listener thread (we don't want a listener without a worker!)
> -                 */
> -                create_listener_thread(ts);
> -            }
> +
> +        }
> +        /* Start the listener only when there are workers available */
> +        if (!listener_started && threads_created) {
> +            create_listener_thread(ts);
> +            listener_started = 1;
>          }
> +
>          if (start_thread_may_exit || threads_created == ap_threads_per_child) {
>              break;
>          }
> +
>          /* wait for previous generation to clean up an entry */
>          apr_sleep(1 * APR_USEC_PER_SEC);
>          ++loops;

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] Possible fix for worker MPM performance problem (Updated patch)

Posted by "Paul J. Reder" <re...@remulak.net>.
I have just finished testing cvs-head against cvs-head + Bill's patch to delay listener
creation. I believe that this patch is still *very* useful and should be applied. Here
is what I did:

Started an Apache with the following config:

<IfModule worker.c>
StartServers         1
ThreadsPerChild      64
MaxClients           1024
MinSpareThreads      64
MaxSpareThreads      192
MaxRequestsPerChild  500000
</IfModule>

Then I started 26 copies of Jeff Trawick's testing program "b" in 7 different machines (all
on a private local network). I used the following parameters for the calls to "b":

./b -c 15 -n 120000 -f replay_file_XXX -v

where XXX is a file from 000 to 026, each of which contains a 120000 line segment from an
actual apache.org 24 hour access log.

Summary of results:

cvs-head displayed an immense amount of process/thread churn (repeatedly starting
more than it needed to, then killing more than it needed to...). This churn continued
during the entire 45-60 minutes of each of the 3 runs I did.

cvs-head + Bill's patch displayed very stable process/thread characteristics right
from the start. Growing to handle incoming requests and killing an appropriate number
of processes/threads to keep within bounds. Overall, cvs-head + Bill's had higher
total traffic with higher requests/per second and lower number of requests currently
being processed. cvs-head + Bill's did display a higher cpu load, but I believe this
relates to the fact that cvs-head + Bill's is spending a lot less time doing task
switching and other overhead related to having more (or less) processes/threads than
are needed.

Now for the long version, with data for your purusing pleasure...
=================================================================

cvs-head
===================================

Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:34:12 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 44 seconds
Total accesses: 4935 - Total Traffic: 52.7 MB
CPU Usage: u3.28 s11.69 cu0 cs0 - 34% CPU load
112 requests/sec - 1.2 MB/second - 10.9 kB/request
142 requests currently being processed, 49 idle workers

_WWRRR_WWRW_RWW_____R_RWWWW_WWWW_W_WWWW_WWWW___RW__W_W_R__WRW__R
WWWW__WWWWWWW_W_WRRWW_W__WW__WW_W_W_WW_WW_WW__WW_RWWRRWW_W____W_
WWWWWRWWWKWRWS..................................................
RWWWWWWWWWSS....................................................
WRWWWSS.........................................................
WWWWSS..........................................................
WRKWWWS.........................................................
KWWW_CSS........................................................
WRWWWS..........................................................
_S..............................................................
S...............................................................
................................................................
................................................................
................................................................
................................................................
................................................................



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:35:42 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 2 minutes 14 seconds
Total accesses: 17491 - Total Traffic: 478.8 MB
CPU Usage: u12.36 s67.61 cu0 cs0 - 59.7% CPU load
131 requests/sec - 3.6 MB/second - 28.0 kB/request
207 requests currently being processed, 134 idle workers

_W_W__W______WW__W__W___WW____W_WW__W____WW___W_________________
W___WW____W_W_W____WW__W__W_____W_W____W_____________________W__
WRWCWWWCWW_WWWWW_CCWCWCSS.......................................
W__W_W_W_W__W_W_W_____SS........................................
WCW_WWWWW_WKWCWWCSS.............................................
____WWW___WW____SS..............................................
WWWWKWWWWWWWWCWRWSS.............................................
RWWWWWWWWWWWCWWWWCSS............................................
WWWWRRWWRCWWCWWCSS..............................................
WWW____W___SS...................................................
RWWWWCRWWWSS....................................................
RWKWWWWWWSS.....................................................
WWWWWWSS........................................................
WCWWWWCKSS......................................................
WCWWWWRSS.......................................................
RWWCCSS.........................................................



Now if all 16 of those processes start all the way up, with 64 threads each,
we will have 1024 worker threads. Do we *really* need that many?



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:36:32 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 3 minutes 4 seconds
Total accesses: 24094 - Total Traffic: 785.7 MB
CPU Usage: u17.32 s105.95 cu0 cs0 - 67% CPU load
131 requests/sec - 4.3 MB/second - 33.4 kB/request
200 requests currently being processed, 82 idle workers

GWGWGGWGGGGGGGGG.GWGGGGGGW.GGGGG.GGGGGGGGWWWG.GGGGGGWGGGGGGG.GGG
WGGGGGGG.GWGWGWGGGGWGGGGGGGWGWGGWGGGGGGGGWGGGGGGGWGGWGGGGGGGGGGG
W_WW__W_WW___WWWW_________W_SS..................................
WW_WWWWW__W_WW_W_W__W__W___SS...................................
WWW__WWW_WW_WWW_CW___WSS........................................
WWWRWWW_WWW_WWWWWWW_RSS.........................................
RWW__WW_W__W_W_WWW_W__SS........................................
WWWWWWWCW_WWWWWWWKWWWCCSS.......................................
WWRWKWWCWWWCWRWWWWCWWSS.........................................
W___W____W____WWS...............................................
__WW_W_____W___SS...............................................
_W_W_WW_W___W_SS................................................
CWCCWWCCCWWSS...................................................
RWWCWWWWWCCRCSS.................................................
CCCWCWWCWCCSSS..................................................
WWWWWWWWWRSS....................................................


I guess not. Above, we see the first churn happen at just over three minutes.
It eventually got to the point where it had more than 192 idle workers so
it started killing off processes. But we still have a long ways to go before
all the workers for all the processes have been started...



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:39:11 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 5 minutes 43 seconds
Total accesses: 42437 - Total Traffic: 1.7 GB
CPU Usage: u12.41 s72.36 cu0 cs0 - 24.7% CPU load
124 requests/sec - 5.1 MB/second - 42.5 kB/request
381 requests currently being processed, 175 idle workers

WGGGWGGGW..G.W.WWWGGGGGGWWWGGGWWGGWGWGGGWGWGGGGGGGGGGGGGGGGGGGGS
GGGGG.GGGGGGGGGGGGGGG.GGGGGGGGGGWGGGGWGGG.GGGGGG................
_WW.WWW_WWW__WW___W__W___WWSSGGW..WGGGG.GGGGGGGG.G.W..GG........
_WW___W_____W___W_SSG..WW.GGWWG.GG.GGG.GWGG.G..GGGGGG.GG........
WW_W._______WW___W_W_WWWW______W_WW_SSGGG.WG....................
WW_WWWW___._________W_W_WSS.....................................
_W_W_WWR_WRSSGGGWG.GGGGGGGGGGGGGGGGWGGGGWGGGWGGWGGGGGGGGGGGG....
RCWWWWWWRWWWCWWRCWWWWW.WWWWCCW___WW________SS.GWGGGG.GGGGGGG....
W_WW_.___W_________W____W____________SSGG.......................
WWW.WWWWWWWWCWWWWWWWWWWWR_________SGGGGGGGGGGGGGGGGGGGG.........
CCWWCWC.C.CRCCWCCWCCC..CC________W_W_W_SSGGGGG.GGGGGGG..........
WCWWWWWWCCWWWWW.WWCWCCW________W___SSGGGWGGGG..GGGGGGG..........
WWCKWCWWCWCWWRC.K____________SSGGGGGGGG.GGGGGGGGGG..............
____WW__WW_____WW_______W________SS.....W.......................
GG.GGGGGGWGGWWGGGGWGGGGGG.GGGGGGGGGGGGG.GGGGGGGGGGG.............
WGGG.GWGG.G.GGGGGGGGG..GGGGGGG.GGGWGGGGGGGGGGG.G................


Now with almost 6 minutes under our belt (above), we see that Apache has
indeed discovered that it started far too many processes and has begun
cleaning up many of them up. I believe, because they die slowly, it
kills too many...

Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:41:12 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 7 minutes 44 seconds
Total accesses: 54592 - Total Traffic: 2.4 GB
CPU Usage: u16.24 s126.24 cu0 cs0 - 30.7% CPU load
118 requests/sec - 5.3 MB/second - 46.3 kB/request
10 requests currently being processed, 50 idle workers

WGGG.GGG...G.W..W.GGGGGG.WWGGG..GG.G.GGG.GWGGGGGGGGGGGGGGGGGGGG.
W__W________W___WWW_____W_____________W___________W________S....
W.W.WW.WW....WW..W....WWWW.WW.W..W.W.W......W....W..............
.W..WWW..WW.W...WW..WW..W..WW..W.W..............................
WGG..WGGGGGGWWWGGWG.GW.WWGGGWGG.G...GGGGGGGGGGGG.GGGGG..........
WGWG.GWGWW.WGWGGGGGWWG.G.GGWWGGGWGWWGGGWGGGGGGGGGGGG.GGG........
GGGWWWWWWWRWWGGGGG.GGGGGGGGGGGGGWGGGGG.GGGGGGG.GGGGG............
WWG.GG.GG.WWWGWWGGGGGG.G...GG.WGG..GGGGWGGGGGGG.GGGGG...........
WG.WW.G..GGWWWGGGGG.G.GG.WGGGGWGGWWWGGGGGGGWGGGG................
.WG.GGWGWGG.GWGWGW.WWGWWGGGWGGWW.G.WGGGGGGGGGGGG................
GRGWWWW.G.GGGG.GGWWGG..GGGGGGGGGG.G.G.GGGGGWWGWWGGG.............
WWG.W..WGWWG.W..WWWGWWGWWGGWGWG.WGWGWGGGGGGGGGGGGGGGGGGGG.......
WW.WWGW.GGWGGGW.G.GWGGGGG.GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG......
GWWGW.GGW.GGWGGW.GGG.WGG.GGGGG.GGGGGGGGG.GGGGGG.................
................................................................
WGGG.G.GG.G.GGGGGGGGG..GGGGGGG.GGG.GGGGGGGGGGG.G................


...and by 8 minutes (above) you can see that it is not in good shape. It is
still responding to users, but the responses have continued to get slower in
arriving.


Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:41:52 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 8 minutes 24 seconds
Total accesses: 57795 - Total Traffic: 2.6 GB
CPU Usage: u9.62 s72.62 cu0 cs0 - 16.3% CPU load
115 requests/sec - 5.3 MB/second - 47.4 kB/request
233 requests currently being processed, 209 idle workers

WWWWWWCCW_.__W__C______SSW................W.....................
.__WRWW_WWW_WWWWWWWWW___WW_WWRWRW_C_R______S....................
WWWW.WRWWWWW______W_W____W_.SW..................................
W.WWWWWWC_W_W___WW___W__WSSWW...................................
WWWWWWWWWWWWWWWR_W___W_.W___W_SS................................
WCCC__W__WWW_W_____WW_SGGWGWWWWGW..WGGGWGGGGG.GG.GGGWGGGG.......
____K_______W________________S..................................
.___________.________________SS.................................
.CCWCCRCWRCC___________SS.......................................
____W_W_W___W__W___WW_SWW.......................................
C.CW_W___W_______W_S.......................W..WW................
WWWWCCWCRW________.__SS.........................................
_W___W__W____________S..........................................
WWWKW_W_W___W__W__SG.WGG.GGGGG.GGGGGGGGG.GGGGGG.................
_WW___.______W_SS...............................................
WGGG.G.GG.G.GGGGGGGGG..GGGGGGG.GGG.GGGGGGGGGGG.G................


Now that we realize we need more (above) we start the creation process
again, but because we are sharing space with workers that are shutting down,
we start more than we need to, now ending up with > 192 idle workers again
(and likely to grow, due to more workers coming)...



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 16:45:32 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 12 minutes 4 seconds
Total accesses: 78517 - Total Traffic: 3.9 GB
CPU Usage: u22.21 s192.46 cu0 cs0 - 29.7% CPU load
108 requests/sec - 5.5 MB/second - 51.7 kB/request
210 requests currently being processed, 454 idle workers

W__.W_W___.W_W__.________W__________________________SS..........
.GGWGGGGGGGG.GGGGGGGGGGG.GGWGGGGGGGGGGGGGGGGGGGGGGGGGGG.........
__....W__.W.___W__._.____.___W_____WW___________________SS......
_.__..._WW.WWW_W.W___._WW__WWW_WW__W___________________SS.......
...__.___WWW..WW_.__W.W..___.___W__WWWWWW___________________SS..
W_WW_WW_R..._W_____..__W_.WWW__WW_W__SS.GGGGG.GG.GGG.GGGG.......
___W_W__________________W___W____W__W_W_______SS................
________W_W_____WWW____W___W_________________W____________SS....
._W____WW_____________WWW____WW_____W_______________SS..........
____._.W.___.__.___..W_.WW_WW______________________SS...........
_._.WW_W_.__WW_WW.W_W__WW_____________________SS................
____WWW___.WW_WWW_WW_W_W_______________________SS...............
_WW_W____________W_____WWW_W_____SS.............................
_..W.__W.__W._WW_____._____W____SSGGGGGG.GGGGGG.................
W..WWW.WWCWWW.WWWWWWWWWWWWWWWRWWWWWWWW_______SS.................
WGGG.G.GG.G.GGGGGGGGG..GGGGGGG.GGG.GGGGGGGGGGG.G................


...and sure enough, we've hit 454 idle workers.

This patern is repeated over and over again during the whole hour
of testing. I have more data if anyone wants to see it, but it roughly
mimics the above. The hour was finished out with the following final
status.


Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 13:48:50

Current Time: Monday, 29-Apr-2002 17:36:31 EDT
Restart Time: Monday, 29-Apr-2002 16:33:28 EDT
Parent Server Generation: 0
Server uptime: 1 hour 3 minutes 3 seconds
Total accesses: 320561 - Total Traffic: 20.7 GB
CPU Usage: u14.51 s150.62 cu0 cs0 - 4.37% CPU load
84.7 requests/sec - 5.6 MB/second - 67.9 kB/request
357 requests currently being processed, 181 idle workers

WWWWRWCWWRWSS....WG..GG..G..G..G....G...G.G.....................
WWWR____WW_WWW_WR_W___W_W_SS....................................
_____W_._W_W____W___.W____SSGGGGGGGGGGGGGGGGGGGGGGGGGGG.GGGGGGG.
W.WWCWWWWCWRSS..................................................
_W_W_._WW_.W___WWWW__WWW_W_SS.GGWGG.GGG.GGGWGG.GG.GG..GG.GG.....
W_.__._R_WW___WW_W__WSSGG.G.GGG.G.GG.G.G.G.GGGGGGGGGGGGG........
RR__RW____WR__WW___W______SS....................................
WCWCCWWWWCCWWWCWCWCSS...........................................
____WW_____________W_W_W_W__W__W__W_________W______W__WW_W___SS.
_____W__W___W_W_______SSGGGGGGGG.GGGGGG.........................
__.______.__W._W.__W____SSGGGW.G.GGGGGGGGGGWGGG.WGGG.GGG.GG.GG..
___CWWWCWW_WWWW_WWWWCWSS........................................
WWRWWWWWRWWWWWSS................................................
GWGGGGGGG.GGGGGGWGGG.GW.G.GGGGG.GWG.GGGGGGGGGGGGGGG.GGGGGGWGGGG.
_W___W__W__W____W___________SS..................................
KWWWWW.WWWWWWCWWWWCCSS.GGGGGGGGGGGGGG.GGWG.GG.GGGGGG............


============================================================================
============================================================================
============================================================================
============================================================================

cvs-head + Bill's
============================================


Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:09:59 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 1 minute 13 seconds
Total accesses: 4607 - Total Traffic: 124.9 MB
CPU Usage: u3.22 s15.85 cu0 cs0 - 26.1% CPU load
63.1 requests/sec - 1.7 MB/second - 27.8 kB/request
117 requests currently being processed, 75 idle workers

WWWWRW_WRWWWWR___WWWW_WW__WW_WWRWWWW_W_W___W_W_WWWR_WW_WWWRW____
WW___WWWWWW_WRWRWWWWW_W__WWWWWWWWWWWRWWWWWWWWWWW_WRWWR_WWWWRWWWW
W________________W__________R_W__W__W____WWW__R_WWR_WWR___WW___W
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................



Ramp up of rps comes a little slower for this one because there aren't as many
processes starting threads in parallel...



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:10:26 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 1 minute 40 seconds
Total accesses: 8351 - Total Traffic: 216.0 MB
CPU Usage: u5.79 s31.74 cu0 cs0 - 37.5% CPU load
83.5 requests/sec - 2.2 MB/second - 26.5 kB/request
128 requests currently being processed, 128 idle workers

WWWWWWWWWWWWWW__WWWWWWWW_WWWWWWWWWWWWWWWWWWWWWWWWW__WWW_WWWWW_WW
R___W__W_W__________W_________WW_W___W_____R_W___________WW____W
WWWWWWWWW___WWWWWW_WW__WWWWWWW_WW___W_W_WWWWWWWWW__WWWWW_WWW__WW
__W_____W________WW_____W___R_____W______R__RR__________________
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................



But you can see that about a minute into handling the load spike (above)
it is up to a respectable rps level (compared to the cvs-head only run)
and has displayed no churn.




Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:12:02 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 3 minutes 16 seconds
Total accesses: 22300 - Total Traffic: 643.1 MB
CPU Usage: u15.93 s87.42 cu0 cs0 - 52.7% CPU load
114 requests/sec - 3.3 MB/second - 29.5 kB/request
175 requests currently being processed, 81 idle workers

__WW___W_WW_WW___WWW___WW__WWWWWWWWWWWWWWW__WWW_RWW_W_W_WWWWWW__
WWW_WW_WW_WWWW____WWW_WW__WWWWWW_WWW_WWWWWWW_WWWWW_W_WWWWWWWWWWW
WWWWWWWWWWW_WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
_W____WWW___W__W___W__W_W__WWWW_W__W_R______WW_____W______WW_W__
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................


By the 3 minute mark (above) Apache is in full swing and has never
had to kill off any processes. It started just what it needed to handle
a nice smooth ramp up.



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:12:35 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 3 minutes 49 seconds
Total accesses: 26517 - Total Traffic: 802.9 MB
CPU Usage: u18.9 s106.51 cu0 cs0 - 54.8% CPU load
116 requests/sec - 3.5 MB/second - 31.0 kB/request
171 requests currently being processed, 149 idle workers

___WW_WWWW____W__WWW_WWWW_W_WWWWW__WWWWWWWWWWWW__W_WW_WWWW_W__W_
_WWWWWWWWWW_WWWWW_WWWWWWWWWWWWWWW_WW_WW_W_W__WWWWWWWWWWWWWWWW_WW
W_WW___WWW__W__WWW_W_WW_WW_W_W__WW_____WWWW___WW__WW__WWWWW__W__
__WWWWWWWWW_WWWW___WWWWWWWWW_WW_WWW___W_W__WWW__W_WWWW_W_W_WWWW_
________________________________________________________________
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................


At about 3 1/2 minutes (above) it hits its peak. The number of long running
responses has started to impact the rps. From this point on the rps
will drop slightly (eventually down to about 95 rps) and the average
request size will grow (to about 70 kB/request). It is starting to add another
process worth of threads above.




Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:34:11 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 25 minutes 25 seconds
Total accesses: 151003 - Total Traffic: 9.3 GB
CPU Usage: u116.53 s831.64 cu0 cs0 - 62.2% CPU load
99 requests/sec - 6.3 MB/second - 64.8 kB/request
224 requests currently being processed, 160 idle workers

WW_WWW_W_WWWWWWW__WW_WWWWW_W_W_WWWW_W__WWW_WWW_W_W_WWW_WWW_WW_WW
_WW_WWW_WWWWWWW_R_WWWWWWW_WW__WWWWWWWW_W_WWWWWWWWW__WW_WWWWWWWW_
___W_W_W__WW___W_W____WW__W__W____WW_W_WW___W_W__W_____W__W__WWW
WW_W_WW_W_WW___WW_W___WWWWW_WW___W____W___WW_W_WWW_W___W________
WWWWW_WWWWW_W_WWWWWWWW_W___W__WWW__WWWWWWW_WW_W_WWW__W_WWWWW_WW_
W_W_W_WWWW____W_W_____WW__WWWW_WW_W_WWW_WWW_W_W__WWW_W____WW____
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................



By 25 minutes it has added two more processes to handle the workload
backlog generated by the building number of long running responses.




Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:45:39 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 36 minutes 53 seconds
Total accesses: 209872 - Total Traffic: 13.6 GB
CPU Usage: u162.62 s1233.66 cu0 cs0 - 63.1% CPU load
94.8 requests/sec - 6.3 MB/second - 68.0 kB/request
177 requests currently being processed, 143 idle workers

GGGGWWWWWGWW.G..G.GWWG.W.W..GGGGGGGG.GGWG.G..G.WGWGGG.GGWW.WWGG.
WWWWWW_WWW_WW_WW__WWWW_W_W_W_W_WWWWWW__W__WW_W_W__WWWWW_WWWW_WWW
W_WW__W__W_W_WW__WW_W_____WW_WWWW_WW_W_WWWWWWW___WW___WWW_W_WW_W
__WW_W__WW_____W_W______WW__W__WWWWW_WWW_WWW___WWWW_W_W_WW__WWW_
WW__WWW___WWW_W_WWW_WWW_W__WW_WW_WWW__WW___W_WWW__WW_WWWW_WWW_W_
____W__W__WW__W_WWW_WWWW__W__WWW_WW_W_______W___W__WWW__W___W___
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................


At about 35 minutes (above), enough of the long running responses cleared
out so that it rose above 192 idle workers so it killed off one of the
processes.



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:48:49 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 40 minutes 3 seconds
Total accesses: 228745 - Total Traffic: 14.8 GB
CPU Usage: u177.24 s1360.57 cu0 cs0 - 64% CPU load
95.2 requests/sec - 6.3 MB/second - 67.7 kB/request
187 requests currently being processed, 133 idle workers

GGGGW....GW..G..G.G..G.W....GGGGGGGG.GG.G.G..G..G.GGG.GG.W..WGG.
G..WG.W.G...WG.GWGWW.GGGGG...GGWWW.WW.GWGG.WG.GW.G...G.G.G..GWWG
W__W_W___W____WW_WW__WW__W_W_W______WWWW__WWWW___WW__W_W_W___W_W
WWWWWWW__WW___WWW_W___W__WWW_W__WW_WWWW_W____WWWWW_WWW_WW_WWWWW_
WW__WWWW__W__WW_W_W_WW_WWW__WWWW__WWWWW_WWW_WW_WWWWW_WW____WWW__
__W_WWWWWW____WWWWWWWWW_W__W_WWWW_W_W_WWW___WWW__W_W_W__WWWW_WWW
_W___W_W_W_WWWW_____WWWW__W_W_WW_W_WWW_W_WW__W_WW_WW_WWWWWWWWWWW
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................


By 40 minutes (above) the number of idle processes rose above 192 again
so it killed off another process. This actually turned out to be slightly
too many workers killed...




Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 18:59:18 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 50 minutes 32 seconds
Total accesses: 288883 - Total Traffic: 18.6 GB
CPU Usage: u223.01 s1740.45 cu0 cs0 - 64.8% CPU load
95.3 requests/sec - 6.3 MB/second - 67.5 kB/request
214 requests currently being processed, 170 idle workers

................................................................
G..WG...G....G.G.G...GGGGG...GGW...WW.G.GG..G.G..G...G.G.G..G..G
WWW_WW__WW__W_W_W_W_WW__W__W___WWW__WW_WWWW___W_W_WW_WWW___W__WW
_W_W____WWW_WWW__WWW__W__WWW_WW__WWW_WW_W__W__W_WW_WW_WW_W_WW_WW
WWWWWWWW_W_WWWWWW_WWWW_WWW_WW_W_WW__W__W_W_WWWWWW__W__W_WWW_WWW_
_W_W_W_WW_WWW_WWWW__W_WWW__WWW___WW_WW_WW__W____W__W__WW_WWW__WW
_WW_WWWWWWW_W_WWWWW_W_WWWWW_W_WWWWWWW___WWW_____WW_WW__WWW_WWWWW
_WW_W__W______WW___W_W_W__W__W____W_______WW_W__WW___WW_______WW
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................



...so by the 50 minute mark it had started up another process. By this
time (above) you can see that the first killed process has finished
being cleaned up.



Apache Server Status for www.apache.org

Server Version: Apache/2.0.37-dev (Unix)
Server Built: Apr 29 2002 17:54:46

Current Time: Monday, 29-Apr-2002 19:08:53 EDT
Restart Time: Monday, 29-Apr-2002 18:08:46 EDT
Parent Server Generation: 0
Server uptime: 1 hour 7 seconds
Total accesses: 339622 - Total Traffic: 22.2 GB
CPU Usage: u263.19 s2077.39 cu0 cs0 - 64.9% CPU load
94.2 requests/sec - 6.3 MB/second - 68.4 kB/request
219 requests currently being processed, 165 idle workers

................................................................
G..WG...G....G.G.G...GGGGG...GG....W..G.GG..G.G..G...G.G.G..G..G
W_WWWWWWWWWWWWWWWWWWWWWWWWW_W_WWW_WWWWW_W_WWWWWWW_WW_WW_W_WW__WW
W____WW__WWWWW___W_W_W__WW_WWWWW__WW_WW____W__W_W_WW__W___W__WW_
__WWWWWWWW_WW___RW_W_WW_W_W_WWWWW_WWW_WWWWWWWWWWWWWWWWW__WW__WWW
W___WWW_W______WW_____W__W_WWWW_W___WW__W____W____________WW_W__
WWW_WWWWWW_WW_WW_WWW_WWWWWW_WW_WW__WWW_W_WWW_WW_WW_WWWWW_W_WWRWW
WW___WW___W_________WW_W__W_W_____WWW___________WWW_WW______W_W_
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................



And for an apples to apples comparisson, here is the status at the
1 hour mark (above). Requests per second has remained close to 100
the whole time. An extra 2 GB of data was sent, about 20,000 extra
requests were processed with less system thrashing.


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein