You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Paul J. Reder" <re...@raleigh.ibm.com> on 2001/03/20 11:32:45 UTC

[Patch]: Suggested patch to fix perform_idle_server_cleanup window.

This patch closes the window where a child in the startup state (with locks)
blocks the cleanup processing from occurring, thus setting a lower limit for
the number of child processes that can be achieved during idle state.

If this patch is acceptable, then I can look at applying the same principle
to the other mpms (as appropriate), then commit them all at once (once I
have enough karma).

I deliberately used a different flag for the idle kill to avoid overloading
the shutdown and restart concepts.

The shift in the position of the break in mpm_common is due to the fact that,
although the kids are cleaned up successfully, they go away indirectly rather
than directly by signal which takes slightly longer. This just avoids the
annoying set of messages indicating that "some kids didn't go away yet so they
are being killed again".

-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein


Index: httpd-2.0/include/ap_mpm.h
===================================================================
RCS file: /home/cvspublic/httpd-2.0/include/ap_mpm.h,v
retrieving revision 1.24
diff -u -r1.24 ap_mpm.h
--- httpd-2.0/include/ap_mpm.h	2001/03/02 22:46:31	1.24
+++ httpd-2.0/include/ap_mpm.h	2001/03/20 14:52:00
@@ -121,9 +121,17 @@
  * predicate indicating if a graceful stop has been requested ...
  * used by the connection loop 
  * @return 1 if a graceful stop has been requested, 0 otherwise
- * @deffunc int ap_graceful_stop_signalled*void)
+ * @deffunc int ap_graceful_stop_signalled(*void)
  */
 AP_DECLARE(int) ap_graceful_stop_signalled(void);
+
+/**
+ * predicate indicating if an idle cleanup has been requested ...
+ * Used by the perform_idle_server_maintenance code. 
+ * @return 1 if an idle kill has been requested, 0 otherwise
+ * @deffunc int ap_idle_die_signalled(*void)
+ */
+AP_DECLARE(int) ap_idle_die_signalled(void);
 
 /**
  * Spawn a process with privileges that another module has requested
Index: httpd-2.0/include/scoreboard.h
===================================================================
RCS file: /home/cvspublic/httpd-2.0/include/scoreboard.h,v
retrieving revision 1.14
diff -u -r1.14 scoreboard.h
--- httpd-2.0/include/scoreboard.h	2001/02/24 03:40:48	1.14
+++ httpd-2.0/include/scoreboard.h	2001/03/20 14:52:01
@@ -100,7 +100,8 @@
 #define SERVER_GRACEFUL 8	/* server is gracefully finishing request */
 #define SERVER_ACCEPTING 9	/* thread is accepting connections */
 #define SERVER_QUEUEING	10      /* thread is putting connection on the queue */
-#define SERVER_NUM_STATUS 11	/* number of status settings */
+#define SERVER_IDLE_KILL 11     /* Server is cleaning up idle children. */
+#define SERVER_NUM_STATUS 12	/* number of status settings */
 
 /* A "virtual time" is simply a counter that indicates that a child is
  * making progress.  The parent checks up on each child, and when they have
@@ -148,6 +149,10 @@
     SB_NOT_SHARED = 2
 } ap_scoreboard_e;
 
+#define SB_WORKING  0  /* The server is busy and the child is useful. */
+#define SB_IDLE_DIE 1  /* The server is idle and the child is superfluous. */
+                       /*   The child should check for this and exit gracefully. */
+
 /* stuff which is thread/process specific */
 typedef struct {
 #ifdef OPTIMIZE_TIMEOUTS
@@ -165,6 +170,7 @@
     unsigned long my_bytes_served;
     unsigned long conn_bytes;
     unsigned short conn_count;
+    unsigned short life_status;    /* Either SB_WORKING or SB_IDLE_DIE */
     apr_time_t start_time;
     apr_time_t stop_time;
 #ifdef HAVE_TIMES
Index: httpd-2.0/modules/generators/mod_status.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/modules/generators/mod_status.c,v
retrieving revision 1.33
diff -u -r1.33 mod_status.c
--- httpd-2.0/modules/generators/mod_status.c	2001/02/28 18:41:44	1.33
+++ httpd-2.0/modules/generators/mod_status.c	2001/03/20 14:52:55
@@ -764,6 +764,8 @@
     status_flags[SERVER_BUSY_LOG] = 'L';
     status_flags[SERVER_BUSY_DNS] = 'D';
     status_flags[SERVER_GRACEFUL] = 'G';
+    status_flags[SERVER_IDLE_KILL] = 'I';
+
 }
 
 static void register_hooks(apr_pool_t *p)
Index: httpd-2.0/server/mpm_common.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/mpm_common.c,v
retrieving revision 1.43
diff -u -r1.43 mpm_common.c
--- httpd-2.0/server/mpm_common.c	2001/03/02 22:46:31	1.43
+++ httpd-2.0/server/mpm_common.c	2001/03/20 14:54:09
@@ -126,9 +126,9 @@
             switch (tries) {
             case 1:     /*  16ms */
             case 2:     /*  82ms */
-                break;
             case 3:     /* 344ms */
             case 4:     /*  16ms */
+                break;
             case 5:     /*  82ms */
             case 6:     /* 344ms */
             case 7:     /* 1.4sec */
Index: httpd-2.0/server/mpm/prefork/prefork.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/mpm/prefork/prefork.c,v
retrieving revision 1.170
diff -u -r1.170 prefork.c
--- httpd-2.0/server/mpm/prefork/prefork.c	2001/03/02 22:46:32	1.170
+++ httpd-2.0/server/mpm/prefork/prefork.c	2001/03/20 14:54:30
@@ -362,7 +362,8 @@
 
 static void just_die(int sig)
 {
-    clean_child_exit(0);
+    /* clean_child_exit(0); */
+    ap_scoreboard_image->servers[my_child_num][0].life_status = SB_IDLE_DIE;
 }
 
 static int volatile deferred_die;
@@ -530,6 +531,16 @@
     return 0;
 }
 
+int ap_idle_die_signalled(void)
+{
+    ap_sync_scoreboard_image();
+    if (deferred_die ||
+	ap_scoreboard_image->servers[my_child_num][0].life_status == SB_IDLE_DIE) {
+	return 1;
+    }
+    return 0;
+}
+
 static void child_main(int child_num_arg)
 {
     ap_listen_rec *lr;
@@ -540,9 +551,9 @@
     apr_status_t stat = APR_EINIT;
     int sockdes;
 
+    my_child_num = child_num_arg;
     ap_my_pid = getpid();
     csd = NULL;
-    my_child_num = child_num_arg;
     requests_this_child = 0;
     last_lr = NULL;
 
@@ -568,7 +579,7 @@
     apr_signal(SIGHUP, just_die);
     apr_signal(SIGTERM, just_die);
 
-    while (!ap_graceful_stop_signalled()) {
+    while ((!ap_graceful_stop_signalled()) && (!ap_idle_die_signalled())) {
 
 	/* Prepare to receive a SIGWINCH due to graceful restart so that
 	 * we can exit cleanly.
@@ -656,7 +667,7 @@
 	     */
 	    usr1_just_die = 0;
 	    for (;;) {
-		if (deferred_die) {
+		if (ap_idle_die_signalled()) {
 		    /* we didn't get a socket, and we were told to die */
 		    clean_child_exit(0);
 		}
@@ -763,7 +774,7 @@
 		}
 	    }
 
-	    if (ap_graceful_stop_signalled()) {
+	    if ((ap_graceful_stop_signalled()) || (ap_idle_die_signalled())) {
 		clean_child_exit(0);
 	    }
 	    usr1_just_die = 1;
@@ -827,6 +838,7 @@
 	apr_signal(SIGQUIT, SIG_DFL);
 #endif
 	apr_signal(SIGTERM, just_die);
+        ap_scoreboard_image->servers[slot][0].life_status = SB_WORKING;
 	child_main(slot);
     }
 
@@ -877,6 +889,7 @@
 	apr_signal(SIGHUP, just_die);
 	apr_signal(SIGWINCH, just_die);
 	apr_signal(SIGTERM, just_die);
+        ap_scoreboard_image->servers[slot][0].life_status = SB_WORKING;
 	child_main(slot);
     }

Re: [Patch]: Suggested patch to fix perform_idle_server_cleanup window.

Posted by "Paul J. Reder" <re...@raleigh.ibm.com>.
Greg Ames wrote:
> 
> "Paul J. Reder" wrote:
> >
> >
> > These two functions could easily be merged,
> 
> OK, that sounds like a good first step.  But better still would be to
> make this so simple for the mainline code that no function or macro is
> called for.  Just test one variable.
 ....
> I'm suggesting that the question, "Should I keep working?" is very
> simple conceptually, so the code to perform the test in the main loop
> should be dead simple.  We can use your new field in the worker's slot
> of the scoreboard as the one and only place that needs to be checked to
> answer such a question.  If we do that, there's no need for function
> calls or macros, which makes things simpler still.  Yes, we would have
> to make the code that initiates graceful restart a tiny bit more
> complex, but I think that's a fair tradeoff.  The generation numbers can
> remain separate, but unused for this purpose.
> 
> Without your patch, we are touching 4 separate variables (deferred_die,
> ap_my_generation, ap_scoreboard_image, and
> ap_scoreboard_image->global.running_generation ) to answer "Should I
> keep working?".  Worse case, that's 4 data cache lines that must be
> either already present in the cache or fetched, at two different spots
> in the request loop separated by what could be a long time delay.  That
> seems like a lot, so I am advocating making the situation better by
> reducing rather than increasing the number of variables accessed.

I agree. When I wrote the initial version I was trying to maintain the
current separation of concepts, but felt it made for excessive code path.
I have rewritten the code to check for just one variable. I am currently
working to move the code to set the idle_die flag to the server during
restart to eliminate this code in the worker.

-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein

Re: [Patch]: Suggested patch to fix perform_idle_server_cleanup window.

Posted by Greg Ames <gr...@remulak.net>.
"Paul J. Reder" wrote:
> 
> 
> These two functions could easily be merged, 

OK, that sounds like a good first step.  But better still would be to
make this so simple for the mainline code that no function or macro is
called for.  Just test one variable.
> 
> >
> > I could see moving some of the path length into the graceful
> > restart/shutdown code, since it doesn't run very often.  It could walk
> > thru the scoreboard and put some appropriate DIE value into the
> > life_status field for each worker.  The signal handler that now sets
> > deferred_die could set some appropriate DIE value into its own
> > scoreboard slot instead. Then the checks in the request loop could be
> > simplified to something like (scoreboard[my_slot].life_status !=
> > SB_WORKING).  Does that make sense?
> 
> Are you suggesting merging the restart (generation) and idle cleanup flags?

I'm suggesting that the question, "Should I keep working?" is very
simple conceptually, so the code to perform the test in the main loop
should be dead simple.  We can use your new field in the worker's slot
of the scoreboard as the one and only place that needs to be checked to
answer such a question.  If we do that, there's no need for function
calls or macros, which makes things simpler still.  Yes, we would have
to make the code that initiates graceful restart a tiny bit more
complex, but I think that's a fair tradeoff.  The generation numbers can
remain separate, but unused for this purpose.

Without your patch, we are touching 4 separate variables (deferred_die,
ap_my_generation, ap_scoreboard_image, and
ap_scoreboard_image->global.running_generation ) to answer "Should I
keep working?".  Worse case, that's 4 data cache lines that must be
either already present in the cache or fetched, at two different spots
in the request loop separated by what could be a long time delay.  That
seems like a lot, so I am advocating making the situation better by
reducing rather than increasing the number of variables accessed.  

Thanks,
Greg

Re: [Patch]: Suggested patch to fix perform_idle_server_cleanup window.

Posted by "Paul J. Reder" <re...@raleigh.ibm.com>.
Greg Ames wrote:
> 
> "Paul J. Reder" wrote:
> >
> > @@ -568,7 +579,7 @@
> >      apr_signal(SIGHUP, just_die);
> >      apr_signal(SIGTERM, just_die);
> >
> > -    while (!ap_graceful_stop_signalled()) {
> > +    while ((!ap_graceful_stop_signalled()) && (!ap_idle_die_signalled())) {
> >
> 
> This is part of the mainline request loop.  Couldn't this be optimized
> so that we don't need to check two conditions, each of which is a
> function call checking two conditions, one of which is deferred_die in
> both functions?

These two functions could easily be merged, and with a very small amount of
effort turned into macros. My main reason for not doing that was to make the
idle cleanup code clear and separate.

> 
> I could see moving some of the path length into the graceful
> restart/shutdown code, since it doesn't run very often.  It could walk
> thru the scoreboard and put some appropriate DIE value into the
> life_status field for each worker.  The signal handler that now sets
> deferred_die could set some appropriate DIE value into its own
> scoreboard slot instead. Then the checks in the request loop could be
> simplified to something like (scoreboard[my_slot].life_status !=
> SB_WORKING).  Does that make sense?

Are you suggesting merging the restart (generation) and idle cleanup flags? 
I was deliberately keeping those concepts separate due to the difference in
scope (all kids versus just the few extras). If folks agree that merging the
flag from the workers perspective isn't a problem then the code to initiate a
graceful restart would just tag all the kids, and the idle cleanup code would
just tag the one to be cleaned up.

Anyone else want to voice an opinion.

> 
> > -           if (ap_graceful_stop_signalled()) {
> > +           if ((ap_graceful_stop_signalled()) || (ap_idle_die_signalled())) {
> >                 clean_child_exit(0);
> >             }
> >             usr1_just_die = 1;
> 
> same comment as above.

Same answer as above.

> 
> Greg

-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein

Re: [Patch]: Suggested patch to fix perform_idle_server_cleanup window.

Posted by Greg Ames <gr...@remulak.net>.
"Paul J. Reder" wrote:
> 
> @@ -568,7 +579,7 @@
>      apr_signal(SIGHUP, just_die);
>      apr_signal(SIGTERM, just_die);
> 
> -    while (!ap_graceful_stop_signalled()) {
> +    while ((!ap_graceful_stop_signalled()) && (!ap_idle_die_signalled())) {
> 

This is part of the mainline request loop.  Couldn't this be optimized
so that we don't need to check two conditions, each of which is a
function call checking two conditions, one of which is deferred_die in
both functions?  

I could see moving some of the path length into the graceful
restart/shutdown code, since it doesn't run very often.  It could walk
thru the scoreboard and put some appropriate DIE value into the
life_status field for each worker.  The signal handler that now sets
deferred_die could set some appropriate DIE value into its own
scoreboard slot instead. Then the checks in the request loop could be
simplified to something like (scoreboard[my_slot].life_status !=
SB_WORKING).  Does that make sense?

> -           if (ap_graceful_stop_signalled()) {
> +           if ((ap_graceful_stop_signalled()) || (ap_idle_die_signalled())) {
>                 clean_child_exit(0);
>             }
>             usr1_just_die = 1;

same comment as above.

Greg