You are viewing a plain text version of this content. The canonical link for it is here.
Posted to cvs@httpd.apache.org by yl...@apache.org on 2022/04/14 14:38:03 UTC

svn commit: r1899858 - /httpd/httpd/trunk/server/mpm/event/event.c

Author: ylavic
Date: Thu Apr 14 14:38:03 2022
New Revision: 1899858

URL: http://svn.apache.org/viewvc?rev=1899858&view=rev
Log:
mpm_event: Handle children killed pathologically.

If children processes get killed (SIGSEGV/SIGABRT/..) early after starting or
frequently enough then we never enter perform_idle_server_maintenance() to
try something.

Below three successive children killed restart them immediately, above three
let's sleep the usual 1s (to avoid fork()s flood) and do the idle maintenance.


Modified:
    httpd/httpd/trunk/server/mpm/event/event.c

Modified: httpd/httpd/trunk/server/mpm/event/event.c
URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/mpm/event/event.c?rev=1899858&r1=1899857&r2=1899858&view=diff
==============================================================================
--- httpd/httpd/trunk/server/mpm/event/event.c (original)
+++ httpd/httpd/trunk/server/mpm/event/event.c Thu Apr 14 14:38:03 2022
@@ -3382,6 +3382,7 @@ static void server_main_loop(int remaini
 {
     int num_buckets = retained->mpm->num_buckets;
     int max_daemon_used = 0;
+    int successive_signals = 0;
     int child_slot;
     apr_exit_why_e exitwhy;
     int status, processed_status;
@@ -3460,11 +3461,31 @@ static void server_main_loop(int remaini
             /* Don't perform idle maintenance when a child dies,
              * only do it when there's a timeout.  Remember only a
              * finite number of children can die, and it's pretty
-             * pathological for a lot to die suddenly.
+             * pathological for a lot to die suddenly.  If that happens
+             * anyway, protect against fork()+kill() flood by not restarting
+             * more than 3 children if no timeout happened in between,
+             * otherwise we keep going with idle maintenance.
              */
-            continue;
+            if (child_slot < 0 || !APR_PROC_CHECK_SIGNALED(exitwhy)) {
+                continue;
+            }
+            if (++successive_signals >= 3) {
+                if (successive_signals % 10 == 3) {
+                    ap_log_error(APLOG_MARK, APLOG_WARNING, 0,
+                                 ap_server_conf, APLOGNO(10392)
+                                 "children are killed successively!");
+                }
+                apr_sleep(apr_time_from_sec(1));
+            }
+            else {
+                ++remaining_children_to_start;
+            }
+        }
+        else {
+            successive_signals = 0;
         }
-        else if (remaining_children_to_start) {
+
+        if (remaining_children_to_start) {
             /* we hit a 1 second timeout in which none of the previous
              * generation of children needed to be reaped... so assume
              * they're all done, and pick up the slack if any is left.



Re: svn commit: r1899858 - /httpd/httpd/trunk/server/mpm/event/event.c

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Apr 14, 2022 at 6:11 PM Ruediger Pluem <rp...@apache.org> wrote:
>
> On 4/14/22 4:38 PM, ylavic@apache.org wrote:
> >
> > +            if (++successive_signals >= 3) {
> > +                if (successive_signals % 10 == 3) {
> > +                    ap_log_error(APLOG_MARK, APLOG_WARNING, 0,
> > +                                 ap_server_conf, APLOGNO(10392)
> > +                                 "children are killed successively!");
> > +                }
> > +                apr_sleep(apr_time_from_sec(1));
>
> Why do we need the sleep here? Why can't we just do continue?

Much better thanks, r1899865.

Regards;
Yann.

Re: svn commit: r1899858 - /httpd/httpd/trunk/server/mpm/event/event.c

Posted by Ruediger Pluem <rp...@apache.org>.

On 4/14/22 4:38 PM, ylavic@apache.org wrote:
> Author: ylavic
> Date: Thu Apr 14 14:38:03 2022
> New Revision: 1899858
> 
> URL: http://svn.apache.org/viewvc?rev=1899858&view=rev
> Log:
> mpm_event: Handle children killed pathologically.
> 
> If children processes get killed (SIGSEGV/SIGABRT/..) early after starting or
> frequently enough then we never enter perform_idle_server_maintenance() to
> try something.
> 
> Below three successive children killed restart them immediately, above three
> let's sleep the usual 1s (to avoid fork()s flood) and do the idle maintenance.
> 
> 
> Modified:
>     httpd/httpd/trunk/server/mpm/event/event.c
> 
> Modified: httpd/httpd/trunk/server/mpm/event/event.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/mpm/event/event.c?rev=1899858&r1=1899857&r2=1899858&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/mpm/event/event.c (original)
> +++ httpd/httpd/trunk/server/mpm/event/event.c Thu Apr 14 14:38:03 2022
> @@ -3382,6 +3382,7 @@ static void server_main_loop(int remaini
>  {
>      int num_buckets = retained->mpm->num_buckets;
>      int max_daemon_used = 0;
> +    int successive_signals = 0;
>      int child_slot;
>      apr_exit_why_e exitwhy;
>      int status, processed_status;
> @@ -3460,11 +3461,31 @@ static void server_main_loop(int remaini
>              /* Don't perform idle maintenance when a child dies,
>               * only do it when there's a timeout.  Remember only a
>               * finite number of children can die, and it's pretty
> -             * pathological for a lot to die suddenly.
> +             * pathological for a lot to die suddenly.  If that happens
> +             * anyway, protect against fork()+kill() flood by not restarting
> +             * more than 3 children if no timeout happened in between,
> +             * otherwise we keep going with idle maintenance.
>               */
> -            continue;
> +            if (child_slot < 0 || !APR_PROC_CHECK_SIGNALED(exitwhy)) {
> +                continue;
> +            }
> +            if (++successive_signals >= 3) {
> +                if (successive_signals % 10 == 3) {
> +                    ap_log_error(APLOG_MARK, APLOG_WARNING, 0,
> +                                 ap_server_conf, APLOGNO(10392)
> +                                 "children are killed successively!");
> +                }
> +                apr_sleep(apr_time_from_sec(1));

Why do we need the sleep here? Why can't we just do continue?

> +            }
> +            else {
> +                ++remaining_children_to_start;
> +            }
> +        }
> +        else {
> +            successive_signals = 0;
>          }
> -        else if (remaining_children_to_start) {
> +
> +        if (remaining_children_to_start) {
>              /* we hit a 1 second timeout in which none of the previous
>               * generation of children needed to be reaped... so assume
>               * they're all done, and pick up the slack if any is left.
> 
> 
> 

Regards

RĂ¼diger