You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Arkadiusz Miśkiewicz <ar...@maven.pl> on 2013/04/12 09:22:07 UTC

Re: 2.4.4 graceful restart taking long time [PATCH]

On Thursday 11 of April 2013, Arkadiusz Miśkiewicz wrote:
> Hello,
> 
> On apache 2.2.22, 2.2.23 and 2.4.4 I'm able to reproduce a problem
> where graceful restart takes very long time. Linux 3.7.10, glibc 2.17 here.

Summary of my findings

1) main process on gracefull restart calls 
/* kill off the idle ones */
ap_mpm_pod_killpg(pod, retained->max_daemons_limit);

and that does dummy_connection() for each pod. This works fast
(1-2s) in most cases where childrens are alive but not in this
case - children exit earlier.

2) children exit in 
        else if (retained->my_generation !=
                 ap_scoreboard_image->global->running_generation) { /* 
restart? */
            /* yeah, this could be non-graceful restart, in which case the
             * parent will kill us soon enough, but why bother checking?
             */
            die_now = 1;
        }

3) in my case main process works slowly enough that it doesn't finish
doing ap_mpm_pod_killpg before all children exit like in 2)

4) so we are ending up in situation where main process is in middle
of ap_mpm_pod_killpg while all children already exited in 2). Main
process is continuing to do dummy_connection and these connect()s
+ dummy data send, polling is taking very long time (like 1-2s for each
dummy_connection). This is the primary reason why graceful is painfully
slow here (1-5 minutes) sometimes.

5) why connect() and socket sending in dummy_connect succeeds?
If it failed then ap_mpm_pod_killpg loop would break and things would
be fast.

Because when all children exited there is main process that still
holds listening socket and connection go to this socket. Verified that
by doing external telnet connection to the same IP/port dummy_connection
is doing - telnet connect() succeeded and telnet waits until next
generation comes up. This is why connect()/apr_socket_send()
in dummy_connect() dont fail.

6) possible solution - detect that sending dummy_connection data weren't
processed by children by trying to read reply

diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c
--- httpd-2.4.4.org/server/mpm_unix.c	2012-07-03 21:38:58.000000000 +0200
+++ httpd-2.4.4/server/mpm_unix.c	2013-04-12 09:14:58.282929959 +0200
@@ -604,7 +604,17 @@
         len = strlen(data);
     }
 
-    apr_socket_send(sock, data, &len);
+    rv = apr_socket_send(sock, data, &len);
+    if (rv == APR_SUCCESS) {
+        char *buffer[10]; // dummy buffer to see if child is alive
+
+        len = sizeof(buffer);
+        rv = apr_socket_recv(sock, buffer, &len);
+        if (rv != APR_SUCCESS) {
+            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf, APLOGNO(00057)
+                         "receiving response to dummy data to listener on %pI", lp->bind_addr);
+        }
+    }
     apr_socket_close(sock);
     apr_pool_destroy(p);
 

without patch (40s):
[Fri Apr 12 09:08:40.785226 2013] [mpm_prefork:notice] [pid 6790] AH00171: Graceful restart requested, doing restart
[Fri Apr 12 09:09:20.968224 2013] [mpm_prefork:notice] [pid 6790] AH00163: Apache/2.4.4 (Unix) configured -- resuming normal operations

with patch (3s):
[Fri Apr 12 09:10:01.902779 2013] [mpm_prefork:notice] [pid 7532] AH00171: Graceful restart requested, doing restart
[Fri Apr 12 09:10:04.973715 2013] [mpm_prefork:notice] [pid 7532] AH00163: Apache/2.4.4 (Unix) configured -- resuming normal operations

strace with the patch:
http://ixion.pld-linux.org/~arekm/apache4.txt

7) How to reproduce?
Main process need to be slowed down, so all children have a chance to exit
in 2) before 1) finishes.

In my case "strace -ttT -s 300 -p pido-of-main-apache-process" is enough.

Then run ab -n 100000 -c 100 http://apache.ip and in parallel issue graceful restart before ab finishes.

Look at error_log if graceful time was longer than it should be. Strace should show
dummy_connection connects(), polling etc.

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

Re: 2.4.4 graceful restart taking long time [PATCH]

Posted by Jeff Trawick <tr...@gmail.com>.
On Fri, Apr 12, 2013 at 7:36 AM, Arkadiusz Miśkiewicz <ar...@maven.pl>wrote:

> On Friday 12 of April 2013, Eric Covener wrote:
> > > diff -ur httpd-2.4.4.org/server/mpm_unix.chttpd-2.4.4/server/mpm_unix.c
> > > --- httpd-2.4.4.org/server/mpm_unix.c   2012-07-03 21:38:58.000000000
> > > +0200 +++ httpd-2.4.4/server/mpm_unix.c       2013-04-12
> > > 09:14:58.282929959 +0200 @@ -604,7 +604,17 @@
> > >
> > >          len = strlen(data);
> > >
> > >      }
> > >
> > > -    apr_socket_send(sock, data, &len);
> > > +    rv = apr_socket_send(sock, data, &len);
> > > +    if (rv == APR_SUCCESS) {
> > > +        char *buffer[10]; // dummy buffer to see if child is alive
> > > +
> > > +        len = sizeof(buffer);
> > > +        rv = apr_socket_recv(sock, buffer, &len);
> > > +        if (rv != APR_SUCCESS) {
> > > +            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf,
> > > APLOGNO(00057) +                         "receiving response to dummy
> > > data to listener on %pI", lp->bind_addr); +        }
> > > +    }
> > >
> > >      apr_socket_close(sock);
> > >      apr_pool_destroy(p);
>
> End of this is:
>
>     apr_socket_close(sock);
>     apr_pool_destroy(p);
>
>     return rv;
> >
> > Interesting analysis, I'm confused on how the patch helps though if it
> > doesn't actually do anything different when there is no reply (which
> > is more what I expected from the bullets)
>
> It returns error in rv


The implication of the recv was quite confusing to me initially when I
reviewed this again.  Rüdiger had reviewed an strace and determined that
connect() was taking longer and longer over time (presumably as the queue
in the kernel built up).  So with a timeout of 3 seconds you might have
connects that took increasing amounts of time all under 3 seconds that
could add up to extremely long delays.  Add the recv call and you cap the
total delay at essentially 3 seconds since getting connect+recv to work
within the timeout forces there to be a non-zombie, non-busy child on the
other end.

I don't see that Jim's proposal for the 2.4 branch matches Arkadiusz's
exact scenario/PR or suggested fix though I do understand that it could
quicken graceful restart for some scenarios.  I'll remove my note from
2.4.x/STATUS.


> which is propagated up to ap_mpm_pod_killpg
> which does
>     for (i = 0; i < num && rv == APR_SUCCESS; i++) {
>         rv = dummy_connection(pod);
>     }
>
> and on error will break the loop and stop doing dummy connections as there
> are
> no children anyway. graceful restart then can proceed without a delay.
>
> --
> Arkadiusz Miśkiewicz, arekm / maven.pl
>



-- 
Born in Roswell... married an alien...
http://emptyhammock.com/

Re: 2.4.4 graceful restart taking long time [PATCH]

Posted by Arkadiusz Miśkiewicz <ar...@maven.pl>.
On Friday 12 of April 2013, Eric Covener wrote:
> > diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c
> > --- httpd-2.4.4.org/server/mpm_unix.c   2012-07-03 21:38:58.000000000
> > +0200 +++ httpd-2.4.4/server/mpm_unix.c       2013-04-12
> > 09:14:58.282929959 +0200 @@ -604,7 +604,17 @@
> > 
> >          len = strlen(data);
> >      
> >      }
> > 
> > -    apr_socket_send(sock, data, &len);
> > +    rv = apr_socket_send(sock, data, &len);
> > +    if (rv == APR_SUCCESS) {
> > +        char *buffer[10]; // dummy buffer to see if child is alive
> > +
> > +        len = sizeof(buffer);
> > +        rv = apr_socket_recv(sock, buffer, &len);
> > +        if (rv != APR_SUCCESS) {
> > +            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf,
> > APLOGNO(00057) +                         "receiving response to dummy
> > data to listener on %pI", lp->bind_addr); +        }
> > +    }
> > 
> >      apr_socket_close(sock);
> >      apr_pool_destroy(p);

End of this is:

    apr_socket_close(sock);
    apr_pool_destroy(p);

    return rv;
> 
> Interesting analysis, I'm confused on how the patch helps though if it
> doesn't actually do anything different when there is no reply (which
> is more what I expected from the bullets)

It returns error in rv which is propagated up to ap_mpm_pod_killpg
which does
    for (i = 0; i < num && rv == APR_SUCCESS; i++) {
        rv = dummy_connection(pod);
    }

and on error will break the loop and stop doing dummy connections as there are 
no children anyway. graceful restart then can proceed without a delay.

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

Re: 2.4.4 graceful restart taking long time [PATCH]

Posted by Eric Covener <co...@gmail.com>.
> diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c
> --- httpd-2.4.4.org/server/mpm_unix.c   2012-07-03 21:38:58.000000000 +0200
> +++ httpd-2.4.4/server/mpm_unix.c       2013-04-12 09:14:58.282929959 +0200
> @@ -604,7 +604,17 @@
>          len = strlen(data);
>      }
>
> -    apr_socket_send(sock, data, &len);
> +    rv = apr_socket_send(sock, data, &len);
> +    if (rv == APR_SUCCESS) {
> +        char *buffer[10]; // dummy buffer to see if child is alive
> +
> +        len = sizeof(buffer);
> +        rv = apr_socket_recv(sock, buffer, &len);
> +        if (rv != APR_SUCCESS) {
> +            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf, APLOGNO(00057)
> +                         "receiving response to dummy data to listener on %pI", lp->bind_addr);
> +        }
> +    }
>      apr_socket_close(sock);
>      apr_pool_destroy(p);

Interesting analysis, I'm confused on how the patch helps though if it
doesn't actually do anything different when there is no reply (which
is more what I expected from the bullets)

Re: 2.4.4 graceful restart taking long time [PATCH]

Posted by Arkadiusz Miśkiewicz <ar...@maven.pl>.
On Friday 12 of April 2013, Arkadiusz Miśkiewicz wrote:

> 6) possible solution - detect that sending dummy_connection data weren't
> processed by children by trying to read reply
> 
> diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c
> --- httpd-2.4.4.org/server/mpm_unix.c	2012-07-03 21:38:58.000000000 +0200
> +++ httpd-2.4.4/server/mpm_unix.c	2013-04-12 09:14:58.282929959 +0200
> @@ -604,7 +604,17 @@
>          len = strlen(data);
>      }
> 
> -    apr_socket_send(sock, data, &len);
> +    rv = apr_socket_send(sock, data, &len);
> +    if (rv == APR_SUCCESS) {
> +        char *buffer[10]; // dummy buffer to see if child is alive
> +
> +        len = sizeof(buffer);
> +        rv = apr_socket_recv(sock, buffer, &len);
> +        if (rv != APR_SUCCESS) {
> +            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf,
> APLOGNO(00057) +                         "receiving response to dummy data
> to listener on %pI", lp->bind_addr); +        }
> +    }
>      apr_socket_close(sock);
>      apr_pool_destroy(p);

Not sure how to deal with SSL dummy connection though. With SSL children sends 
no reply in response to tls10_close_notify.

apr_socket_recv method works only for non-SSL connections.

Maybe there is another way to check which children are alive at 
dummy_connection level (and avoiding sending anything if there are no 
children) ?

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl