You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2012/07/21 01:20:48 UTC

[Bug 53579] New: httpd looping writing on a pipe and unresponsive (httpd )

https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

          Priority: P2
            Bug ID: 53579
          Assignee: bugs@httpd.apache.org
           Summary: httpd looping writing on a pipe and unresponsive
                    (httpd <defunct>)
          Severity: critical
    Classification: Unclassified
          Reporter: yrd_us@yahoo.com
          Hardware: PC
            Status: NEW
           Version: 2.2.22
         Component: mpm_prefork
           Product: Apache httpd-2

httpd shows 3 defunct children and is unable to serve any further requets. A
lelnet connection to port 80 is successful though.

Version output 

httpd -V
Server version: Apache/2.2.22 (Unix)
Server built:   Jun 27 2012 23:19:33
Server's Module Magic Number: 20051115:30
Server loaded:  APR 1.4.5, APR-Util 1.4.1
Compiled using: APR 1.4.5, APR-Util 1.4.1
Architecture:   64-bit
Server MPM:     Prefork
  threaded:     no
    forked:     yes (variable process count)
Server compiled with....
 -D APACHE_MPM_DIR="server/mpm/prefork"
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128
 -D HTTPD_ROOT="/opt/tms"
 -D SUEXEC_BIN="/opt/tms/bin/suexec"
 -D DEFAULT_PIDLOG="logs/httpd.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_LOCKFILE="logs/accept.lock"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="conf/mime.types"
 -D SERVER_CONFIG_FILE="conf/httpd.conf"



httpd -l
Compiled in modules:
  core.c
  mod_authn_file.c
  mod_authn_default.c
  mod_authz_host.c
  mod_authz_groupfile.c
  mod_authz_user.c
  mod_authz_default.c
  mod_auth_basic.c
  mod_filter.c
  mod_deflate.c
  mod_log_config.c
  mod_env.c
  mod_headers.c
  mod_version.c
  mod_ssl.c
  prefork.c
  http_core.c
  mod_mime.c
  mod_autoindex.c
  mod_cgi.c
  mod_dir.c
  mod_actions.c
  mod_alias.c
  mod_rewrite.c
  mod_so.c

Config file snippet

Timeout 3600
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15

StartServers         1
MinSpareServers      1
MaxSpareServers      2
MaxClients          16
MaxRequestsPerChild  0

Listen 80
Listen 443



Following is the output of netstat and status of the httpd

# netstat -nt | grep ":80"
tcp      296      0 ::ffff:10.0.1.233:80        ::ffff:10.18.3.14:62429    
CLOSE_WAIT
tcp      287      0 ::ffff:10.0.1.233:80        ::ffff:10.18.0.233:51590   
CLOSE_WAIT
tcp      304      0 ::ffff:10.0.1.233:80        ::ffff:10.18.0.233:51284   
CLOSE_WAIT
tcp      304      0 ::ffff:10.0.1.233:80        ::ffff:10.18.0.233:51576   
CLOSE_WAIT
tcp      296      0 ::ffff:10.0.1.233:80        ::ffff:10.18.3.14:58146    
CLOSE_WAIT
tcp      287      0 ::ffff:10.0.1.233:80        ::ffff:10.18.0.233:51783   
CLOSE_WAIT
tcp      287      0 ::ffff:10.0.1.233:80        ::ffff:10.18.0.233:51761   
CLOSE_WAIT

# ps -ef | grep httpd
admin    11463 19173  0 21:53 pts/0    00:00:00 grep httpd
admin    12084 24519  0 Jun29 ?        00:00:19 /usr/sbin/httpd -D NO_DETACH -f
/etc/opt/tms/output/httpd.conf
apache   21089 12084  0 Jul05 ?        00:00:00 [httpd] <defunct>
apache   31445 12084  0 Jul05 ?        00:00:00 [httpd] <defunct>
apache   31596 12084  0 Jul05 ?        00:00:00 [httpd] <defunct>

gdb -p 12084 /usr/sbin/httpd shows that the top level process is here:

(gdb) where
#0  0x00007f638884e470 in __write_nocancel () from /lib64/libpthread.so.0
#1  0x0000000000483c87 in apr_file_write ()
#2  0x00000000004273af in ap_mpm_pod_signal ()
#3  0x000000000045f3c3 in ap_mpm_run ()
#4  0x000000000040a664 in main ()

Apparently, it is getting stuck writing something to a file. 
ap_mpm_pod_signal() effectively just called apr_file_write(), where it is
getting stuck.

Using an httpd built with -g , the stack trace from gdb looks like this:

(gdb) where
#0  0x00007f638884e470 in __write_nocancel () from /lib64/libpthread.so.0
#1  0x0000000000483c87 in apr_file_write (thefile=0x19314b0, 
    buf=0x7fffb849709f, nbytes=0x7fffb8497090) at file_io/unix/readwrite.c:188
#2  0x00000000004273af in pod_signal_internal (pod=0x1931420)
    at mpm_common.c:630
#3  ap_mpm_pod_signal (pod=0x1931420) at mpm_common.c:726
#4  0x000000000045f3c3 in perform_idle_server_maintenance (
    _pconf=<value optimized out>, plog=<value optimized out>, 
    s=<value optimized out>) at prefork.c:867
#5  ap_mpm_run (_pconf=<value optimized out>, plog=<value optimized out>, 
    s=<value optimized out>) at prefork.c:1107
#6  0x000000000040a664 in main (argc=5, argv=0x7fffb8497448) at main.c:753

thefile is as follows:

(gdb) frame 1
#1  0x0000000000483c87 in apr_file_write (thefile=0x19314b0, 
    buf=0x7fffb849709f, nbytes=0x7fffb8497090) at file_io/unix/readwrite.c:188
188     file_io/unix/readwrite.c: No such file or directory.
        in file_io/unix/readwrite.c
(gdb) print thefile
$1 = (apr_file_t *) 0x19314b0
(gdb) print *thefile
$2 = {
  pool = 0x18fa138, 
  filedes = 8, 
  fname = 0x0, 
  flags = 0, 
  eof_hit = 0, 
  is_pipe = 1, 
  timeout = -1, 
  buffered = 0, 
  blocking = BLK_ON, 
  ungetchar = 0, 
  buffer = 0x0, 
  bufpos = 0, 
  bufsize = 0, 
  dataRead = 0, 
  direction = 0, 
  filePtr = 0, 
  thlock = 0x0
}
(gdb) 

In other words, it is writing to a pipe (thefile->ispipe == 1).

Note that line 188 of file_io/unix/readwrite.c is this:

187         do {
188             rv = write(thefile->filedes, buf, *nbytes);
189         } while (rv == (apr_size_t)-1 && errno == EINTR);

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

Loic Etienne <lo...@tech.swisssign.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |loic.etienne@tech.swisssign
                   |                            |.com

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #6 from george <gc...@gmail.com> ---
After looking at the source code and doing some more testing, here are the
results:

-This problem kicks in when the pipe-of-death becomes full (64 kbytes on a
Linux system). When this happens, then the httpd parent process is stuck in a
write() call, since the pod is blocking (?!) 
-The pipe-of-death is written to by the httpd parent process, one byte at a
time, in its main loop, as long as the number or httpd child processes exceeds
the "MaxSpareServers" setting in the httpd.conf file.
-The pipe-of death is read by the httpd child processes, in their main loop, as
long as there are HTTP requests sent to the server.
-If the httpd server is idle for a long time (5 days or so), then the child
processes are sleeping, either pending on a semaphore or in a system
epoll_wait() call with a very long timeout (100 days ?!).
-If the number of child processes exceeds the MaxSpareServers setting at the
time the httpd server becomes idle, then with only the httpd parent process
spinning, the pipe-of-death is constantly written to and no one is reading from
it.
-When the parent httpd process writes to the pipe-of-death, it also sends a
"dummy" HTTP request to the child processes, making a TCP connection to the
system loopback interface on the httpd listener port.
-If for some reason the loopback interface and/or listener port are blocked by
the Linux firewall for example, then this "dummy" HTTP request fails and never
wakes the child processes up.
-When, after the pipe-of-death becomes full, the first external HTTP request
comes in, the httpd child processes wake up, process the requests, read the
pipe-of-death and go <defunct>.
-The parent httpd process cannot collect the <defunct> child processes, since
it it blocked in the write() system call.

In conclusion this problem is caused by a combination of:

internal httpd design:
-pipe-of-death blocks in write().
-child httpd epoll_wait() call does not timeout.

system problems:
-httpd listener port on the loopback interface is blocked.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

yd <yr...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P2                          |P1

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

yd <yr...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |blocker

--- Comment #4 from yd <yr...@yahoo.com> ---
Could reproduce it again. Basically it is writing to the pod (pipe of death)
and is stuck in it forever. All the child processes are in defunct state. httpd
is no more able to server any further requests.. Can someone please respond. I
am increasing the severity to blocker as it stops serving any requests and is
practically not usable.

Please let me know if any more info is needed as I still have the system in
this state.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #8 from Loic Etienne <lo...@tech.swisssign.com> ---
We observed that some processes try to read from the pipe of death from the
file descriptor 0 (instead of 7 on our system, from which other processes read
the pipe of death).

dummy request read from file descriptor 0
dummy request fails (no ssl)
file descriptor 0 closed
read from file descriptor 0 (instead of pipe of death), error bad file
descriptor
loop to the next connection

Note that the correct file descriptor 7 is open in the child process in
question, but the variable supposed to store it contains 0 instead of 7.

We also observed that all our processes are gradually affected by this problem,
but we have no hypothesis what event triggers this dysfunction.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

santhosh jacob <sa...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |santhoshmukkadan@gmail.com
           Severity|blocker                     |critical

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

Arkadiusz Miskiewicz <ar...@maven.pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |arekm@maven.pl

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #2 from yd <yr...@yahoo.com> ---
Not very easy to reproduce, happens after around 48 hours of normal testing.
The error log has the following errors.

connect to listener on [::]:443
[Thu Jul 05 00:00:01 2012] [warn] (101)Network is unreachable: connect to
listener on [::]:443
[Thu Jul 05 00:00:01 2012] [warn] (101)Network is unreachable: connect to
listener on [::]:443
[Thu Jul 05 00:00:01 2012] [notice] Graceful restart requested, doing restart
[Thu Jul 05 00:00:01 2012] [warn] (101)Network is unreachable: connect to
listener on [::]:443

The Network unreachable message is seen continous for few hours, above are the
last messages in the error log

Following are the last messages in the access log

product.js?v=1a95dd38c7ca8035134db9ac4aa167f6 HTTP/1.1" 304 - (259 us)
10.0.3.45 - - [05/Jul/2012:00:02:46 -0700] "GET
/rollup.css?v=1a95dd38c7ca8035134db9ac4aa167f6 HTTP/1.1" 304 - (91 us)
10.0.3.45 - - [05/Jul/2012:00:02:46 -0700] "GET
/rollup-product.css?v=1a95dd38c7ca8035134db9ac4aa167f6 HTTP/1.1" 304 - (155 us)
10.0.3.45 - - [05/Jul/2012:00:02:47 -0700] "POST /mgmt/xmldata?p=dynamicStatus
HTTP/1.1" 200 138 (10943 us)

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #3 from yd <yr...@yahoo.com> ---
I suspect this has something to do with graceful restart from the logrotate. We
did not see this problem before , but it is more prominent once we started
getting the warning 
[warn] (101)Network is unreachable: connect to listener on [::]:443
which is repeated frequently in the log file.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

Jeff Trawick <tr...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #1 from Jeff Trawick <tr...@apache.org> ---
Hmmm, parent blocked writing to a full pipe with no readers...  Unclear if the
parent needs to use a timeout, or it is writing too many times, or ???

Do you have an easy way to reproduce this?  Does this happen at about the same
time that the error log shows other issues?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #7 from Loic Etienne <lo...@tech.swisssign.com> ---
We could also observe the problem with 2.2.14-5ubuntu8.12

Reading (with cat) from the affected file descriptor (pipe of death) actually
made the parent process stop blocking on write. The affected application was a
test instance, with little traffic. This experimentally confirms the the
analysis above.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


[Bug 53579] httpd looping writing on a pipe and unresponsive (httpd )

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53579

--- Comment #5 from george <gc...@gmail.com> ---
(In reply to comment #4)
> Could reproduce it again. Basically it is writing to the pod (pipe of death)
> and is stuck in it forever. All the child processes are in defunct state.
> httpd is no more able to server any further requests.. Can someone please
> respond. I am increasing the severity to blocker as it stops serving any
> requests and is practically not usable.

Please let me know if any more info
> is needed as I still have the system in this state.

I am bumping into the exact same problem, but I cannot reproduce it.
Could you please let me know how did you reproduce this problem?
Thank you


Backtrace:

(gdb) #0  0xb7cd330e in write () from /lib/libpthread.so.0
(gdb) #1  0x080e1947 in apr_file_write ()
(gdb) #2  0x08080147 in ap_mpm_pod_signal ()
(gdb) #3  0x080b1aa1 in ap_mpm_run ()
(gdb) #4  0x08067ef2 in main ()

All child processes are <defunct>


Server version: Apache/2.2.3
Server built:   Oct  6 2011 18:06:51
Server's Module Magic Number: 20051115:3
Server loaded:  APR 1.2.7, APR-Util 1.2.7
Compiled using: APR 1.2.7, APR-Util 1.2.7
Architecture:   32-bit
Server MPM:     Prefork
  threaded:     no
    forked:     yes (variable process count)
Server compiled with....
 -D APACHE_MPM_DIR="server/mpm/prefork"
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128
 -D HTTPD_ROOT="/"
 -D SUEXEC_BIN="//bin/suexec"
 -D DEFAULT_PIDLOG="/var/run/apache2.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_LOCKFILE="var/run/accept.lock"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="etc/apache/mime.types"
 -D SERVER_CONFIG_FILE="etc/apache/httpd.conf"

Compiled in modules:
  core.c
  mod_authn_file.c
  mod_authn_default.c
  mod_authz_host.c
  mod_authz_groupfile.c
  mod_authz_user.c
  mod_authz_default.c
  mod_auth_basic.c
  mod_include.c
  mod_filter.c
  mod_log_config.c
  mod_env.c
  mod_setenvif.c
  mod_ssl.c
  prefork.c
  http_core.c
  mod_mime.c
  mod_status.c
  mod_autoindex.c
  mod_asis.c
  mod_cgi.c
  mod_negotiation.c
  mod_dir.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_so.c

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org