You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@httpd.apache.org by "Daniel A. Ramaley" <da...@DRAKE.EDU> on 2006/08/07 17:50:26 UTC

[users@httpd] Segfault & MaxClients problems [long]

Hello. I have been struggling for a few months to solve a problem with 
an Apache web server. First i'll try to describe the symptoms, then 
give details about the configuration and what i have tried so far. I 
would greatly appreciate any suggestions on how to diagnose and correct 
the problem(s). I suspect there are basically 2 issues, one with 
segmentation faults that cause core dumps, the other with MaxClients 
being reached even if the server is not under heavy load. I don't know 
if the problems are distinct or related.


Symptom 1:

After starting Apache during periods of high load the server appears to 
run fine for awhile ("awhile" is highly variable but on the order of 90 
minutes) then starts returning empty pages. When the empty pages start, 
Apache's error_log gets filled with lines like this:

[Fri Aug 04 14:18:29 2006] [notice] child pid 16021 exit signal 
Segmentation fault (11), possible coredump in /etc/httpd/core

These errors are very intermittent (occurring every few hours to every 
few weeks) during periods of light load. I have had CoreDumpDirectory 
defined for awhile and have amassed quite a collection of core files. 
This is the first few lines of what gdb reports for one of the dumps:

(gdb) bt
#0  0x0000002a99ff2492 in preg_replace_impl (ht=Variable "ht" is not 
available.)
    at /usr/src/redhat/BUILD/php-4.3.9/ext/pcre/php_pcre.c:1154
#1  0x0000002a9a0ac255 in execute (op_array=0x552afe2798)
    at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1640
#2  0x0000002a9a0a9386 in execute (op_array=0x552aff7fb8)
    at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684
#3  0x0000002a9a0a9386 in execute (op_array=0x552b0e26c8)
    at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684
#4  0x0000002a9a0a9386 in execute (op_array=0x552af80db8)
    at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684

Symptom 2:

At other times messages about MaxClients show up. I don't believe the 
server is actually running out of processes to serve requests since 
even during the academic year when it is very busy requests are handled 
quickly enough that there are seldom more than a dozen or two in 
progress at any moment.

[Fri Aug 04 02:30:02 2006] [error] server reached MaxClients setting, 
consider raising the MaxClients setting

For reference, the section of httpd.conf that controls process numbers 
looks like this:
    StartServers       8
    MinSpareServers    5
    MaxSpareServers   20
    ServerLimit      256
    MaxClients       128
    MaxRequestsPerChild  1024

I have configured the server-status handler and set a log monitoring 
daemon to grab the status page from Apache when MaxClients is mentioned 
in error_log. During those times the scoreboard looks like this:

    119 processes closing connections
      2 processes reading requests
      4 processes sending replies
      3 processes waiting for connections

During normal operation i consulted the scoreboard and got the following 
as typical results when everything is fine:
      2 closing connections
     11 reading requests
      3 sending replies
      9 waiting for connections

Symptom 3:

A further symptom of problems is found in PostgreSQL's logs. The web 
applications the server runs rely on PostgreSQL, which logs several 
times per minute messages like this:

Aug  7 10:07:29 sun12 postgres[4479]: [1-1] LOG:  unexpected EOF on 
client connection

I believe that error occurs whenever an Apache child process dies 
without properly closing its PostgreSQL connection. PostgreSQL is 
configured to accept 256 connections, twice as many children as Apache 
should spawn.

Symptom 4:

I don't know if this is important, but when Apache's LogLevel is cranked 
up to "debug" entries like this appear in the error_log at a rate of 
several per minute:

[Fri Jun 09 13:56:17 2006] [debug] util_ldap.c(1441): INIT global 
mutex /tmp/filessX9mx in child 25783 


Configuration:

The server is a Sun v40z, which is a dual Opteron box with 2 GB RAM. It 
is running Red Hat Linux Enterprise AS 4.3 64-bit, with Apache 2.0.52, 
PHP 4.3.9, and eAccelerator 0.9.4. The server's purpose is to run Horde 
and Imp, with the latest versions of those packages and a few other 
Horde components (Ingo, Passwd, and Turba).

Apache is installed with Red Hat's RPMs. PHP is compiled from Red Hat's 
source RPMs; i added mcrypt support which is not included by default 
(but needed to improve performance of Horde).

Below is a list of things i have tried that did not work. After each 
test that failed to provide results i reverted to the previous 
configuration.

1) Lowering MaxRequestsPerChild to 64. This increased the error rate.

2) Using Red Hat's RPMs even though they don't have mcrypt support. This 
did nothing.

3) Upgrading PHP to the latest 4.4.x (x==2 at the time i tried that), 
compiled from source. This had no effect.

4) Removed eAccelerator. Problems remained, but server responds to 
requests a bit slower (as would be expected).

5) Running with a uniprocessor kernel. No effect on the errors.

6) Switched Apache and PHP to 32-bit. The rest of the system is 64-bit. 
This also had no effect.

Over the last few months i have exchanged many messages on the Horde and 
Imp mailing lists trying to track down the problems. The consensus 
seems to be that Horde is not the problem, hence i turn to this list 
for help instead. If you can provide any assistance, i will be very 
grateful. Thank you in advance for your time and for reading this long 
message.

------------------------------------------------------------------------
Dan Ramaley                            Dial Center 118, Drake University
Network Programmer/Analyst             2407 Carpenter Ave
+1 515 271-4540                        Des Moines IA 50311 USA

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: [users@httpd] Segfault & MaxClients problems [long]

Posted by Joshua Slive <jo...@slive.ca>.

On 8/7/06, Daniel A. Ramaley <da...@drake.edu> wrote:

> Roughly half of the connections were from IPs that had less than 5
> connections open (most had just 1). I would think those are normal
> users. The rest of the connections were taken up by half a dozen IPs
> with this number of connections open each: 5, 5, 8, 9, 12, 24. Needing
> 24 connections seems suspicious to me. But those connections were used
> to grab images that make up the design of the web site. Perhaps it is a
> user with a buggy web browser? Is there a way to tell Apache not to
> leave connections in the closing state for so long, and just to hurry
> up and close them?

Lowering the Timeout directive might help.

Joshua.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: [users@httpd] Segfault & MaxClients problems [long]

Posted by "Daniel A. Ramaley" <da...@DRAKE.EDU>.

On Monday 07 August 2006 11:05, Joshua Slive wrote:
>That is clearly a programming error either in php or pcre.  You might
>want to try php forums or the php bug database to see if you can find
>the solution.

Thank you for taking the time to read my message and post a response. 
I'll check the PHP forums and mailing lists.

>This problem seems unrelated to the first.  It does seem unusual to
>have all those processes in the closing phase.  When you look at the
>full server-status output, do many of the lines comes from the same
>IP?  This would be an indication of a (deliberate or accidental)
>denial of service attack.  Otherwise, you could look at commonality
>between the types of requests they are processing.

Roughly half of the connections were from IPs that had less than 5 
connections open (most had just 1). I would think those are normal 
users. The rest of the connections were taken up by half a dozen IPs 
with this number of connections open each: 5, 5, 8, 9, 12, 24. Needing 
24 connections seems suspicious to me. But those connections were used 
to grab images that make up the design of the web site. Perhaps it is a 
user with a buggy web browser? Is there a way to tell Apache not to 
leave connections in the closing state for so long, and just to hurry 
up and close them?

------------------------------------------------------------------------
Dan Ramaley                            Dial Center 118, Drake University
Network Programmer/Analyst             2407 Carpenter Ave
+1 515 271-4540                        Des Moines IA 50311 USA

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: [users@httpd] Segfault & MaxClients problems [long]

Posted by Joshua Slive <jo...@slive.ca>.

On 8/7/06, Daniel A. Ramaley <da...@drake.edu> wrote:

> Symptom 1:
>
> After starting Apache during periods of high load the server appears to
> run fine for awhile ("awhile" is highly variable but on the order of 90
> minutes) then starts returning empty pages. When the empty pages start,
> Apache's error_log gets filled with lines like this:
>
> [Fri Aug 04 14:18:29 2006] [notice] child pid 16021 exit signal
> Segmentation fault (11), possible coredump in /etc/httpd/core
>
> These errors are very intermittent (occurring every few hours to every
> few weeks) during periods of light load. I have had CoreDumpDirectory
> defined for awhile and have amassed quite a collection of core files.
> This is the first few lines of what gdb reports for one of the dumps:
>
> (gdb) bt
> #0  0x0000002a99ff2492 in preg_replace_impl (ht=Variable "ht" is not
> available.)
>     at /usr/src/redhat/BUILD/php-4.3.9/ext/pcre/php_pcre.c:1154
> #1  0x0000002a9a0ac255 in execute (op_array=0x552afe2798)
>     at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1640
> #2  0x0000002a9a0a9386 in execute (op_array=0x552aff7fb8)
>     at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684
> #3  0x0000002a9a0a9386 in execute (op_array=0x552b0e26c8)
>     at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684
> #4  0x0000002a9a0a9386 in execute (op_array=0x552af80db8)
>     at /usr/src/redhat/BUILD/php-4.3.9/Zend/zend_execute.c:1684

That is clearly a programming error either in php or pcre.  You might
want to try php forums or the php bug database to see if you can find
the solution.


>
> Symptom 2:
>
> At other times messages about MaxClients show up. I don't believe the
> server is actually running out of processes to serve requests since
> even during the academic year when it is very busy requests are handled
> quickly enough that there are seldom more than a dozen or two in
> progress at any moment.
>
> [Fri Aug 04 02:30:02 2006] [error] server reached MaxClients setting,
> consider raising the MaxClients setting

>     119 processes closing connections
>       2 processes reading requests
>       4 processes sending replies
>       3 processes waiting for connections

This problem seems unrelated to the first.  It does seem unusual to
have all those processes in the closing phase.  When you look at the
full server-status output, do many of the lines comes from the same
IP?  This would be an indication of a (deliberate or accidental)
denial of service attack.  Otherwise, you could look at commonality
between the types of requests they are processing.

Joshua.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org