You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Tyler Morgan <ty...@xmission.com> on 2003/10/02 03:03:56 UTC

[users@httpd] Question about some strace output -- Is this normal?

Hello,

I am having some problems with runaway apache child processes (1 or 2 at a
time) that consume 100% cpu and eventually get killed after consuming 100%
of memory (all swap too). This is on a i386 Debian/testing machine running
2.4.21-grsec with apache 1.3.27.0-2 (.deb distro) with typical hardware,
2.4ghz P4, Intel 82845G/GL [Brookdale-G] chipset, PC2700 DDR, and IDE
Seagate drives.

Anywhere from 30 seconds to 12 hours after starting apache 1 or 2 processes
will begin to consume tons of resources and bring the machine to it's
knees. There is not a specific event that causes this based on what I can
tell from apache and system logs.

So far I have:
- Compiled 1.3.28 from source and used that instead of the deb from testing
- Confirmed md5 checksums of apache binaries and linked libraries against an
  identical (not identical in hardware, just software) problem-free machine
- Reverted to a 100% default configuration for testing purposes (minus some
  prefix stuff)
- Replaced all hardware in the machine (we're in an enviorment with
  hundreds of identical machines to swap hardware around)
- Tested under both 2.4.20 and 2.4.22 without grsec patches
- Watched the server-status page before and during, always shows a couple
  dozen connections (about average) with no abnormal requests

When the problem is occuring it looks like this:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18950 www-data  16   0 1367m 856m 1732 R 75.2 85.7   1:54.47 apache

Today I started looking at strace output of pids that are runaway, and I
see this happening over and over hundreds (thousands?) of times:

<-- strace -->
# strace -p 18950
mremap(0x4a778000, 1638899712, 1638899712, MREMAP_MAYMOVE) = 0x4a778000
open("/etc/passwd", O_RDONLY)           = 4
fcntl64(4, F_GETFD)                     = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
_llseek(4, 0, [0], SEEK_CUR)            = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=4165, ...}) = 0
mmap2(NULL, 4165, PROT_READ, MAP_SHARED, 4, 0) = 0x46437000
_llseek(4, 4165, [4165], SEEK_SET)      = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=4165, ...}) = 0
munmap(0x46437000, 4165)                = 0
close(4)                                = 0
<-- end strace -->

That repeats over and over.

Eventually I get:
...
_llseek(4, 0, [0], SEEK_CUR)            = 0
+++ killed by SIGKILL +++
<dead>

And dmesg shows:
Out of Memory: Killed process 18950 (apache).

There is no relevant information in error.log or syslog as to why this is
happening before, during, or after this sequence of events.

My most imporatnt question is: Why is apache trying to open /etc/passwd
hundreds of times per minute? Is this normal? I straced apache on another
problem-free machine and did not notice /etc/passwd being opened. Perhaps I
am misinterperting the strace (not a serious programmer) and something else
is happening?

And, of course, any ideas as to the cause of these runaway pids are greatly
appreciated, because I'm just about all out of them :)

Thanks for reading,
Tyler Morgan

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org