You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2002/07/02 23:34:46 UTC
DO NOT REPLY [Bug 10426] New: -
load average high when httpd doing nothing
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10426>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10426
load average high when httpd doing nothing
Summary: load average high when httpd doing nothing
Product: Apache httpd-2.0
Version: 2.0.39
Platform: PC
OS/Version: Other
Status: NEW
Severity: Normal
Priority: Other
Component: Platform
AssignedTo: bugs@httpd.apache.org
ReportedBy: pab@balancewd.com
On a lightly-loaded system, httpd's load average goes up by 1 permanently
for each httpd process that handles any HTTP traffic at all.
When you start the daemon, and it forks the minimum # httpd's, the load is near
zero.
After you fetch any document, such as /index.html, that 1 child httpd process
begins raising the load average by 1.00 from then on, even with no other
connections in-coming on any of the daemons. If you do it again and get a
different child httpd, that one does the same thing, now the load is around
2.00, and so on.
Top and PS do not show any processes using a lot of CPU. The available CPU is
always 95-100%, which is really weird! If you let it stay this way for 12
hours, with the load high, the amount of accumulated CPU per process is very
low (less than 1 minute). If these processes were really using up CPU they
should at least have used an hour or two of CPU time each!
If you SIGHUP the main httpd, it kills & restarts its children, so the load
drops back down to around 0.00. Same if you completely kill & restart them.
No errors about this go to syslog or to apache's error_log.
The system is completely usable as a web server, and logging in
and typing in a telnet window feels fine
(it doesn't feel like a load of 5.00).
The web server is fully functional, from what we can see, only the load is high.
Operating System: BSDI 4.1 with all patches up to date
Platform: Compaq DL380 (Pentium III)
Doing a Ktrace yields some interesting info, it looks like some
sort of SysV semaphore issue. A Ktrace of a "good" child httpd
(before the problem occurs):
11781 httpd CALL sigprocmask(0x3,0)
11781 httpd RET sigprocmask -65809/0xfffefeef
11781 httpd CALL gettimeofday(0x8047570,0)
11781 httpd RET gettimeofday 0
11781 httpd CALL setitimer(0,0x8047568,0)
11781 httpd RET setitimer 0
11781 httpd CALL sigreturn(0)
11781 httpd RET sigreturn JUSTRETURN
11781 httpd PSIG SIGalrm caught handler=0x281c5d20 mask=0x0 code=0x0
11781 httpd CALL sigprocmask(0x3,0)
11781 httpd RET sigprocmask -65809/0xfffefeef
11781 httpd CALL gettimeofday(0x8047570,0)
11781 httpd RET gettimeofday 0
11781 httpd CALL setitimer(0,0x8047568,0)
11781 httpd RET setitimer 0
11781 httpd CALL sigreturn(0)
11781 httpd RET sigreturn JUSTRETURN
11781 httpd PSIG SIGalrm caught handler=0x281c5d20 mask=0x0 code=0x0
And now, after you ask for 1 document and the load goes up,
here's the same Ktrace on a httpd child process:
11766 httpd CALL sigprocmask(0x3,0)
11766 httpd RET sigprocmask -65809/0xfffefeef
11766 httpd CALL gettimeofday(0x8047910,0)
11766 httpd RET gettimeofday 0
11766 httpd CALL setitimer(0,0x8047908,0)
11766 httpd RET setitimer 0
11766 httpd CALL sigreturn(0)
11766 httpd RET sigreturn JUSTRETURN
11766 httpd CALL semop(0xd0000,0x280faf1c,0x1)
11766 httpd PSIG SIGalrm caught handler=0x281c5d20 mask=0x0 code=0x0
11766 httpd RET semop -1 errno 4 Interrupted system call
11766 httpd CALL sigprocmask(0x3,0)
11766 httpd RET sigprocmask -65809/0xfffefeef
11766 httpd CALL gettimeofday(0x8047910,0)
11766 httpd RET gettimeofday 0
11766 httpd CALL setitimer(0,0x8047908,0)
11766 httpd RET setitimer 0
11766 httpd CALL sigreturn(0)
11766 httpd RET sigreturn JUSTRETURN
11766 httpd CALL semop(0xd0000,0x280faf1c,0x1)
11766 httpd PSIG SIGalrm caught handler=0x281c5d20 mask=0x0 code=0x0
11766 httpd RET semop -1 errno 4 Interrupted system call
Searching groups.google.com for "apache bsdi load" shows that some people
were having our very same problem back in 1997, with Apache 1.0 and 1.1.
I couldn't find any messages newer than about 1998 reporting this problem.
No real resolution was listed, but someone recommended the Ktrace idea above.
>From the ktrace, it looks like the itimer is going off (maybe semop() is
locking up indefinitely?) which sends SIGALRM to the process, which interrupts
the semop(). In the "good" output, above, you can see the semop()'s are
finishing just fine without having to be interrupted by SIGALRM.
So maybe a semaphore (lock) is not being unlocked?
We did not experience this issue with Apache 1.3.9 on this platform,
which we use on 50+ systems today.
We are going to try Apache 1.3.<latest> next.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org