You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Ames <gr...@remulak.net> on 2001/11/06 22:10:24 UTC

[Fwd: httpd's CPU hogging on daedalus]

Folks, I'm afraid we have a 2.0.27 showstopper.  We had 4 looping
processes on daedalus.  I'm going to leave 2.0.27 up for now but keep a
close watch on it, then bounce us back to 2.0.24 when the usage eases
up.

There are 2 dumps in /usr/local/apache2.0.27/corefiles.  I attached to a
couple of the loopers with gdb so I could be sure I understood the loop
before killing them all.  The key to this problem appears to be getting
a request body that's smaller than the Content-Length header says it
should be.  

The request is a POST to search.apache.org/ .  ap_get_client_block is
looping, continually getting empty brigades.  r->remaining matches the
Content-Length header, so no body bytes were ever returned.  I don't
quite understand that, because it looks like we have a few bytes of body
data in the input buffers.  len_read (*readbytes in the filters) is
currently zero (my bug...ooops...it should have been reset to non-zero
after getting an empty brigade), but on the first iteration it should
have been the same as r->remaining. 

The loop goes as far as core_input_filter, who has an empty brigade in
its context.  I was surprised not to see a socket bucket there.  Should
I be worried about that?  

I'll change ap_get_client_block to keep its calculated length in a local
variable so it can be restored in case ap_get_brigade returns an empty
brigade.  It looks like this will cause core_input_filter to return
APR_EOF, which should get us out of the loop.  I'll also tweak
http_filter so the AP_DEBUG_ASSERT(*readbytes) will catch this case.

Greg
---------------------------------------------------------------------------------------------
input buffers:

POST / HTTP/1.1
Host: search.apache.org
Connection: keep-alive
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/msword, application/vnd.ms-powerpoint,
application/vnd.ms-excel, */*
Referer: http://lynx.abode.com/manual/
Accept-Language: en-gb
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90;
Supplied by blueyonder)
Content-Length: 56
Pragma: no-cache
Via: 1.0 cache-edi (NetCache NetApp/5.1R2D7DEBUG2)
X-Forwarded-For: 213.48.4.108
 
applicÄ

then there's an empty buffer.

corefiles/httpd.core.looper1
#0  0x805d2d8 in ap_http_filter (f=0x82e4a44, b=0x82e4774,
    mode=AP_MODE_BLOCKING, readbytes=0xbfbf7644) at http_protocol.c:588
#1  0x806cfa1 in ap_get_brigade (next=0x82e4a44, bb=0x82e4774,
    mode=AP_MODE_BLOCKING, readbytes=0xbfbf7644) at util_filter.c:250
#2  0x805e5a9 in ap_get_client_block (r=0x82e43fc,
    buffer=0xbfbf96cc "\xfc\226\xbf\xbf\xd0X\f(\f`%\b$\235%\b
\xd5\013(\xb6X\f(\xec\210\f(\xc8\236%\b\xc4\236%\b\xe2\022\031(\220\020\013(\xc8\236%\b,\227\xbf\xbf\xaf\xd7\013(\f`%\b$\235%\b
\xd5\013(\224\xd7\013(\xf4x\e(\xc4\236%\bL\227\xbf\xbf\xb5\xcf\006\b\xf4x\e(\xc8\236%\b\xbc\xd7\xbf\xbf\xa0Y\e($\235%\b",
bufsiz=8192)
    at http_protocol.c:1392
#3  0x281b57b0 in cgi_handler (r=0x82e43fc) at mod_cgi.c:629
#4  0x80630eb in ap_run_handler (r=0x82e43fc) at config.c:185
#5  0x8063683 in ap_invoke_handler (r=0x82e43fc) at config.c:344
#6  0x8060bb5 in ap_internal_redirect (new_uri=0x82e43d4 "/index.cgi",
    r=0x82e103c) at http_request.c:446
#7  0x281e6c9d in handle_dir (r=0x82e103c) at mod_dir.c:194
#8  0x80630eb in ap_run_handler (r=0x82e103c) at config.c:185
#9  0x8063683 in ap_invoke_handler (r=0x82e103c) at config.c:344
#10 0x8060751 in ap_process_request (r=0x82e103c) at http_request.c:286
#11 0x805c472 in ap_process_http_connection (c=0x8125114) at
http_core.c:289
#12 0x806b663 in ap_run_process_connection (c=0x8125114) at
connection.c:82
-------- Original Message --------
Subject: httpd's CPU hogging on daedalus
Date: Tue, 6 Nov 2001 08:46:24 -0800 (PST)
From: Brian Behlendorf <br...@collab.net>
To: <gr...@apache.org>, <tr...@apache.org>
CC: <ro...@apache.org>


This doesn't look good:

last pid: 29771;  load averages:  5.83,  6.65,  6.67         up
4+08:21:09  08:44:22
524 processes: 6 running, 517 sleeping, 1 zombie
CPU states: 86.6% user,  0.0% nice,  8.0% system,  5.3% interrupt,  0.0%
idle
Mem: 526M Active, 239M Inact, 178M Wired, 53M Cache, 112M Buf, 7228K
Free
Swap: 1000M Total, 460K Used, 999M Free

  PID USERNAME     PRI NICE  SIZE    RES STATE  C   TIME   WCPU    CPU
COMMAND
55690 nobody        64   0  4464K  3428K RUN    0 209:39 45.17% 45.17%
httpd
58597 nobody        64   0  3436K  2620K CPU0   0 115:45 38.77% 38.77%
httpd
67215 nobody        64   0  4848K  3808K RUN    1  15:05 38.72% 38.72%
httpd
83986 nobody        64   0  3316K  2472K RUN    0  15:01 34.86% 34.86%
httpd
55902 qmails         2   0  1136K   696K select 0  16:59  1.12%  1.12%
qmail-se
29739 brian         34   0  2548K  1700K CPU1   1   0:00  6.30%  0.88%
top
55904 qmaill        -6   0   888K   396K piperd 0   6:51  0.05%  0.05%
multilog

Could one of you look at this?  Need me to gcore something?  Please look
at it soon, it's probably affecting other things...

	Brian

Re: [Fwd: httpd's CPU hogging on daedalus]

Posted by Justin Erenkrantz <je...@ebuilt.com>.
On Tue, Nov 06, 2001 at 04:10:24PM -0500, Greg Ames wrote:
> The loop goes as far as core_input_filter, who has an empty brigade in
> its context.  I was surprised not to see a socket bucket there.  Should
> I be worried about that?  

If the socket bucket is exhausted (i.e. read all it could), the core 
filter removes the bucket.  So that seems correct.

> I'll change ap_get_client_block to keep its calculated length in a local
> variable so it can be restored in case ap_get_brigade returns an empty
> brigade.  It looks like this will cause core_input_filter to return
> APR_EOF, which should get us out of the loop.  I'll also tweak
> http_filter so the AP_DEBUG_ASSERT(*readbytes) will catch this case.

Hmph.  I haven't had a chance to look at ap_get_client_block in a
while - hopefully will have time tomorrow.  When I redid the
input filtering, I purposely avoided ap_get_client_block, but it
may not be operating under the correct assumptions now.  If someone
can beat me to it, great.  =)  -- justin