You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Ames <gr...@remulak.net> on 2001/07/14 03:28:02 UTC

memory leak on daedalus

Brian e-mailed me earlier today because he noticed the httpd processes on 
daedalus were starting to grow.  They were up to around 10-12M when he 
noticed it, so he restarted the server.  

I had bumped up MaxRequestsPerChild from 1000 to 2000 shortly after I put up 
the new 2.0.21-dev build, so after capturing a core dump of a process which 
was up to 5.8M, I lowered MRPC back to 1K and did a graceful restart.  That 
helped some (process size maxed out at around 5.5M) but still looked bigger 
than I remember with our old build (3-4M).  So I set MRPC down to 400 and did 
another graceful restart.  There are still a few at 5.5M, but most are 
smaller.

Anyway, the core dump is in /usr/local/apache2.0.21-dev/corefiles, named 
httpd.core.fat (5.8M) and another named httpd.core.skinny for comparison 
purposes.  If anyone has any cool tools to analyse memory consumption, please 
speak up.  Otherwise, I'll resort to looking at random memory addresses and 
see what's there.

Greg

Re: memory leak on daedalus

Posted by Greg Ames <gr...@remulak.net>.
Greg Ames wrote:

> Digging thru the core dump from the new build, I see we are leaking
> apr_sockaddr_t's.  They are being allocated out of the pconf pool, so
> they will never be cleaned up until the process dies

ap_mpm_pod_signal in mpm_common.c is guilty.  Every time this function
is called by perform_idle_server_maintenance, the parent leaks another
apr_sockaddr_t.  When the parent forks off new children, they inherit
the bloat factor.  Setting MaxRequestsPerChild small doesn't help in
this case; restarting the server is the only circumvention.

Here's my plan:  change the pod creation code to create a single
apr_sockaddr_t containing the loopback address etc.  Hang it off the
pod, and use it from there when needed in ap_mpm_pod_signal.

Greg

Re: memory leak on daedalus

Posted by Greg Ames <gr...@remulak.net>.
Greg Ames wrote:
> 
> Brian e-mailed me earlier today because he noticed the httpd processes on
> daedalus were starting to grow.  They were up to around 10-12M when he
> noticed it, so he restarted the server.

We were back up to 10M per process with MaxRequestsPerChild at 400, so I
bounced the server back to the old 2.0.16 build at 08:58:10 PDT.  The
old build is running fine as far as memory usage.

Digging thru the core dump from the new build, I see we are leaking
apr_sockaddr_t's.  They are being allocated out of the pconf pool, so
they will never be cleaned up until the process dies due to
MaxRequestsPerChild (which can take a lot longer than you might think
with HTTP/1.1) or ap_perform_idle_server_maintenance.  I don't know yet
which piece of code is doing this, so feel free to beat me to it.  

There's other leaked stuff I don't recognize yet, but if we solve the
apr_sockaddr_t problem, the rest will probably fall out for free.

Greg

gory details:

here's some leaked stuff from
/usr/local/apache2.0.21-dev/corefiles/httpd.core.fat

(gdb) x/100xw 0x8700000
0x8700000:      0x00000000      0x00000000      0x00000000     
0x00000000
0x8700010:      0x00000000      0x00000010      0x00000004     
0x00000010
0x8700020:      0x086ffffc      0x00000000      0x00000000     
0x086fff6c
0x8700030:      0x08085f2c      0x08089d7c      0x082b0964     
0x080c000c
0x8700040:      0x08700084      0x00000000      0x00000050     
0x00000002
0x8700050:      0x50000210      0x0100007f      0x00000000     
0x00000000
0x8700060:      0x00000000      0x00000000      0x00000000     
0x00000010
0x8700070:      0x00000004      0x00000010      0x08700054     
0x00000000
0x8700080:      0x00000000      0x2e373231      0x2e302e30     
0x00000031
0x8700090:      0x00000000      0x080c000c      0xffffffff     
0x00000001
0x87000a0:      0x087000c4      0x0870003c      0x002dc6c0     
0x00000000
0x87000b0:      0x00000001      0x00000001      0x00000020     
0x00000000
0x87000c0:      0x00000000      0x080c000c      0x00000000     
0x00000000
0x87000d0:      0x00000000      0x00000002      0x00000200     
0x00000000
0x87000e0:      0x00000000      0x00000000      0x00000000     
0x00000000
0x87000f0:      0x00000000      0x00000010      0x00000004     
0x00000010
0x8700100:      0x087000dc      0x00000000      0x00000000     
0x080c000c
0x8700110:      0x00000000      0x00000000      0x00000000     
0x00000002
0x8700120:      0x00000200      0x00000000      0x00000000     
0x00000000
0x8700130:      0x00000000      0x00000000      0x00000000     
0x00000010
0x8700140:      0x00000004      0x00000010      0x08700124     
0x00000000
0x8700150:      0x00000000      0x08700094      0x08085f2c     
0x08089d7c
0x8700160:      0x082b0964      0x080c000c      0x087001ac     
0x00000000
0x8700170:      0x00000050      0x00000002      0x50000210     
0x0100007f
0x8700180:      0x00000000      0x00000000      0x00000000     
0x00000000

there's an apr_sockaddr_t starting at 0x870003c, another at 0x8700164. 
(hmmm, both of these offset 4 bytes from an 8 byte boundary). 
0x080c000c is the pconf pool.  starting at 0x8700084 is the string
"127.0.0.1" - localhost.  0x00000050 == 80, our port.  0x00000002 is the
address family.

there's another repeating pattern at 0x8700010, 0x8700068, 0x87000f0,
and 0x8700138 which goes <0, 16, 4, 16, (ptr to a few bytes before
this), 0, 0, ptr>.  I don't know what this is.  We shouldn't need to
figure out what it is to solve this problem, but if you have any ideas,
please speak up.  It might come in handy in the future.