You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mina.apache.org by Matthew Phillips <ma...@mattp.name> on 2007/11/21 08:48:37 UTC

Memory leak in MINA 1.1.4

Hi all,

it seems I've been hit by a fairly serious memory leak that appears
to be in MINA's heap byte buffer system. Running a MINA-based server
with "ByteBuffer.setUseDirectBuffers (false)" triggers it. This is
confirmed by running a client/server test case where the client sends
and receives 64KB messages in a tight loop (Mac OS X 10.4.11, JDK 1.5)

With ByteBuffer.setUseDirectBuffers (true), I see CPU usage between
116% - 123% and memory usage (RSS) varying rapidly between 18.27MB
and 19.05MB for a 30 minute run. Total messages sent in this mode =
1,479,415.

With ByteBuffer.setUseDirectBuffers (false) and the same test, I see
lower CPU (109%-110%) and memory gradually increasing, with almost no
fluctuation, e.g.

   RSS
   18.88
   19.09
   19.23
   19.31
   19.50

Total messages sent in this mode = 1,138,246 (i.e. 76% of the
performance).

In the real server (Linux, Fedora Core 6, Java(TM) SE Runtime
Environment build 1.6.0_03-b05), RSS usually starts at 24MB and
increases over 24 hours to 45MB. Using jmap in histogram mode to
profile memory usage, I see output like:

         Instances  Size      Class
    1:   2192677    35082832
java.util.concurrent.ConcurrentLinkedQueue$Node
    2:      8285    23427664  [B
    3:     51674     9060984  [C

(this is after 2 days of operational use and at 109MB RSS - I also
have a log showing the growth over 2 days)

So, a small number of byte arrays and a huge (and always increasing)
number of ConcurrentLinkedQueue$Node's is sucking up about 78MB of heap.

I'm not wrapping any of my own buffers with ByteBuffer.wrap (), which
is the only bug on the list that I've been able to find. The only
buffer allocation code is:

    ByteBuffer buffer = ByteBuffer.allocate (4096);
    buffer.setAutoExpand (true);

in the server's ProtocolEncoder implementation.

Any ideas?

Cheers,

Matthew.

Re: Memory leak in MINA 1.1.4

Posted by Bogdan Ciprian Pistol <bo...@gmail.com>.

> Hmm.. I don't have much experience with jmap histogram.  It might be
> including the objects to be GC'd or that has been GC'd so far.  I'm
> not sure.

There is an option "-histo:live" to count only live objects (added in Java 1.6)

Re: Memory leak in MINA 1.1.4

Posted by Matt Phillips <ma...@mattp.name>.

An update: it appears that this is not a memory leak, but a consequence of
the way the Java 6 VM GC works combined with the NIO buffering system.

Using "jmap -histo:live" was the way to go: it showed a fairly constant size
after a while because it seems to force a general GC, whereas the "-histo"
option doesn't, and thus showed the steady increase in buffers that I was
seeing. That, combined with using a large heap size (-Xms64M -Xmx256M), was
triggering the excessive memory usage. I can only assume the Java 6 GC
strategy avoids calling GC at all if it's nowhere near it's max heap size,
because forcing a GC would usually shed 20-30M off the RSS.

So, I estimated actual heap usage by watching the production server with
"jmap -heap", and reduced the allocated heap to "-Xms12M -Xmx96M" (which is
still erring on the large side), at which point RSS levelled out at ~40M,
with ~1MB of buffers.

Thanks for your help. Hopefully this will be of use to anyone else who runs
into this sort of behaviour and mis-diagnoses it.

Matthew.


-- 
View this message in context: http://www.nabble.com/Memory-leak-in-MINA-1.1.4-tf4849077s16868.html#a14012900
Sent from the Apache MINA Support Forum mailing list archive at Nabble.com.

Re: Memory leak in MINA 1.1.4

Posted by Trustin Lee <tr...@gmail.com>.

On Nov 22, 2007 3:08 PM, Matt Phillips <ma...@mattp.name> wrote:
> Trustin Lee wrote:
> >
> > I opened the hprof files you provided in YourKit profiler and it seems
> > like the actual amount of memory the JVM is very small:
> >
> >  http://www.nabble.com/file/p13891096/avis-profiler-report.zip
> > avis-profiler-report.zip
> >
> > I often observe JVM increases its total heap size too much even when the
> > application doesn't demand that much.  Wouldn't it be such a case?
> >
>
> OK, now I'm really confused. Am I reading the output of the jmap -histo
> output wrong? The output I cited in the original mail seemed to be saying
> there were 2,192,677 instances of
> java.util.concurrent.ConcurrentLinkedQueue$Node, taking up 35,082,832 bytes.
> And 8,285 byte arrays taking up 23,427,664 bytes. Which seems excessive for
> a server with just 20 connections, to put it mildly. But, as you say, the
> reports that YourKit generated don't bear this out (although that output is
> not from the exact same instance as the hprof's, it was taken earlier in my
> testing).

Hmm.. I don't have much experience with jmap histogram.  It might be
including the objects to be GC'd or that has been GC'd so far.  I'm
not sure.  Do you get OOM btw?

> Thanks for your help so far,

No problem.  Please keep up updated to find out any potential problem. :)

Thanks,
Trustin
-- 
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP Key ID: 0x0255ECA6

Re: Memory leak in MINA 1.1.4

Posted by Matt Phillips <ma...@mattp.name>.

Trustin Lee wrote:
> 
> I opened the hprof files you provided in YourKit profiler and it seems
> like the actual amount of memory the JVM is very small:
> 
>  http://www.nabble.com/file/p13891096/avis-profiler-report.zip
> avis-profiler-report.zip 
> 
> I often observe JVM increases its total heap size too much even when the
> application doesn't demand that much.  Wouldn't it be such a case?
> 

OK, now I'm really confused. Am I reading the output of the jmap -histo
output wrong? The output I cited in the original mail seemed to be saying
there were 2,192,677 instances of
java.util.concurrent.ConcurrentLinkedQueue$Node, taking up 35,082,832 bytes.
And 8,285 byte arrays taking up 23,427,664 bytes. Which seems excessive for
a server with just 20 connections, to put it mildly. But, as you say, the
reports that YourKit generated don't bear this out (although that output is
not from the exact same instance as the hprof's, it was taken earlier in my
testing).

Probably the best thing is to re-run the server with heap allocation on, let
it run for a few days, and jmap it again to generate both a histogram and a
hprof.

Thanks for your help so far,

Matthew.
-- 
View this message in context: http://www.nabble.com/Memory-leak-in-MINA-1.1.4-tf4849077s16868.html#a13891490
Sent from the Apache MINA Support Forum mailing list archive at Nabble.com.

Re: Memory leak in MINA 1.1.4

Posted by Trustin Lee <tr...@gmail.com>.


Matt Phillips wrote:
> 
> 
> Trustin Lee wrote:
>> 
>> On Nov 21, 2007 7:27 PM, Matt Phillips <ma...@mattp.name> wrote:
>>> Trustin Lee wrote:
>>> >
>>> > You might be writing too fast so that the write request queue piles
>>> > up.  Could you run jmap to dump your server's memory map?
>>> >
>>>
>>> Do you mean you'd like a hprof memory dump?
>> 
>> Yes.
>> 
> 
> It looks like I spoke too soon: in production both heap and direct buffers
> leak, and ironically the direct buffers seem far worse, in direct
> contradiction to what the tight loop test indicated. This is certainly a
> weird situation. I just wish I had access to a profiler in order to track
> the origin...
> 
> In the direct case the histogram looks like:
> 
>   8218080 513630  java.util.concurrent.ConcurrentLinkedQueue$Node
>   2622280 4955    int[]
>   1247320 12293   * ConstMethodKlass
>   1107480 15942   char[]
>   986880  12293   * MethodKlass
>   947200  29600   java.lang.ThreadLocal$ThreadLocalMap$Entry
>   658160  17045   * SymbolKlass
>   625560  26065  
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync$HoldCounter
>   ...
> 
> I've dropped two hprof files at the following locations. The "direct" one
> is using direct heap buffers, the "heap" one is using heap buffers, both
> taken after at least 12 hours of light usage. These will be auto deleted
> in 24 hours, so please grab a copy asap.
> 
>   ftp://publicftp.dsto.defence.gov.au/avis-direct.hprof.bz2
>   ftp://publicftp.dsto.defence.gov.au/avis-heap.hprof.bz2
> 
> Hope this is enough. I might see if I can use the profiler in NetBeans to
> give me a better idea of what's going on.
> 
> Matthew.
> 

I opened the hprof files you provided in YourKit profiler and it seems like
the actual amount of memory the JVM is very small:

http://www.nabble.com/file/p13891096/avis-profiler-report.zip
avis-profiler-report.zip 

I often observe JVM increases its total heap size too much even when the
application doesn't demand that much.  Wouldn't it be such a case?

HTH
Trustin
-- 
View this message in context: http://www.nabble.com/Memory-leak-in-MINA-1.1.4-tf4849077s16868.html#a13891096
Sent from the Apache MINA Support Forum mailing list archive at Nabble.com.

Re: Memory leak in MINA 1.1.4

Posted by Matt Phillips <ma...@mattp.name>.

Trustin Lee wrote:
> 
> On Nov 21, 2007 7:27 PM, Matt Phillips <ma...@mattp.name> wrote:
>> Trustin Lee wrote:
>> >
>> > You might be writing too fast so that the write request queue piles
>> > up.  Could you run jmap to dump your server's memory map?
>> >
>>
>> Do you mean you'd like a hprof memory dump?
> 
> Yes.
> 

It looks like I spoke too soon: in production both heap and direct buffers
leak, and ironically the direct buffers seem far worse, in direct
contradiction to what the tight loop test indicated. This is certainly a
weird situation. I just wish I had access to a profiler in order to track
the origin...

In the direct case the histogram looks like:

  8218080 513630  java.util.concurrent.ConcurrentLinkedQueue$Node
  2622280 4955    int[]
  1247320 12293   * ConstMethodKlass
  1107480 15942   char[]
  986880  12293   * MethodKlass
  947200  29600   java.lang.ThreadLocal$ThreadLocalMap$Entry
  658160  17045   * SymbolKlass
  625560  26065  
java.util.concurrent.locks.ReentrantReadWriteLock$Sync$HoldCounter
  ...

I've dropped two hprof files at the following locations. The "direct" one is
using direct heap buffers, the "heap" one is using heap buffers, both taken
after at least 12 hours of light usage. These will be auto deleted in 24
hours, so please grab a copy asap.

  ftp://publicftp.dsto.defence.gov.au/avis-direct.hprof.bz2
  ftp://publicftp.dsto.defence.gov.au/avis-heap.hprof.bz2

Hope this is enough. I might see if I can use the profiler in NetBeans to
give me a better idea of what's going on.

Matthew.
-- 
View this message in context: http://www.nabble.com/Memory-leak-in-MINA-1.1.4-tf4849077s16868.html#a13888344
Sent from the Apache MINA Support Forum mailing list archive at Nabble.com.

Re: Memory leak in MINA 1.1.4

Posted by Trustin Lee <tr...@gmail.com>.

On Nov 21, 2007 7:27 PM, Matt Phillips <ma...@mattp.name> wrote:
> Trustin Lee wrote:
> >
> > You might be writing too fast so that the write request queue piles
> > up.  Could you run jmap to dump your server's memory map?
> >
>
> Do you mean you'd like a hprof memory dump?

Yes.

> The test client is certainly writing as fast as it can, but the operational
> server is actually very lightly loaded: I haven't looked at network usage,
> but CPU usage never goes above 15% and averages out probably at 2%.

That's weird.  hprof memory dump or YourKit profiler dump is appreciated.

Trustin
-- 
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP Key ID: 0x0255ECA6

Re: Memory leak in MINA 1.1.4

Posted by Matt Phillips <ma...@mattp.name>.

Trustin Lee wrote:
> 
> You might be writing too fast so that the write request queue piles
> up.  Could you run jmap to dump your server's memory map?
> 

Do you mean you'd like a hprof memory dump?

The test client is certainly writing as fast as it can, but the operational
server is actually very lightly loaded: I haven't looked at network usage,
but CPU usage never goes above 15% and averages out probably at 2%.

Matthew.

-- 
View this message in context: http://www.nabble.com/Memory-leak-in-MINA-1.1.4-tf4849077s16868.html#a13874505
Sent from the Apache MINA Support Forum mailing list archive at Nabble.com.

Re: Memory leak in MINA 1.1.4

Posted by Trustin Lee <tr...@gmail.com>.

You might be writing too fast so that the write request queue piles
up.  Could you run jmap to dump your server's memory map?

Trustin

On Nov 21, 2007 4:48 PM, Matthew Phillips <ma...@mattp.name> wrote:
> Hi all,
>
> it seems I've been hit by a fairly serious memory leak that appears
> to be in MINA's heap byte buffer system. Running a MINA-based server
> with "ByteBuffer.setUseDirectBuffers (false)" triggers it. This is
> confirmed by running a client/server test case where the client sends
> and receives 64KB messages in a tight loop (Mac OS X 10.4.11, JDK 1.5)
>
> With ByteBuffer.setUseDirectBuffers (true), I see CPU usage between
> 116% - 123% and memory usage (RSS) varying rapidly between 18.27MB
> and 19.05MB for a 30 minute run. Total messages sent in this mode =
> 1,479,415.
>
> With ByteBuffer.setUseDirectBuffers (false) and the same test, I see
> lower CPU (109%-110%) and memory gradually increasing, with almost no
> fluctuation, e.g.
>
>    RSS
>    18.88
>    19.09
>    19.23
>    19.31
>    19.50
>
> Total messages sent in this mode = 1,138,246 (i.e. 76% of the
> performance).
>
> In the real server (Linux, Fedora Core 6, Java(TM) SE Runtime
> Environment build 1.6.0_03-b05), RSS usually starts at 24MB and
> increases over 24 hours to 45MB. Using jmap in histogram mode to
> profile memory usage, I see output like:
>
>          Instances  Size      Class
>     1:   2192677    35082832
> java.util.concurrent.ConcurrentLinkedQueue$Node
>     2:      8285    23427664  [B
>     3:     51674     9060984  [C
>
> (this is after 2 days of operational use and at 109MB RSS - I also
> have a log showing the growth over 2 days)
>
> So, a small number of byte arrays and a huge (and always increasing)
> number of ConcurrentLinkedQueue$Node's is sucking up about 78MB of heap.
>
> I'm not wrapping any of my own buffers with ByteBuffer.wrap (), which
> is the only bug on the list that I've been able to find. The only
> buffer allocation code is:
>
>     ByteBuffer buffer = ByteBuffer.allocate (4096);
>     buffer.setAutoExpand (true);
>
> in the server's ProtocolEncoder implementation.
>
> Any ideas?
>
> Cheers,
>
> Matthew.
>
>



-- 
what we call human nature is actually human habit
--
http://gleamynode.net/
--
PGP Key ID: 0x0255ECA6