You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Ryan Rawson <ry...@gmail.com> on 2009/03/27 21:37:43 UTC

direct buffer considered harmful

Hi all,

I ran into this on my TRUNK hbase setup:
java.io.IOException: java.lang.OutOfMemoryError: Direct buffer memory

The pertinent details of the stack trace are:
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at
org.apache.hadoop.io.compress.zlib.ZlibDecompressor.<init>(ZlibDecompressor.java:110)
at
org.apache.hadoop.io.compress.GzipCodec.createDecompressor(GzipCodec.java:188)
at
org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:120)
at
org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:267)
at
org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:871)

Ok, so what is this mysterious direct buffer and why am I dying?

This might be because I have 800 regions and 300+ gb of compressed hfiles.

So I looked at the ZlibDecompressor in hadoop, and it looks like there is
_no_ reason whatsoever to be using direct buffers.

A little background:
ByteBuffer offers 2 types of allocation: normal (backed by byte[]) and
'direct'. The direct kind lives outside the scope of normal heap, and can
be passed via nio to the underlying OS possibly optimizing things. But
there is only so much direct buffer space available, and you should only use
it if you are _sure_ you need to. Furthermore there appears to be GC bugs
that doesn't let the JVM reclaim these buffers as quickly as it should - you
can go OOME without actually being OOME.

The hadoop compression library attempts to keep things under control by
reusing codecs and therefore the direct buffers. But each codec uses
128kbytes of buffer and once you open too many, you go OOME.

I am not sure why the lib uses direct buffers. We might be able to switch
it to not using direct buffers...

I think we should attempt to procure our own fast zlib-like compression
library that is not in hadoop however.

Re: direct buffer considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.

I have discovered a few things since I wrote this:

- DirectBuffers allow native methods a no-copy way of sharing data with
JVM/Java code.
-- The codecs use this for that purpose.
- HFile wasn't returning codecs to the codec pool.
- Java 1.7 has GC bugs with DirectBuffers - switching to 1.6.0_13 fixed my
OOME crashes.

Does anyone know someone working on Java 1.7?  My repro case isn't great:
"use hbase trunk and put at least 120 gb of data in compressed tables".

-ryan

On Fri, Mar 27, 2009 at 1:37 PM, Ryan Rawson <ry...@gmail.com> wrote:

> Hi all,
>
> I ran into this on my TRUNK hbase setup:
> java.io.IOException: java.lang.OutOfMemoryError: Direct buffer memory
>
> The pertinent details of the stack trace are:
>         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
>         at
> org.apache.hadoop.io.compress.zlib.ZlibDecompressor.<init>(ZlibDecompressor.java:110)
>         at
> org.apache.hadoop.io.compress.GzipCodec.createDecompressor(GzipCodec.java:188)
>         at
> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:120)
>         at
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:267)
>         at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:871)
>
> Ok, so what is this mysterious direct buffer and why am I dying?
>
> This might be because I have 800 regions and 300+ gb of compressed hfiles.
>
> So I looked at the ZlibDecompressor in hadoop, and it looks like there is
> _no_ reason whatsoever to be using direct buffers.
>
> A little background:
> ByteBuffer offers 2 types of allocation: normal (backed by byte[]) and
> 'direct'.  The direct kind lives outside the scope of normal heap, and can
> be passed via nio to the underlying OS possibly optimizing things.  But
> there is only so much direct buffer space available, and you should only use
> it if you are _sure_ you need to.  Furthermore there appears to be GC bugs
> that doesn't let the JVM reclaim these buffers as quickly as it should - you
> can go OOME without actually being OOME.
>
> The hadoop compression library attempts to keep things under control by
> reusing codecs and therefore the direct buffers.  But each codec uses
> 128kbytes of buffer and once you open too many, you go OOME.
>
> I am not sure why the lib uses direct buffers.  We might be able to switch
> it to not using direct buffers...
>
> I think we should attempt to procure our own fast zlib-like compression
> library that is not in hadoop however.
>