You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2006/09/27 20:46:00 UTC

[jira] Updated: (HADOOP-538) Implement a nio's 'direct buffer' based wrapper over zlib to improve performance of java.util.zip.{De|In}flater as a 'custom codec'

     [ http://issues.apache.org/jira/browse/HADOOP-538?page=all ]

Arun C Murthy updated HADOOP-538:
---------------------------------

    Attachment: HADOOP-538.patch
                HADOOP-538_benchmarks.tgz

Here's a patch incorporating 2 major features:
a) A framework for decoupling the '(de)compressor' from the 'stream' as discussed on this issue i.e. the {Com|Decom}pressor interfaces and a concrete implementation of a Data{Com|Decom}pression{Out|In}putStream for plugging the (de)compressors.
b) A straight-forward wrapper over the popular zlib algorithm (http://www.zlib.net) using a 'direct' java.nio.ByteBuffer - this gives us 60%-70% improvement in performance over existing java.util.zip.{De|In}flater (benchmarks attached).

I have also added some amount of framework (tweaks in build.xml to build the native libhadoop.so, Makefiles, src/native directory etc.) so that any future 'native' code written for hadoop will fit into the native-code framework (src/native dir and libhadoop.so) fairly easily... I have built the infrastructure so that only 1 share-object (libhadoop.so) is created by the top-level Makefile (src/native/Makefile) and it should have *all* of the necessary 'native' code for hadoop.

I'd appreciate feedback/improvements to any of the above stuff...

Doug, could you please treat this patch as 99.9% ready and hold-off committing it (assuming you are satisfied with it) so as to get as much feedback over this as possible? Thanks!


> Implement a nio's 'direct buffer' based wrapper over zlib to improve performance of java.util.zip.{De|In}flater as a 'custom codec'
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-538
>                 URL: http://issues.apache.org/jira/browse/HADOOP-538
>             Project: Hadoop
>          Issue Type: Improvement
>    Affects Versions: 0.6.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.7.0
>
>         Attachments: HADOOP-538.patch, HADOOP-538_benchmarks.tgz
>
>
> There has been more than one instance where java.util.zip's {De|In}flater classes perform unreliably, a simple wrapper over zlib-1.2.3 (latest stable) using java.nio.ByteBuffer (i.e. direct buffers) should go a long way in alleviating these woes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira