You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Xiao Kang (JIRA)" <ji...@apache.org> on 2010/03/30 08:44:27 UTC

[jira] Updated: (HADOOP-6662) hadoop zlib compression does not fully utilize the buffer

     [ https://issues.apache.org/jira/browse/HADOOP-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Kang updated HADOOP-6662:
------------------------------

    Attachment: ZlibCompressor.patch

Patch attached.

needsInput() check the uncompressedDirectBuf, if it is full return false, else copy data from saved userBuf and then recheck.

A special case, that the input uncompressedDirectBuf is not all comsumed by zlib due to output buffer is not enough, should be respected. It may be the reason the original code just return false if  uncompressedBufLen > 0.

After JNI compress invoked, uncompressedBufLen will be set back to the remaining input data length that not consumed by zlib. So if uncompressedBufLen > 0 after deflateBytesDirect() invoked, a flag keepUncompressedBuf is setted true to indicate no input needed and compress() should be invoked again to compress the remainling input data.

> hadoop zlib compression does not fully utilize the buffer
> ---------------------------------------------------------
>
>                 Key: HADOOP-6662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6662
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.20.2
>            Reporter: Xiao Kang
>         Attachments: ZlibCompressor.patch
>
>
> org.apache.hadoop.io.compress.ZlibCompressonr does not fully utilize its buffer. 
> Its needesInput() return false when there is any data in its buffer (64K by default). The performance will greately degrade since an JNI call will be invoded at each time the write() method of CompressonStream is called. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.