You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Chris Douglas (JIRA)" <ji...@apache.org> on 2008/09/12 01:15:44 UTC

[jira] Issue Comment Edited: (HADOOP-4162) CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor.

    [ https://issues.apache.org/jira/browse/HADOOP-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630413#action_12630413 ] 

chris.douglas edited comment on HADOOP-4162 at 9/11/08 4:13 PM:
----------------------------------------------------------------

bq. Using LzopCodec as anything but a stream doesn't make sense.

I should probably be clearer. Reusing the decompressor between streams makes sense, but using LzopDecompressor like LzoDecompressor or ZlibDecompressor to effect block compression for a structured file format is not going to work, or at least is unlikely to match the intent. I'm assuming this is related to HADOOP-3315, which- like SequenceFile- shouldn't use LzopCodec.

As written, an LzopDecompressor instance can't be reused between streams. The checksums aren't reset. LzopDecopressor should clear its checksum maps in initHeaderFlags before adding new ones.

      was (Author: chris.douglas):
    bq. Using LzopCodec as anything but a stream doesn't make sense.

I should probably be clearer. Reusing the decompressor between streams makes sense, but using LzopDecompressor like LzoDecompressor or ZlibDecompressor to effect block compression for a structured file format is not going to work, or at least is unlikely to match the intent. I'm assuming this is related to HADOOP-3315, which- like SequenceFile- shouldn't use LzopCodec.

The patch is good, but I'm concerned about possible (mis)uses.
  
> CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-4162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4162
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.0
>            Reporter: Hong Tang
>            Assignee: Arun C Murthy
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4162_0_20080911.patch
>
>
> CodecPool.getDecompressor(LzopCodec) always creates a brand-new decompressor. I investigated the code, the reason seems to be the following:
> LzopCodec inherits from LzoCodec. The getDecompressorType() method is supposed to return the concrete Decompressor class type the specific Codec class creates. In this case, LzopCodec creates LzopDecompressors and should return LzopDecompressor.class. But instead, it uses the getDecompressorType() method defined in the parent and returns LzoDecompressor.class.
> This leads to CodecPool unable to properly recycle the decompressors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.