You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Johan Oskarsson (JIRA)" <ji...@apache.org> on 2007/05/25 19:14:16 UTC

[jira] Created: (HADOOP-1434) Let users add compression types

Let users add compression types
-------------------------------

                 Key: HADOOP-1434
                 URL: https://issues.apache.org/jira/browse/HADOOP-1434
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Johan Oskarsson
            Priority: Minor


This is probably a special case, but we're considering serving data from the generated sequence files to avoid having to convert to other file format.

However, using block compression means we'd have to read up to almost one mb (default) of data to find the data. Our records are so small that compressing
them using records compression increases the size of the file compared to no compression. 

I'd like to make a modified version of the BlockCompressWriter that ends a block depending on features of the key appended.
There's currently no easy way of adding this without modifying SequenceFile directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.