You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2011/03/01 05:23:38 UTC

[jira] Commented: (HADOOP-6835) Support concatenated gzip and bzip2 files

    [ https://issues.apache.org/jira/browse/HADOOP-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000704#comment-13000704 ] 

Hudson commented on HADOOP-6835:
--------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #616 (See [https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/616/])
    MAPREDUCE-1927. Unit test for HADOOP-6835 (concatenated gzip support). Contributed by Greg Roelofs.


> Support concatenated gzip and bzip2 files
> -----------------------------------------
>
>                 Key: HADOOP-6835
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6835
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.20.2
>            Reporter: Tom White
>            Assignee: Greg Roelofs
>             Fix For: 0.22.0
>
>         Attachments: C6835-9.patch, HADOOP-6835.v3.yahoo-0.20.2xx-branch.patch, HADOOP-6835.v4.trunk-hadoop-common.patch, HADOOP-6835.v4.trunk-hadoop-mapreduce.patch, HADOOP-6835.v4.yahoo-0.20.2xx-branch.patch, HADOOP-6835.v5.trunk-hadoop-common.patch, HADOOP-6835.v6.trunk-hadoop-common.patch, HADOOP-6835.v7.trunk-hadoop-common.patch, HADOOP-6835.v8.trunk-hadoop-common.patch, HADOOP-6835.v9.yahoo-0.20.2xx-branch.patch, MR-469.v2.yahoo-0.20.2xx-branch.patch, grr-hadoop-common.dif.20100614c, grr-hadoop-mapreduce.dif.20100614c
>
>
> When running MapReduce with concatenated gzip files as input only the first part is read, which is confusing, to say the least. Concatenated gzip is described in http://www.gnu.org/software/gzip/manual/gzip.html#Advanced-usage and in http://www.ietf.org/rfc/rfc1952.txt. (See original report at http://www.nabble.com/Problem-with-Hadoop-and-concatenated-gzip-files-to21383097.html)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira