You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Muddy Dixon (JIRA)" <ji...@apache.org> on 2012/09/20 07:14:07 UTC
[jira] [Commented] (HADOOP-8449) hadoop fs -text fails with
compressed sequence files with the codec file extension
[ https://issues.apache.org/jira/browse/HADOOP-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459352#comment-13459352 ]
Muddy Dixon commented on HADOOP-8449:
-------------------------------------
Hi
We found the changes in order of switch and guard block in
{code}private InputStream forMagic(Path p, FileSystem srcFs) throws IOException{code}
Because of this change, return value of {code}codec.createInputStream(i){code} is changed if codec exists.
h4. cdh3u3
{code}
private InputStream forMagic(Path p, FileSystem srcFs) throws IOException {
FSDataInputStream i = srcFs.open(p);
// check codecs
CompressionCodecFactory cf = new CompressionCodecFactory(getConf());
CompressionCodec codec = cf.getCodec(p);
if (codec != null) {
return codec.createInputStream(i);
}
switch(i.readShort()) {
// cases
}
{code}
h4. cdh3u5
{code}
private InputStream forMagic(Path p, FileSystem srcFs) throws IOException {
FSDataInputStream i = srcFs.open(p);
switch(i.readShort()) { // <=== index (or pointer) processes!!
// cases
default: {
// Check the type of compression instead, depending on Codec class's
// own detection methods, based on the provided path.
CompressionCodecFactory cf = new CompressionCodecFactory(getConf());
CompressionCodec codec = cf.getCodec(p);
if (codec != null) {
return codec.createInputStream(i);
}
break;
}
}
// File is non-compressed, or not a file container we know.
i.seek(0);
return i;
}
{code}
> hadoop fs -text fails with compressed sequence files with the codec file extension
> ----------------------------------------------------------------------------------
>
> Key: HADOOP-8449
> URL: https://issues.apache.org/jira/browse/HADOOP-8449
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 1.0.3, 2.0.0-alpha
> Reporter: Joey Echeverria
> Assignee: Harsh J
> Priority: Minor
> Fix For: 2.0.2-alpha
>
> Attachments: HADOOP-8449.patch, HADOOP-8449.patch
>
>
> When the -text command is run on a file and the file ends in the default extension for a codec (e.g. snappy or gz), but is a compressed sequence file, the command will fail.
> The issue is that it assumes that if it matches the extension, then it's plain compressed file. It might be more helpful to check if it's a sequence file first, and then check the file extension second.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira