You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Slavik Krassovsky (JIRA)" <ji...@apache.org> on 2012/10/09 00:14:02 UTC
[jira] [Commented] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

    [ https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471908#comment-13471908 ] 

Slavik Krassovsky commented on HADOOP-8900:
-------------------------------------------




It’s a quirk of Java to treat masks of integer values as integer types even when applying to a long value – legitimate, yet somewhat imposing.

/**
 @Hadoo Gzip issue repro
•@author viatk
 */

public class Repro {

public static void main(String[] args) {

long smallLongValue = 665615408L; //0x027AC7C30
 long largeLongValue = 9255550000L; //0x227AC7C30
 long largeValueWithIntMask = (largeLongValue & 0xffffffff);
 long largeValueWithLongMask = (largeLongValue & 0xffffffffL);

System.out.println("smallLongValue= "+smallLongValue);
 System.out.println("largeLongValue= "+largeLongValue);
 System.out.println("largeValueWithIntMask ="+largeValueWithIntMask);
 System.out.println("largeValueWithLongMask ="+largeValueWithLongMask);
 System.out.println();

if (largeValueWithIntMask != largeValueWithLongMask) 
{ System.out.println("Here is your repro - largeValueWithIntMask != largeValueWithLongMask"); } 
if (smallLongValue != largeValueWithIntMask) 
{ System.out.println("Thus smallLongValue != largeValueWithIntMask"); } 
if (smallLongValue == largeValueWithLongMask) 
{ System.out.println("The fix is to compare Long values with long values with long masks."); } 
}
 }

smallLongValue= 665615408
 largeLongValue= 9255550000
 largeValueWithIntMask =9255550000
 largeValueWithLongMask =665615408

Here is your repro - largeValueWithIntMask != largeValueWithLongMask
 Thus smallLongValue != largeValueWithIntMask
 The fix is to compare Long values with long values with long masks.



.


  

Chuan Liu added a comment - 24/Aug/12 11:32 AM - edited 


+1
 We found this bug while working with an internal customer.
 The bug exists on Linux as well.
 The root cause is we are comparing long values with a int mask.

                
> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8900
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: Encountered failure when processing large GZIP file
>            Reporter: Slavik Krassovsky
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed size
>  at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira