You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/09/05 20:51:23 UTC

[jira] Commented: (HADOOP-502) Summer buffer overflow exception

    [ http://issues.apache.org/jira/browse/HADOOP-502?page=comments#action_12432651 ] 
            
Doug Cutting commented on HADOOP-502:
-------------------------------------

To be clear, currently we ignore errors processing checksums (checksum file not found, too short, timeouts while reading, etc.) so that the checksum system only throws user-visible exceptions when data is known to be corrupt.  You're proposing we change this so that, if the checksum file is there, then we may throw user-visible exceptions for errors processing the checksum data (like unexpected eof).  Is that right, or are you proposing something else?

> Summer buffer overflow exception
> --------------------------------
>
>                 Key: HADOOP-502
>                 URL: http://issues.apache.org/jira/browse/HADOOP-502
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.5.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.0
>
>
> The extended error message with the offending values finally paid off and I was able to get the values that were causing the Summber buffer overflow exception.
> java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, summed=512, read=2880, bytesPerSum=1, inSum=512
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
>         at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
>         at java.io.DataInputStream.read(DataInputStream.java:80)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>         at java.util.zip.CRC32.update(CRC32.java:43)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
>         ... 9 more
> Tracking through the code, what happens is inside of FSDataInputStream.Checker.read() the verifySum gets an  EOF Exception and turns off the summing. Among other things this sets the bytesPerSum to 1. Unfortunately, that leads to the ArrayIndexOutOfBoundsException.
> I think the problem is that the original EOF exception was logged and ignored. I propose that we allow the original EOF to propagate back to the caller. (So that file not found will still disable the checksum checking, but we will detect truncated checksum files.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira