You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2006/09/01 20:06:23 UTC

[jira] Created: (HADOOP-502) Summer buffer overflow exception

Summer buffer overflow exception
--------------------------------

                 Key: HADOOP-502
                 URL: http://issues.apache.org/jira/browse/HADOOP-502
             Project: Hadoop
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.5.0
            Reporter: Owen O'Malley
         Assigned To: Owen O'Malley
             Fix For: 0.6.0


The extended error message with the offending values finally paid off and I was able to get the values that were causing the Summber buffer overflow exception.

java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, summed=512, read=2880, bytesPerSum=1, inSum=512
        at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
        at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
        at java.io.DataInputStream.read(DataInputStream.java:80)
        at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
        at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
Caused by: java.lang.ArrayIndexOutOfBoundsException
        at java.util.zip.CRC32.update(CRC32.java:43)
        at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
        ... 9 more

Tracking through the code, what happens is inside of FSDataInputStream.Checker.read() the verifySum gets an  EOF Exception and turns off the summing. Among other things this sets the bytesPerSum to 1. Unfortunately, that leads to the ArrayIndexOutOfBoundsException.

I think the problem is that the original EOF exception was logged and ignored. I propose that we allow the original EOF to propagate back to the caller. (So that file not found will still disable the checksum checking, but we will detect truncated checksum files.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-502) Summer buffer overflow exception

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ http://issues.apache.org/jira/browse/HADOOP-502?page=comments#action_12432651 ] 
            
Doug Cutting commented on HADOOP-502:
-------------------------------------

To be clear, currently we ignore errors processing checksums (checksum file not found, too short, timeouts while reading, etc.) so that the checksum system only throws user-visible exceptions when data is known to be corrupt.  You're proposing we change this so that, if the checksum file is there, then we may throw user-visible exceptions for errors processing the checksum data (like unexpected eof).  Is that right, or are you proposing something else?

> Summer buffer overflow exception
> --------------------------------
>
>                 Key: HADOOP-502
>                 URL: http://issues.apache.org/jira/browse/HADOOP-502
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.5.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.0
>
>
> The extended error message with the offending values finally paid off and I was able to get the values that were causing the Summber buffer overflow exception.
> java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, summed=512, read=2880, bytesPerSum=1, inSum=512
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
>         at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
>         at java.io.DataInputStream.read(DataInputStream.java:80)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>         at java.util.zip.CRC32.update(CRC32.java:43)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
>         ... 9 more
> Tracking through the code, what happens is inside of FSDataInputStream.Checker.read() the verifySum gets an  EOF Exception and turns off the summing. Among other things this sets the bytesPerSum to 1. Unfortunately, that leads to the ArrayIndexOutOfBoundsException.
> I think the problem is that the original EOF exception was logged and ignored. I propose that we allow the original EOF to propagate back to the caller. (So that file not found will still disable the checksum checking, but we will detect truncated checksum files.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira