You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2006/09/14 18:39:22 UTC

[jira] Created: (HADOOP-532) Writable underrun in sort example

Writable underrun in sort example
---------------------------------

                 Key: HADOOP-532
                 URL: http://issues.apache.org/jira/browse/HADOOP-532
             Project: Hadoop
          Issue Type: Bug
          Components: io
    Affects Versions: 0.6.1
            Reporter: Owen O'Malley
         Assigned To: Owen O'Malley
             Fix For: 0.6.1


When running the sort benchmark, I get consistent failures of this sort:

java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-532) Writable underrun in sort example

Posted by "Bryan Pendleton (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-532?page=comments#action_12434776 ] 
            
Bryan Pendleton commented on HADOOP-532:
----------------------------------------

I had trouble reading old compressed SequenceFiles using the new block-compressing code, with similar kinds of problems. I haven't been able to characterize why, yet, so I don't have anything useful to add to this bug except "I've seen this general sort of thing, too!". In my case, it's with a class that's similar to BytesWritable.

> Writable underrun in sort example
> ---------------------------------
>
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>
>
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-532) Writable underrun in sort example

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-532?page=all ]

Doug Cutting updated HADOOP-532:
--------------------------------

    Fix Version/s: 0.6.2
                       (was: 0.6.1)

> Writable underrun in sort example
> ---------------------------------
>
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>
>
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-532) Writable underrun in sort example

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-532?page=all ]

Owen O'Malley updated HADOOP-532:
---------------------------------

    Status: Patch Available  (was: Open)

> Writable underrun in sort example
> ---------------------------------
>
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>
>         Attachments: seqfile-underread-check.patch
>
>
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-532) Writable underrun in sort example

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-532?page=all ]

Doug Cutting updated HADOOP-532:
--------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

I just committed this.  Thanks, Owen!

> Writable underrun in sort example
> ---------------------------------
>
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>
>         Attachments: seqfile-underread-check.patch
>
>
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-532) Writable underrun in sort example

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-532?page=all ]

Owen O'Malley updated HADOOP-532:
---------------------------------

    Attachment: seqfile-underread-check.patch

The compression codec is not reading the entire value buffer, but it is getting the correct value. (I suspect the unread bytes are a crc.) This error message is the SequenceFile complaining that the entire buffer was not used.

This patch:
  1. extends the unit test to use bigger values so that we detect the problem
  2. allows the user of the org.apache.hadoop.io.TestSequenceFile main program to control the random seed (and prints out the seed value, even if it is random).
  3. check that the stream is done by trying to read the next byte on the input stream.
  4. removes some redundant buffering of the already buffered value stream.
  5. marks the start of the value in non-block compressed sequence files and does a reset at the front of getCurrentValue.

> Writable underrun in sort example
> ---------------------------------
>
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>
>         Attachments: seqfile-underread-check.patch
>
>
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150) at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira