You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Big Jules (JIRA)" <ji...@apache.org> on 2007/06/01 04:44:15 UTC
[jira] Commented: (HADOOP-573) Checksum error during sorting in
reducer
[ https://issues.apache.org/jira/browse/HADOOP-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500558 ]
Big Jules commented on HADOOP-573:
----------------------------------
Apologies again.. I didn't see Dennis' reply.. Will try to convince management to replace with ECC memory. I am running tests to see if we should invest in a cluster and move some of our recommendation system algorithms over to hadoop.
> Checksum error during sorting in reducer
> ----------------------------------------
>
> Key: HADOOP-573
> URL: https://issues.apache.org/jira/browse/HADOOP-573
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Owen O'Malley
>
> Many reduce tasks got killed due to checksum error. The strange thing is that the file was generated by the sort function, and was on a local disk. Here is the stack:
> Checksum error: ../task_0011_r_000140_0/all.2.1 at 5342920704
> at org.apache.hadoop.fs.FSDataInputStream$Checker.verifySum(FSDataInputStream.java:134)
> at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:110)
> at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
> at java.io.DataInputStream.readFully(DataInputStream.java:176)
> at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
> at org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1061)
> at org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1126)
> at org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:1354)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeStream.next(SequenceFile.java:1880)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:1938)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:1802)
> at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:1749)
> at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:1494)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:240)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.