You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by himanshu chandola <hi...@yahoo.com> on 2010/02/16 22:32:48 UTC
fs errors in reduce
Hi ,
I'm struggling with an error while running hadoop and haven't been able to find a solution to it. At the end of the map phase, all the reduces get stuck and fail at 0%. Some fail with this message:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for attempt_201002151946_0001_r_000001_2/intermediate.13
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:313)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
...
Reducers on some of the nodes give this error on failing:
java.io.IOException: All datanodes 10.42.255.203:50010 are bad. Aborting...
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2168)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
...
For the first error, I checked whether the directory 'attempt_*' existed. It did but the file intermediate.13 didn't exist (The
intermediate files were only upto intermediate.12).
I also checked the fs health and it looks good.
I also tried restarting hadoop and it restarts without any errors. The nodes have sufficient free space so that couldn't be a problem as well.
Please give me some suggestions if you have any ideas.
Thanks
H
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.