You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by himanshu chandola <hi...@yahoo.com> on 2010/02/18 21:54:42 UTC
reduce errors
Hi ,
I posted this earlier, but didn't get a reply. I'm still stuck with my problem so if anyone has ideas, please give them. Would help
me a ton.
Thanks a lot
H
----
I'm struggling with an error while running hadoop and haven't been able to find a solution to it. All the reduces get stuck
and fail at 0%. Some fail with this message:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory
for attempt_201002151946_0001_r_000001_2/intermediate.13
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:313)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
...
Reducers on some of the nodes give this error on failing:
java.io.IOException: All datanodes 10.42.255.203:50010 are bad. Aborting...
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2168)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
...
For the first error, I checked whether the directory 'attempt_*' existed. It did but the file
intermediate.13 didn't exist (The
intermediate files were only upto intermediate.12).
I also checked the fs health and it looks good.
I also tried restarting hadoop and it restarts without any errors. The nodes have sufficient free space so
that couldn't be a problem as well.
Please give me some suggestions if you have any ideas.
Thanks
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.
Re: reduce errors
Posted by himanshu chandola <hi...@yahoo.com>.
Yes,that's a valid directory with enough spare space.
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.
----- Original Message ----
From: Utkarsh Agarwal <un...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Thu, February 18, 2010 4:14:40 PM
Subject: Re: reduce errors
Can you check the mapred-site.xml for
<property>
<name>mapred.local.dir</name>
<value> Give a proper dir HERE </value>
</property>
On Thu, Feb 18, 2010 at 1:54 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:
> Hi ,
> I posted this earlier, but didn't get a reply. I'm still stuck with my
> problem so if anyone has ideas, please give them. Would help
> me a ton.
> Thanks a lot
> H
> ----
> I'm struggling with an error while running hadoop and haven't been able to
> find a solution to it. All the reduces get stuck
> and fail at 0%. Some fail with this message:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory
> for attempt_201002151946_0001_r_000001_2/intermediate.13
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:313)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> ...
>
> Reducers on some of the nodes give this error on failing:
> java.io.IOException: All datanodes 10.42.255.203:50010 are bad.
> Aborting...
> at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2168)
> at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
> ...
> For the first error, I checked whether the directory 'attempt_*' existed.
> It did but the file
> intermediate.13 didn't exist (The
> intermediate files were only upto intermediate.12).
>
> I also checked the fs health and it looks good.
> I also tried restarting hadoop and it restarts without any errors. The
> nodes have sufficient free space so
> that couldn't be a problem as well.
>
> Please give me some suggestions if you have any ideas.
>
> Thanks
>
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>
Re: reduce errors
Posted by Utkarsh Agarwal <un...@gmail.com>.
Can you check the mapred-site.xml for
<property>
<name>mapred.local.dir</name>
<value> Give a proper dir HERE </value>
</property>
On Thu, Feb 18, 2010 at 1:54 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:
> Hi ,
> I posted this earlier, but didn't get a reply. I'm still stuck with my
> problem so if anyone has ideas, please give them. Would help
> me a ton.
> Thanks a lot
> H
> ----
> I'm struggling with an error while running hadoop and haven't been able to
> find a solution to it. All the reduces get stuck
> and fail at 0%. Some fail with this message:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory
> for attempt_201002151946_0001_r_000001_2/intermediate.13
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:313)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> ...
>
> Reducers on some of the nodes give this error on failing:
> java.io.IOException: All datanodes 10.42.255.203:50010 are bad.
> Aborting...
> at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2168)
> at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745)
> ...
> For the first error, I checked whether the directory 'attempt_*' existed.
> It did but the file
> intermediate.13 didn't exist (The
> intermediate files were only upto intermediate.12).
>
> I also checked the fs health and it looks good.
> I also tried restarting hadoop and it restarts without any errors. The
> nodes have sufficient free space so
> that couldn't be a problem as well.
>
> Please give me some suggestions if you have any ideas.
>
> Thanks
>
> Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>