You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by André Martin <ma...@andremartin.de> on 2008/05/16 15:02:58 UTC

java.io.IOException: Could not obtain block / java.io.IOException: Could not get block locations

Hi Hadoopers,
we are experiencing a lot of "Could not obtain block / Could not get 
block locations IOExceptions" when processing a 400 GB large Map/Red job 
using our 6 nodes DFS & MapRed (v. 0.16.4) cluster. Each node is 
equipped with a 400GB Sata HDD and running Suse Linux Enterprise 
Edition. While processing this "huge" MapRed job, the name node doesn't 
seem to receive heartbeats from datanodes for up to a couple of minutes 
and thus marks those nodes as dead even they are still alive and serving 
blocks according to their logs. We first suspected network congestion 
and measured the inter-node bandwidth using scp - receiving throughputs 
of 30MB/s. CPU utilization is about 100% when processing the job, 
however, the tasktracker instances shouldn't cause such datanode drop outs?
In the datanode logs, we see a lot of java.io.IOException: Block 
blk_-7943096461180653598 is valid, and cannot be written to. errors...

Any ideas? Thanks in advance.

Cu on the 'net,
                        Bye - bye,

                                   <<<<< André <<<< >>>> èrbnA >>>>>

Re: java.io.IOException: Could not obtain block / java.io.IOException: Could not get block locations

Posted by André Martin <ma...@andremartin.de>.

Hi dhruba,
we are running the latest Sun Java 6u10-beta, and the namenode runs with 
25 threads on a quad core machine.

Cu on the 'net,
                      Bye - bye,

                                 <<<<< André <<<< >>>> èrbnA >>>>>


Dhruba Borthakur wrote:
> What version of java are you using? How may threads are you running on
> the namenode? How many cores does your machines have?
>
> thanks,
> dhruba
>
> On Fri, May 16, 2008 at 6:02 AM, André Martin <ma...@andremartin.de> wrote:
>   
>> Hi Hadoopers,
>> we are experiencing a lot of "Could not obtain block / Could not get block
>> locations IOExceptions" when processing a 400 GB large Map/Red job using our
>> 6 nodes DFS & MapRed (v. 0.16.4) cluster. Each node is equipped with a 400GB
>> Sata HDD and running Suse Linux Enterprise Edition. While processing this
>> "huge" MapRed job, the name node doesn't seem to receive heartbeats from
>> datanodes for up to a couple of minutes and thus marks those nodes as dead
>> even they are still alive and serving blocks according to their logs. We
>> first suspected network congestion and measured the inter-node bandwidth
>> using scp - receiving throughputs of 30MB/s. CPU utilization is about 100%
>> when processing the job, however, the tasktracker instances shouldn't cause
>> such datanode drop outs?
>> In the datanode logs, we see a lot of java.io.IOException: Block
>> blk_-7943096461180653598 is valid, and cannot be written to. errors...
>>
>> Any ideas? Thanks in advance.
>>
>> Cu on the 'net,
>>                       Bye - bye,
>>
>>                                  <<<<< André <<<< >>>> èrbnA >>>>>
>>
>>
>>
>>

Re: java.io.IOException: Could not obtain block / java.io.IOException: Could not get block locations

Posted by Dhruba Borthakur <dh...@gmail.com>.

What version of java are you using? How may threads are you running on
the namenode? How many cores does your machines have?

thanks,
dhruba

On Fri, May 16, 2008 at 6:02 AM, André Martin <ma...@andremartin.de> wrote:
> Hi Hadoopers,
> we are experiencing a lot of "Could not obtain block / Could not get block
> locations IOExceptions" when processing a 400 GB large Map/Red job using our
> 6 nodes DFS & MapRed (v. 0.16.4) cluster. Each node is equipped with a 400GB
> Sata HDD and running Suse Linux Enterprise Edition. While processing this
> "huge" MapRed job, the name node doesn't seem to receive heartbeats from
> datanodes for up to a couple of minutes and thus marks those nodes as dead
> even they are still alive and serving blocks according to their logs. We
> first suspected network congestion and measured the inter-node bandwidth
> using scp - receiving throughputs of 30MB/s. CPU utilization is about 100%
> when processing the job, however, the tasktracker instances shouldn't cause
> such datanode drop outs?
> In the datanode logs, we see a lot of java.io.IOException: Block
> blk_-7943096461180653598 is valid, and cannot be written to. errors...
>
> Any ideas? Thanks in advance.
>
> Cu on the 'net,
>                       Bye - bye,
>
>                                  <<<<< André <<<< >>>> èrbnA >>>>>
>
>
>