You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Kumar Pandey <ku...@gmail.com> on 2009/01/16 07:02:10 UTC

hadoop 0.19.0 and data node failure

To test hadoop's fault tolerence I tried the following node A -- name node
and secondaryname node
nodeB  - datanode
nodeC  - datanode

replica set to 2.
When A, B and C are running I'm able to make a round trip for a wav file.

Now to test fault tolerence I brought nodeB down and tried to write a file.
Writing failed even though nodeC was up and running with following msg.
More interestingly the file of size was listed in the name node.
I would have expected hadoop to write the file to NodeB

##############error msg###################
[hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav

09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException
09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
blk_4025795281260753088_1216
09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
10.0.3.136:50010
09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
blk_-2076345051085316536_1216
09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
blk_2666380449580768625_1216
09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
blk_742770163755453348_1216
09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
blk_742770163755453348_1216 bad datanode[0] nodes == null
09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
Aborting...
copyFromLocal: No route to host
Exception closing file
/user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
        at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
        at
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
        at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)

Re: hadoop 0.19.0 and data node failure

Posted by Kumar Pandey <ku...@gmail.com>.
Played with it a bit more and made few observations which I thought I'd
share

1) If replica is set to 2 and you have minimum of 2 datanodes running then
other datanodes going down doesn't affect the write and read.
2) If datanodes is < replica write still goes through provided namenode
recognizes that the datanodes are dead. This can take upto 10 min by default
but can be tweaked with heartbeat.recheck.interval



On Fri, Jan 16, 2009 at 6:56 AM, Kumar Pandey <ku...@gmail.com>wrote:

> Thanks Brian I'll try with 3 data node and bringing down one with replica
> as 2.
>  I should probably go ahead and file a bug for the fact that although write
> failed, the file was listed under directory listing with size zero and
> subsequent write attempt with both nodes up failed with following error.
>
>  Target jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav already
> exists
>
>
>
> On Fri, Jan 16, 2009 at 6:10 AM, Brian Bockelman <bb...@cse.unl.edu>wrote:
>
>> Hey Kumar,
>>
>> Hadoop won't let you write new blocks if it can't write them at the right
>> replica level.
>>
>> You've requested to write a block with two replicas on a system where
>> there's only one datanode alive.  I'd hope that it wouldn't let you create a
>> new file!
>>
>> Brian
>>
>>
>> On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:
>>
>>  To test hadoop's fault tolerence I tried the following node A -- name
>>> node
>>> and secondaryname node
>>> nodeB  - datanode
>>> nodeC  - datanode
>>>
>>> replica set to 2.
>>> When A, B and C are running I'm able to make a round trip for a wav file.
>>>
>>> Now to test fault tolerence I brought nodeB down and tried to write a
>>> file.
>>> Writing failed even though nodeC was up and running with following msg.
>>> More interestingly the file of size was listed in the name node.
>>> I would have expected hadoop to write the file to NodeB
>>>
>>> ##############error msg###################
>>> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
>>> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
>>> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>>
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.SocketTimeoutException
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
>>> blk_4025795281260753088_1216
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
>>> 10.0.3.136:50010
>>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
>>> blk_-2076345051085316536_1216
>>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
>>> blk_2666380449580768625_1216
>>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
>>> blk_742770163755453348_1216
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
>>> java.io.IOException: Unable to create new block.
>>>       at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
>>>       at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>>>       at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>>
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
>>> blk_742770163755453348_1216 bad datanode[0] nodes == null
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
>>> Aborting...
>>> copyFromLocal: No route to host
>>> Exception closing file
>>> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>> java.io.IOException: Filesystem closed
>>>       at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
>>>       at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
>>>       at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
>>>       at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
>>>       at
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>>>       at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
>>>
>>
>>
>
>
> --
> Kumar Pandey
> http://www.linkedin.com/in/kumarpandey
>



-- 
Kumar Pandey
http://www.linkedin.com/in/kumarpandey

Re: hadoop 0.19.0 and data node failure

Posted by Kumar Pandey <ku...@gmail.com>.
Thanks Brian I'll try with 3 data node and bringing down one with replica as
2.
 I should probably go ahead and file a bug for the fact that although write
failed, the file was listed under directory listing with size zero and
subsequent write attempt with both nodes up failed with following error.

 Target jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav already
exists



On Fri, Jan 16, 2009 at 6:10 AM, Brian Bockelman <bb...@cse.unl.edu>wrote:

> Hey Kumar,
>
> Hadoop won't let you write new blocks if it can't write them at the right
> replica level.
>
> You've requested to write a block with two replicas on a system where
> there's only one datanode alive.  I'd hope that it wouldn't let you create a
> new file!
>
> Brian
>
>
> On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:
>
>  To test hadoop's fault tolerence I tried the following node A -- name node
>> and secondaryname node
>> nodeB  - datanode
>> nodeC  - datanode
>>
>> replica set to 2.
>> When A, B and C are running I'm able to make a round trip for a wav file.
>>
>> Now to test fault tolerence I brought nodeB down and tried to write a
>> file.
>> Writing failed even though nodeC was up and running with following msg.
>> More interestingly the file of size was listed in the name node.
>> I would have expected hadoop to write the file to NodeB
>>
>> ##############error msg###################
>> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
>> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
>> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.SocketTimeoutException
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
>> blk_4025795281260753088_1216
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
>> 10.0.3.136:50010
>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
>> blk_-2076345051085316536_1216
>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
>> blk_2666380449580768625_1216
>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
>> blk_742770163755453348_1216
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
>> java.io.IOException: Unable to create new block.
>>       at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
>>       at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>>       at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
>> blk_742770163755453348_1216 bad datanode[0] nodes == null
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
>> Aborting...
>> copyFromLocal: No route to host
>> Exception closing file
>> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>> java.io.IOException: Filesystem closed
>>       at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
>>       at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
>>       at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
>>       at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
>>       at
>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>>       at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
>>
>
>


-- 
Kumar Pandey
http://www.linkedin.com/in/kumarpandey

Re: hadoop 0.19.0 and data node failure

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hey Kumar,

Hadoop won't let you write new blocks if it can't write them at the  
right replica level.

You've requested to write a block with two replicas on a system where  
there's only one datanode alive.  I'd hope that it wouldn't let you  
create a new file!

Brian

On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:

> To test hadoop's fault tolerence I tried the following node A --  
> name node
> and secondaryname node
> nodeB  - datanode
> nodeC  - datanode
>
> replica set to 2.
> When A, B and C are running I'm able to make a round trip for a wav  
> file.
>
> Now to test fault tolerence I brought nodeB down and tried to write  
> a file.
> Writing failed even though nodeC was up and running with following  
> msg.
> More interestingly the file of size was listed in the name node.
> I would have expected hadoop to write the file to NodeB
>
> ##############error msg###################
> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in  
> createBlockOutputStream
> java.net.SocketTimeoutException
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
> blk_4025795281260753088_1216
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
> 10.0.3.136:50010
> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in  
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
> blk_-2076345051085316536_1216
> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in  
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
> blk_2666380449580768625_1216
> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in  
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
> blk_742770163755453348_1216
> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
> java.io.IOException: Unable to create new block.
>        at
> org.apache.hadoop.hdfs.DFSClient 
> $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access 
> $2000(DFSClient.java:1997)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
> $DataStreamer.run(DFSClient.java:2183)
>
> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
> blk_742770163755453348_1216 bad datanode[0] nodes == null
> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
> Aborting...
> copyFromLocal: No route to host
> Exception closing file
> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
> java.io.IOException: Filesystem closed
>        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java: 
> 198)
>        at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java: 
> 65)
>        at
> org.apache.hadoop.hdfs.DFSClient 
> $DFSOutputStream.closeInternal(DFSClient.java:3084)
>        at
> org.apache.hadoop.hdfs.DFSClient 
> $DFSOutputStream.close(DFSClient.java:3053)
>        at
> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java: 
> 942)
>        at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)