You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Kumar Pandey <ku...@gmail.com> on 2009/01/16 07:02:10 UTC
hadoop 0.19.0 and data node failure
To test hadoop's fault tolerence I tried the following node A -- name node
and secondaryname node
nodeB - datanode
nodeC - datanode
replica set to 2.
When A, B and C are running I'm able to make a round trip for a wav file.
Now to test fault tolerence I brought nodeB down and tried to write a file.
Writing failed even though nodeC was up and running with following msg.
More interestingly the file of size was listed in the name node.
I would have expected hadoop to write the file to NodeB
##############error msg###################
[hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException
09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
blk_4025795281260753088_1216
09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
10.0.3.136:50010
09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
blk_-2076345051085316536_1216
09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
blk_2666380449580768625_1216
09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.NoRouteToHostException: No route to host
09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
blk_742770163755453348_1216
09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
blk_742770163755453348_1216 bad datanode[0] nodes == null
09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
Aborting...
copyFromLocal: No route to host
Exception closing file
/user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
at
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
Re: hadoop 0.19.0 and data node failure
Posted by Kumar Pandey <ku...@gmail.com>.
Played with it a bit more and made few observations which I thought I'd
share
1) If replica is set to 2 and you have minimum of 2 datanodes running then
other datanodes going down doesn't affect the write and read.
2) If datanodes is < replica write still goes through provided namenode
recognizes that the datanodes are dead. This can take upto 10 min by default
but can be tweaked with heartbeat.recheck.interval
On Fri, Jan 16, 2009 at 6:56 AM, Kumar Pandey <ku...@gmail.com>wrote:
> Thanks Brian I'll try with 3 data node and bringing down one with replica
> as 2.
> I should probably go ahead and file a bug for the fact that although write
> failed, the file was listed under directory listing with size zero and
> subsequent write attempt with both nodes up failed with following error.
>
> Target jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav already
> exists
>
>
>
> On Fri, Jan 16, 2009 at 6:10 AM, Brian Bockelman <bb...@cse.unl.edu>wrote:
>
>> Hey Kumar,
>>
>> Hadoop won't let you write new blocks if it can't write them at the right
>> replica level.
>>
>> You've requested to write a block with two replicas on a system where
>> there's only one datanode alive. I'd hope that it wouldn't let you create a
>> new file!
>>
>> Brian
>>
>>
>> On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:
>>
>> To test hadoop's fault tolerence I tried the following node A -- name
>>> node
>>> and secondaryname node
>>> nodeB - datanode
>>> nodeC - datanode
>>>
>>> replica set to 2.
>>> When A, B and C are running I'm able to make a round trip for a wav file.
>>>
>>> Now to test fault tolerence I brought nodeB down and tried to write a
>>> file.
>>> Writing failed even though nodeC was up and running with following msg.
>>> More interestingly the file of size was listed in the name node.
>>> I would have expected hadoop to write the file to NodeB
>>>
>>> ##############error msg###################
>>> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
>>> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
>>> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>>
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.SocketTimeoutException
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
>>> blk_4025795281260753088_1216
>>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
>>> 10.0.3.136:50010
>>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
>>> blk_-2076345051085316536_1216
>>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
>>> blk_2666380449580768625_1216
>>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in
>>> createBlockOutputStream
>>> java.net.NoRouteToHostException: No route to host
>>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
>>> blk_742770163755453348_1216
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
>>> java.io.IOException: Unable to create new block.
>>> at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
>>> at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>>> at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>>
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
>>> blk_742770163755453348_1216 bad datanode[0] nodes == null
>>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
>>> Aborting...
>>> copyFromLocal: No route to host
>>> Exception closing file
>>> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>> java.io.IOException: Filesystem closed
>>> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
>>> at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
>>> at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
>>> at
>>>
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
>>>
>>
>>
>
>
> --
> Kumar Pandey
> http://www.linkedin.com/in/kumarpandey
>
--
Kumar Pandey
http://www.linkedin.com/in/kumarpandey
Re: hadoop 0.19.0 and data node failure
Posted by Kumar Pandey <ku...@gmail.com>.
Thanks Brian I'll try with 3 data node and bringing down one with replica as
2.
I should probably go ahead and file a bug for the fact that although write
failed, the file was listed under directory listing with size zero and
subsequent write attempt with both nodes up failed with following error.
Target jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav already
exists
On Fri, Jan 16, 2009 at 6:10 AM, Brian Bockelman <bb...@cse.unl.edu>wrote:
> Hey Kumar,
>
> Hadoop won't let you write new blocks if it can't write them at the right
> replica level.
>
> You've requested to write a block with two replicas on a system where
> there's only one datanode alive. I'd hope that it wouldn't let you create a
> new file!
>
> Brian
>
>
> On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:
>
> To test hadoop's fault tolerence I tried the following node A -- name node
>> and secondaryname node
>> nodeB - datanode
>> nodeC - datanode
>>
>> replica set to 2.
>> When A, B and C are running I'm able to make a round trip for a wav file.
>>
>> Now to test fault tolerence I brought nodeB down and tried to write a
>> file.
>> Writing failed even though nodeC was up and running with following msg.
>> More interestingly the file of size was listed in the name node.
>> I would have expected hadoop to write the file to NodeB
>>
>> ##############error msg###################
>> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
>> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
>> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>>
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.SocketTimeoutException
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
>> blk_4025795281260753088_1216
>> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
>> 10.0.3.136:50010
>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
>> blk_-2076345051085316536_1216
>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
>> blk_2666380449580768625_1216
>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream
>> java.net.NoRouteToHostException: No route to host
>> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
>> blk_742770163755453348_1216
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
>> java.io.IOException: Unable to create new block.
>> at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
>> at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
>> at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
>>
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
>> blk_742770163755453348_1216 bad datanode[0] nodes == null
>> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
>> Aborting...
>> copyFromLocal: No route to host
>> Exception closing file
>> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>> java.io.IOException: Filesystem closed
>> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
>> at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
>> at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
>> at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
>> at
>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
>>
>
>
--
Kumar Pandey
http://www.linkedin.com/in/kumarpandey
Re: hadoop 0.19.0 and data node failure
Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hey Kumar,
Hadoop won't let you write new blocks if it can't write them at the
right replica level.
You've requested to write a block with two replicas on a system where
there's only one datanode alive. I'd hope that it wouldn't let you
create a new file!
Brian
On Jan 16, 2009, at 12:02 AM, Kumar Pandey wrote:
> To test hadoop's fault tolerence I tried the following node A --
> name node
> and secondaryname node
> nodeB - datanode
> nodeC - datanode
>
> replica set to 2.
> When A, B and C are running I'm able to make a round trip for a wav
> file.
>
> Now to test fault tolerence I brought nodeB down and tried to write
> a file.
> Writing failed even though nodeC was up and running with following
> msg.
> More interestingly the file of size was listed in the name node.
> I would have expected hadoop to write the file to NodeB
>
> ##############error msg###################
> [hadoop@cancunvm1 testfiles]$ hadoop fs -copyFromLocal
> 9979_D4FE01E0-DD119BDE-3000CB83-EB857348.wav
> jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
>
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Exception in
> createBlockOutputStream
> java.net.SocketTimeoutException
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Abandoning block
> blk_4025795281260753088_1216
> 09/01/16 01:47:09 INFO hdfs.DFSClient: Waiting to find target node:
> 10.0.3.136:50010
> 09/01/16 01:47:18 INFO hdfs.DFSClient: Exception in
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:18 INFO hdfs.DFSClient: Abandoning block
> blk_-2076345051085316536_1216
> 09/01/16 01:47:27 INFO hdfs.DFSClient: Exception in
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:27 INFO hdfs.DFSClient: Abandoning block
> blk_2666380449580768625_1216
> 09/01/16 01:47:36 INFO hdfs.DFSClient: Exception in
> createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 09/01/16 01:47:36 INFO hdfs.DFSClient: Abandoning block
> blk_742770163755453348_1216
> 09/01/16 01:47:42 WARN hdfs.DFSClient: DataStreamer Exception:
> java.io.IOException: Unable to create new block.
> at
> org.apache.hadoop.hdfs.DFSClient
> $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2723)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access
> $2000(DFSClient.java:1997)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
> $DataStreamer.run(DFSClient.java:2183)
>
> 09/01/16 01:47:42 WARN hdfs.DFSClient: Error Recovery for block
> blk_742770163755453348_1216 bad datanode[0] nodes == null
> 09/01/16 01:47:42 WARN hdfs.DFSClient: Could not get block locations.
> Aborting...
> copyFromLocal: No route to host
> Exception closing file
> /user/hadoop/jukebox/9979_D4FE01E0-DD119BDE-3000CB83-EB857348_21.wav
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:
> 198)
> at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:
> 65)
> at
> org.apache.hadoop.hdfs.DFSClient
> $DFSOutputStream.closeInternal(DFSClient.java:3084)
> at
> org.apache.hadoop.hdfs.DFSClient
> $DFSOutputStream.close(DFSClient.java:3053)
> at
> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:
> 942)
> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)