You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Samuel Guo <si...@gmail.com> on 2008/05/07 04:16:51 UTC

Replication failover of HDFS

Hi all,

I am reading the hadoop source code to study the design of the hadoop
distributed filesystem. And I think I've got some questions about the
file replication of HDFS.

I know the degree of replication of HDFS is configurable on a configure
file such as "hadoop-default.xml". The default degree is 3. And hadoop
use *ReplicationTargetChooser* to try to choose the best datanodes to
store the replications.

But when the degree of replication for a file drops below the configured
amount(such as, due to an extended datanode outage). I guess that there
will be an daemon thread or other things run backgroud to do the
*Re-replication* : the namenode forces the block to be re-replicated on
the remaining datanodes. but I can find any source code to do this in
the hadoop source archive? Can anybody tell me how hadoop deal with
Re-replication?

In addition, I am confused about how hadoop to keep the replications
consistency. I know hadoop use pipeline-write to try to make the
replications consistency. but datanode crash may make the replications
become inconsistency.
For example, we got a block of 3 replications in 3 datanodes: datanode1,
datanode2, datanode3.
Datanode3 crashes in some time, and datanode3 become inavailable.
Before the recovery of datanode3, if we want to write the block
mentioned above, how hadoop will do?
to write only the replications in datanode1, datanode2? or to wait for
the recovery of datanode3? To wait will be unadvisable. It cost too much.
if we have written the replications in datanode1,2, we will got
inconsistency between datanode3 and datanode1,2. Does hadoop deal with
this kind of inconsistency? And If so, where I can find this part of
source code in hadoop source archive?

Hope for your reply.

regards,

Samuel