You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Foss User <fo...@gmail.com> on 2009/05/07 17:51:42 UTC

Is "/room1" in the rack name "/room1/rack1" significant during replication?

I have written a rack awareness script which maps the IP addresses to
rack names in this way.

10.31.1.* -> /room1/rack1
10.31.2.* -> /room1/rack2
10.31.3.* -> /room1/rack3
10.31.100.* -> /room2/rack1
10.31.200.* -> /room2/rack2
10.31.200.* -> /room2/rack3

I understand that DFS will try to have replication of data in such a
manner that even if /room1/rack1 goes down, the data is still
available in other racks. I want to understand whether the hierarchy
of racks (like rack1 is in room1 here) is given any importance.

What I mean is, in addition to taking care that the data is unaffected
if /room1/rack1 goes down, will it also try to take care that almost
all data is replicated in the racks withiin /room2 so that if /room1
goes down as a whole (say there is a power cut in room1), we still
have all the data in racks of /room2?