You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Edmon Begoli <eb...@gmail.com> on 2011/11/22 22:04:34 UTC

How is network distance for nodes calculated

I am reading Hadoop Definitive Guide 2nd Edition and I am struggling
to figure out the exact
Hadoop's formula for network distance calculation (page 64/65). (I
have my guesses, but I would like to know the exact formula)

There is an example showing following distances:

For example, imagine a node n1 on rack r1 in data center d1.
This can be represented as /d1/r1/n1.

Using this notation, here are the distances for the four scenarios:
•	distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)
•	distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack)
•	distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the
same data center)
•	distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers)

and there is illustration there as well.

Here is the link to the illustration:
http://books.google.com/books?id=Nff49D7vnJcC&lpg=PA65&ots=IidrYuayXs&dq=hadoop%20network%20distance%20calculation&pg=PA65#v=onepage&q=hadoop%20network%20distance%20calculation&f=false

If different rack is 4 and same one is 2 what would be the distance of
other nodes that are on the same rack? 2 as well? Can distance be 1?

Thank you,
Edmon
http://it.toolbox.com/blogs/lim

Re: How is network distance for nodes calculated

Posted by Steve Loughran <st...@apache.org>.
On 22/11/11 21:04, Edmon Begoli wrote:
> I am reading Hadoop Definitive Guide 2nd Edition and I am struggling
> to figure out the exact
> Hadoop's formula for network distance calculation (page 64/65). (I
> have my guesses, but I would like to know the exact formula)

It's implemented in org.apache.hadoop.net.NetworkTopology

It's measuring the #of network hops to get there, "2" = n1 -> switch1 -> n2

etc