You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by 田超 <ti...@software.ict.ac.cn> on 2008/06/06 04:43:08 UTC

Hadoop topology.script.file.name Form

hi,

I want to setup a hadoop cluster, and I want to make the cluster to be Rack Awareness. But I can't find any document about the form of topology.script.file.name.

Could anybody give me an example about the form of topology.script.file.name?

thanks a lot.

iver

2008-06-06 

Re: Hadoop topology.script.file.name Form

Posted by Vadim Zaliva <kr...@gmail.com>.
I just got around configuring this in my hadoop-0.18.3 install and I
can share my working topology script.
Documentaion is a bit confusing on this matter, so I hope it would be helpful.

The script is called by namenode as datanotes first connect to it. It
is passed an IP address of a datanode as a
parameter. I do not particularly like hardcoded data in python code,
perhaps the script could read this information
from separate configuration file.

Here is my script:

#!/usr/bin/env python

'''
This script used by hadoop to determine network/rack topology.  It
should be specified in hadoop-site.xml via topology.script.file.name
Property.

<property>
 <name>topology.script.file.name</name>
 <value>/home/hadoop/topology.py</value>
</property>
'''

import sys
from string import join

DEFAULT_RACK = '/default/rack0';

RACK_MAP = { '208.94.2.10' : '/datacenter1/rack0',
             '1.2.3.4' : '/datacenter1/rack0',
             '1.2.3.5' : '/datacenter1/rack0',
             '1.2.3.6' : '/datacenter1/rack0',

             '10.2.3.4' : '/datacenter2/rack0',
             '10.2.3.4' : '/datacenter2/rack0'
    }

if len(sys.argv)==1:
    print DEFAULT_RACK
else:
    print join([RACK_MAP.get(i, DEFAULT_RACK) for i in sys.argv[1:]]," ")

RE: Hadoop topology.script.file.name Form

Posted by Yunhong Gu1 <yg...@bert.cs.uic.edu>.
I guess what we need is an example of the "script", where do we put it, 
and what exactly to fill in the value of the "topology.script.file.name" 
entry.

So, I wrote a program called "mydns". I can run the program

./mydns node1.rack1.yahoo.com

It prints "/rack1" to the screen.

Is this correct? Where to put this program? What to fill the the 
configuration file?

Thanks
Yunhong


On Mon, 9 Jun 2008, Devaraj Das wrote:

> This documentation is for the earlier versions. In 0.17 the way in which
> racks are dealt with has changed.
>
>> -----Original Message-----
>> From: Yang Chen [mailto:chenyangyinpeng@gmail.com]
>> Sent: Sunday, June 08, 2008 8:06 PM
>> To: core-user@hadoop.apache.org
>> Subject: Re: Hadoop topology.script.file.name Form
>>
>> Rack Awareness
>>
>> Typically large Hadoop clusters are arranged in *racks* and
>> network traffic between different nodes with in the same rack
>> is much more desirable than network traffic across the racks.
>> In addition Namenode tries to place replicas of block on
>> multiple racks for improved fault tolerance. Hadoop lets the
>> cluster administrators decide which *rack* a node belongs to
>> through configuration variable dfs.network.script. When this
>> script is configured, each node runs the script to determine
>> its *rackid*. A default installation assumes all the nodes
>> belong to the same rack. This feature and configuration is
>> further described in
>> PDF<http://issues.apache.org/jira/secure/attachment/12345251/R
>> ack_aware_HDFS_proposal.pdf>attached
>> to
>> HADOOP-692 <http://issues.apache.org/jira/browse/HADOOP-692>.
>>
>> Hope this will be helpful.
>>
>>
>>
>> YC
>>
>> On Sun, Jun 8, 2008 at 9:53 PM, Devaraj Das
>> <dd...@yahoo-inc.com> wrote:
>>
>>> Hi Iver,
>>> The implementation of the script depends on your setup. The
>> main thing
>>> is that it should be able to accept a bunch of IP addresses and DNS
>>> names and be able to give back the rackIDs for each. It is a
>>> one-to-one correspondence between what you pass and what
>> you get back.
>>> For getting the rackID the script could deduce it from the
>> IP address,
>>> or, query some service (similar to the way dns works, or
>> some similar
>>> mechanism), or, in the extreme case, read a file that has
>> the mapping
>>> from IP address to rackId.
>>> Thanks,
>>> Devaraj.
>>>
>>>> -----Original Message-----
>>>> From: ?? [mailto:tianchao@software.ict.ac.cn]
>>>> Sent: Friday, June 06, 2008 8:13 AM
>>>> To: core-user
>>>> Subject: Hadoop topology.script.file.name Form
>>>>
>>>> hi,
>>>>
>>>> I want to setup a hadoop cluster, and I want to make the
>> cluster to
>>>> be Rack Awareness. But I can't find any document about
>> the form of
>>>> topology.script.file.name.
>>>>
>>>> Could anybody give me an example about the form of
>>>> topology.script.file.name?
>>>>
>>>> thanks a lot.
>>>>
>>>> iver
>>>>
>>>> 2008-06-06
>>>> ________________________________
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

RE: Hadoop topology.script.file.name Form

Posted by Devaraj Das <dd...@yahoo-inc.com>.
This documentation is for the earlier versions. In 0.17 the way in which
racks are dealt with has changed.

> -----Original Message-----
> From: Yang Chen [mailto:chenyangyinpeng@gmail.com] 
> Sent: Sunday, June 08, 2008 8:06 PM
> To: core-user@hadoop.apache.org
> Subject: Re: Hadoop topology.script.file.name Form
> 
> Rack Awareness
> 
> Typically large Hadoop clusters are arranged in *racks* and 
> network traffic between different nodes with in the same rack 
> is much more desirable than network traffic across the racks. 
> In addition Namenode tries to place replicas of block on 
> multiple racks for improved fault tolerance. Hadoop lets the 
> cluster administrators decide which *rack* a node belongs to 
> through configuration variable dfs.network.script. When this 
> script is configured, each node runs the script to determine 
> its *rackid*. A default installation assumes all the nodes 
> belong to the same rack. This feature and configuration is 
> further described in 
> PDF<http://issues.apache.org/jira/secure/attachment/12345251/R
> ack_aware_HDFS_proposal.pdf>attached
> to
> HADOOP-692 <http://issues.apache.org/jira/browse/HADOOP-692>.
> 
> Hope this will be helpful.
> 
> 
> 
> YC
> 
> On Sun, Jun 8, 2008 at 9:53 PM, Devaraj Das 
> <dd...@yahoo-inc.com> wrote:
> 
> > Hi Iver,
> > The implementation of the script depends on your setup. The 
> main thing 
> > is that it should be able to accept a bunch of IP addresses and DNS 
> > names and be able to give back the rackIDs for each. It is a 
> > one-to-one correspondence between what you pass and what 
> you get back. 
> > For getting the rackID the script could deduce it from the 
> IP address, 
> > or, query some service (similar to the way dns works, or 
> some similar 
> > mechanism), or, in the extreme case, read a file that has 
> the mapping 
> > from IP address to rackId.
> > Thanks,
> > Devaraj.
> >
> > > -----Original Message-----
> > > From: ?? [mailto:tianchao@software.ict.ac.cn]
> > > Sent: Friday, June 06, 2008 8:13 AM
> > > To: core-user
> > > Subject: Hadoop topology.script.file.name Form
> > >
> > > hi,
> > >
> > > I want to setup a hadoop cluster, and I want to make the 
> cluster to 
> > > be Rack Awareness. But I can't find any document about 
> the form of 
> > > topology.script.file.name.
> > >
> > > Could anybody give me an example about the form of 
> > > topology.script.file.name?
> > >
> > > thanks a lot.
> > >
> > > iver
> > >
> > > 2008-06-06
> > > ________________________________
> > >
> > >
> > >
> > >
> >
> >
> 


Re: Hadoop topology.script.file.name Form

Posted by Yang Chen <ch...@gmail.com>.
Rack Awareness

Typically large Hadoop clusters are arranged in *racks* and network traffic
between different nodes with in the same rack is much more desirable than
network traffic across the racks. In addition Namenode tries to place
replicas of block on multiple racks for improved fault tolerance. Hadoop
lets the cluster administrators decide which *rack* a node belongs to
through configuration variable dfs.network.script. When this script is
configured, each node runs the script to determine its *rackid*. A default
installation assumes all the nodes belong to the same rack. This feature and
configuration is further described in
PDF<http://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf>attached
to
HADOOP-692 <http://issues.apache.org/jira/browse/HADOOP-692>.

Hope this will be helpful.



YC

On Sun, Jun 8, 2008 at 9:53 PM, Devaraj Das <dd...@yahoo-inc.com> wrote:

> Hi Iver,
> The implementation of the script depends on your setup. The main thing is
> that it should be able to accept a bunch of IP addresses and DNS names and
> be able to give back the rackIDs for each. It is a one-to-one
> correspondence
> between what you pass and what you get back. For getting the rackID the
> script could deduce it from the IP address, or, query some service (similar
> to the way dns works, or some similar mechanism), or, in the extreme case,
> read a file that has the mapping from IP address to rackId.
> Thanks,
> Devaraj.
>
> > -----Original Message-----
> > From: ?? [mailto:tianchao@software.ict.ac.cn]
> > Sent: Friday, June 06, 2008 8:13 AM
> > To: core-user
> > Subject: Hadoop topology.script.file.name Form
> >
> > hi,
> >
> > I want to setup a hadoop cluster, and I want to make the
> > cluster to be Rack Awareness. But I can't find any document
> > about the form of topology.script.file.name.
> >
> > Could anybody give me an example about the form of
> > topology.script.file.name?
> >
> > thanks a lot.
> >
> > iver
> >
> > 2008-06-06
> > ________________________________
> >
> >
> >
> >
>
>

RE: Hadoop topology.script.file.name Form

Posted by Devaraj Das <dd...@yahoo-inc.com>.
Hi Iver,
The implementation of the script depends on your setup. The main thing is
that it should be able to accept a bunch of IP addresses and DNS names and
be able to give back the rackIDs for each. It is a one-to-one correspondence
between what you pass and what you get back. For getting the rackID the
script could deduce it from the IP address, or, query some service (similar
to the way dns works, or some similar mechanism), or, in the extreme case,
read a file that has the mapping from IP address to rackId.
Thanks,
Devaraj.

> -----Original Message-----
> From: ?? [mailto:tianchao@software.ict.ac.cn] 
> Sent: Friday, June 06, 2008 8:13 AM
> To: core-user
> Subject: Hadoop topology.script.file.name Form
> 
> hi,
>  
> I want to setup a hadoop cluster, and I want to make the 
> cluster to be Rack Awareness. But I can't find any document 
> about the form of topology.script.file.name.
>  
> Could anybody give me an example about the form of 
> topology.script.file.name?
>  
> thanks a lot.
>  
> iver
>  
> 2008-06-06
> ________________________________
> 
> 
>  
>