You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Yunhong Gu1 <yg...@bert.cs.uic.edu> on 2008/07/03 06:18:52 UTC

topology.script.file.name


Hello,

I have been trying to figure out how to configure rack awareness. I have
written a script that reads a list of IPs or host names and return a list 
of rack IDs of the same number.

This is my script running:

$./mydns 192.168.1.1 192.168.2.1
/rack0 /rack1

I specified the path of this script to topology.script.file.name. I 
verified that this script was called by Hadoop and I could see the input 
(up to 21 IPs in my case).

However, it seems the output of my script is not correct and Hadoop cannot
use it to get the correct topology (only 1 rack is found by Hadoop no 
matter how I change the format of the output).

Please advise if you know how to do this.

Thanks
Yunhong

RE: topology.script.file.name

Posted by Yunhong Gu1 <yg...@bert.cs.uic.edu>.

This is my "script", which is actually a C++ program:

#include <iostream>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
    for (int i = 1; i < argc; i ++ )
    {
       string dn = argv[i];

       if (dn.substr(0, 5) == "rack1")
          cout << "/rack1";
       else if (dn.substr(0, 5) == "rack2")
          cout << "/rack2";
       else if (dn.substr(0, 3) == "192")
          cout << "/rack1";
       else if (dn.substr(0, 2) == "10")
          cout << "/rack2";
       else
          cout << "/rack0";

       cout << " ";
    }

    return 1;
}

So I compiled the program as mydns. It can accept many IPs and print 
/rack0, /rack1, or /rack2 in a row.

e.g.,
./mydns 192.168.0.1 10.0.0.1
/rack1 rack2

(I tried other possible output, like each rack ID in one row, which 
didn't help)

I configured hadoop-site.xml and add this
<property>
   <name>topology.script.file.name</name>
   <value>/home/my/hadoop-0.17.0/mydns</value>
</property>

The program is located at /home/my/hadoop-0.17.0.

My understanding is that "mydns" should be called by 
ScriptBasedMapping.java.

I added some output to file in the mydns program and I can verify that it 
is actually being called, with an input parameter something like 
"192.168.0.1 192.168.0.10 10.0.0.5".

However, when I ran ./bin/hadoop fsck, it still tells me that there is 
only one rack in the system, and MapReduce program will immediately fail 
because some "topology initialization error" (I could find the exact text 
any more).

Thanks
Yunhong


On Thu, 3 Jul 2008, Devaraj Das wrote:

> This is strange. If you don't mind, pls send the script to me.
>
>> -----Original Message-----
>> From: Yunhong Gu1 [mailto:ygu1@bert.cs.uic.edu]
>> Sent: Thursday, July 03, 2008 9:49 AM
>> To: core-user@hadoop.apache.org
>> Subject: topology.script.file.name
>>
>>
>>
>> Hello,
>>
>> I have been trying to figure out how to configure rack
>> awareness. I have written a script that reads a list of IPs
>> or host names and return a list of rack IDs of the same number.
>>
>> This is my script running:
>>
>> $./mydns 192.168.1.1 192.168.2.1
>> /rack0 /rack1
>>
>> I specified the path of this script to
>> topology.script.file.name. I verified that this script was
>> called by Hadoop and I could see the input (up to 21 IPs in my case).
>>
>> However, it seems the output of my script is not correct and
>> Hadoop cannot use it to get the correct topology (only 1 rack
>> is found by Hadoop no matter how I change the format of the output).
>>
>> Please advise if you know how to do this.
>>
>> Thanks
>> Yunhong
>>
>
>

RE: topology.script.file.name

Posted by Devaraj Das <dd...@yahoo-inc.com>.
This is strange. If you don't mind, pls send the script to me.

> -----Original Message-----
> From: Yunhong Gu1 [mailto:ygu1@bert.cs.uic.edu] 
> Sent: Thursday, July 03, 2008 9:49 AM
> To: core-user@hadoop.apache.org
> Subject: topology.script.file.name
> 
> 
> 
> Hello,
> 
> I have been trying to figure out how to configure rack 
> awareness. I have written a script that reads a list of IPs 
> or host names and return a list of rack IDs of the same number.
> 
> This is my script running:
> 
> $./mydns 192.168.1.1 192.168.2.1
> /rack0 /rack1
> 
> I specified the path of this script to 
> topology.script.file.name. I verified that this script was 
> called by Hadoop and I could see the input (up to 21 IPs in my case).
> 
> However, it seems the output of my script is not correct and 
> Hadoop cannot use it to get the correct topology (only 1 rack 
> is found by Hadoop no matter how I change the format of the output).
> 
> Please advise if you know how to do this.
> 
> Thanks
> Yunhong
>