You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by nitesh bhatia <ni...@gmail.com> on 2009/01/25 23:02:43 UTC

Zeroconf for hadoop

Hi
Apple provides opensource discovery service called Bonjour (zeroconf). Is it
possible to integrate Zeroconf with Hadoop so that discovery of nodes become
automatic ? Presently for setting up multi-node cluster we need to add IPs
manually. Integrating it with bonjour can make this process automatic.

--nitesh


-- 
Nitesh Bhatia
Dhirubhai Ambani Institute of Information & Communication Technology
Gandhinagar
Gujarat

"Life is never perfect. It just depends where you draw the line."

visit:
http://www.awaaaz.com - connecting through music
http://www.volstreet.com - lets volunteer for better tomorrow
http://www.instibuzz.com - Voice opinions, Transact easily, Have fun

Re: Zeroconf for hadoop

Posted by Steve Loughran <st...@apache.org>.
Doug Cutting wrote:
> Owen O'Malley wrote:
>> allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker
>>
>> and it will use ssh in parallel to connect to every node between 
>> node1000 and node3000. Our's is a mess, but it would be great if 
>> someone contributed a script like that. *smile*
> 
> It would be a one-line change to bin/slaves.sh to have it filter hosts 
> by a regex.
> 
> Note that bin/slaves.sh can have problems with larger clusters (>~100 
> nodes) since a single shell has trouble handling the i/o from 100 
> sub-processes, and ssh connections will start timing out.  That's the 
> point of the HADOOP_SLAVE_SLEEP parameter, to meter the rate that 
> sub-processes are spawned.  A better solution might be too sleep if the 
> number of sub-processes exceeds some limit, e.g.:
> 
>   while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done
> 
> Doug

The trick there is for your script to pick the first couple of nodes and 
give them half the work, they do the same thing down the tree and you 
end up with the cluster booting itself at some rate that includes 
log2(N) somewhere in the equations.


Re: Zeroconf for hadoop

Posted by Doug Cutting <cu...@apache.org>.
Owen O'Malley wrote:
> allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker
> 
> and it will use ssh in parallel to connect to every node between 
> node1000 and node3000. Our's is a mess, but it would be great if someone 
> contributed a script like that. *smile*

It would be a one-line change to bin/slaves.sh to have it filter hosts 
by a regex.

Note that bin/slaves.sh can have problems with larger clusters (>~100 
nodes) since a single shell has trouble handling the i/o from 100 
sub-processes, and ssh connections will start timing out.  That's the 
point of the HADOOP_SLAVE_SLEEP parameter, to meter the rate that 
sub-processes are spawned.  A better solution might be too sleep if the 
number of sub-processes exceeds some limit, e.g.:

   while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done

Doug

Re: Zeroconf for hadoop

Posted by Ted Dunning <te...@gmail.com>.
A big positive vote for Zookeeper.

The most salient aspect of my experience using zookeeper is that
coordination and heartbeats and discovery and failure notifications all
become nearly trivial.  The most amazing thing about ZK isn't the code that
you write, it is all the code that you never have write.

On Wed, Jan 28, 2009 at 11:01 AM, Patrick Hunt <ph...@apache.org> wrote:

> Owen O'Malley wrote:
>
>> On Jan 25, 2009, at 2:02 PM, nitesh bhatia wrote:
>>
>>  Apple provides opensource discovery service called Bonjour (zeroconf).
>>>
>>
>> I don't know enough about Zeroconf to be able to answer definitively, but
>> I suspect the hardest bit would be figuring out the approach. Of course
>> Hadoop has to continue to work on other platforms, so cross-platform
>> strategies are better.
>>
>
> Take a look at ZooKeeper (a sub-project of Hadoop):
> http://hadoop.apache.org/zookeeper/
>
> Among other features ZooKeeper provides group membership and dynamic
> configuration support -- you could modify the various Hadoop processes to
> query & register with ZooKeeper when they come up. This could be used for
> node/service discovery as well as auto configuration of the processes.
>
> Patrick
>



-- 
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)

Re: Zeroconf for hadoop

Posted by Patrick Hunt <ph...@apache.org>.
Owen O'Malley wrote:
> On Jan 25, 2009, at 2:02 PM, nitesh bhatia wrote:
> 
>> Apple provides opensource discovery service called Bonjour (zeroconf).
> 
> I don't know enough about Zeroconf to be able to answer definitively, 
> but I suspect the hardest bit would be figuring out the approach. Of 
> course Hadoop has to continue to work on other platforms, so 
> cross-platform strategies are better.

Take a look at ZooKeeper (a sub-project of Hadoop):
http://hadoop.apache.org/zookeeper/

Among other features ZooKeeper provides group membership and dynamic 
configuration support -- you could modify the various Hadoop processes 
to query & register with ZooKeeper when they come up. This could be used 
for node/service discovery as well as auto configuration of the processes.

Patrick

Re: Zeroconf for hadoop

Posted by Owen O'Malley <om...@apache.org>.
On Jan 25, 2009, at 2:02 PM, nitesh bhatia wrote:

> Apple provides opensource discovery service called Bonjour (zeroconf).

I don't know enough about Zeroconf to be able to answer definitively,  
but I suspect the hardest bit would be figuring out the approach. Of  
course Hadoop has to continue to work on other platforms, so cross- 
platform strategies are better.

>  Presently for setting up multi-node cluster we need to add IPs
> manually.

To the slaves list? The slaves list is only used to ssh to the hosts.  
At Yahoo, we use a parallel ssh perl script that takes ranges of  
hosts, so you issue commands like:

allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker

and it will use ssh in parallel to connect to every node between  
node1000 and node3000. Our's is a mess, but it would be great if  
someone contributed a script like that. *smile*

-- Owen