You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bin YANG <ya...@gmail.com> on 2008/01/15 15:26:57 UTC

how to deploy hadoop on many PCs quickly?

Dear colleagues,

Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
One PC will be set as master, the others will be set as slaves.
The PCs have similar hardware, or even the same hardware.

Is there a quick and easy way to deploy hadoop on these PCs?

Do you think that

1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
2. and then copy the image to other PCs

is the best way?

Thank you very much.

Best wishes,
Bin YANG


-- 
Bin YANG
Department of Computer Science and Engineering
Fudan University
Shanghai, P. R. China
EMail: yangbinisme82@gmail.com

RE: how to deploy hadoop on many PCs quickly?

Posted by Xavier Stevens <Xa...@fox.com>.
You can also setup an NFS mount for the hadoop directory.  In your case
maybe on the master node.

-Xavier 

-----Original Message-----
From: Benjamin Reed [mailto:breed@yahoo-inc.com] 
Sent: Tuesday, January 15, 2008 7:27 AM
To: hadoop-user@lucene.apache.org
Cc: Miles Osborne
Subject: Re: how to deploy hadoop on many PCs quickly?

On my cluster I use the patch in
https://issues.apache.org/jira/browse/HADOOP-435
to build a single jar file and zip my configuration into that jar file.

Installation just a matter of copying the one jar file to all the
machines in my cluster.

To startup I have a script that runs through all the machines in the
cluster and runs the start.sh script. The start.sh script assumes the
jobtracker and namenode run on the same machine. If this is not the
case, you need to tweak the script a bit. (It's a rather trivial 20 line
script.)

The patch will never go in since the issue was closed, but I still find
it useful for situations where I don't want to do a lot of tarring and
configuring to get things setup. (I'm probably just lazy since it
doesn't seem to bother most people :)

ben

On Tuesday 15 January 2008 06:41:04 Miles Osborne wrote:
> I have been through this very recently.  My approach was to:
>
> --manually setup the master (ie specify the conf files etc) --tar-up 
> java and hadoop s.t unpacking them puts them in the desired location 
> --create the ssh keys on the master.
>
> now, create a shell script which does the following:
>
> --open the necessary ports
> --copy across the ssh keys from the master and install them in the 
> correct location --copy across and untar java and hadoop --assign the 
> correct permissions to the distributed file system directory on the 
> current node --create user accounts as necessary
>
> copy this script across to each slave in turn and run it;  adding a 
> new slave node will take a minute or two.
>
> (this assumes each node already has linux installed on it and the 
> filesystem is identical)
>
> Miles
>
> On 15/01/2008, Bin YANG <ya...@gmail.com> wrote:
> > Dear colleagues,
> >
> > Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> > One PC will be set as master, the others will be set as slaves.
> > The PCs have similar hardware, or even the same hardware.
> >
> > Is there a quick and easy way to deploy hadoop on these PCs?
> >
> > Do you think that
> >
> > 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk 2. 
> > and then copy the image to other PCs
> >
> > is the best way?
> >
> > Thank you very much.
> >
> > Best wishes,
> > Bin YANG
> >
> >
> > --
> > Bin YANG
> > Department of Computer Science and Engineering Fudan University 
> > Shanghai, P. R. China
> > EMail: yangbinisme82@gmail.com





Re: how to deploy hadoop on many PCs quickly?

Posted by Benjamin Reed <br...@yahoo-inc.com>.
On my cluster I use the patch in 
https://issues.apache.org/jira/browse/HADOOP-435
to build a single jar file and zip my configuration into that jar file.

Installation just a matter of copying the one jar file to all the machines in 
my cluster.

To startup I have a script that runs through all the machines in the cluster 
and runs the start.sh script. The start.sh script assumes the jobtracker and 
namenode run on the same machine. If this is not the case, you need to tweak 
the script a bit. (It's a rather trivial 20 line script.)

The patch will never go in since the issue was closed, but I still find it 
useful for situations where I don't want to do a lot of tarring and 
configuring to get things setup. (I'm probably just lazy since it doesn't 
seem to bother most people :)

ben

On Tuesday 15 January 2008 06:41:04 Miles Osborne wrote:
> I have been through this very recently.  My approach was to:
>
> --manually setup the master (ie specify the conf files etc)
> --tar-up java and hadoop s.t unpacking them puts them in the desired
> location
> --create the ssh keys on the master.
>
> now, create a shell script which does the following:
>
> --open the necessary ports
> --copy across the ssh keys from the master and install them in the correct
> location
> --copy across and untar java and hadoop
> --assign the correct permissions to the distributed file system directory
> on the current node
> --create user accounts as necessary
>
> copy this script across to each slave in turn and run it;  adding a new
> slave node will take a minute or two.
>
> (this assumes each node already has linux installed on it and the
> filesystem is identical)
>
> Miles
>
> On 15/01/2008, Bin YANG <ya...@gmail.com> wrote:
> > Dear colleagues,
> >
> > Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> > One PC will be set as master, the others will be set as slaves.
> > The PCs have similar hardware, or even the same hardware.
> >
> > Is there a quick and easy way to deploy hadoop on these PCs?
> >
> > Do you think that
> >
> > 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> > 2. and then copy the image to other PCs
> >
> > is the best way?
> >
> > Thank you very much.
> >
> > Best wishes,
> > Bin YANG
> >
> >
> > --
> > Bin YANG
> > Department of Computer Science and Engineering
> > Fudan University
> > Shanghai, P. R. China
> > EMail: yangbinisme82@gmail.com



Re: how to deploy hadoop on many PCs quickly?

Posted by Miles Osborne <mi...@inf.ed.ac.uk>.
I have been through this very recently.  My approach was to:

--manually setup the master (ie specify the conf files etc)
--tar-up java and hadoop s.t unpacking them puts them in the desired
location
--create the ssh keys on the master.

now, create a shell script which does the following:

--open the necessary ports
--copy across the ssh keys from the master and install them in the correct
location
--copy across and untar java and hadoop
--assign the correct permissions to the distributed file system directory on
the current node
--create user accounts as necessary

copy this script across to each slave in turn and run it;  adding a new
slave node will take a minute or two.

(this assumes each node already has linux installed on it and the filesystem
is identical)

Miles
On 15/01/2008, Bin YANG <ya...@gmail.com> wrote:
>
> Dear colleagues,
>
> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> One PC will be set as master, the others will be set as slaves.
> The PCs have similar hardware, or even the same hardware.
>
> Is there a quick and easy way to deploy hadoop on these PCs?
>
> Do you think that
>
> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> 2. and then copy the image to other PCs
>
> is the best way?
>
> Thank you very much.
>
> Best wishes,
> Bin YANG
>
>
> --
> Bin YANG
> Department of Computer Science and Engineering
> Fudan University
> Shanghai, P. R. China
> EMail: yangbinisme82@gmail.com
>

Re: how to deploy hadoop on many PCs quickly?

Posted by Bin YANG <ya...@gmail.com>.
thanks, russell smith.

I fix the grub, but the ext3 file system which is restored from the Norton
Ghost seems not preserve correctly.

At last, I use G4L (Ghost for Linux) to copy whole hard disk from source
drive to destination drive, it works very well. Both GRUB and ext3 file
system works correctly.

I think dd is similar to G4L, is it right?

Thank you very much again!

Bin YANG

On Jan 18, 2008 2:52 AM, Russell Smith <ru...@ukd1.co.uk> wrote:

> Bin,
>
> Did you try using dd from the source to the destination drive? That
> should preserve grub.
>
>
> Russell Smith
> UKD1 Limited
>
>
> Bin YANG wrote:
> > I use the Norton Ghost 8.0 ghost a whole ubuntu hard disk to a image,
> and
> > restore another hard disk from the image, but the restored hard disk
> cannnot
> > start up ubuntu successfully.
> > The GRUB said error 22.
> >
> > Does somebody know how to fix the problem?
> >
> > Thanks.
> >
> > Bin YANG
> >
> > On Jan 16, 2008 4:54 AM, Sagar Naik <sa...@visvo.com> wrote:
> >
> >
> >> Hi
> >>
> >> We at Visvo have developed a small script
> >> for command processing on a cluster.
> >> We would like to share it with you , have a it reviewed .
> >> It is available under APL.
> >>
> >> We would like to make a project so that we all
> >> can contribute to this script.
> >>
> >> For now,  you can download this script
> >> from http://visvo.com/dev_scripts/co.tar.gz
> >> <http://visvo.com/dev_scripts/co.tar.gz>
> >>
> >> - Sagar Naik
> >>
> >>
> >> Bin YANG wrote:
> >>
> >>> Dear colleagues,
> >>>
> >>> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> >>> One PC will be set as master, the others will be set as slaves.
> >>> The PCs have similar hardware, or even the same hardware.
> >>>
> >>> Is there a quick and easy way to deploy hadoop on these PCs?
> >>>
> >>> Do you think that
> >>>
> >>> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> >>> 2. and then copy the image to other PCs
> >>>
> >>> is the best way?
> >>>
> >>> Thank you very much.
> >>>
> >>> Best wishes,
> >>> Bin YANG
> >>>
> >>>
> >>>
> >>>
> >> --
> >> This message has been scanned for viruses and
> >> dangerous content and is believed to be clean.
> >>
> >>
> >>
> >
> >
> >
>
>


-- 
Bin YANG
Department of Computer Science and Engineering
Fudan University
Shanghai, P. R. China
EMail: yangbinisme82@gmail.com

Re: how to deploy hadoop on many PCs quickly?

Posted by Russell Smith <ru...@ukd1.co.uk>.
Bin,

Did you try using dd from the source to the destination drive? That 
should preserve grub.


Russell Smith
UKD1 Limited


Bin YANG wrote:
> I use the Norton Ghost 8.0 ghost a whole ubuntu hard disk to a image, and
> restore another hard disk from the image, but the restored hard disk cannnot
> start up ubuntu successfully.
> The GRUB said error 22.
>
> Does somebody know how to fix the problem?
>
> Thanks.
>
> Bin YANG
>
> On Jan 16, 2008 4:54 AM, Sagar Naik <sa...@visvo.com> wrote:
>
>   
>> Hi
>>
>> We at Visvo have developed a small script
>> for command processing on a cluster.
>> We would like to share it with you , have a it reviewed .
>> It is available under APL.
>>
>> We would like to make a project so that we all
>> can contribute to this script.
>>
>> For now,  you can download this script
>> from http://visvo.com/dev_scripts/co.tar.gz
>> <http://visvo.com/dev_scripts/co.tar.gz>
>>
>> - Sagar Naik
>>
>>
>> Bin YANG wrote:
>>     
>>> Dear colleagues,
>>>
>>> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
>>> One PC will be set as master, the others will be set as slaves.
>>> The PCs have similar hardware, or even the same hardware.
>>>
>>> Is there a quick and easy way to deploy hadoop on these PCs?
>>>
>>> Do you think that
>>>
>>> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
>>> 2. and then copy the image to other PCs
>>>
>>> is the best way?
>>>
>>> Thank you very much.
>>>
>>> Best wishes,
>>> Bin YANG
>>>
>>>
>>>
>>>       
>> --
>> This message has been scanned for viruses and
>> dangerous content and is believed to be clean.
>>
>>
>>     
>
>
>   


Re: how to deploy hadoop on many PCs quickly?

Posted by Ted Dunning <td...@veoh.com>.
This isn't really a question about Hadoop, but is about system
administration basics.

You are probably missing a master boot record (MBR) on the disk.  Ask a
local linux expert to help you or look at the Norton documentation.


On 1/16/08 4:59 AM, "Bin YANG" <ya...@gmail.com> wrote:

> I use the Norton Ghost 8.0 ghost a whole ubuntu hard disk to a image, and
> restore another hard disk from the image, but the restored hard disk cannnot
> start up ubuntu successfully.
> The GRUB said error 22.
> 
> Does somebody know how to fix the problem?
> 
> Thanks.
> 
> Bin YANG
> 
> On Jan 16, 2008 4:54 AM, Sagar Naik <sa...@visvo.com> wrote:
> 
>> Hi
>> 
>> We at Visvo have developed a small script
>> for command processing on a cluster.
>> We would like to share it with you , have a it reviewed .
>> It is available under APL.
>> 
>> We would like to make a project so that we all
>> can contribute to this script.
>> 
>> For now,  you can download this script
>> from http://visvo.com/dev_scripts/co.tar.gz
>> <http://visvo.com/dev_scripts/co.tar.gz>
>> 
>> - Sagar Naik
>> 
>> 
>> Bin YANG wrote:
>>> Dear colleagues,
>>> 
>>> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
>>> One PC will be set as master, the others will be set as slaves.
>>> The PCs have similar hardware, or even the same hardware.
>>> 
>>> Is there a quick and easy way to deploy hadoop on these PCs?
>>> 
>>> Do you think that
>>> 
>>> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
>>> 2. and then copy the image to other PCs
>>> 
>>> is the best way?
>>> 
>>> Thank you very much.
>>> 
>>> Best wishes,
>>> Bin YANG
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> This message has been scanned for viruses and
>> dangerous content and is believed to be clean.
>> 
>> 
> 


Re: how to deploy hadoop on many PCs quickly?

Posted by Bin YANG <ya...@gmail.com>.
I use the Norton Ghost 8.0 ghost a whole ubuntu hard disk to a image, and
restore another hard disk from the image, but the restored hard disk cannnot
start up ubuntu successfully.
The GRUB said error 22.

Does somebody know how to fix the problem?

Thanks.

Bin YANG

On Jan 16, 2008 4:54 AM, Sagar Naik <sa...@visvo.com> wrote:

> Hi
>
> We at Visvo have developed a small script
> for command processing on a cluster.
> We would like to share it with you , have a it reviewed .
> It is available under APL.
>
> We would like to make a project so that we all
> can contribute to this script.
>
> For now,  you can download this script
> from http://visvo.com/dev_scripts/co.tar.gz
> <http://visvo.com/dev_scripts/co.tar.gz>
>
> - Sagar Naik
>
>
> Bin YANG wrote:
> > Dear colleagues,
> >
> > Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> > One PC will be set as master, the others will be set as slaves.
> > The PCs have similar hardware, or even the same hardware.
> >
> > Is there a quick and easy way to deploy hadoop on these PCs?
> >
> > Do you think that
> >
> > 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> > 2. and then copy the image to other PCs
> >
> > is the best way?
> >
> > Thank you very much.
> >
> > Best wishes,
> > Bin YANG
> >
> >
> >
>
>
> --
> This message has been scanned for viruses and
> dangerous content and is believed to be clean.
>
>


-- 
Bin YANG
Department of Computer Science and Engineering
Fudan University
Shanghai, P. R. China
EMail: yangbinisme82@gmail.com

Re: how to deploy hadoop on many PCs quickly?

Posted by Sagar Naik <sa...@visvo.com>.
Hi

We at Visvo have developed a small script
for command processing on a cluster.
We would like to share it with you , have a it reviewed .
It is available under APL.

We would like to make a project so that we all
can contribute to this script.

For now,  you can download this script
from http://visvo.com/dev_scripts/co.tar.gz 
<http://visvo.com/dev_scripts/co.tar.gz>

- Sagar Naik


Bin YANG wrote:
> Dear colleagues,
>
> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> One PC will be set as master, the others will be set as slaves.
> The PCs have similar hardware, or even the same hardware.
>
> Is there a quick and easy way to deploy hadoop on these PCs?
>
> Do you think that
>
> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> 2. and then copy the image to other PCs
>
> is the best way?
>
> Thank you very much.
>
> Best wishes,
> Bin YANG
>
>
>   


-- 
This message has been scanned for viruses and
dangerous content and is believed to be clean.


Re: how to deploy hadoop on many PCs quickly?

Posted by Ted Dunning <td...@veoh.com>.
That's a fine way.

If you already have a Linux master distribution, then rsync can distribute
the hadoop software very quickly.


On 1/15/08 6:26 AM, "Bin YANG" <ya...@gmail.com> wrote:

> Dear colleagues,
> 
> Right now, I have to deploy ubuntu 7.10 + hadoop 0.15 on 16 PCs.
> One PC will be set as master, the others will be set as slaves.
> The PCs have similar hardware, or even the same hardware.
> 
> Is there a quick and easy way to deploy hadoop on these PCs?
> 
> Do you think that
> 
> 1. ghost a whole successful ubuntu 7.10 + hadoop 0.15 hard disk
> 2. and then copy the image to other PCs
> 
> is the best way?
> 
> Thank you very much.
> 
> Best wishes,
> Bin YANG
>