You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Chris Anderson <jc...@grabb.it> on 2008/06/07 20:31:30 UTC

contrib EC2 with hadoop 0.17

First of all, thanks to whoever maintains the hadoop-ec2 scripts.
They've saved us untold time and frustration getting started with a
small testing cluster (5 instances).

A question: when we log into the newly created cluster, and run jobs
from the example jar (pi, etc) everything works great. We expect our
custom jobs will run just as smoothly.

However, when we restart the namenodes and tasktrackers by running
bin/stop-all.sh on the master, it tries to stop only activity on
localhost. Running start-all.sh then boots up a localhost-only cluster
(on which jobs run just fine).

The only way we've been able to recover from this situation is to use
bin/terminate-hadoop-cluster and bin/destroy-hadoop-cluster and then
start again from scratch with a new cluster.

There must be a simple way to restart the namenodes and jobtrackers
across all machines from the master. Also, I think understanding the
answer to this question might put a lot more into perspective for me,
so I can go on to do more advanced things on my own.

Thanks for any assistance / insight!

Chris


output from stop-all.sh
==

stopping jobtracker
localhost: Warning: Permanently added 'localhost' (RSA) to the list of
known hosts.
localhost: no tasktracker to stop
stopping namenode
localhost: no datanode to stop
localhost: no secondarynamenode to stop


conf files in /usr/local/hadoop-0.17.0
==

# cat conf/slaves
localhost
# cat conf/masters
localhost




-- 
Chris Anderson
http://jchris.mfdz.com

Re: contrib EC2 with hadoop 0.17

Posted by Chris Anderson <jc...@grabb.it>.

On Mon, Jun 9, 2008 at 9:01 AM, Chris K Wensel <ch...@wensel.net> wrote:
>
> configuration values should be set in conf/hadoop-site.xml. Those particular
> values you are referring to probably should be set per job and generally
> don't have anything to do with instance sizes but more to do with cluster
> size and the job being run.
>
> different instance sizes have mapred.tasktracker.map.tasks.maximum and
> mapred.tasktracker.reduce.tasks.maximum set accordingly (see hadoop-init),
> but again might/should be tuned to your application (cpu or io bound).
>

Thanks for clearing all this up, Chris. We're actually doing just
that, and now having the recommendation from you to do it this way
makes me believe we're doing it right.

So far, Hadoop has been treating us well!

-- 
Chris Anderson
http://jchris.mfdz.com

Re: contrib EC2 with hadoop 0.17

Posted by Chris K Wensel <ch...@wensel.net>.

> Thanks for the description, Chris. Now that I understand the basic
> model, I'm starting to see how the configuration is passed to the
> slaves using the -d option of ec2-run-instances.
>
> One config question: on our cluster (hadoop 0.17 with
> INSTANCE_TYPE="m1.small") the conf/hadoop-default.xml has
> mapred.reduce.tasks set to 1, and mapred.map.tasks set to 2.
>
> From experimenting and reading the FAQ, it looks like those numbers
> should be higher, unless you have single-machine cluster. Maybe
> there's something I'm missing, but by upping mapred.map.tasks and
> mapred.reduce.tasks to 5 and 15 (in our job jar) we're getting much
> better performance. Is there a reason hadoop-init doesn't build a
> hadoop-site.xml file with higher or configurable values for these
> fields?
>

configuration values should be set in conf/hadoop-site.xml. Those  
particular values you are referring to probably should be set per job  
and generally don't have anything to do with instance sizes but more  
to do with cluster size and the job being run.

different instance sizes have mapred.tasktracker.map.tasks.maximum and  
mapred.tasktracker.reduce.tasks.maximum set accordingly (see hadoop- 
init), but again might/should be tuned to your application (cpu or io  
bound).

ckw

Chris K Wensel
chris@wensel.net
http://chris.wensel.net/
http://www.cascading.org/

Re: contrib EC2 with hadoop 0.17

Posted by Chris Anderson <jc...@grabb.it>.

On Sat, Jun 7, 2008 at 5:25 PM, Chris K Wensel <ch...@wensel.net> wrote:
> The new scripts do not use the start/stop-all.sh scripts, and thus do not
> maintain the slaves file. This is so cluster startup is much faster and a
> bit more reliable (keys do not need to be pushed to the slaves). Also we can
> grow the cluster lazily just by starting slave nodes.

Thanks for the description, Chris. Now that I understand the basic
model, I'm starting to see how the configuration is passed to the
slaves using the -d option of ec2-run-instances.

One config question: on our cluster (hadoop 0.17 with
INSTANCE_TYPE="m1.small") the conf/hadoop-default.xml has
mapred.reduce.tasks set to 1, and mapred.map.tasks set to 2.

>From experimenting and reading the FAQ, it looks like those numbers
should be higher, unless you have single-machine cluster. Maybe
there's something I'm missing, but by upping mapred.map.tasks and
mapred.reduce.tasks to 5 and 15 (in our job jar) we're getting much
better performance. Is there a reason hadoop-init doesn't build a
hadoop-site.xml file with higher or configurable values for these
fields?

> But it probably would be wise to provide scripts to build/refresh the slaves
> file, and push keys to slaves, so the cluster can be traditionally
> maintained, instead of just re-instantiated with new parameters etc.

I'm still getting the hang of best practices as far as deploying /
managing clusters. But for EC2 the all-or-nothing cluster approach
seems right. Maybe the slave scripts aren't needed.

>
> I wonder if these scripts would make sense in general, instead of being ec2
> specific?

There's so much functionality being handled by the ec2-script suite
that using Eucalyptus http://eucalyptus.cs.ucsb.edu/ (which allows any
data center to be managed like EC2) might make more sense.

Thanks again for the response, I'm think I'm starting to get the hang of this.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: contrib EC2 with hadoop 0.17

Posted by Chris K Wensel <ch...@wensel.net>.

The new scripts do not use the start/stop-all.sh scripts, and thus do  
not maintain the slaves file. This is so cluster startup is much  
faster and a bit more reliable (keys do not need to be pushed to the  
slaves). Also we can grow the cluster lazily just by starting slave  
nodes. That is, they are mostly optimized for booting a large cluster  
fast, doing work, then shutting down (allowing for huge short lived  
clusters, vs a smaller/cheaper long lived one).

But it probably would be wise to provide scripts to build/refresh the  
slaves file, and push keys to slaves, so the cluster can be  
traditionally maintained, instead of just re-instantiated with new  
parameters etc.

I wonder if these scripts would make sense in general, instead of  
being ec2 specific?

ckw

On Jun 7, 2008, at 11:31 AM, Chris Anderson wrote:

> First of all, thanks to whoever maintains the hadoop-ec2 scripts.
> They've saved us untold time and frustration getting started with a
> small testing cluster (5 instances).
>
> A question: when we log into the newly created cluster, and run jobs
> from the example jar (pi, etc) everything works great. We expect our
> custom jobs will run just as smoothly.
>
> However, when we restart the namenodes and tasktrackers by running
> bin/stop-all.sh on the master, it tries to stop only activity on
> localhost. Running start-all.sh then boots up a localhost-only cluster
> (on which jobs run just fine).
>
> The only way we've been able to recover from this situation is to use
> bin/terminate-hadoop-cluster and bin/destroy-hadoop-cluster and then
> start again from scratch with a new cluster.
>
> There must be a simple way to restart the namenodes and jobtrackers
> across all machines from the master. Also, I think understanding the
> answer to this question might put a lot more into perspective for me,
> so I can go on to do more advanced things on my own.
>
> Thanks for any assistance / insight!
>
> Chris
>
>
> output from stop-all.sh
> ==
>
> stopping jobtracker
> localhost: Warning: Permanently added 'localhost' (RSA) to the list of
> known hosts.
> localhost: no tasktracker to stop
> stopping namenode
> localhost: no datanode to stop
> localhost: no secondarynamenode to stop
>
>
> conf files in /usr/local/hadoop-0.17.0
> ==
>
> # cat conf/slaves
> localhost
> # cat conf/masters
> localhost
>
>
>
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com

Chris K Wensel
chris@wensel.net
http://chris.wensel.net/
http://www.cascading.org/

Re: contrib EC2 with hadoop 0.17

Posted by Tom White <to...@cloudera.com>.

I haven't used Eucalyptus, but you could start by trying out the
Hadoop EC2 scripts (http://wiki.apache.org/hadoop/AmazonEC2) with your
Eucalyptus installation.

Cheers,
Tom

On Tue, Mar 3, 2009 at 2:51 PM, falcon164 <mu...@gmail.com> wrote:
>
> I am new to hadoop. I want to run hadoop on eucalyptus. Please let me know
> how to do this.
> --
> View this message in context: http://www.nabble.com/contrib-EC2-with-hadoop-0.17-tp17711758p22310068.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Re: contrib EC2 with hadoop 0.17

Posted by falcon164 <mu...@gmail.com>.

I am new to hadoop. I want to run hadoop on eucalyptus. Please let me know
how to do this.
-- 
View this message in context: http://www.nabble.com/contrib-EC2-with-hadoop-0.17-tp17711758p22310068.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.