You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@whirr.apache.org by Adrian Cole <ad...@opscode.com> on 2010/08/14 22:04:28 UTC

pipeline a bit better?

Hi, team.

As you know, a large part of the startup time in a cloud is spent spinning
up "blank" machines, then installing the base software.  I've noticed that
we start the hadoop master and configure it, then the slaves.  I'm pretty
sure we can eliminate a few minutes from this by starting the masters and
slaves up simultaneously, then configuring them after the software is on.
Initially, this can be done via runScriptOnNodes... later, this could be
deferred to another system like chef.

What do you think?
-Adrian

Re: pipeline a bit better?

Posted by Tom White <to...@cloudera.com>.

This is a hangover from the Python scripts, but I agree that it would
make sense to start all the nodes simultaneously and run a post
configure script (which, incidentally, is what ZooKeeper and Cassandra
do).

Do you want to open a JIRA?

Cheers,
Tom

On Sat, Aug 14, 2010 at 1:04 PM, Adrian Cole <ad...@opscode.com> wrote:
> Hi, team.
>
> As you know, a large part of the startup time in a cloud is spent spinning
> up "blank" machines, then installing the base software.  I've noticed that
> we start the hadoop master and configure it, then the slaves.  I'm pretty
> sure we can eliminate a few minutes from this by starting the masters and
> slaves up simultaneously, then configuring them after the software is on.
> Initially, this can be done via runScriptOnNodes... later, this could be
> deferred to another system like chef.
>
> What do you think?
> -Adrian
>