You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Andrew Klochkov <di...@gmail.com> on 2010/04/05 13:56:29 UTC

a patch for src/contrib/cloud scripts

Hi,

I'm using new src/contrib/cloud scripts created by Tom White recently. Great
work, now we can easily deploy 0.20.x onto EC2. Thanks, Tom!

I've just fixed several things in the scripts:

1. A typo in src/py/hadoop/cloud/cli.py: on "push" command the script
actually invokes proxy creation
2. Modified src/py/hadoop/cloud/service.py, to copy AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY variables into a user data script environment, so
later it can fill them in hadoop configs.
3. Fixed user data script to fill /home/hadoop/.bashrc file additionally to
root's .bashrc. Seems like with new scripts, it's better to issue jobs from
"hadoop" user. Some HDFS permissions don't let to run jobs from "root" user.

Should I open a JIRA issue on it? Or should I post the patch here, in the
mailing list?

And also I'm thinking if it's a good idea to create a "hardware" provider to
use cloud scripts for hardware (non-cloud) cluster deployment. What do you
think? Is there a better way to automate hardware  cluster deployment?

-- 
Andrew Klochkov

Re: a patch for src/contrib/cloud scripts

Posted by Tom White <to...@cloudera.com>.
Hi Chris,

Thanks for the suggestion. I've been having similar thoughts, so I've
created a proposal [1] for a new incubator project for cloud services
(not just Hadoop). It's still in draft, and I'd welcome any feedback
you or others may have.

Cheers,
Tom

[1] http://www.mail-archive.com/general@incubator.apache.org/msg24373.html

On Sat, Apr 10, 2010 at 11:40 AM, Chris K Wensel <ch...@wensel.net> wrote:
> I personally would love to see more of the orthogonal stuff (like these scripts) show up in github and out of contrib so the friction to contribution is much lower.
>
> i recognize this is somewhat heretical, but this does make non-critical things more democratic, if you will.
>
> fyi http://github.com/clj-sys/crane is already a clojure/java version of these scripts. adrian on the jclouds project is contributing to it to get vendor independence to some degree.
>
> ckw
>
> On Apr 9, 2010, at 5:11 AM, Andrew Klochkov wrote:
>
>> On Mon, Apr 5, 2010 at 6:04 PM, Tom White <to...@cloudera.com> wrote:
>>
>>> On Mon, Apr 5, 2010 at 4:56 AM, Andrew Klochkov <di...@gmail.com> wrote:
>>>>
>>>> And also I'm thinking if it's a good idea to create a "hardware" provider
>>> to
>>>> use cloud scripts for hardware (non-cloud) cluster deployment. What do
>>> you
>>>> think? Is there a better way to automate hardware  cluster deployment?
>>>
>>> That would be a great addition.
>>>
>>
>> Submitted it here: https://issues.apache.org/jira/browse/HADOOP-6696
>>
>> But after I discovered Hadoop On Demand, I'm not sure it worth using
>> contrib/cloud for hardware clusters deployment. So now I'm going to try HOD,
>> and probably some configuration management tools like Chef or SmartFrog.
>>
>> --
>> Andrew Klochkov
>
> --
> Chris K Wensel
> chris@concurrentinc.com
> http://www.concurrentinc.com
>
>

Re: a patch for src/contrib/cloud scripts

Posted by Chris K Wensel <ch...@wensel.net>.
I personally would love to see more of the orthogonal stuff (like these scripts) show up in github and out of contrib so the friction to contribution is much lower.

i recognize this is somewhat heretical, but this does make non-critical things more democratic, if you will.

fyi http://github.com/clj-sys/crane is already a clojure/java version of these scripts. adrian on the jclouds project is contributing to it to get vendor independence to some degree.

ckw

On Apr 9, 2010, at 5:11 AM, Andrew Klochkov wrote:

> On Mon, Apr 5, 2010 at 6:04 PM, Tom White <to...@cloudera.com> wrote:
> 
>> On Mon, Apr 5, 2010 at 4:56 AM, Andrew Klochkov <di...@gmail.com> wrote:
>>> 
>>> And also I'm thinking if it's a good idea to create a "hardware" provider
>> to
>>> use cloud scripts for hardware (non-cloud) cluster deployment. What do
>> you
>>> think? Is there a better way to automate hardware  cluster deployment?
>> 
>> That would be a great addition.
>> 
> 
> Submitted it here: https://issues.apache.org/jira/browse/HADOOP-6696
> 
> But after I discovered Hadoop On Demand, I'm not sure it worth using
> contrib/cloud for hardware clusters deployment. So now I'm going to try HOD,
> and probably some configuration management tools like Chef or SmartFrog.
> 
> -- 
> Andrew Klochkov

--
Chris K Wensel
chris@concurrentinc.com
http://www.concurrentinc.com


Re: a patch for src/contrib/cloud scripts

Posted by Andrew Klochkov <di...@gmail.com>.
On Mon, Apr 5, 2010 at 6:04 PM, Tom White <to...@cloudera.com> wrote:

> On Mon, Apr 5, 2010 at 4:56 AM, Andrew Klochkov <di...@gmail.com> wrote:
> >
> > And also I'm thinking if it's a good idea to create a "hardware" provider
> to
> > use cloud scripts for hardware (non-cloud) cluster deployment. What do
> you
> > think? Is there a better way to automate hardware  cluster deployment?
>
> That would be a great addition.
>

Submitted it here: https://issues.apache.org/jira/browse/HADOOP-6696

But after I discovered Hadoop On Demand, I'm not sure it worth using
contrib/cloud for hardware clusters deployment. So now I'm going to try HOD,
and probably some configuration management tools like Chef or SmartFrog.

-- 
Andrew Klochkov

Re: a patch for src/contrib/cloud scripts

Posted by Tom White <to...@cloudera.com>.
On Mon, Apr 5, 2010 at 4:56 AM, Andrew Klochkov <di...@gmail.com> wrote:
> Hi,
>
> I'm using new src/contrib/cloud scripts created by Tom White recently. Great
> work, now we can easily deploy 0.20.x onto EC2. Thanks, Tom!
>
> I've just fixed several things in the scripts:
>
> 1. A typo in src/py/hadoop/cloud/cli.py: on "push" command the script
> actually invokes proxy creation
> 2. Modified src/py/hadoop/cloud/service.py, to copy AWS_ACCESS_KEY_ID
> and AWS_SECRET_ACCESS_KEY variables into a user data script environment, so
> later it can fill them in hadoop configs.
> 3. Fixed user data script to fill /home/hadoop/.bashrc file additionally to
> root's .bashrc. Seems like with new scripts, it's better to issue jobs from
> "hadoop" user. Some HDFS permissions don't let to run jobs from "root" user.
>
> Should I open a JIRA issue on it? Or should I post the patch here, in the
> mailing list?

Please open JIRAs and attach patches there.

>
> And also I'm thinking if it's a good idea to create a "hardware" provider to
> use cloud scripts for hardware (non-cloud) cluster deployment. What do you
> think? Is there a better way to automate hardware  cluster deployment?

That would be a great addition.

Cheers,
Tom

>
> --
> Andrew Klochkov
>