You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Marco Didonna <m....@gmail.com> on 2011/10/30 09:28:20 UTC

CDH behaves differently in the cloud?

Hello everyone,
I am pretty new to whirr and I would like to share with you my doubts.
I am using on my laptop latest version of cdh available from
repositories and I'd expect it to behave the same way on the cloud
since I used the "cloudera recipe" to launch a cdh cluster in EC2,
m1.large instance type. Here's the whirr config file
http://pastebin.com/JXHYvMNb
First of all seems like whirr installs a different version of cdh in
the cloud according to what hadoop version says: I don't think this is
a big deal but still...
Secondly some options have absolutely no effect:

- hadoop-mapreduce.mapred.child.java.opts=-Xmx1600m is basically
ignored, even though it is present in mapred-site (on my laptop it
works). According to ps aux the Xmx is 200 as default. I've also tried
to manually edit mapred-site and add
mapred.map.child.java.opts=-Xmx1000m and I got error saying there was
no enough space (4GB were free).

- hadoop-hdfs.dfs.replication=1 is ignored as well since I see
replication factor 3 when I move my data from s3 to hdfs.

- hadoop-hdfs.dfs.block.size=134217728 I actually don't know how to
see if this has affected hdfs config.

I hope someone can shed some light and maybe some configuration tweaks
for m1.large he/she is using.

Thank you

Marco Didonna

Re: CDH behaves differently in the cloud?

Posted by Marco Didonna <m....@gmail.com>.
What do you mean by check remote config files? I did check them to
understand why some options were completely ignored...the config files
seems ok ... some minor difference compared to my own (pseudo mode)
config files.

Thanks for your answer,

Marco

On 1 November 2011 13:31, Andrei Savu <sa...@gmail.com> wrote:
> Right now Whirr is using the latest release from Cloudera in this case
> cdh3u2 release recently. It's possible that our setup scripts are affected
> by this upgrade. We are tracking progress on this in:

Re: CDH behaves differently in the cloud?

Posted by Andrei Savu <sa...@gmail.com>.
Marco -

Can you check the remote config files for CDH Hadoop & versions?

Right now Whirr is using the latest release from Cloudera in this case
cdh3u2 release recently. It's possible that our setup scripts are affected
by this upgrade. We are tracking progress on this in:
https://issues.apache.org/jira/browse/WHIRR-415

Thanks,

-- Andrei Savu

On Sun, Oct 30, 2011 at 10:28 AM, Marco Didonna <m....@gmail.com>wrote:

> Hello everyone,
> I am pretty new to whirr and I would like to share with you my doubts.
> I am using on my laptop latest version of cdh available from
> repositories and I'd expect it to behave the same way on the cloud
> since I used the "cloudera recipe" to launch a cdh cluster in EC2,
> m1.large instance type. Here's the whirr config file
> http://pastebin.com/JXHYvMNb
> First of all seems like whirr installs a different version of cdh in
> the cloud according to what hadoop version says: I don't think this is
> a big deal but still...
> Secondly some options have absolutely no effect:
>
> - hadoop-mapreduce.mapred.child.java.opts=-Xmx1600m is basically
> ignored, even though it is present in mapred-site (on my laptop it
> works). According to ps aux the Xmx is 200 as default. I've also tried
> to manually edit mapred-site and add
> mapred.map.child.java.opts=-Xmx1000m and I got error saying there was
> no enough space (4GB were free).
>
> - hadoop-hdfs.dfs.replication=1 is ignored as well since I see
> replication factor 3 when I move my data from s3 to hdfs.
>
> - hadoop-hdfs.dfs.block.size=134217728 I actually don't know how to
> see if this has affected hdfs config.
>
> I hope someone can shed some light and maybe some configuration tweaks
> for m1.large he/she is using.
>
> Thank you
>
> Marco Didonna
>