You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Bo Shi <bs...@gmail.com> on 2010/11/25 09:23:19 UTC

Whirr and CDH3

Hello,

I have created a custom AMI derived from the Canonical Ubuntu 10.04 image
which has some large EBS volumes attached and mounted at /mnt/hadoop and
/mnt/hadoop2 by default.  I've been trying to use whirr to set up a cluster
using this custom AMI and using CDH3beta without much success; I've run into
two problems which I think I can characterize but I'm at a loss about how to
solve them.


Issue 1
=======

I'm using the whirr source release with WHIRR-137 as the last commit and I have
run into the same issue encountered below where whirr is not able to log into
the nodes instantiated by jclouds:

https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/73735107db0cbc05/7f3858dc7abfb80f?lnk=raot&fwc=1



The root of the issue is that jclouds appears not to recognize my Ubuntu
derived image as part of the Ubuntu OS family (see "operatingSystem=[...,
family=unrecognized,...]" in the log sample below).

DEBUG [jclouds.compute] (main) <<   matched
image([id=us-east-1/ami-f74eb99e, name=null,
operatingSystem=[name=null, family=unrecognized, version=,
arch=paravirtual, is64Bit=false,
description=098149836275/custom-ubuntu-node-small],
description=Small/Medium Ubuntu 10.04 Cluster Node 2x200 GB EBS
Attached, version=small, location=[id=us-east-1, scope=REGION,
description=us-east-1, parent=ec2]])
INFO  [org.apache.whirr.service.hadoop.HadoopService] (main) Starting
master node

...

java.util.concurrent.ExecutionException: org.jclouds.ssh.SshException:
root@50.16.75.60:22: Error connecting to sftp.


If I boot up the canonical Ubuntu 10.04 EC2 image, jclouds appears to recognize
it as an Ubuntu image and knows to log into the instance as the "ubuntu" user
and to run scripts as sudo from there (see "family=ubuntu" in the log sample
below).  Unfortunately, thus far I haven't been able to find anything in the
jclouds or amazon/ec2 documentation to hint at what i need to do to get jclouds
to recognize my ubuntu derived AMI as an ubuntu image so it tries to log in
as root and subsequently fails.


DEBUG [jclouds.compute] (main) <<   matched
image([id=us-east-1/ami-480df921, name=null,
operatingSystem=[name=null, family=ubuntu, version=10.04,
arch=paravirtual, is64Bit=false,
description=099720109477/ebs/ubuntu-images/ubuntu-lucid-10.04-i386-server-20101020],
description=099720109477/ebs/ubuntu-images/ubuntu-lucid-10.04-i386-server-20101020,
version=20101020, location=[id=us-east-1, scope=REGION,
description=us-east-1, parent=ec2]])
2010-11-24 23:34:14,771 INFO
[org.apache.whirr.service.hadoop.HadoopService] (main) Starting master
node


Issue 2
=======

For Ubuntu AMI's which can properly instantiate nodes, the installation script
"cloudera/cdh/install" seems to fail since the hadoop user is not created,
however I was under the impression that the scripts handled user creation::


DEBUG [jclouds.compute] (user thread 1) << stdout from runscript as
ubuntu@184.73.17.198
^M
Setting up hadoop-0.20 (0.20.2+737-1~lucid-cdh3b3) ...^M
update-alternatives: using /etc/hadoop-0.20/conf.empty to provide
/etc/hadoop-0.20/conf (hadoop-0.20-conf) in auto mode.^M
update-alternatives: using /usr/bin/hadoop-0.20 to provide
/usr/bin/hadoop (hadoop-default) in auto mode.^M
^M
Setting up hadoop-0.20-native (0.20.2+737-1~lucid-cdh3b3) ...^M
^M
Processing triggers for libc-bin ...^M
ldconfig deferred processing now taking place^M
update-alternatives: using /etc/hadoop-0.20/conf.dist to provide
/etc/hadoop-0.20/conf (hadoop-0.20-conf) in auto mode.^M

2010-11-24 23:36:57,352 DEBUG [jclouds.compute] (user thread 1) <<
stderr from runscript as ubuntu@184.73.17.198
+ DFS_DATA_DIR=/mnt/hadoop/hdfs/data,/mnt/hadoop2^M
+ MAPRED_LOCAL_DIR=/mnt/hadoop/mapred/local^M
+ MAX_MAP_TASKS=2^M
+ MAX_REDUCE_TASKS=1^M
+ CHILD_OPTS=-Xmx550m^M
+ CHILD_ULIMIT=1126400^M
+ TMP_DIR='/mnt/tmp/hadoop-${user.name}'^M
+ mkdir -p /mnt/hadoop^M
+ chown hadoop:hadoop /mnt/hadoop^M
chown: invalid user: `hadoop:hadoop'^M







Here is my test configuration:


   whirr.service-name=hadoop
   whirr.cluster-name=testcluster
   whirr.instance-templates=1 jt+nn,1 dn+tt

   whirr.provider=ec2
   whirr.identity=***
   whirr.credential=***
   whirr.private-key-file=${sys:user.home}/.ssh/id_rsa

   # Uploaded to the following

   # FFFFFFFffffffffffuuuuuuuuuuu
   # Full path is the combination of the two below
   whirr.hadoop-install-runurl=cloudera/cdh/install
   whirr.run-url-base=http://production.tinycorp.clusters.s3.amazonaws.com/

   # Confirm the Lucid (10.04) kernel is up to date wrt below:
   # https://bugs.launchpad.net/ubuntu-on-ec2/+bug/574910

   # Custom image with a bunch of EBS
   # whirr.image-id=us-east-1/ami-f74eb99e

   # Default image is Amazon Linux AMI which is based on RHEL 5
   # Canonical Ubuntu 10.04 32-bit EBS ASMI
   whirr.image-id=us-east-1/ami-480df921
   # CDH Ubuntu 8.10
   # whirr.image-id=us-east-1/ami-ed59bf84
   # https://issues.apache.org/jira/browse/WHIRR-137
   jclouds.ec2.ami-owners=098149836275,726089167552,099720109477

Re: Whirr and CDH3

Posted by Tom White <to...@cloudera.com>.
On Thu, Nov 25, 2010 at 12:23 AM, Bo Shi <bs...@gmail.com> wrote:
> Hello,
>
> I have created a custom AMI derived from the Canonical Ubuntu 10.04 image
> which has some large EBS volumes attached and mounted at /mnt/hadoop and
> /mnt/hadoop2 by default.  I've been trying to use whirr to set up a cluster
> using this custom AMI and using CDH3beta without much success; I've run into
> two problems which I think I can characterize but I'm at a loss about how to
> solve them.
>
>
> Issue 1
> =======
>
> I'm using the whirr source release with WHIRR-137 as the last commit and I have
> run into the same issue encountered below where whirr is not able to log into
> the nodes instantiated by jclouds:
>
> https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/73735107db0cbc05/7f3858dc7abfb80f?lnk=raot&fwc=1
>
>
>
> The root of the issue is that jclouds appears not to recognize my Ubuntu
> derived image as part of the Ubuntu OS family (see "operatingSystem=[...,
> family=unrecognized,...]" in the log sample below).
>
> DEBUG [jclouds.compute] (main) <<   matched
> image([id=us-east-1/ami-f74eb99e, name=null,
> operatingSystem=[name=null, family=unrecognized, version=,
> arch=paravirtual, is64Bit=false,
> description=098149836275/custom-ubuntu-node-small],
> description=Small/Medium Ubuntu 10.04 Cluster Node 2x200 GB EBS
> Attached, version=small, location=[id=us-east-1, scope=REGION,
> description=us-east-1, parent=ec2]])
> INFO  [org.apache.whirr.service.hadoop.HadoopService] (main) Starting
> master node
>
> ...
>
> java.util.concurrent.ExecutionException: org.jclouds.ssh.SshException:
> root@50.16.75.60:22: Error connecting to sftp.
>
>
> If I boot up the canonical Ubuntu 10.04 EC2 image, jclouds appears to recognize
> it as an Ubuntu image and knows to log into the instance as the "ubuntu" user
> and to run scripts as sudo from there (see "family=ubuntu" in the log sample
> below).  Unfortunately, thus far I haven't been able to find anything in the
> jclouds or amazon/ec2 documentation to hint at what i need to do to get jclouds
> to recognize my ubuntu derived AMI as an ubuntu image so it tries to log in
> as root and subsequently fails.

I believe this happens in jclouds code. Adrian, do you know how to control this?

>
>
> DEBUG [jclouds.compute] (main) <<   matched
> image([id=us-east-1/ami-480df921, name=null,
> operatingSystem=[name=null, family=ubuntu, version=10.04,
> arch=paravirtual, is64Bit=false,
> description=099720109477/ebs/ubuntu-images/ubuntu-lucid-10.04-i386-server-20101020],
> description=099720109477/ebs/ubuntu-images/ubuntu-lucid-10.04-i386-server-20101020,
> version=20101020, location=[id=us-east-1, scope=REGION,
> description=us-east-1, parent=ec2]])
> 2010-11-24 23:34:14,771 INFO
> [org.apache.whirr.service.hadoop.HadoopService] (main) Starting master
> node
>
>
> Issue 2
> =======
>
> For Ubuntu AMI's which can properly instantiate nodes, the installation script
> "cloudera/cdh/install" seems to fail since the hadoop user is not created,
> however I was under the impression that the scripts handled user creation::
>
>
> DEBUG [jclouds.compute] (user thread 1) << stdout from runscript as
> ubuntu@184.73.17.198
> ^M
> Setting up hadoop-0.20 (0.20.2+737-1~lucid-cdh3b3) ...^M
> update-alternatives: using /etc/hadoop-0.20/conf.empty to provide
> /etc/hadoop-0.20/conf (hadoop-0.20-conf) in auto mode.^M
> update-alternatives: using /usr/bin/hadoop-0.20 to provide
> /usr/bin/hadoop (hadoop-default) in auto mode.^M
> ^M
> Setting up hadoop-0.20-native (0.20.2+737-1~lucid-cdh3b3) ...^M
> ^M
> Processing triggers for libc-bin ...^M
> ldconfig deferred processing now taking place^M
> update-alternatives: using /etc/hadoop-0.20/conf.dist to provide
> /etc/hadoop-0.20/conf (hadoop-0.20-conf) in auto mode.^M
>
> 2010-11-24 23:36:57,352 DEBUG [jclouds.compute] (user thread 1) <<
> stderr from runscript as ubuntu@184.73.17.198
> + DFS_DATA_DIR=/mnt/hadoop/hdfs/data,/mnt/hadoop2^M
> + MAPRED_LOCAL_DIR=/mnt/hadoop/mapred/local^M
> + MAX_MAP_TASKS=2^M
> + MAX_REDUCE_TASKS=1^M
> + CHILD_OPTS=-Xmx550m^M
> + CHILD_ULIMIT=1126400^M
> + TMP_DIR='/mnt/tmp/hadoop-${user.name}'^M
> + mkdir -p /mnt/hadoop^M
> + chown hadoop:hadoop /mnt/hadoop^M
> chown: invalid user: `hadoop:hadoop'^M

I think the scripts for CDH in trunk are not working correctly. I just
noticed the same thing when working on
https://issues.apache.org/jira/browse/WHIRR-87, and will post a fix as
a part of that issue.

Cheers
Tom

>
>
>
>
>
>
>
> Here is my test configuration:
>
>
>   whirr.service-name=hadoop
>   whirr.cluster-name=testcluster
>   whirr.instance-templates=1 jt+nn,1 dn+tt
>
>   whirr.provider=ec2
>   whirr.identity=***
>   whirr.credential=***
>   whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
>
>   # Uploaded to the following
>
>   # FFFFFFFffffffffffuuuuuuuuuuu
>   # Full path is the combination of the two below
>   whirr.hadoop-install-runurl=cloudera/cdh/install
>   whirr.run-url-base=http://production.tinycorp.clusters.s3.amazonaws.com/
>
>   # Confirm the Lucid (10.04) kernel is up to date wrt below:
>   # https://bugs.launchpad.net/ubuntu-on-ec2/+bug/574910
>
>   # Custom image with a bunch of EBS
>   # whirr.image-id=us-east-1/ami-f74eb99e
>
>   # Default image is Amazon Linux AMI which is based on RHEL 5
>   # Canonical Ubuntu 10.04 32-bit EBS ASMI
>   whirr.image-id=us-east-1/ami-480df921
>   # CDH Ubuntu 8.10
>   # whirr.image-id=us-east-1/ami-ed59bf84
>   # https://issues.apache.org/jira/browse/WHIRR-137
>   jclouds.ec2.ami-owners=098149836275,726089167552,099720109477
>