You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Benjamin Clark <be...@daltonclark.com> on 2011/03/16 17:54:58 UTC
aws 64-bit c1.xlarge problems
I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
whirr.hardware-id=c1.xlarge
and then either this (from the recipe)
# Ubuntu 10.04 LTS Lucid. See http://alestic.com/
whirr.image-id=us-east-1/ami-da0cf8b3
or this:
# Amazon linux 64-bit, default as of 3/11:
whirr.image-id=us-east-1/ami-8e1fece7
I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
My whole config file is this:
whirr.cluster-name=bhcL4
whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
whirr.provider=aws-ec2
whirr.identity=...
whirr.credential=...
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
whirr.hardware-id=c1.xlarge
#whirr.hardware-id=c1.medium
# Ubuntu 10.04 LTS Lucid. See http://alestic.com/
whirr.image-id=us-east-1/ami-da0cf8b3
# Amazon linux as of 3/11:
#whirr.image-id=us-east-1/ami-8e1fece7
# If you choose a different location, make sure whirr.image-id is updated too
whirr.location-id=us-east-1d
hadoop-hdfs.dfs.permissions=false
hadoop-hdfs.dfs.replication=2
Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
Re: aws 64-bit c1.xlarge problems
Posted by Andrei Savu <sa...@gmail.com>.
Thanks for sharing. I'm thinking about defining a set of of supported
AMIs / OSes for Whirr and test them all when building a new release.
On Fri, Mar 18, 2011 at 9:57 PM, Benjamin Clark <be...@daltonclark.com> wrote:
> I read the manual and figured a bit more of this out. Amazon may change the
> defaults in their console without an announcement, but they document what
> they're doing here: http://aws.amazon.com/amazon-linux-ami/
> The /media/ephemeral0 is for one of their Amazon linux instances that has
> S3-backed non-durable storage. It seems as if the ebs-backed ones have no
> non-durable storage by default, and the S3-backed ones do, but in that
> eccentric location (eccentric relative to what everybody else does on
> Amazon). So if we like the S3-backed instances, we can hack the
> install_cdh_hadoop.sh script by adding
> rm -rf /mnt
> ln -s /media/ephemeral0 /mnt
> or we can write a whole thing that spins up an ebs volume per node and
> attaches it, for the ebs-backed ones.
> Is there any experience among the users as to which will be more stable and
> perform better? I've got the S3-backed one working, so I'll use that and
> just bake it off against the Alestic/ubuntu system that now also works for
> me, unless there's a compelling case for the ebs-backed thing.
> --Ben
>
>
> Andrei,
>
> The release candidate code does work. Perhaps something is different,
> relative to the patched frankenstein I was using, perhaps I had some local
> corruption or config problem.
>
> It sets up everything as whatever my local user is, by default, and the
> override as whirr.cluster-user works as well.
>
> In any case, at the rate AWS seems to be changing the configuration of
> 'amazon linux' perhaps it's less useful than I thought. Last week the
> default amis in the console had a bunch of spare disk space on the
> /media/ephemeral0 partition, which I could symlink /mnt to in the
> install_cdh_hadoop.sh script, and then hdfs would have a decent amount of
> space. Now there is no such thing, so I suppose I would have to launch an
> ebs volume per node and mount that. This is now tipping over into the "too
> much trouble" zone for me. And in the mean time I got all my native stuff
> (hadoop-lzo and R/Rhipe) working on ubuntu, so I think I'm going to use the
> Alestic image from the recipe for a while. If there's an obvious candidate
> up there for "reasonably-modern redhat derivative ami from a source on the
> good lists that behaves well," I'd like to know what it is. By 'reasonably
> modern' I mean having default python >= 2.5.
>
> I liked the old custom of having /mnt be a separate partition of a decent
> size. I hope this is just a glitch with AWS. I suspect it may be because
> jclouds/whirr is showing (e.g.) in the output:
> volumes=[[id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false,
> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc,
> durable=false, isBootDevice=false]
> So theoretically the disk space is still there on those non-boot,
> non-durable devices, but I cannot mount them.
>
>
> I also tried the cluster ami, because I am intrigued by the possibilities
> for good performance. Sounds great for hadoop, doesn't it? But it won't
> even start the nodes, giving this:
>
> Configuring template
> Unexpected error while starting 1 nodes, minimum 1 nodes for
> [hadoop-namenode, hadoop-jobtracker] of cluster bhcLA
> java.util.concurrent.ExecutionException:
> org.jclouds.http.HttpResponseException: command: POST
> https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1
> 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of
> 'hvm' currently may only be used with Cluster Compute instance types.]
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at
> org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.waitForOutcomes(BootstrapClusterAction.java:307)
> at
> org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:260)
> at
> org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:221)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: org.jclouds.http.HttpResponseException: command: POST
> https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1
> 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of
> 'hvm' currently may only be used with Cluster Compute instance types.]
> at
> org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:75)
>
> There must be something a bit more involved to specify cluster instances in
> the amazon api, perhaps not (yet) supported by jclouds? I'm afraid I don't
> need this enough right now to justify digging further .
>
>
> Anyway, thanks for all your help and advice on this.
>
> --Ben
>
>
> On Mar 17, 2011, at 7:01 PM, Andrei Savu wrote:
>
> Strange! I will try your properties file tomorrow.
>
> If you want to try again you can find the artifacts for 0.4.0 RC1 here:
>
> http://people.apache.org/~asavu/whirr-0.4.0-incubating-candidate-1
>
> On Thu, Mar 17, 2011 at 8:41 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>
> Andrei,
>
> Thanks for looking at this. Unfortunately it does not seem to work.
>
> Using the Amazon linux 64-bit ami with no whirr.cluster-user, or if I set it
> to 'ben' or whatever else, I get this.
>
> 1) SshException on node us-east-1/i-62de280d:
>
> org.jclouds.ssh.SshException: ec2-user@72.44.35.254:22: Error connecting to
> session.
>
> at
> org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>
> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>
> at
> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>
> at
> org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>
> So it doesn't seem to be honoring that property, and it's definitely not
> allowing me to log in to any nodes, 'ben', 'ec2-user' or 'root'.
>
> The ubuntu ami from the recipes continues to work fine.
>
> Here's the full config file I'm using. I grabbed the recipe from trunk and
> put my stuff back in, to make sure I'm not missing a new setting:
>
> whirr.cluster-name=bhcTL
>
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2
> hadoop-datanode+hadoop-tasktracker
>
> whirr.hadoop-install-function=install_cdh_hadoop
>
> whirr.hadoop-configure-function=configure_cdh_hadoop
>
> whirr.provider=aws-ec2
>
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-hkey
>
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-hkey.pub
>
> whirr.cluster-user=ben
>
> # Amazon linux 32-bit--works
>
> #whirr.hardware-id=c1.medium
>
> #whirr.image-id=us-east-1/ami-d59d6bbc
>
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/ -- works
>
> #whirr.hardware-id=c1.xlarge
>
> #whirr.image-id=us-east-1/ami-da0cf8b3
>
> # Amazon linux 64-bit as of 3/11:--doesn't work
>
> whirr.hardware-id=c1.xlarge
>
> whirr.image-id=us-east-1/ami-8e1fece7
>
> #Cluster compute --doesn't work
>
> #whirr.hardward-id=cc1.4xlarge
>
> #whirr.image-id=us-east-1/ami-321eed5b
>
> whirr.location-id=us-east-1d
>
> hadoop-hdfs.dfs.permissions=false
>
> hadoop-hdfs.dfs.replication=2
>
>
> --Ben
>
>
>
>
> On Mar 17, 2011, at 1:08 PM, Andrei Savu wrote:
>
> Ben, could you give it one more try using the current trunk?
>
> You can specify the user by setting the option whirr.cluster-user
>
> (defaults to current system user).
>
> On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com>
> wrote:
>
> Andrei,
>
> Thanks.
>
> After patching with 158, it launches fine as me on that Ubuntu image from
> the recipe (i.e. on my client machine I am 'ben', so now the aws user that
> has sudo, and as whom I can log in is also 'ben'), so that looks good.
>
> But it's now doing this with amazon linux (ami-da0cf8b3, which was the
> default 64-bit ami a few days ago, and may still be) during launch:
>
> 1) SshException on node us-east-1/i-b2678ddd:
>
> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to
> session.
>
> at
> org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>
> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>
> at
> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>
> at
> org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>
> at
> org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> So it seems as if the key part of jclouds authentication setup is still
> failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the
> local user, but failing.
>
> Is there a property for the user it launches as? Or does it just do
> whichever user you are locally, instead of ec2-user/ubuntu/root, depending
> on the default, as before?
>
> I can switch to ubuntu, but I have a fair amount of native code setup in my
> custom scripts and would prefer to stick with a redhattish version if
> possible.
>
> Looking ahead, I want to benchmark plain old 64-bit instances against
> cluster instances, to see if the allegedly improved networking gives us a
> boost, and the available ones I see are Suse and Amazon linux. When I
> switch to the amazon linux one, like so:
>
> whirr.hardward-id=cc1.4xlarge
>
> whirr.image-id=us-east-1/ami-321eed5b
>
> I get different a different problem:
>
> Exception in thread "main" java.util.NoSuchElementException: hardwares don't
> support any images: [biggest=false, fastest=false, imageName=null,
> imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4,
> imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1,
> scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA],
> metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null,
> osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=,
> osArch=hvm, os64Bit=true, hardwareId=m1.small]
>
> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0,
> speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null,
> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true],
> [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false,
> isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc,
> durable=false, isBootDevice=false]], supportsI
>
> but I imagine that if using cluster instances is going to be possible,
> support for amazon linux will be needed.
>
> --Ben
>
>
> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>
> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>
> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>
> you are seeing. We should be able to restart the vote for the 0.4.0
>
> release after fixing this issue.
>
> [0] https://issues.apache.org/jira/browse/WHIRR-264
>
> [1] https://issues.apache.org/jira/browse/WHIRR-158
>
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>
> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon
> linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default
> for new amazon linux instances, a few days ago) with good success. I took
> the default hadoop-ec2.properties recipe and modified it slightly to suit my
> needs. I'm now trying with basically the same properties file, but when I
> use
>
> whirr.hardware-id=c1.xlarge
>
> and then either this (from the recipe)
>
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>
> whirr.image-id=us-east-1/ami-da0cf8b3
>
> or this:
>
> # Amazon linux 64-bit, default as of 3/11:
>
> whirr.image-id=us-east-1/ami-8e1fece7
>
> I get a a failure to install the right public key, so that I can't log into
> the name node (or any other nodes, for that matter).
>
>
> My whole config file is this:
>
> whirr.cluster-name=bhcL4
>
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4
> hadoop-datanode+hadoop-tasktracker
>
> whirr.hadoop-install-function=install_cdh_hadoop
>
> whirr.hadoop-configure-function=configure_cdh_hadoop
>
> whirr.provider=aws-ec2
>
> whirr.identity=...
>
> whirr.credential=...
>
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>
> whirr.hardware-id=c1.xlarge
>
> #whirr.hardware-id=c1.medium
>
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>
> whirr.image-id=us-east-1/ami-da0cf8b3
>
> # Amazon linux as of 3/11:
>
> #whirr.image-id=us-east-1/ami-8e1fece7
>
> # If you choose a different location, make sure whirr.image-id is updated
> too
>
> whirr.location-id=us-east-1d
>
> hadoop-hdfs.dfs.permissions=false
>
> hadoop-hdfs.dfs.replication=2
>
>
>
> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d
> and whirr.location-id=us-east-1
>
>
>
>
>
>
>
>
Re: aws 64-bit c1.xlarge problems
Posted by Benjamin Clark <be...@daltonclark.com>.
I read the manual and figured a bit more of this out. Amazon may change the defaults in their console without an announcement, but they document what they're doing here: http://aws.amazon.com/amazon-linux-ami/
The /media/ephemeral0 is for one of their Amazon linux instances that has S3-backed non-durable storage. It seems as if the ebs-backed ones have no non-durable storage by default, and the S3-backed ones do, but in that eccentric location (eccentric relative to what everybody else does on Amazon). So if we like the S3-backed instances, we can hack the install_cdh_hadoop.sh script by adding
rm -rf /mnt
ln -s /media/ephemeral0 /mnt
or we can write a whole thing that spins up an ebs volume per node and attaches it, for the ebs-backed ones.
Is there any experience among the users as to which will be more stable and perform better? I've got the S3-backed one working, so I'll use that and just bake it off against the Alestic/ubuntu system that now also works for me, unless there's a compelling case for the ebs-backed thing.
--Ben
>
>> Andrei,
>>
>> The release candidate code does work. Perhaps something is different, relative to the patched frankenstein I was using, perhaps I had some local corruption or config problem.
>>
>> It sets up everything as whatever my local user is, by default, and the override as whirr.cluster-user works as well.
>>
>> In any case, at the rate AWS seems to be changing the configuration of 'amazon linux' perhaps it's less useful than I thought. Last week the default amis in the console had a bunch of spare disk space on the /media/ephemeral0 partition, which I could symlink /mnt to in the install_cdh_hadoop.sh script, and then hdfs would have a decent amount of space. Now there is no such thing, so I suppose I would have to launch an ebs volume per node and mount that. This is now tipping over into the "too much trouble" zone for me. And in the mean time I got all my native stuff (hadoop-lzo and R/Rhipe) working on ubuntu, so I think I'm going to use the Alestic image from the recipe for a while. If there's an obvious candidate up there for "reasonably-modern redhat derivative ami from a source on the good lists that behaves well," I'd like to know what it is. By 'reasonably modern' I mean having default python >= 2.5.
>>
>> I liked the old custom of having /mnt be a separate partition of a decent size. I hope this is just a glitch with AWS. I suspect it may be because jclouds/whirr is showing (e.g.) in the output:
>> volumes=[[id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false]
>> So theoretically the disk space is still there on those non-boot, non-durable devices, but I cannot mount them.
>>
>>
>> I also tried the cluster ami, because I am intrigued by the possibilities for good performance. Sounds great for hadoop, doesn't it? But it won't even start the nodes, giving this:
>>
>> Configuring template
>> Unexpected error while starting 1 nodes, minimum 1 nodes for [hadoop-namenode, hadoop-jobtracker] of cluster bhcLA
>> java.util.concurrent.ExecutionException: org.jclouds.http.HttpResponseException: command: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of 'hvm' currently may only be used with Cluster Compute instance types.]
>> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>> at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.waitForOutcomes(BootstrapClusterAction.java:307)
>> at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:260)
>> at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:221)
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:680)
>> Caused by: org.jclouds.http.HttpResponseException: command: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of 'hvm' currently may only be used with Cluster Compute instance types.]
>> at org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:75)
>>
>> There must be something a bit more involved to specify cluster instances in the amazon api, perhaps not (yet) supported by jclouds? I'm afraid I don't need this enough right now to justify digging further .
>>
>>
>> Anyway, thanks for all your help and advice on this.
>>
>> --Ben
>>
>>
>> On Mar 17, 2011, at 7:01 PM, Andrei Savu wrote:
>>
>>> Strange! I will try your properties file tomorrow.
>>>
>>> If you want to try again you can find the artifacts for 0.4.0 RC1 here:
>>> http://people.apache.org/~asavu/whirr-0.4.0-incubating-candidate-1
>>>
>>> On Thu, Mar 17, 2011 at 8:41 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>> Andrei,
>>>>
>>>> Thanks for looking at this. Unfortunately it does not seem to work.
>>>>
>>>> Using the Amazon linux 64-bit ami with no whirr.cluster-user, or if I set it to 'ben' or whatever else, I get this.
>>>>
>>>> 1) SshException on node us-east-1/i-62de280d:
>>>> org.jclouds.ssh.SshException: ec2-user@72.44.35.254:22: Error connecting to session.
>>>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>>>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>>>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>>>>
>>>> So it doesn't seem to be honoring that property, and it's definitely not allowing me to log in to any nodes, 'ben', 'ec2-user' or 'root'.
>>>>
>>>> The ubuntu ami from the recipes continues to work fine.
>>>>
>>>> Here's the full config file I'm using. I grabbed the recipe from trunk and put my stuff back in, to make sure I'm not missing a new setting:
>>>>
>>>> whirr.cluster-name=bhcTL
>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2 hadoop-datanode+hadoop-tasktracker
>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>> whirr.provider=aws-ec2
>>>> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>>>> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-hkey
>>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-hkey.pub
>>>> whirr.cluster-user=ben
>>>> # Amazon linux 32-bit--works
>>>> #whirr.hardware-id=c1.medium
>>>> #whirr.image-id=us-east-1/ami-d59d6bbc
>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/ -- works
>>>> #whirr.hardware-id=c1.xlarge
>>>> #whirr.image-id=us-east-1/ami-da0cf8b3
>>>> # Amazon linux 64-bit as of 3/11:--doesn't work
>>>> whirr.hardware-id=c1.xlarge
>>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>> #Cluster compute --doesn't work
>>>> #whirr.hardward-id=cc1.4xlarge
>>>> #whirr.image-id=us-east-1/ami-321eed5b
>>>> whirr.location-id=us-east-1d
>>>> hadoop-hdfs.dfs.permissions=false
>>>> hadoop-hdfs.dfs.replication=2
>>>>
>>>>
>>>> --Ben
>>>>
>>>>
>>>>
>>>>
>>>> On Mar 17, 2011, at 1:08 PM, Andrei Savu wrote:
>>>>
>>>>> Ben, could you give it one more try using the current trunk?
>>>>>
>>>>> You can specify the user by setting the option whirr.cluster-user
>>>>> (defaults to current system user).
>>>>>
>>>>> On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>>>> Andrei,
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
>>>>>>
>>>>>> But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
>>>>>>
>>>>>> 1) SshException on node us-east-1/i-b2678ddd:
>>>>>> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
>>>>>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>>>>>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>>>>>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>>>>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>>>>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
>>>>>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>>>
>>>>>> So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
>>>>>>
>>>>>> Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
>>>>>>
>>>>>> I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
>>>>>>
>>>>>> Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
>>>>>>
>>>>>> whirr.hardward-id=cc1.4xlarge
>>>>>> whirr.image-id=us-east-1/ami-321eed5b
>>>>>>
>>>>>> I get different a different problem:
>>>>>>
>>>>>> Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
>>>>>> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
>>>>>>
>>>>>> but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
>>>>>>
>>>>>> --Ben
>>>>>>
>>>>>>
>>>>>> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>>>>>>
>>>>>>> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>>>>>>> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>>>>>>> you are seeing. We should be able to restart the vote for the 0.4.0
>>>>>>> release after fixing this issue.
>>>>>>>
>>>>>>> [0] https://issues.apache.org/jira/browse/WHIRR-264
>>>>>>> [1] https://issues.apache.org/jira/browse/WHIRR-158
>>>>>>>
>>>>>>> -- Andrei Savu / andreisavu.ro
>>>>>>>
>>>>>>> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>>>>>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>>>>>>>
>>>>>>>> whirr.hardware-id=c1.xlarge
>>>>>>>>
>>>>>>>> and then either this (from the recipe)
>>>>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>>>>>
>>>>>>>> or this:
>>>>>>>> # Amazon linux 64-bit, default as of 3/11:
>>>>>>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>>>>>>
>>>>>>>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>>>>>>>
>>>>>>>>
>>>>>>>> My whole config file is this:
>>>>>>>>
>>>>>>>> whirr.cluster-name=bhcL4
>>>>>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>>>>>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>>>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>>>>>> whirr.provider=aws-ec2
>>>>>>>> whirr.identity=...
>>>>>>>> whirr.credential=...
>>>>>>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>>>>>>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>>>>>>>> whirr.hardware-id=c1.xlarge
>>>>>>>> #whirr.hardware-id=c1.medium
>>>>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>>>>> # Amazon linux as of 3/11:
>>>>>>>> #whirr.image-id=us-east-1/ami-8e1fece7
>>>>>>>> # If you choose a different location, make sure whirr.image-id is updated too
>>>>>>>> whirr.location-id=us-east-1d
>>>>>>>> hadoop-hdfs.dfs.permissions=false
>>>>>>>> hadoop-hdfs.dfs.replication=2
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
>>>>>>
>>>>>>
>>>>
>>>>
>>
>
Re: aws 64-bit c1.xlarge problems
Posted by Benjamin Clark <be...@daltonclark.com>.
Andrei,
The release candidate code does work. Perhaps something is different, relative to the patched frankenstein I was using, perhaps I had some local corruption or config problem.
It sets up everything as whatever my local user is, by default, and the override as whirr.cluster-user works as well.
In any case, at the rate AWS seems to be changing the configuration of 'amazon linux' perhaps it's less useful than I thought. Last week the default amis in the console had a bunch of spare disk space on the /media/ephemeral0 partition, which I could symlink /mnt to in the install_cdh_hadoop.sh script, and then hdfs would have a decent amount of space. Now there is no such thing, so I suppose I would have to launch an ebs volume per node and mount that. This is now tipping over into the "too much trouble" zone for me. And in the mean time I got all my native stuff (hadoop-lzo and R/Rhipe) working on ubuntu, so I think I'm going to use the Alestic image from the recipe for a while. If there's an obvious candidate up there for "reasonably-modern redhat derivative ami from a source on the good lists that behaves well," I'd like to know what it is. By 'reasonably modern' I mean having default python >= 2.5.
I liked the old custom of having /mnt be a separate partition of a decent size. I hope this is just a glitch with AWS. I suspect it may be because jclouds/whirr is showing (e.g.) in the output:
volumes=[[id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false]
So theoretically the disk space is still there on those non-boot, non-durable devices, but I cannot mount them.
I also tried the cluster ami, because I am intrigued by the possibilities for good performance. Sounds great for hadoop, doesn't it? But it won't even start the nodes, giving this:
Configuring template
Unexpected error while starting 1 nodes, minimum 1 nodes for [hadoop-namenode, hadoop-jobtracker] of cluster bhcLA
java.util.concurrent.ExecutionException: org.jclouds.http.HttpResponseException: command: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of 'hvm' currently may only be used with Cluster Compute instance types.]
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.waitForOutcomes(BootstrapClusterAction.java:307)
at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:260)
at org.apache.whirr.cluster.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:221)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.jclouds.http.HttpResponseException: command: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [Non-Windows AMIs with a virtualization type of 'hvm' currently may only be used with Cluster Compute instance types.]
at org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:75)
There must be something a bit more involved to specify cluster instances in the amazon api, perhaps not (yet) supported by jclouds? I'm afraid I don't need this enough right now to justify digging further .
Anyway, thanks for all your help and advice on this.
--Ben
On Mar 17, 2011, at 7:01 PM, Andrei Savu wrote:
> Strange! I will try your properties file tomorrow.
>
> If you want to try again you can find the artifacts for 0.4.0 RC1 here:
> http://people.apache.org/~asavu/whirr-0.4.0-incubating-candidate-1
>
> On Thu, Mar 17, 2011 at 8:41 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>> Andrei,
>>
>> Thanks for looking at this. Unfortunately it does not seem to work.
>>
>> Using the Amazon linux 64-bit ami with no whirr.cluster-user, or if I set it to 'ben' or whatever else, I get this.
>>
>> 1) SshException on node us-east-1/i-62de280d:
>> org.jclouds.ssh.SshException: ec2-user@72.44.35.254:22: Error connecting to session.
>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>>
>> So it doesn't seem to be honoring that property, and it's definitely not allowing me to log in to any nodes, 'ben', 'ec2-user' or 'root'.
>>
>> The ubuntu ami from the recipes continues to work fine.
>>
>> Here's the full config file I'm using. I grabbed the recipe from trunk and put my stuff back in, to make sure I'm not missing a new setting:
>>
>> whirr.cluster-name=bhcTL
>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2 hadoop-datanode+hadoop-tasktracker
>> whirr.hadoop-install-function=install_cdh_hadoop
>> whirr.hadoop-configure-function=configure_cdh_hadoop
>> whirr.provider=aws-ec2
>> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-hkey
>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-hkey.pub
>> whirr.cluster-user=ben
>> # Amazon linux 32-bit--works
>> #whirr.hardware-id=c1.medium
>> #whirr.image-id=us-east-1/ami-d59d6bbc
>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/ -- works
>> #whirr.hardware-id=c1.xlarge
>> #whirr.image-id=us-east-1/ami-da0cf8b3
>> # Amazon linux 64-bit as of 3/11:--doesn't work
>> whirr.hardware-id=c1.xlarge
>> whirr.image-id=us-east-1/ami-8e1fece7
>> #Cluster compute --doesn't work
>> #whirr.hardward-id=cc1.4xlarge
>> #whirr.image-id=us-east-1/ami-321eed5b
>> whirr.location-id=us-east-1d
>> hadoop-hdfs.dfs.permissions=false
>> hadoop-hdfs.dfs.replication=2
>>
>>
>> --Ben
>>
>>
>>
>>
>> On Mar 17, 2011, at 1:08 PM, Andrei Savu wrote:
>>
>>> Ben, could you give it one more try using the current trunk?
>>>
>>> You can specify the user by setting the option whirr.cluster-user
>>> (defaults to current system user).
>>>
>>> On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>> Andrei,
>>>>
>>>> Thanks.
>>>>
>>>> After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
>>>>
>>>> But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
>>>>
>>>> 1) SshException on node us-east-1/i-b2678ddd:
>>>> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
>>>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>>>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>>>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
>>>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>
>>>> So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
>>>>
>>>> Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
>>>>
>>>> I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
>>>>
>>>> Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
>>>>
>>>> whirr.hardward-id=cc1.4xlarge
>>>> whirr.image-id=us-east-1/ami-321eed5b
>>>>
>>>> I get different a different problem:
>>>>
>>>> Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
>>>> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
>>>>
>>>> but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
>>>>
>>>> --Ben
>>>>
>>>>
>>>> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>>>>
>>>>> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>>>>> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>>>>> you are seeing. We should be able to restart the vote for the 0.4.0
>>>>> release after fixing this issue.
>>>>>
>>>>> [0] https://issues.apache.org/jira/browse/WHIRR-264
>>>>> [1] https://issues.apache.org/jira/browse/WHIRR-158
>>>>>
>>>>> -- Andrei Savu / andreisavu.ro
>>>>>
>>>>> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>>>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>>>>>
>>>>>> whirr.hardware-id=c1.xlarge
>>>>>>
>>>>>> and then either this (from the recipe)
>>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>>>
>>>>>> or this:
>>>>>> # Amazon linux 64-bit, default as of 3/11:
>>>>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>>>>
>>>>>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>>>>>
>>>>>>
>>>>>> My whole config file is this:
>>>>>>
>>>>>> whirr.cluster-name=bhcL4
>>>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>>>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>>>> whirr.provider=aws-ec2
>>>>>> whirr.identity=...
>>>>>> whirr.credential=...
>>>>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>>>>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>>>>>> whirr.hardware-id=c1.xlarge
>>>>>> #whirr.hardware-id=c1.medium
>>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>>> # Amazon linux as of 3/11:
>>>>>> #whirr.image-id=us-east-1/ami-8e1fece7
>>>>>> # If you choose a different location, make sure whirr.image-id is updated too
>>>>>> whirr.location-id=us-east-1d
>>>>>> hadoop-hdfs.dfs.permissions=false
>>>>>> hadoop-hdfs.dfs.replication=2
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
>>>>
>>>>
>>
>>
Re: aws 64-bit c1.xlarge problems
Posted by Andrei Savu <sa...@gmail.com>.
Strange! I will try your properties file tomorrow.
If you want to try again you can find the artifacts for 0.4.0 RC1 here:
http://people.apache.org/~asavu/whirr-0.4.0-incubating-candidate-1
On Thu, Mar 17, 2011 at 8:41 PM, Benjamin Clark <be...@daltonclark.com> wrote:
> Andrei,
>
> Thanks for looking at this. Unfortunately it does not seem to work.
>
> Using the Amazon linux 64-bit ami with no whirr.cluster-user, or if I set it to 'ben' or whatever else, I get this.
>
> 1) SshException on node us-east-1/i-62de280d:
> org.jclouds.ssh.SshException: ec2-user@72.44.35.254:22: Error connecting to session.
> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>
> So it doesn't seem to be honoring that property, and it's definitely not allowing me to log in to any nodes, 'ben', 'ec2-user' or 'root'.
>
> The ubuntu ami from the recipes continues to work fine.
>
> Here's the full config file I'm using. I grabbed the recipe from trunk and put my stuff back in, to make sure I'm not missing a new setting:
>
> whirr.cluster-name=bhcTL
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2 hadoop-datanode+hadoop-tasktracker
> whirr.hadoop-install-function=install_cdh_hadoop
> whirr.hadoop-configure-function=configure_cdh_hadoop
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-hkey
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-hkey.pub
> whirr.cluster-user=ben
> # Amazon linux 32-bit--works
> #whirr.hardware-id=c1.medium
> #whirr.image-id=us-east-1/ami-d59d6bbc
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/ -- works
> #whirr.hardware-id=c1.xlarge
> #whirr.image-id=us-east-1/ami-da0cf8b3
> # Amazon linux 64-bit as of 3/11:--doesn't work
> whirr.hardware-id=c1.xlarge
> whirr.image-id=us-east-1/ami-8e1fece7
> #Cluster compute --doesn't work
> #whirr.hardward-id=cc1.4xlarge
> #whirr.image-id=us-east-1/ami-321eed5b
> whirr.location-id=us-east-1d
> hadoop-hdfs.dfs.permissions=false
> hadoop-hdfs.dfs.replication=2
>
>
> --Ben
>
>
>
>
> On Mar 17, 2011, at 1:08 PM, Andrei Savu wrote:
>
>> Ben, could you give it one more try using the current trunk?
>>
>> You can specify the user by setting the option whirr.cluster-user
>> (defaults to current system user).
>>
>> On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>> Andrei,
>>>
>>> Thanks.
>>>
>>> After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
>>>
>>> But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
>>>
>>> 1) SshException on node us-east-1/i-b2678ddd:
>>> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
>>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
>>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>
>>> So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
>>>
>>> Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
>>>
>>> I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
>>>
>>> Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
>>>
>>> whirr.hardward-id=cc1.4xlarge
>>> whirr.image-id=us-east-1/ami-321eed5b
>>>
>>> I get different a different problem:
>>>
>>> Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
>>> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
>>>
>>> but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
>>>
>>> --Ben
>>>
>>>
>>> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>>>
>>>> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>>>> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>>>> you are seeing. We should be able to restart the vote for the 0.4.0
>>>> release after fixing this issue.
>>>>
>>>> [0] https://issues.apache.org/jira/browse/WHIRR-264
>>>> [1] https://issues.apache.org/jira/browse/WHIRR-158
>>>>
>>>> -- Andrei Savu / andreisavu.ro
>>>>
>>>> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>>>>
>>>>> whirr.hardware-id=c1.xlarge
>>>>>
>>>>> and then either this (from the recipe)
>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>>
>>>>> or this:
>>>>> # Amazon linux 64-bit, default as of 3/11:
>>>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>>>
>>>>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>>>>
>>>>>
>>>>> My whole config file is this:
>>>>>
>>>>> whirr.cluster-name=bhcL4
>>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>>> whirr.provider=aws-ec2
>>>>> whirr.identity=...
>>>>> whirr.credential=...
>>>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>>>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>>>>> whirr.hardware-id=c1.xlarge
>>>>> #whirr.hardware-id=c1.medium
>>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>> # Amazon linux as of 3/11:
>>>>> #whirr.image-id=us-east-1/ami-8e1fece7
>>>>> # If you choose a different location, make sure whirr.image-id is updated too
>>>>> whirr.location-id=us-east-1d
>>>>> hadoop-hdfs.dfs.permissions=false
>>>>> hadoop-hdfs.dfs.replication=2
>>>>>
>>>>>
>>>>>
>>>>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
>>>
>>>
>
>
Re: aws 64-bit c1.xlarge problems
Posted by Benjamin Clark <be...@daltonclark.com>.
Andrei,
Thanks for looking at this. Unfortunately it does not seem to work.
Using the Amazon linux 64-bit ami with no whirr.cluster-user, or if I set it to 'ben' or whatever else, I get this.
1) SshException on node us-east-1/i-62de280d:
org.jclouds.ssh.SshException: ec2-user@72.44.35.254:22: Error connecting to session.
at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
So it doesn't seem to be honoring that property, and it's definitely not allowing me to log in to any nodes, 'ben', 'ec2-user' or 'root'.
The ubuntu ami from the recipes continues to work fine.
Here's the full config file I'm using. I grabbed the recipe from trunk and put my stuff back in, to make sure I'm not missing a new setting:
whirr.cluster-name=bhcTL
whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2 hadoop-datanode+hadoop-tasktracker
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-hkey
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-hkey.pub
whirr.cluster-user=ben
# Amazon linux 32-bit--works
#whirr.hardware-id=c1.medium
#whirr.image-id=us-east-1/ami-d59d6bbc
# Ubuntu 10.04 LTS Lucid. See http://alestic.com/ -- works
#whirr.hardware-id=c1.xlarge
#whirr.image-id=us-east-1/ami-da0cf8b3
# Amazon linux 64-bit as of 3/11:--doesn't work
whirr.hardware-id=c1.xlarge
whirr.image-id=us-east-1/ami-8e1fece7
#Cluster compute --doesn't work
#whirr.hardward-id=cc1.4xlarge
#whirr.image-id=us-east-1/ami-321eed5b
whirr.location-id=us-east-1d
hadoop-hdfs.dfs.permissions=false
hadoop-hdfs.dfs.replication=2
--Ben
On Mar 17, 2011, at 1:08 PM, Andrei Savu wrote:
> Ben, could you give it one more try using the current trunk?
>
> You can specify the user by setting the option whirr.cluster-user
> (defaults to current system user).
>
> On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>> Andrei,
>>
>> Thanks.
>>
>> After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
>>
>> But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
>>
>> 1) SshException on node us-east-1/i-b2678ddd:
>> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
>> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
>> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
>> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
>> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>
>> So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
>>
>> Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
>>
>> I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
>>
>> Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
>>
>> whirr.hardward-id=cc1.4xlarge
>> whirr.image-id=us-east-1/ami-321eed5b
>>
>> I get different a different problem:
>>
>> Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
>> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
>>
>> but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
>>
>> --Ben
>>
>>
>> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>>
>>> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>>> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>>> you are seeing. We should be able to restart the vote for the 0.4.0
>>> release after fixing this issue.
>>>
>>> [0] https://issues.apache.org/jira/browse/WHIRR-264
>>> [1] https://issues.apache.org/jira/browse/WHIRR-158
>>>
>>> -- Andrei Savu / andreisavu.ro
>>>
>>> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>>>
>>>> whirr.hardware-id=c1.xlarge
>>>>
>>>> and then either this (from the recipe)
>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>>
>>>> or this:
>>>> # Amazon linux 64-bit, default as of 3/11:
>>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>>
>>>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>>>
>>>>
>>>> My whole config file is this:
>>>>
>>>> whirr.cluster-name=bhcL4
>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>> whirr.provider=aws-ec2
>>>> whirr.identity=...
>>>> whirr.credential=...
>>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>>>> whirr.hardware-id=c1.xlarge
>>>> #whirr.hardware-id=c1.medium
>>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>> # Amazon linux as of 3/11:
>>>> #whirr.image-id=us-east-1/ami-8e1fece7
>>>> # If you choose a different location, make sure whirr.image-id is updated too
>>>> whirr.location-id=us-east-1d
>>>> hadoop-hdfs.dfs.permissions=false
>>>> hadoop-hdfs.dfs.replication=2
>>>>
>>>>
>>>>
>>>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
>>
>>
Re: aws 64-bit c1.xlarge problems
Posted by Andrei Savu <sa...@gmail.com>.
Ben, could you give it one more try using the current trunk?
You can specify the user by setting the option whirr.cluster-user
(defaults to current system user).
On Wed, Mar 16, 2011 at 11:23 PM, Benjamin Clark <be...@daltonclark.com> wrote:
> Andrei,
>
> Thanks.
>
> After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
>
> But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
>
> 1) SshException on node us-east-1/i-b2678ddd:
> org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
> at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
> at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
> at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
> at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
>
> Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
>
> I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
>
> Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
>
> whirr.hardward-id=cc1.4xlarge
> whirr.image-id=us-east-1/ami-321eed5b
>
> I get different a different problem:
>
> Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
> [[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
>
> but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
>
> --Ben
>
>
> On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
>
>> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
>> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
>> you are seeing. We should be able to restart the vote for the 0.4.0
>> release after fixing this issue.
>>
>> [0] https://issues.apache.org/jira/browse/WHIRR-264
>> [1] https://issues.apache.org/jira/browse/WHIRR-158
>>
>> -- Andrei Savu / andreisavu.ro
>>
>> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>>
>>> whirr.hardware-id=c1.xlarge
>>>
>>> and then either this (from the recipe)
>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>>
>>> or this:
>>> # Amazon linux 64-bit, default as of 3/11:
>>> whirr.image-id=us-east-1/ami-8e1fece7
>>>
>>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>>
>>>
>>> My whole config file is this:
>>>
>>> whirr.cluster-name=bhcL4
>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>>> whirr.hadoop-install-function=install_cdh_hadoop
>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>> whirr.provider=aws-ec2
>>> whirr.identity=...
>>> whirr.credential=...
>>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>>> whirr.hardware-id=c1.xlarge
>>> #whirr.hardware-id=c1.medium
>>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>>> whirr.image-id=us-east-1/ami-da0cf8b3
>>> # Amazon linux as of 3/11:
>>> #whirr.image-id=us-east-1/ami-8e1fece7
>>> # If you choose a different location, make sure whirr.image-id is updated too
>>> whirr.location-id=us-east-1d
>>> hadoop-hdfs.dfs.permissions=false
>>> hadoop-hdfs.dfs.replication=2
>>>
>>>
>>>
>>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
>
>
Re: aws 64-bit c1.xlarge problems
Posted by Benjamin Clark <be...@daltonclark.com>.
Andrei,
Thanks.
After patching with 158, it launches fine as me on that Ubuntu image from the recipe (i.e. on my client machine I am 'ben', so now the aws user that has sudo, and as whom I can log in is also 'ben'), so that looks good.
But it's now doing this with amazon linux (ami-da0cf8b3, which was the default 64-bit ami a few days ago, and may still be) during launch:
1) SshException on node us-east-1/i-b2678ddd:
org.jclouds.ssh.SshException: ben@50.16.96.211:22: Error connecting to session.
at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:206)
at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:90)
at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:70)
at org.jclouds.compute.strategy.RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(RunScriptOnNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:45)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
So it seems as if the key part of jclouds authentication setup is still failing for the amazon linux/ec2-user scenario, i.e. trying to set up as the local user, but failing.
Is there a property for the user it launches as? Or does it just do whichever user you are locally, instead of ec2-user/ubuntu/root, depending on the default, as before?
I can switch to ubuntu, but I have a fair amount of native code setup in my custom scripts and would prefer to stick with a redhattish version if possible.
Looking ahead, I want to benchmark plain old 64-bit instances against cluster instances, to see if the allegedly improved networking gives us a boost, and the available ones I see are Suse and Amazon linux. When I switch to the amazon linux one, like so:
whirr.hardward-id=cc1.4xlarge
whirr.image-id=us-east-1/ami-321eed5b
I get different a different problem:
Exception in thread "main" java.util.NoSuchElementException: hardwares don't support any images: [biggest=false, fastest=false, imageName=null, imageDescription=Amazon Linux AMI x86_64 HVM EBS EXT4, imageId=us-east-1/ami-321eed5b, imageVersion=ext4, location=[id=us-east-1, scope=REGION, description=us-east-1, parent=aws-ec2, iso3166Codes=[US-VA], metadata={}], minCores=0.0, minRam=0, osFamily=unrecognized, osName=null, osDescription=amazon/amzn-hvm-ami-2011.02.1-beta.x86_64-ext4, osVersion=, osArch=hvm, os64Bit=true, hardwareId=m1.small]
[[id=cc1.4xlarge, providerId=cc1.4xlarge, name=null, processors=[[cores=4.0, speed=4.0], [cores=4.0, speed=4.0]], ram=23552, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=840.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=840.0, device=/dev/sdc, durable=false, isBootDevice=false]], supportsI
but I imagine that if using cluster instances is going to be possible, support for amazon linux will be needed.
--Ben
On Mar 16, 2011, at 4:07 PM, Andrei Savu wrote:
> I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
> going to commit WHIRR-158 [1] tomorrow and it should fix the problem
> you are seeing. We should be able to restart the vote for the 0.4.0
> release after fixing this issue.
>
> [0] https://issues.apache.org/jira/browse/WHIRR-264
> [1] https://issues.apache.org/jira/browse/WHIRR-158
>
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
>> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>>
>> whirr.hardware-id=c1.xlarge
>>
>> and then either this (from the recipe)
>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>> whirr.image-id=us-east-1/ami-da0cf8b3
>>
>> or this:
>> # Amazon linux 64-bit, default as of 3/11:
>> whirr.image-id=us-east-1/ami-8e1fece7
>>
>> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>>
>>
>> My whole config file is this:
>>
>> whirr.cluster-name=bhcL4
>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
>> whirr.hadoop-install-function=install_cdh_hadoop
>> whirr.hadoop-configure-function=configure_cdh_hadoop
>> whirr.provider=aws-ec2
>> whirr.identity=...
>> whirr.credential=...
>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
>> whirr.hardware-id=c1.xlarge
>> #whirr.hardware-id=c1.medium
>> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
>> whirr.image-id=us-east-1/ami-da0cf8b3
>> # Amazon linux as of 3/11:
>> #whirr.image-id=us-east-1/ami-8e1fece7
>> # If you choose a different location, make sure whirr.image-id is updated too
>> whirr.location-id=us-east-1d
>> hadoop-hdfs.dfs.permissions=false
>> hadoop-hdfs.dfs.replication=2
>>
>>
>>
>> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1
Re: aws 64-bit c1.xlarge problems
Posted by Andrei Savu <sa...@gmail.com>.
I've seen something similar while testing Whirr: WHIRR-264 [0]. We are
going to commit WHIRR-158 [1] tomorrow and it should fix the problem
you are seeing. We should be able to restart the vote for the 0.4.0
release after fixing this issue.
[0] https://issues.apache.org/jira/browse/WHIRR-264
[1] https://issues.apache.org/jira/browse/WHIRR-158
-- Andrei Savu / andreisavu.ro
On Wed, Mar 16, 2011 at 6:54 PM, Benjamin Clark <be...@daltonclark.com> wrote:
> I have been using whirr 0.4 branch to launch clusters of c1.medium amazon linux machines (whirr.image-id=us-east-1/ami-d59d6bbc, which was the default for new amazon linux instances, a few days ago) with good success. I took the default hadoop-ec2.properties recipe and modified it slightly to suit my needs. I'm now trying with basically the same properties file, but when I use
>
> whirr.hardware-id=c1.xlarge
>
> and then either this (from the recipe)
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
> whirr.image-id=us-east-1/ami-da0cf8b3
>
> or this:
> # Amazon linux 64-bit, default as of 3/11:
> whirr.image-id=us-east-1/ami-8e1fece7
>
> I get a a failure to install the right public key, so that I can't log into the name node (or any other nodes, for that matter).
>
>
> My whole config file is this:
>
> whirr.cluster-name=bhcL4
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
> whirr.hadoop-install-function=install_cdh_hadoop
> whirr.hadoop-configure-function=configure_cdh_hadoop
> whirr.provider=aws-ec2
> whirr.identity=...
> whirr.credential=...
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa-formyhadoop.pub
> whirr.hardware-id=c1.xlarge
> #whirr.hardware-id=c1.medium
> # Ubuntu 10.04 LTS Lucid. See http://alestic.com/
> whirr.image-id=us-east-1/ami-da0cf8b3
> # Amazon linux as of 3/11:
> #whirr.image-id=us-east-1/ami-8e1fece7
> # If you choose a different location, make sure whirr.image-id is updated too
> whirr.location-id=us-east-1d
> hadoop-hdfs.dfs.permissions=false
> hadoop-hdfs.dfs.replication=2
>
>
>
> Am I doing something wrong here? I tried with whirr.location-id=us-east-1d and whirr.location-id=us-east-1