You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Madhu Ramanna <ma...@buysight.com> on 2012/01/10 23:55:49 UTC

[newbie] Unable to setup hadoop cluster on ec2

Hello everyone,

I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.

I setup the relevant files as per the quick start guide. However the cluster setup failed with the following exceptions.  What could be going wrong ? I'm sure (almost) my aws keys are correct

Appreciate your response !

2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) << images(3758)
2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) << images(3758)
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) << socket [address=107.22.64.12, port=22] opened
2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) << socket [address=107.21.80.86, port=22] opened
2012-01-06 10:38:13,012 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2) configure phase script run completed on: us-east-1/i-94bccef6
2012-01-06 10:38:16,003 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1) configure phase script run completed on: us-east-1/i-9abccef8
2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main) Unable to start the cluster. Terminating all nodes.
java.io.IOException: org.jclouds.rest.AuthorizationException: (root: <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
        at org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
        at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
        at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
        at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
        at org.apache.whirr.cli.Main.run(Main.java:64)
        at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@107.21.80.86:22) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
        at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
        at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
        at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
        at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
        at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
        ... 10 more
Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth failed
        at net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore] (main) Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)

Thanks,
Madhu


Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Andrei Savu <sa...@gmail.com>.
whirr.instance-templates should be 1 hadoop-namenode+hadoop-jobtracker, 1
hadoop-datanode+hadoop-tasktracker.

This is important because roles are started in order per instance template
group.

-- Andrei Savu / andreisavu.ro

On Wed, Jan 11, 2012 at 1:10 AM, Madhu Ramanna <ma...@buysight.com> wrote:

> Here you go:
>
> whirr.cluster-name=test_bs_hadoop_cluster
> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1
> hadoop-datanode+hadoop-tasktracker
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.private-key-file=/root/.ssh/id_rsa_whirr
> whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
> whirr.cluster-user=tuser
>
> Thanks,
> Madhu
>
> From: Andrei Savu <sa...@gmail.com>
> Reply-To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Date: Tue, 10 Jan 2012 14:59:17 -0800
> To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
> Can you share the .properties file you are using? I want to give it a try
> on my machine.
>
> Thanks,
>
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>wrote:
>
>> Hello everyone,
>>
>> I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.
>>
>> I setup the relevant files as per the quick start guide. However the
>> cluster setup failed with the following exceptions.  What could be going
>> wrong ? I'm sure (almost) my aws keys are correct
>>
>> Appreciate your response !
>>
>> 2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) <<
>> images(3758)
>> 2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> images(3758)
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >>
>> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >>
>> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> socket [address=107.22.64.12, port=22] opened
>> 2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) <<
>> socket [address=107.21.80.86, port=22] opened
>> 2012-01-06 10:38:13,012 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2)
>> configure phase script run completed on: us-east-1/i-94bccef6
>> 2012-01-06 10:38:16,003 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1)
>> configure phase script run completed on: us-east-1/i-9abccef8
>> 2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main)
>> Unable to start the cluster. Terminating all nodes.
>> java.io.IOException: org.jclouds.rest.AuthorizationException: (root:
>> <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available
>> authentication methods
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
>>         at
>> org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>>         at
>> org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>>         at org.apache.whirr.cli.Main.run(Main.java:64)
>>         at org.apache.whirr.cli.Main.main(Main.java:97)
>> Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@
>> 107.21.80.86:22) error acquiring SSHClient(timeout=60000): Exhausted
>> available authentication methods
>>         at
>> org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
>>         at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
>>         at
>> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
>>         at
>> org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:619)
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted
>> available authentication methods
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
>>         at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
>>         ... 10 more
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth
>> failed
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
>>         at
>> net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
>>         at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
>>         at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
>>         at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
>> 2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore]
>> (main) Unable to load cluster state, assuming it has no running nodes.
>> java.io.FileNotFoundException:
>> /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)
>>
>> Thanks,
>> Madhu
>>
>>
>

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Akash Ashok <th...@gmail.com>.
Are you trying to access the UI from the same machine from which you
launched the cluster ?  Please try to telnet to 50070 on the machine u are
tryin to connect to and go to that instance where namenode is running and
telnet to localhost 50070. If it works on the instance and not on your
system then have a look at ur security configurations or ur security group
in aws console.

Cheers,
Akash A

On Wed, Jan 11, 2012 at 7:22 AM, Madhu Ramanna <ma...@buysight.com> wrote:

> So, launching the cluster with non-root user was successful. Namenode UI
> url was printed to the logs.
>
> However, namenode was shutdown
>
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> , error=************************************************************/
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/01/11 01:20:52 INFO common.Storage: Image file of size 96 saved in 0
> seconds.
> 12/01/11 01:20:52 INFO common.Storage: Storage directory
> /data/hadoop/hdfs/name has been successfully formatted.
> 12/01/11 01:20:52 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at ip-10-8-39-133.ec2.internal/
> 10.8.39.133
> ************************************************************/
> , exitCode=0]
>
>
> Not able to access the UI from browser.
>
> Any ideas ?
>
> I want to setup a long running hadoop cluster that is never down. What is
> the recommended whirr setup for this configuration ?
>
> Thanks,
> Madhu
>
>
>
> From: Madhu Ramanna <ma...@buysight.com>
> Date: Tue, 10 Jan 2012 15:19:27 -0800
> To: Alex Heneveld <Al...@CloudsoftCorp.com>, Madhu Ramanna <
> madhu@buysight.com>
>
> Cc: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
> Will give it a shot then
>
> Thanks,
> Madhu
>
> From: Alex Heneveld <Al...@CloudsoftCorp.com>
> Date: Tue, 10 Jan 2012 15:15:37 -0800
> To: Madhu Ramanna <ma...@buysight.com>
> Cc: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
>
> Hi Madhu,
>
> Looks like you might be running as root -- there are known issues with
> this.  Does it work if you are a different user?
>
> Best,
> Alex
>
>
>
> On 10/01/2012 23:10, Madhu Ramanna wrote:
>
> Here you go:
>
>  whirr.cluster-name=test_bs_hadoop_cluster
> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1
> hadoop-datanode+hadoop-tasktracker
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.private-key-file=/root/.ssh/id_rsa_whirr
> whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
> whirr.cluster-user=tuser
>
>  Thanks,
> Madhu
>
>  From: Andrei Savu <sa...@gmail.com>
> Reply-To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Date: Tue, 10 Jan 2012 14:59:17 -0800
> To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
>  Can you share the .properties file you are using? I want to give it a
> try on my machine.
>
>  Thanks,
>
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>wrote:
>
>>  Hello everyone,
>>
>>  I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.
>>
>>  I setup the relevant files as per the quick start guide. However the
>> cluster setup failed with the following exceptions.  What could be going
>> wrong ? I'm sure (almost) my aws keys are correct
>>
>>  Appreciate your response !
>>
>>  2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) <<
>> images(3758)
>> 2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> images(3758)
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >>
>> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >>
>> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> socket [address=107.22.64.12, port=22] opened
>> 2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) <<
>> socket [address=107.21.80.86, port=22] opened
>> 2012-01-06 10:38:13,012 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2)
>> configure phase script run completed on: us-east-1/i-94bccef6
>> 2012-01-06 10:38:16,003 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1)
>> configure phase script run completed on: us-east-1/i-9abccef8
>> 2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main)
>> Unable to start the cluster. Terminating all nodes.
>> java.io.IOException: org.jclouds.rest.AuthorizationException: (root:
>> <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available
>> authentication methods
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
>>         at
>> org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>>         at
>> org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>>         at org.apache.whirr.cli.Main.run(Main.java:64)
>>         at org.apache.whirr.cli.Main.main(Main.java:97)
>> Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@
>> 107.21.80.86:22) error acquiring SSHClient(timeout=60000): Exhausted
>> available authentication methods
>>         at
>> org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
>>         at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
>>         at
>> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
>>         at
>> org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:619)
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted
>> available authentication methods
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
>>         at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
>>         ... 10 more
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth
>> failed
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
>>         at
>> net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
>>         at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
>>         at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
>>         at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
>> 2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore]
>> (main) Unable to load cluster state, assuming it has no running nodes.
>> java.io.FileNotFoundException:
>> /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)
>>
>>  Thanks,
>> Madhu
>>
>>
>
>

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Thu, 12 Jan 2012 05:53:57 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2


Yes we're going to be running jobs on a continuous basis


I understand. Managing long running Hadoop clusters in Amazon is tricky due to namenode availability issues and inconsistent network & disk performance. Have you looked at this from a cost perspective? Maybe it's cheaper to buy a bunch of servers for this cluster that needs to be on all the time.

We have a running cluster with a provider but node additions to cluster takes a long time (provisioning, contracts etc). Hence the move




Also, how can I specify ebs volumes for these machines ?

Unfortunately there is no easy way to do this with the current implementation. Do you want to take the lead on this?

See https://issues.apache.org/jira/browse/WHIRR-290


I may not have the bandwidth to ramp up but would appreciate if you could send me some pointers on getting started !

We have a wiki page that describes how to build Whirr and contribute changes:
https://cwiki.apache.org/confluence/display/WHIRR/How+To+Contribute

-- Andrei

will take a look

Thanks,
Madhu

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Andrei Savu <sa...@gmail.com>.
> Yes we're going to be running jobs on a continuous basis
>
>
I understand. Managing long running Hadoop clusters in Amazon is tricky due
to namenode availability issues and inconsistent network & disk
performance. Have you looked at this from a cost perspective? Maybe it's
cheaper to buy a bunch of servers for this cluster that needs to be on all
the time.


>
>> Also, how can I specify ebs volumes for these machines ?
>>
>
> Unfortunately there is no easy way to do this with the current
> implementation. Do you want to take the lead on this?
>
> See https://issues.apache.org/jira/browse/WHIRR-290
>
>
>
> I may not have the bandwidth to ramp up but would appreciate if you could
> send me some pointers on getting started !
>

We have a wiki page that describes how to build Whirr and contribute
changes:
https://cwiki.apache.org/confluence/display/WHIRR/How+To+Contribute

-- Andrei

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.
Hey Andrei,

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Wed, 11 Jan 2012 02:40:04 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2



Have you tried to login on the remote machines? Are the daemons running as expected? (check with jps)

May be I need to RTFM, but should hadoop processes be started by user ? If so, how to do this ?

The Hadoop daemons are started by Whirr at the end of the configuration scripts. Unfortunately due to timing issues they fail to start sometimes (e.g. datanode trying to start before namenode).

BTW for 0.7.1 / 0.8.0 we are working on adding the ability to restart a specific service through Whirr:
https://issues.apache.org/jira/browse/WHIRR-421


Not a full-blown warehouse at this point; but might contain a week's worth of data. I could use Amazon EMR but I'm thinking using whirr would minimize the changes to our jobs running on hadoop cluster. What are your thoughts

I think you should use Apache Whirr the same way you would use Amazon EMR. Store data in S3 and start the Hadoop cluster only when you need to process things. Are you running jobs on a continuos basis?


Yes we're going to be running jobs on a continuous basis


Also, how can I specify ebs volumes for these machines ?

Unfortunately there is no easy way to do this with the current implementation. Do you want to take the lead on this?

See https://issues.apache.org/jira/browse/WHIRR-290


I may not have the bandwidth to ramp up but would appreciate if you could send me some pointers on getting started !

Thanks,
Madhu


Thanks,

-- Andrei Savu / andreisavu.ro<http://andreisavu.ro>


Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Andrei Savu <sa...@gmail.com>.
>
> Have you tried to login on the remote machines? Are the daemons running as
> expected? (check with jps)
>
>
> May be I need to RTFM, but should hadoop processes be started by user ? If
> so, how to do this ?
>

The Hadoop daemons are started by Whirr at the end of the configuration
scripts. Unfortunately due to timing issues they fail to start sometimes
(e.g. datanode trying to start before namenode).

BTW for 0.7.1 / 0.8.0 we are working on adding the ability to restart a
specific service through Whirr:
https://issues.apache.org/jira/browse/WHIRR-421


>
> Not a full-blown warehouse at this point; but might contain a week's worth
> of data. I could use Amazon EMR but I'm thinking using whirr would minimize
> the changes to our jobs running on hadoop cluster. What are your thoughts
>

I think you should use Apache Whirr the same way you would use Amazon EMR.
Store data in S3 and start the Hadoop cluster only when you need to process
things. Are you running jobs on a continuos basis?


>
> Also, how can I specify ebs volumes for these machines ?
>

Unfortunately there is no easy way to do this with the current
implementation. Do you want to take the lead on this?

See https://issues.apache.org/jira/browse/WHIRR-290

Thanks,

-- Andrei Savu / andreisavu.ro

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.
Thanks Andrei; my replies inline

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Tue, 10 Jan 2012 17:58:04 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Inline.

On Wed, Jan 11, 2012 at 3:52 AM, Madhu Ramanna <ma...@buysight.com>> wrote:
So, launching the cluster with non-root user was successful. Namenode UI url was printed to the logs.

However, namenode was shutdown

No directory, logging in with HOME=/
No directory, logging in with HOME=/
No directory, logging in with HOME=/
No directory, logging in with HOME=/
, error=************************************************************/
12/01/11 01:20:52 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
12/01/11 01:20:52 INFO namenode.FSNamesystem: supergroup=supergroup
12/01/11 01:20:52 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/01/11 01:20:52 INFO common.Storage: Image file of size 96 saved in 0 seconds.
12/01/11 01:20:52 INFO common.Storage: Storage directory /data/hadoop/hdfs/name has been successfully formatted.
12/01/11 01:20:52 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-10-8-39-133.ec2.internal/10.8.39.133<http://10.8.39.133>
************************************************************/
, exitCode=0]

This is the expected message you get when formatting HDFS (first time only).



Not able to access the UI from browser.

Any ideas ?

Have you tried to login on the remote machines? Are the daemons running as expected? (check with jps)

May be I need to RTFM, but should hadoop processes be started by user ? If so, how to do this ?



I want to setup a long running hadoop cluster that is never down. What is the recommended whirr setup for this configuration ?

Right now Whirr is best for short-lived clusters for periodical data processing. Are you building something like a warehouse in Amazon with Hadoop? Why not push data in S3 and only start Hadoop for processing?

Not a full-blown warehouse at this point; but might contain a week's worth of data. I could use Amazon EMR but I'm thinking using whirr would minimize the changes to our jobs running on hadoop cluster. What are your thoughts

Also, how can I specify ebs volumes for these machines ?



Thanks,
Madhu



From: Madhu Ramanna <ma...@buysight.com>>
Date: Tue, 10 Jan 2012 15:19:27 -0800
To: Alex Heneveld <Al...@CloudsoftCorp.com>>, Madhu Ramanna <ma...@buysight.com>>

Cc: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Will give it a shot then

Thanks,
Madhu

From: Alex Heneveld <Al...@CloudsoftCorp.com>>
Date: Tue, 10 Jan 2012 15:15:37 -0800
To: Madhu Ramanna <ma...@buysight.com>>
Cc: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2


Hi Madhu,

Looks like you might be running as root -- there are known issues with this.  Does it work if you are a different user?

Best,
Alex



On 10/01/2012 23:10, Madhu Ramanna wrote:
Here you go:

whirr.cluster-name=test_bs_hadoop_cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=/root/.ssh/id_rsa_whirr
whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
whirr.cluster-user=tuser

Thanks,
Madhu

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Tue, 10 Jan 2012 14:59:17 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Can you share the .properties file you are using? I want to give it a try on my machine.

Thanks,

-- Andrei Savu / andreisavu.ro<http://andreisavu.ro>

On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>> wrote:
Hello everyone,

I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.

I setup the relevant files as per the quick start guide. However the cluster setup failed with the following exceptions.  What could be going wrong ? I'm sure (almost) my aws keys are correct

Appreciate your response !

2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) << images(3758)
2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) << images(3758)
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) << socket [address=107.22.64.12, port=22] opened
2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) << socket [address=107.21.80.86, port=22] opened
2012-01-06 10:38:13,012 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2) configure phase script run completed on: us-east-1/i-94bccef6
2012-01-06 10:38:16,003 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1) configure phase script run completed on: us-east-1/i-9abccef8
2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main) Unable to start the cluster. Terminating all nodes.
java.io.IOException: org.jclouds.rest.AuthorizationException: (root: <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
        at org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
        at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
        at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
        at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
        at org.apache.whirr.cli.Main.run(Main.java:64)
        at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@107.21.80.86:22<http://107.21.80.86:22>) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
        at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
        at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
        at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
        at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
        at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
        ... 10 more
Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth failed
        at net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore] (main) Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)

Thanks,
Madhu





Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Andrei Savu <sa...@gmail.com>.
Inline.

On Wed, Jan 11, 2012 at 3:52 AM, Madhu Ramanna <ma...@buysight.com> wrote:

> So, launching the cluster with non-root user was successful. Namenode UI
> url was printed to the logs.
>
> However, namenode was shutdown
>
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> No directory, logging in with HOME=/
> , error=************************************************************/
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/01/11 01:20:52 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/01/11 01:20:52 INFO common.Storage: Image file of size 96 saved in 0
> seconds.
> 12/01/11 01:20:52 INFO common.Storage: Storage directory
> /data/hadoop/hdfs/name has been successfully formatted.
> 12/01/11 01:20:52 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at ip-10-8-39-133.ec2.internal/
> 10.8.39.133
> ************************************************************/
> , exitCode=0]
>

This is the expected message you get when formatting HDFS (first time
only).


>
>
> Not able to access the UI from browser.
>
> Any ideas ?
>

Have you tried to login on the remote machines? Are the daemons running as
expected? (check with jps)


>
> I want to setup a long running hadoop cluster that is never down. What is
> the recommended whirr setup for this configuration ?
>

Right now Whirr is best for short-lived clusters for periodical data
processing. Are you building something like a warehouse in Amazon with
Hadoop? Why not push data in S3 and only start Hadoop for processing?


>
> Thanks,
> Madhu
>
>
>
> From: Madhu Ramanna <ma...@buysight.com>
> Date: Tue, 10 Jan 2012 15:19:27 -0800
> To: Alex Heneveld <Al...@CloudsoftCorp.com>, Madhu Ramanna <
> madhu@buysight.com>
>
> Cc: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
> Will give it a shot then
>
> Thanks,
> Madhu
>
> From: Alex Heneveld <Al...@CloudsoftCorp.com>
> Date: Tue, 10 Jan 2012 15:15:37 -0800
> To: Madhu Ramanna <ma...@buysight.com>
> Cc: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
>
> Hi Madhu,
>
> Looks like you might be running as root -- there are known issues with
> this.  Does it work if you are a different user?
>
> Best,
> Alex
>
>
>
> On 10/01/2012 23:10, Madhu Ramanna wrote:
>
> Here you go:
>
>  whirr.cluster-name=test_bs_hadoop_cluster
> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1
> hadoop-datanode+hadoop-tasktracker
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.private-key-file=/root/.ssh/id_rsa_whirr
> whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
> whirr.cluster-user=tuser
>
>  Thanks,
> Madhu
>
>   From: Andrei Savu <sa...@gmail.com>
> Reply-To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Date: Tue, 10 Jan 2012 14:59:17 -0800
> To: "user@whirr.apache.org" <us...@whirr.apache.org>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
>  Can you share the .properties file you are using? I want to give it a
> try on my machine.
>
>  Thanks,
>
> -- Andrei Savu / andreisavu.ro
>
> On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>wrote:
>
>>  Hello everyone,
>>
>>  I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.
>>
>>  I setup the relevant files as per the quick start guide. However the
>> cluster setup failed with the following exceptions.  What could be going
>> wrong ? I'm sure (almost) my aws keys are correct
>>
>>  Appreciate your response !
>>
>>  2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) <<
>> images(3758)
>> 2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> images(3758)
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >>
>> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >>
>> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
>> 2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) <<
>> socket [address=107.22.64.12, port=22] opened
>> 2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) <<
>> socket [address=107.21.80.86, port=22] opened
>> 2012-01-06 10:38:13,012 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2)
>> configure phase script run completed on: us-east-1/i-94bccef6
>> 2012-01-06 10:38:16,003 INFO
>>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1)
>> configure phase script run completed on: us-east-1/i-9abccef8
>> 2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main)
>> Unable to start the cluster. Terminating all nodes.
>> java.io.IOException: org.jclouds.rest.AuthorizationException: (root:
>> <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available
>> authentication methods
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
>>         at
>> org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>>         at
>> org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>>         at org.apache.whirr.cli.Main.run(Main.java:64)
>>         at org.apache.whirr.cli.Main.main(Main.java:97)
>> Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@
>> 107.21.80.86:22) error acquiring SSHClient(timeout=60000): Exhausted
>> available authentication methods
>>         at
>> org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
>>         at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
>>         at
>> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
>>         at
>> org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
>>         at
>> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:619)
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted
>> available authentication methods
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
>>         at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
>>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
>>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
>>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
>>         ... 10 more
>> Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth
>> failed
>>         at
>> net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
>>         at
>> net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
>>         at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
>>         at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
>>         at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
>> 2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore]
>> (main) Unable to load cluster state, assuming it has no running nodes.
>> java.io.FileNotFoundException:
>> /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)
>>
>>  Thanks,
>> Madhu
>>
>>
>
>

Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.
So, launching the cluster with non-root user was successful. Namenode UI url was printed to the logs.

However, namenode was shutdown

No directory, logging in with HOME=/
No directory, logging in with HOME=/
No directory, logging in with HOME=/
No directory, logging in with HOME=/
, error=************************************************************/
12/01/11 01:20:52 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
12/01/11 01:20:52 INFO namenode.FSNamesystem: supergroup=supergroup
12/01/11 01:20:52 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/01/11 01:20:52 INFO common.Storage: Image file of size 96 saved in 0 seconds.
12/01/11 01:20:52 INFO common.Storage: Storage directory /data/hadoop/hdfs/name has been successfully formatted.
12/01/11 01:20:52 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-10-8-39-133.ec2.internal/10.8.39.133
************************************************************/
, exitCode=0]


Not able to access the UI from browser.

Any ideas ?

I want to setup a long running hadoop cluster that is never down. What is the recommended whirr setup for this configuration ?

Thanks,
Madhu



From: Madhu Ramanna <ma...@buysight.com>>
Date: Tue, 10 Jan 2012 15:19:27 -0800
To: Alex Heneveld <Al...@CloudsoftCorp.com>>, Madhu Ramanna <ma...@buysight.com>>
Cc: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Will give it a shot then

Thanks,
Madhu

From: Alex Heneveld <Al...@CloudsoftCorp.com>>
Date: Tue, 10 Jan 2012 15:15:37 -0800
To: Madhu Ramanna <ma...@buysight.com>>
Cc: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2


Hi Madhu,

Looks like you might be running as root -- there are known issues with this.  Does it work if you are a different user?

Best,
Alex



On 10/01/2012 23:10, Madhu Ramanna wrote:
Here you go:

whirr.cluster-name=test_bs_hadoop_cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=/root/.ssh/id_rsa_whirr
whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
whirr.cluster-user=tuser

Thanks,
Madhu

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Tue, 10 Jan 2012 14:59:17 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Can you share the .properties file you are using? I want to give it a try on my machine.

Thanks,

-- Andrei Savu / andreisavu.ro<http://andreisavu.ro>

On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>> wrote:
Hello everyone,

I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.

I setup the relevant files as per the quick start guide. However the cluster setup failed with the following exceptions.  What could be going wrong ? I'm sure (almost) my aws keys are correct

Appreciate your response !

2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) << images(3758)
2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) << images(3758)
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) << socket [address=107.22.64.12, port=22] opened
2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) << socket [address=107.21.80.86, port=22] opened
2012-01-06 10:38:13,012 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2) configure phase script run completed on: us-east-1/i-94bccef6
2012-01-06 10:38:16,003 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1) configure phase script run completed on: us-east-1/i-9abccef8
2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main) Unable to start the cluster. Terminating all nodes.
java.io.IOException: org.jclouds.rest.AuthorizationException: (root: <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
        at org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
        at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
        at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
        at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
        at org.apache.whirr.cli.Main.run(Main.java:64)
        at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@107.21.80.86:22<http://107.21.80.86:22>) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
        at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
        at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
        at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
        at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
        at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
        ... 10 more
Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth failed
        at net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore] (main) Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)

Thanks,
Madhu




Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.
Will give it a shot then

Thanks,
Madhu

From: Alex Heneveld <Al...@CloudsoftCorp.com>>
Date: Tue, 10 Jan 2012 15:15:37 -0800
To: Madhu Ramanna <ma...@buysight.com>>
Cc: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2


Hi Madhu,

Looks like you might be running as root -- there are known issues with this.  Does it work if you are a different user?

Best,
Alex



On 10/01/2012 23:10, Madhu Ramanna wrote:
Here you go:

whirr.cluster-name=test_bs_hadoop_cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=/root/.ssh/id_rsa_whirr
whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
whirr.cluster-user=tuser

Thanks,
Madhu

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Tue, 10 Jan 2012 14:59:17 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Can you share the .properties file you are using? I want to give it a try on my machine.

Thanks,

-- Andrei Savu / andreisavu.ro<http://andreisavu.ro>

On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>> wrote:
Hello everyone,

I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.

I setup the relevant files as per the quick start guide. However the cluster setup failed with the following exceptions.  What could be going wrong ? I'm sure (almost) my aws keys are correct

Appreciate your response !

2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) << images(3758)
2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) << images(3758)
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) << socket [address=107.22.64.12, port=22] opened
2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) << socket [address=107.21.80.86, port=22] opened
2012-01-06 10:38:13,012 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2) configure phase script run completed on: us-east-1/i-94bccef6
2012-01-06 10:38:16,003 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1) configure phase script run completed on: us-east-1/i-9abccef8
2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main) Unable to start the cluster. Terminating all nodes.
java.io.IOException: org.jclouds.rest.AuthorizationException: (root: <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
        at org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
        at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
        at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
        at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
        at org.apache.whirr.cli.Main.run(Main.java:64)
        at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@107.21.80.86:22<http://107.21.80.86:22>) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
        at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
        at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
        at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
        at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
        at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
        ... 10 more
Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth failed
        at net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore] (main) Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)

Thanks,
Madhu




Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Alex Heneveld <Al...@CloudsoftCorp.com>.
Hi Madhu,

Looks like you might be running as root -- there are known issues with 
this.  Does it work if you are a different user?

Best,
Alex



On 10/01/2012 23:10, Madhu Ramanna wrote:
> Here you go:
>
> whirr.cluster-name=test_bs_hadoop_cluster
> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 
> hadoop-datanode+hadoop-tasktracker
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.private-key-file=/root/.ssh/id_rsa_whirr
> whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
> whirr.cluster-user=tuser
>
> Thanks,
> Madhu
>
> From: Andrei Savu <savu.andrei@gmail.com <ma...@gmail.com>>
> Reply-To: "user@whirr.apache.org <ma...@whirr.apache.org>" 
> <user@whirr.apache.org <ma...@whirr.apache.org>>
> Date: Tue, 10 Jan 2012 14:59:17 -0800
> To: "user@whirr.apache.org <ma...@whirr.apache.org>" 
> <user@whirr.apache.org <ma...@whirr.apache.org>>
> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
>
> Can you share the .properties file you are using? I want to give it a 
> try on my machine.
>
> Thanks,
>
> -- Andrei Savu / andreisavu.ro <http://andreisavu.ro>
>
> On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <madhu@buysight.com 
> <ma...@buysight.com>> wrote:
>
>     Hello everyone,
>
>     I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0
>     version.
>
>     I setup the relevant files as per the quick start guide. However
>     the cluster setup failed with the following exceptions.  What
>     could be going wrong ? I'm sure (almost) my aws keys are correct
>
>     Appreciate your response !
>
>     2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) <<
>     images(3758)
>     2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2)
>     << images(3758)
>     2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1)
>     >> blocking on socket [address=107.21.80.86, port=22] for 600000
>     MILLISECONDS
>     2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2)
>     >> blocking on socket [address=107.22.64.12, port=22] for 600000
>     MILLISECONDS
>     2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2)
>     << socket [address=107.22.64.12, port=22] opened
>     2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1)
>     << socket [address=107.21.80.86, port=22] opened
>     2012-01-06 10:38:13,012 INFO
>      [org.apache.whirr.actions.ScriptBasedClusterAction]
>     (pool-4-thread-2) configure phase script run completed on:
>     us-east-1/i-94bccef6
>     2012-01-06 10:38:16,003 INFO
>      [org.apache.whirr.actions.ScriptBasedClusterAction]
>     (pool-4-thread-1) configure phase script run completed on:
>     us-east-1/i-9abccef8
>     2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController]
>     (main) Unable to start the cluster. Terminating all nodes.
>     java.io.IOException: org.jclouds.rest.AuthorizationException:
>     (root: <snip> ) error acquiring SSHClient(timeout=60000):
>     Exhausted available authentication methods
>             at
>     org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
>             at
>     org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
>             at
>     org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
>             at
>     org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>             at
>     org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>             at org.apache.whirr.cli.Main.run(Main.java:64)
>             at org.apache.whirr.cli.Main.main(Main.java:97)
>     Caused by: org.jclouds.rest.AuthorizationException:
>     (root:rsa[<snip>]@107.21.80.86:22 <http://107.21.80.86:22>) error
>     acquiring SSHClient(timeout=60000): Exhausted available
>     authentication methods
>             at
>     org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
>             at
>     org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
>             at
>     org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
>             at
>     org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
>             at
>     org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
>             at
>     org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
>             at
>     org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
>             at
>     java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>             at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>             at java.lang.Thread.run(Thread.java:619)
>     Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted
>     available authentication methods
>             at
>     net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
>             at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
>             at
>     net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
>             at
>     net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
>             at
>     org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
>             at
>     org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
>             at
>     org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
>             ... 10 more
>     Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey
>     auth failed
>             at
>     net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
>             at
>     net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
>             at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
>             at
>     net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
>             at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
>     2012-01-06 10:38:16,007 INFO
>      [org.apache.whirr.state.ClusterStateStore] (main) Unable to load
>     cluster state, assuming it has no running nodes.
>     java.io.FileNotFoundException:
>     /root/.whirr/test_bs_hadoop_cluster/instances (No such file or
>     directory)
>
>     Thanks,
>     Madhu
>
>


Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Madhu Ramanna <ma...@buysight.com>.
Here you go:

whirr.cluster-name=test_bs_hadoop_cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=/root/.ssh/id_rsa_whirr
whirr.public-key-file=/root/.ssh/id_rsa_whirr.pub
whirr.cluster-user=tuser

Thanks,
Madhu

From: Andrei Savu <sa...@gmail.com>>
Reply-To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Date: Tue, 10 Jan 2012 14:59:17 -0800
To: "user@whirr.apache.org<ma...@whirr.apache.org>" <us...@whirr.apache.org>>
Subject: Re: [newbie] Unable to setup hadoop cluster on ec2

Can you share the .properties file you are using? I want to give it a try on my machine.

Thanks,

-- Andrei Savu / andreisavu.ro<http://andreisavu.ro>

On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com>> wrote:
Hello everyone,

I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.

I setup the relevant files as per the quick start guide. However the cluster setup failed with the following exceptions.  What could be going wrong ? I'm sure (almost) my aws keys are correct

Appreciate your response !

2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) << images(3758)
2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) << images(3758)
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) << socket [address=107.22.64.12, port=22] opened
2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) << socket [address=107.21.80.86, port=22] opened
2012-01-06 10:38:13,012 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2) configure phase script run completed on: us-east-1/i-94bccef6
2012-01-06 10:38:16,003 INFO  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1) configure phase script run completed on: us-east-1/i-9abccef8
2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main) Unable to start the cluster. Terminating all nodes.
java.io.IOException: org.jclouds.rest.AuthorizationException: (root: <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
        at org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
        at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
        at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
        at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
        at org.apache.whirr.cli.Main.run(Main.java:64)
        at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@107.21.80.86:22<http://107.21.80.86:22>) error acquiring SSHClient(timeout=60000): Exhausted available authentication methods
        at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
        at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
        at org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
        at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
        at org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
        at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
        at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
        at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
        at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
        at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
        ... 10 more
Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth failed
        at net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore] (main) Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)

Thanks,
Madhu



Re: [newbie] Unable to setup hadoop cluster on ec2

Posted by Andrei Savu <sa...@gmail.com>.
Can you share the .properties file you are using? I want to give it a try
on my machine.

Thanks,

-- Andrei Savu / andreisavu.ro

On Wed, Jan 11, 2012 at 12:55 AM, Madhu Ramanna <ma...@buysight.com> wrote:

> Hello everyone,
>
> I wanted to setup a hadoop cluster on ec2 and downloaded 0.7.0 version.
>
> I setup the relevant files as per the quick start guide. However the
> cluster setup failed with the following exceptions.  What could be going
> wrong ? I'm sure (almost) my aws keys are correct
>
> Appreciate your response !
>
> 2012-01-06 10:38:06,690 DEBUG [jclouds.compute] (user thread 2) <<
> images(3758)
> 2012-01-06 10:38:07,210 DEBUG [jclouds.compute] (pool-4-thread-2) <<
> images(3758)
> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-1) >>
> blocking on socket [address=107.21.80.86, port=22] for 600000 MILLISECONDS
> 2012-01-06 10:38:07,212 DEBUG [jclouds.compute] (pool-4-thread-2) >>
> blocking on socket [address=107.22.64.12, port=22] for 600000 MILLISECONDS
> 2012-01-06 10:38:07,292 DEBUG [jclouds.compute] (pool-4-thread-2) <<
> socket [address=107.22.64.12, port=22] opened
> 2012-01-06 10:38:10,293 DEBUG [jclouds.compute] (pool-4-thread-1) <<
> socket [address=107.21.80.86, port=22] opened
> 2012-01-06 10:38:13,012 INFO
>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-2)
> configure phase script run completed on: us-east-1/i-94bccef6
> 2012-01-06 10:38:16,003 INFO
>  [org.apache.whirr.actions.ScriptBasedClusterAction] (pool-4-thread-1)
> configure phase script run completed on: us-east-1/i-9abccef8
> 2012-01-06 10:38:16,004 ERROR [org.apache.whirr.ClusterController] (main)
> Unable to start the cluster. Terminating all nodes.
> java.io.IOException: org.jclouds.rest.AuthorizationException: (root:
> <snip> ) error acquiring SSHClient(timeout=60000): Exhausted available
> authentication methods
>         at
> org.apache.whirr.actions.ScriptBasedClusterAction.runScripts(ScriptBasedClusterAction.java:215)
>         at
> org.apache.whirr.actions.ScriptBasedClusterAction.doAction(ScriptBasedClusterAction.java:128)
>         at
> org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:107)
>         at
> org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>         at
> org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>         at org.apache.whirr.cli.Main.run(Main.java:64)
>         at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.jclouds.rest.AuthorizationException: (root:rsa[<snip>]@
> 107.21.80.86:22) error acquiring SSHClient(timeout=60000): Exhausted
> available authentication methods
>         at org.jclouds.sshj.SshjSshClient.propagate(SshjSshClient.java:413)
>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:244)
>         at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:255)
>         at
> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSsh.call(RunScriptOnNodeAsInitScriptUsingSsh.java:89)
>         at
> org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:612)
>         at
> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:190)
>         at
> org.apache.whirr.actions.ScriptBasedClusterAction$2.call(ScriptBasedClusterAction.java:178)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: net.schmizz.sshj.userauth.UserAuthException: Exhausted
> available authentication methods
>         at
> net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114)
>         at net.schmizz.sshj.SSHClient.auth(SSHClient.java:204)
>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:304)
>         at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:323)
>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:199)
>         at org.jclouds.sshj.SshjSshClient$1.create(SshjSshClient.java:171)
>         at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:220)
>         ... 10 more
> Caused by: net.schmizz.sshj.userauth.UserAuthException: publickey auth
> failed
>         at
> net.schmizz.sshj.userauth.UserAuthImpl.handle(UserAuthImpl.java:157)
>         at
> net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:474)
>         at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:127)
>         at net.schmizz.sshj.transport.Decoder.received(Decoder.java:195)
>         at net.schmizz.sshj.transport.Reader.run(Reader.java:72)
> 2012-01-06 10:38:16,007 INFO  [org.apache.whirr.state.ClusterStateStore]
> (main) Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException:
> /root/.whirr/test_bs_hadoop_cluster/instances (No such file or directory)
>
> Thanks,
> Madhu
>
>