You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@whirr.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2010/06/29 05:16:54 UTC

[jira] Created: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
--------------------------------------------------------------------------------------------------

                 Key: WHIRR-52
                 URL: https://issues.apache.org/jira/browse/WHIRR-52
             Project: Whirr
          Issue Type: Improvement
          Components: service/hadoop
            Reporter: Jeff Hammerbacher




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883396#action_12883396 ] 

Jeff Hammerbacher commented on WHIRR-52:
----------------------------------------

When running http://github.com/hammer/whirr-demo with a {{whirr.provider}} set to {{cloudservers}}, I get:

{noformat}
Starting the cluster.
Launching cluster.
Cluster launched.
Configuring Proxy.
Starting Proxy.
Proxy Started.
Cluster started.
Running MapReduce job.
Warning: Permanently added '184-106-200-188.static.cloud-ips.com,184.106.200.188' (RSA) to the list of known hosts.
10/06/28 20:15:37 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 0 time(s).
10/06/28 20:15:38 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 1 time(s).
10/06/28 20:15:39 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 2 time(s).
10/06/28 20:15:40 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 3 time(s).
10/06/28 20:15:41 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 4 time(s).
10/06/28 20:15:42 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 5 time(s).
10/06/28 20:15:43 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 6 time(s).
10/06/28 20:15:44 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 7 time(s).
10/06/28 20:15:45 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 8 time(s).
10/06/28 20:15:46 INFO ipc.Client: Retrying connect to server: 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021. Already tried 9 time(s).
Could not run job: Call to 184-106-200-188.static.cloud-ips.com/184.106.200.188:8021 failed on local exception: java.net.SocketException: Connection refused
Finished MapReduce job.
Bringing down the cluster.
Cluster stopped.
{noformat}

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883419#action_12883419 ] 

Jeff Hammerbacher edited comment on WHIRR-52 at 6/29/10 1:47 AM:
-----------------------------------------------------------------

I've confirmed that {{mapred.job.tracker}} is set to {{184-106-196-148.static.cloud-ips.com:8021}} on the configuration object used to submit the MapReduce job to the cluster. Might the JobClient be picking up the bad address somewhere else, particularly from the instances set in the HadoopCluster object? That guy is using the {{.getPublicAddresses()}} URL instead of the {{.getHostName()}} URL, which would result in the error I'm seeing above.

      was (Author: hammer):
    I've confirmed that {[mapred.job.tracker}} is set to {{184-106-196-148.static.cloud-ips.com:8021}} on the configuration object used to submit the MapReduce job to the cluster. Might the JobClient be picking up the bad address somewhere else, particularly from the instances set in the HadoopCluster object? That guy is using the {{.getPublicAddresses()}} URL instead of the {{.getHostName()}} URL, which would result in the error I'm seeing above.
  
> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883560#action_12883560 ] 

Tom White commented on WHIRR-52:
--------------------------------

This looks like a DNS problem. See the Python contrib scripts to see how DNS is set up for Rackspace: I fear we'll need a similar approach here unless anyone has a better idea.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883409#action_12883409 ] 

Jeff Hammerbacher commented on WHIRR-52:
----------------------------------------

Wow, seeing some potentially weird behavior from Rackspace.

In {{HadoopService.java}}, I've instrumented the system with my highly complex debugging tool:
{code}
    NodeMetadata node = Iterables.getOnlyElement(nodes);
    InetAddress namenodePublicAddress = Iterables.getOnlyElement(node.getPublicAddresses());
    System.out.println("NN: " + namenodePublicAddress);
    System.out.println("NN hostname: " + namenodePublicAddress.getHostName());
    InetAddress jobtrackerPublicAddress = Iterables.getOnlyElement(node.getPublicAddresses());
    System.out.println("JT: " + jobtrackerPublicAddress);
    System.out.println("JT hostname: " + jobtrackerPublicAddress.getHostName());
{code}

Here's the output:
{noformat}
NN: /184.106.196.148
NN hostname: 184-106-196-148.static.cloud-ips.com
JT: 184-106-196-148.static.cloud-ips.com/184.106.196.148
JT hostname: 184-106-196-148.static.cloud-ips.com
{noformat}

It seems that a call to {{.getHostName()}} is altering the state of the {{node}} object such that the second call to {{.getPublicAddresses()}} is different from the first. Perhaps that's known behavior, but I'm just recording my investigations here for posterity.

On the other hand, {{mapred.job.tracker}} is set using the {{.getHostName()}} call, and the result of that call seems constant and reasonable, so I'm not quite sure where things are going wrong. More sophisticated debugging awaits...

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Adrian Cole (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924128#action_12924128 ] 

Adrian Cole commented on WHIRR-52:
----------------------------------

Looks like there's a bad configuration bubbling up.  the error from jclouds should translate this into a IllegalArgumentException as eventhough rackspace returns a 404, it is really not accurate:

2010-10-23 00:09:57,123 DEBUG [jclouds.wire] (i/o thread 0) >> "{"server":{"name":"hadoopclustertest-1ad","imageId":31,"flavorId":1}}"
2010-10-23 00:09:57,124 DEBUG [jclouds.headers] (i/o thread 0) >> POST https://servers.api.rackspacecloud.com/v1.0/413274/servers?format=json&now=1287810597122 HTTP/1
.1
2010-10-23 00:09:57,124 DEBUG [jclouds.headers] (i/o thread 0) >> Accept: application/json
2010-10-23 00:09:57,124 DEBUG [jclouds.headers] (i/o thread 0) >> X-Auth-Token: XXXXXXXXXXXXXXXXXX
2010-10-23 00:09:57,124 DEBUG [jclouds.headers] (i/o thread 0) >> Content-Type: application/json
2010-10-23 00:09:57,124 DEBUG [jclouds.headers] (i/o thread 0) >> Content-Length: 69
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << HTTP/1.1 404 Not Found
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << X-Varnish: 382298258
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << Age: 0
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << Date: Sat, 23 Oct 2010 05:09:57 GMT
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << Via: 1.1 varnish
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << Connection: keep-alive
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << Server: Apache-Coyote/1.1
2010-10-23 00:09:57,548 DEBUG [jclouds.headers] (i/o thread 0) << vary: Accept, Accept-Encoding, X-Auth-Token
2010-10-23 00:09:57,549 DEBUG [jclouds.headers] (i/o thread 0) << Cache-Control: no-cache
2010-10-23 00:09:57,549 DEBUG [jclouds.headers] (i/o thread 0) << Content-Type: application/json
2010-10-23 00:09:57,549 DEBUG [jclouds.headers] (i/o thread 0) << Content-Length: 166
2010-10-23 00:09:57,549 DEBUG [jclouds.wire] (i/o thread 0) << "{"itemNotFound":{"message":"No offering found for flavor 1 and option 4","details":"com.rackspace.clou
d.service.servers.ItemNotFoundFault: Fault occured","code":404}}"


> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Attachment: WHIRR-52.patch

With this patch all the integration tests run on Rackspace (and they continue to pass on EC2).

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Adrian Cole (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924133#action_12924133 ] 

Adrian Cole commented on WHIRR-52:
----------------------------------

The following line is resetting the template:

 slaveTemplateBuilder.locationId(masterTemplate.getLocation().getId());

locationId on clouds has the potential to invalidate image choices, as they are not always in all locations (ex. ec2).  As a precaution, when someone explicitly specifies a locationId, jclouds resets the rest of the template to default.  Now, we can argue to change this behaviour, but as of beta-7 this is the case.

A workaround is to comment out the above line.  Then, we should figure out what logic jclouds should follow and request a corresponding change in the project.  We can also patch whatever that is in whirr to make it immediately available.

-Adrian

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Adrian Cole (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Cole updated WHIRR-52:
-----------------------------


some thoughts:

1. rackspace returns ip addresses, not dns hostnames
2. rackspace returns multiple ip addresses
ex.
            "public" : [ 
                "67.23.10.132", 
                "67.23.10.131" 
            ], 

I'd recommend changing our fields that refer to public/private ip addresses to lists (if not already), and use predicable tools for determining whether something is a hostname or not.
   ex. http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/net/InternetDomainName.html

mind isn't free enough to debug further at the moment!


> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>         Attachments: WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Status: Patch Available  (was: Open)

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>            Assignee: Tom White
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I've just committed this.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>            Assignee: Tom White
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Adrian Cole (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924130#action_12924130 ] 

Adrian Cole commented on WHIRR-52:
----------------------------------

yeah.. looking at jclouds-compute.log, the second runNodesWithTag is getting the wrong image as something is zeroing out the default template:

template search from master:
2010-10-23 00:09:05,370 DEBUG [jclouds.compute] (main) >> searching params([biggest=false, fastest=false, imageName=.*10\.?04.*, imageDescription=null, imageId=null, 
imageVersion=null, location=[id=DFW1, scope=ZONE, description=Dallas, TX, parent=cloudservers], minCores=0.0, minRam=0, osFamily=ubuntu, osName=null, osDescription=nu
ll, osVersion=null, osArch=null, os64Bit=null, hardwareId=null])

template search from slave:

2010-10-23 00:09:52,353 DEBUG [jclouds.compute] (main) >> searching params([biggest=false, fastest=false, imageName=null, imageDescription=null, imageId=null, imageVe
rsion=null, location=[id=DFW1, scope=ZONE, description=Dallas, TX, parent=cloudservers], minCores=0.0, minRam=0, osFamily=null, osName=null, osDescription=null, osVer
sion=null, osArch=null, os64Bit=null, hardwareId=null])


Notice that imageName is nulled on the second, which is making the search grab something a bit too random

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Attachment: WHIRR-52.patch

I finally managed to get this working. The problem was a private key with a non-empty passphrase, which was fine on EC2, but not Rackspace (not sure why). I've added some documentation warning about this.

The ZooKeeper test works, but Cassandra doesn't yet (it's listening on the wrong interface, I think). Hadoop also fails, repeatedly, with the following error:

{code}
2010-10-22 14:46:28,422 INFO  [org.apache.whirr.service.hadoop.HadoopService] (main) Starting 1 worker node(s)
2010-10-22 14:46:30,127 ERROR [jclouds.compute] (user thread 1) starting nodes, completed: 0/1, errors: 1, rate: 625ms/op
java.util.concurrent.ExecutionException: org.jclouds.rest.ResourceNotFoundException: /v1.0/418003/servers
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
	at java.util.concurrent.FutureTask.get(FutureTask.java:83)
	at org.jclouds.concurrent.FutureIterables$1.run(FutureIterables.java:121)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:637)
Caused by: org.jclouds.rest.ResourceNotFoundException: /v1.0/418003/servers
	at org.jclouds.rackspace.cloudservers.handlers.ParseCloudServersErrorFromHttpResponse.handleError(ParseCloudServersErrorFromHttpResponse.java:69)
	at org.jclouds.http.handlers.DelegatingErrorHandler.handleError(DelegatingErrorHandler.java:70)
	at org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.shouldContinue(BaseHttpCommandExecutorService.java:193)
	at org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.call(BaseHttpCommandExecutorService.java:163)
	at org.jclouds.http.internal.BaseHttpCommandExecutorService$HttpResponseCallable.call(BaseHttpCommandExecutorService.java:132)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	... 3 more
2010-10-22 14:46:30,128 ERROR [jclouds.compute] (main) starting nodes, completed: 0/1, errors: 1, rate: 627ms/op
java.lang.RuntimeException: starting nodes, completed: 0/1, errors: 1, rate: 627ms/op
	at org.jclouds.concurrent.FutureIterables.awaitCompletion(FutureIterables.java:139)
	at org.jclouds.compute.internal.BaseComputeService.runNodesWithTag(BaseComputeService.java:160)
	at org.apache.whirr.service.hadoop.HadoopService.launchCluster(HadoopService.java:168)
	at org.apache.whirr.service.hadoop.integration.HadoopServiceTest.setUp(HadoopServiceTest.java:89)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
	at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59)
	at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:115)
	at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:102)
	at org.apache.maven.surefire.Surefire.run(Surefire.java:180)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:350)
	at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1021)
{code}

Adrian, any idea what this might be? 

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924220#action_12924220 ] 

Tom White commented on WHIRR-52:
--------------------------------

Thanks for tracking this down Adrian!

> As a precaution, when someone explicitly specifies a locationId, jclouds resets the rest of the template to default. Now, we can argue to change this behaviour, but as of beta-7 this is the case.

Yes, this was unexpected. Does EC2 have different behaviour?

> A workaround is to comment out the above line. Then, we should figure out what logic jclouds should follow and request a corresponding change in the project. We can also patch whatever that is in whirr to make it immediately available.

Setting the location ID was introduced in WHIRR-113. We could also just build a new template for the moment. I'll produce a new patch.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White reassigned WHIRR-52:
------------------------------

    Assignee: Tom White

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>            Assignee: Tom White
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883419#action_12883419 ] 

Jeff Hammerbacher commented on WHIRR-52:
----------------------------------------

I've confirmed that {[mapred.job.tracker}} is set to {{184-106-196-148.static.cloud-ips.com:8021}} on the configuration object used to submit the MapReduce job to the cluster. Might the JobClient be picking up the bad address somewhere else, particularly from the instances set in the HadoopCluster object? That guy is using the {{.getPublicAddresses()}} URL instead of the {{.getHostName()}} URL, which would result in the error I'm seeing above.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Adrian Cole (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924122#action_12924122 ] 

Adrian Cole commented on WHIRR-52:
----------------------------------

first thing I'm noticing is that the same "tag" is used for the master node and the slave nodes.  In jclouds the tag implies identical configuration, although I concede that it will still probably work ;)  looking more closely now.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch, WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912337#action_12912337 ] 

Tom White commented on WHIRR-52:
--------------------------------

Thanks for the ideas, Adrian. I'm not sure the IP address is the problem since I can manually ssh to the node using the IP address. (But I agree that at some point we should change to something that gives more control over resolution than InetAddress.)

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>         Attachments: WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883401#action_12883401 ] 

Jeff Hammerbacher commented on WHIRR-52:
----------------------------------------

>From the above error message, I think something funky is happening when we grab the public name of the JT ({{184-106-200-188.static.cloud-ips.com/184.106.200.188:8021}} sure doesn't look right). Will debug a bit further, but if someone knows how the jclouds Rackspace Cloud Servers API works, that knowledge would be useful here.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Attachment: WHIRR-52.patch

I had a look at this and hit another problem (this is with ZooKeeper, which is a simpler service to launch, so I started with that one):

{noformat}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.141 sec <<< FAILURE!
test(org.apache.whirr.service.zookeeper.integration.ZooKeeperServiceTest)  Time elapsed: 134.007 sec  <<< ERROR!
java.io.IOException: org.jclouds.compute.RunScriptOnNodesException: error runScript on filtered nodes options(RunScriptOptions [overridingCredentials=true, runAsRoot=true])
Execution failures:

0 error[s]
Node failures:

1) SshException on node 337685:
org.jclouds.ssh.SshException: root@173.203.210.13:22: Error connecting to session.
        at org.jclouds.ssh.jsch.JschSshClient.propagate(JschSshClient.java:252)
        at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:199)
        at org.jclouds.compute.internal.BaseComputeService$3.call(BaseComputeService.java:357)
        at org.jclouds.compute.internal.BaseComputeService$3.call(BaseComputeService.java:346)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:637)
Caused by: com.jcraft.jsch.JSchException: Auth fail
        at com.jcraft.jsch.Session.connect(Session.java:452)
        at com.jcraft.jsch.Session.connect(Session.java:150)
        at org.jclouds.ssh.jsch.JschSshClient.newSession(JschSshClient.java:247)
        at org.jclouds.ssh.jsch.JschSshClient.connect(JschSshClient.java:184)
        ... 7 more
{noformat}

Rackspace doesn't return the credentials for nodes after the initial launch, so I tried to override the credentials with the private key.
I'm not sure why there is an authentication error, since I was able to ssh in manually using the same private key.

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>         Attachments: WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (WHIRR-52) Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers

Posted by "Tom White (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-52:
---------------------------

    Fix Version/s: 0.2.0

> Allow a Hadoop MapReduce job to be run against a Hadoop Service running on Rackspace Cloud Servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: WHIRR-52
>                 URL: https://issues.apache.org/jira/browse/WHIRR-52
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Jeff Hammerbacher
>             Fix For: 0.2.0
>
>         Attachments: WHIRR-52.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.