You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Akash Ashok (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 02:43:39 UTC

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184651#comment-13184651 ] 

Akash Ashok commented on WHIRR-459:
-----------------------------------

Was testing as to why this was happening. As the stack trace shows ReverseMap.fromAddress(hostIp). As it turned out its not so much because of the value being passed for resolution but its the network from which this is being called. I ran the same function from my system as a standalone code it threw the same exception. 

So I  re-ran it on a system from another IP address it ran fine. So I am guessing this API somehow has a reverse connection being made.

Will post futher on this issue.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira