You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@whirr.apache.org by "Akash Ashok (Created) (JIRA)" <ji...@apache.org> on 2011/12/31 01:07:30 UTC

[jira] [Created] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

DNS Failure when trying to spawn HBase cluster
----------------------------------------------

                 Key: WHIRR-459
                 URL: https://issues.apache.org/jira/browse/WHIRR-459
             Project: Whirr
          Issue Type: Bug
    Affects Versions: 0.7.0
         Environment: Trying to use WHirr from behind a NAT
            Reporter: Akash Ashok


While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.

bin/whirr launch-cluster --config hbase-ec2.properties
Bootstrapping cluster
Configuring template
Configuring template
Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
Unable to start the cluster. Terminating all nodes.
org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
    at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
    at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
    at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
    at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
    at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
    at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
    at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
    at org.apache.whirr.cli.Main.run(Main.java:64)
    at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
    at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
    at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
    at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
    at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
    at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
    ... 11 more
Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:137)
    at com.google.common.io.Files$1.getInput(Files.java:100)
    at com.google.common.io.Files$1.getInput(Files.java:97)
    at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
    at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
    at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
    at com.google.common.io.Files.readLines(Files.java:580)
    at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
    at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
    at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
    at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
    at org.apache.whirr.cli.Main.run(Main.java:64)
    at org.apache.whirr.cli.Main.main(Main.java:97)
Starting to run scripts on cluster for phase destroyinstances: 
Starting to run scripts on cluster for phase destroyinstances: 
Finished running destroy phase scripts on all cluster instances
Destroying hbase cluster
Cluster hbase destroyed
Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
    at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
    at org.apache.whirr.cli.Main.run(Main.java:64)
    at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
    at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
    at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
    at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
    at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
    at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
    at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
    ... 3 more
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
    at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
    at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
    at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
    at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
    at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
    at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
    ... 11 more


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Alex Heneveld (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Heneveld updated WHIRR-459:
--------------------------------

    Attachment: WHIRR-459-fallback-to-jclouds-hostname.patch

patch which not only catches error but also falls back to the jclouds hostname
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>             Fix For: 0.8.0
>
>         Attachments: WHIRR-459-fallback-to-jclouds-hostname.patch, WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Ashish Paliwal (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225188#comment-13225188 ] 

Ashish Paliwal commented on WHIRR-459:
--------------------------------------

Not sure if this would of help, but this is a recent Thread on HBase ML regarding the DNS issue, and talks about a utility

http://markmail.org/thread/d3l46ejly5kr63g5
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224481#comment-13224481 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

I am looking at FastDnsResolver.java:

{code:java} 
  @Override
  public String apply(String hostIp) {
    try {
         ...
    } catch(SocketTimeoutException e) {
      return hostIp;  /* same response as standard Java on timeout */

    } catch(IOException e) {
      throw new DnsException(e);
    }
  }
{code}

Maybe it is possible to catch java.net.ConnectException and, as for SocketTimeoutException, return hostIp. If not, why not?
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224562#comment-13224562 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

> Do you know why reverse DNS queries over TCP are not supported from your network? 

No, I don't. I never has any issue related to reverse DNS queries before.

> Can you try to switch to UDP? 

I'll try that.

> or should we always fallback to standard Java on exception? 

Maybe, but I am not sure... if returning the public IP is not going to work on VMs running in Amazon, what is the point?
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184971#comment-13184971 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

Thanks Akash for looking into this. Can we at least make the error message more friendly? Is there a better way of handling this failure scenario? (e.g. returning the raw IP as reverse DNS)
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224488#comment-13224488 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

bq. Maybe it is possible to catch java.net.ConnectException and, as for SocketTimeoutException, return hostIp.

Sounds reasonable to me. Would that fix the problem for you? 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224390#comment-13224390 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

Possible: yes. Ideal: no. I normally use my laptop or desktop to develop and test, when I am ready, I run Whirr. On my laptop or desktop I have everything setup properly. An additional VM just to launch stuff is really a 'PITA'. I know: "if you are behind NAT you are not on the net".

It would be good to describe the change in behaviour between Whirr 0.6.0-incubating (which AFAIK did not have this problem) and Whirr 0.7.1 and understand what functionality/benefit that change brought to the users.

By the way, this isn't limited to an HBase cluster, I was firing up an Hadoop cluster.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Grant Ingersoll (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240694#comment-13240694 ] 

Grant Ingersoll commented on WHIRR-459:
---------------------------------------

I'm pretty sure I"m getting this, too, when running the 5 minute quick start.  I can confirm FastDnsResolverTest fails as well.  This is for both 0.7.1 and trunk.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Akash Ashok (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184690#comment-13184690 ] 

Akash Ashok commented on WHIRR-459:
-----------------------------------

Figured this is not an issue with the code per say but with the network configuration and accessibility of DNS servers. I have opened a mail conversation with bwelling@xbill.org.  

Can we resolve this issue ?
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paolo Castagna updated WHIRR-459:
---------------------------------

    Attachment: WHIRR-459.patch

This is my attempt at fixing this issue.

Running FastDnsResolverTest.java locally from Eclipse, I saw exactly the same exceptions.

With the patch applied, no exceptions. I tried to provision an Hadoop cluster with a patched version of Whirr. The cluster started, however there might be issues. I used to see public DNS names to connect to the NameNode UI and to the JobTracker UI. This time, I saw IP addresses. 

I failed to connect to them.

I was able to ssh into the instances of the cluster. But on the master I saw errors:

2012-03-07 17:30:34,173 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to /50.16.125.61:8021 : Cannot assign requested address
	at org.apache.hadoop.ipc.Server.bind(Server.java:227)
	at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:301)
	at org.apache.hadoop.ipc.Server.<init>(Server.java:1483)
	at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:545)
	at org.apache.hadoop.ipc.RPC.getServer(RPC.java:506)
	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2306)
	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2192)
	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2186)
	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300)
	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:291)
	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4978)
Caused by: java.net.BindException: Cannot assign requested address
	at sun.nio.ch.Net.bind(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:137)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
	at org.apache.hadoop.ipc.Server.bind(Server.java:225)
	... 10 more

Similar exception for the NameNode.

Any idea of what's going badly wrong here?
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Joris Poort (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220676#comment-13220676 ] 

Joris Poort commented on WHIRR-459:
-----------------------------------

I'm having issues with this, but pardon my ignorance - how does a NAT work and is there any way to get around it?

Thanks... Joris
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Akash Ashok (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184651#comment-13184651 ] 

Akash Ashok commented on WHIRR-459:
-----------------------------------

Was testing as to why this was happening. As the stack trace shows ReverseMap.fromAddress(hostIp). As it turned out its not so much because of the value being passed for resolution but its the network from which this is being called. I ran the same function from my system as a standalone code it threw the same exception. 

So I  re-ran it on a system from another IP address it ran fine. So I am guessing this API somehow has a reverse connection being made.

Will post futher on this issue.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224789#comment-13224789 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

Do you think it's better if we implement a fail fast mechanism? 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Akash Ashok (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184678#comment-13184678 ] 

Akash Ashok commented on WHIRR-459:
-----------------------------------

My mistake. It has nothing to do with any reverse connection being made because there is not reverse connection being made.  

It's the below code resolver.send which is failing because my system is not able to connect to the dns servers. so it's giving connection refused. 

{code}
Message response = resolver.send(newQuery(record));
{code}


                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Grant Ingersoll (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240701#comment-13240701 ] 

Grant Ingersoll commented on WHIRR-459:
---------------------------------------

Also, from what I can tell, it's getting through the install part (creating the nodes and installing zk, but then failing in config)
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-459:
------------------------------

    Fix Version/s:     (was: 0.7.2)
    
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>             Fix For: 0.8.0
>
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224763#comment-13224763 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

> Can you try to switch to UDP? ( resolver.setTCP(false) )

This didn't help.

However, I think this is a client problem:

 1. check your router/boardband modem
 2. check your DNS configuration settings (i.e. /etc/resolv.conf and/or wherever is in Windows)
 3. run FastDnsResolverTest.java to quickly check if reverse DNS queries with Whirr are working
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-459:
------------------------------

    Fix Version/s: 0.7.2
                   0.8.0

Putting this on the roadmap for 0.7.2 and 0.8.0. Unfortunately handling the ConnectionException is not going to make things work for Hadoop & HBase - what we need is the ability to fetch the hostname using the API (this works for Amazon).
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>             Fix For: 0.8.0, 0.7.2
>
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Alex Heneveld (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272199#comment-13272199 ] 

Alex Heneveld commented on WHIRR-459:
-------------------------------------

I ran in to this, catching the error helps sometimes, but then it failed -- with Hadoop 1.0.2 (not sure whether it's a problem with 0.2x) -- because Hadoop was attempting to bind to the _public IP_ which is not available.

The patch I'm attaching adds one extra fallback -- checking whether nodeMetadata.getHostname() is suitable, and using that in preference to the purely numeric IP address.  Good thing about this is that internally (EC2 definitely, other clouds I think) the hostname resolves to the private IP, but externally it resolves to the public address.

(This may address a few other of the DNS-woe issues.)

                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>             Fix For: 0.8.0
>
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Grant Ingersoll (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240724#comment-13240724 ] 

Grant Ingersoll commented on WHIRR-459:
---------------------------------------

I can confirm that the patch included here fixes the problem for me running the 5 min quick start and the FastDnsResolverTest.  What a colossal waste of a day tracking down that one! 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224559#comment-13224559 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

or should we always fallback to standard Java on exception? 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224366#comment-13224366 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

I am having the same issue and, as Joris, I am interested in any work around. Thanks.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224369#comment-13224369 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

Paolo would it be possible for you to use a VM in Amazon as a launcher? 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225112#comment-13225112 ] 

Paolo Castagna commented on WHIRR-459:
--------------------------------------

> Do you think it's better if we implement a fail fast mechanism? 

Andrei, I am not sure... I am clearly not an expert of DNS reverse lookup requests and I am not completely sure how this is done, used and needed in the context of a tool such as Whirr.

Certainly, from a point of view of a user, you do not want to wait (and pay!) to provision a cluster to find out that something goes wrong towards the end (when you have already paid... and on EC2 you pay by the hour even if you use 2 minutes). Fail fast is good in general, even more so IMHO in this case.

If DNS reverse lookup is necessary in order to provision a service with Whirr, Whirr should test for that before doing anything which will make an user pay and deliver a clear error message. I also did search for similar errors and suggestions on-line, but I did not find anything useful other than this JIRA issue. I think others might hit this problem: it isn't a Whirr problem, but Whirr could help in the diagnosis and, as you suggested, fail fast/sooner.

My 2 cents.
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224558#comment-13224558 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

Thanks Paolo. This is failing because the public IP is not exposed as a NIC on VMs running in Amazon. Do you know why reverse DNS queries over TCP are not supported from your network? Can you try to switch to UDP? ( resolver.setTCP(false) )
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224430#comment-13224430 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

Thanks Paolo. I will look into this more later today. There are some known issues with reverse DNS resolution WHIRR-511 that we are working on for 0.8.0. 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Paolo Castagna (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224763#comment-13224763 ] 

Paolo Castagna edited comment on WHIRR-459 at 3/7/12 10:23 PM:
---------------------------------------------------------------

> Can you try to switch to UDP? ( resolver.setTCP(false) )

This didn't help.

However, I think this is a client problem:

 1. check your router/boardband modem
 2. check your DNS configuration settings (i.e. /etc/resolv.conf and/or wherever is in Windows)
 3. check your firewall configuration if you are running one
 4. run FastDnsResolverTest.java to quickly check if reverse DNS queries with Whirr are working

In my case it was a problem with 1.
I can confirm Apache Whirr 0.7.1 works with Apache Hadoop 1.0.1

You might decide to apply the patch anyway, but that is not going to save troubles to others who, for some reasons, have no reverse DNS requests working properly.
                
      was (Author: castagna):
    > Can you try to switch to UDP? ( resolver.setTCP(false) )

This didn't help.

However, I think this is a client problem:

 1. check your router/boardband modem
 2. check your DNS configuration settings (i.e. /etc/resolv.conf and/or wherever is in Windows)
 3. run FastDnsResolverTest.java to quickly check if reverse DNS queries with Whirr are working
                  
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (WHIRR-459) DNS Failure when trying to spawn HBase cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224576#comment-13224576 ] 

Andrei Savu commented on WHIRR-459:
-----------------------------------

You are right. At least for Hadoop and HBase I think we need to be able to find the public hostname. The good news is that on Amazon we can retrieve that information using the API. 
                
> DNS Failure when trying to spawn HBase cluster
> ----------------------------------------------
>
>                 Key: WHIRR-459
>                 URL: https://issues.apache.org/jira/browse/WHIRR-459
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Trying to use WHirr from behind a NAT
>            Reporter: Akash Ashok
>         Attachments: WHIRR-459.patch
>
>
> While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.
> bin/whirr launch-cluster --config hbase-ec2.properties
> Bootstrapping cluster
> Configuring template
> Configuring template
> Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
> Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
> Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]]
> Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]]
> Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
> Unable to start the cluster. Terminating all nodes.
> org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more
> Unable to load cluster state, assuming it has no running nodes.
> java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:137)
>     at com.google.common.io.Files$1.getInput(Files.java:100)
>     at com.google.common.io.Files$1.getInput(Files.java:97)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
>     at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
>     at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
>     at com.google.common.io.Files.readLines(Files.java:580)
>     at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
>     at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
>     at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Starting to run scripts on cluster for phase destroyinstances: 
> Starting to run scripts on cluster for phase destroyinstances: 
> Finished running destroy phase scripts on all cluster instances
> Destroying hbase cluster
> Cluster hbase destroyed
> Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
>     at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
>     at org.apache.whirr.cli.Main.run(Main.java:64)
>     at org.apache.whirr.cli.Main.main(Main.java:97)
> Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
>     at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
>     at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
>     at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
>     at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
>     at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
>     at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
>     at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
>     ... 3 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
>     at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
>     at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
>     at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
>     at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
>     at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
>     ... 11 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira