You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Huiliang Zhang <zh...@gmail.com> on 2014/05/12 20:31:27 UTC

Can Cassandra client programs use hostnames instead of IPs?

Hi,

Cassandra returns ips of the nodes in the cassandra cluster for further
communication between hadoop program and the casandra cluster. Is there a
way to configure the cassandra cluster to return hostnames instead of ips?
My cassandra cluster is on AWS and has no elastic ips which can be accessed
outside AWS.

Thanks,
Huiliang

Re: Can Cassandra client programs use hostnames instead of IPs?

Posted by Huiliang Zhang <zh...@gmail.com>.
Thanks. My case is that there is no public ip and VPN cannot be set up. It
seems that I have to run EMR job to operate on the AWS cassandra cluster.

I got some timeout errors during running the EMR job as:
java.lang.RuntimeException: Could not retrieve endpoint ranges:
at
org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:333)
at
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149)
at
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:144)
at
org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:228)
at
org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:213)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:658)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at
org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.createThriftClient(BulkRecordWriter.java:348)
at
org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:293)
... 12 more
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
... 15 more

Appreciated if some suggestions are provided.


On Tue, May 13, 2014 at 7:45 AM, Ben Bromhead <be...@instaclustr.com> wrote:

> You can set listen_address in cassandra.yaml to a hostname (
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html
> ).
>
> Cassandra will use the IP address returned by a DNS query for that
> hostname. On AWS you don't have to assign an elastic IP, all instances will
> come with a public IP that lasts its lifetime (if you use ec2-classic or
> your VPC is set up to assign them).
>
> Note that whatever hostname you set in a nodes listen_address, it will
> need to return the private IP as AWS instances only have network access via
> there private address. Traffic to a instances public IP is NATed and
> forwarded to the private address. So you may as well just use the nodes IP
> address.
>
> If you run hadoop on instances in the same AWS region it will be able to
> access your Cassandra cluster via private IP. If you run hadoop externally
> just use the public IPs.
>
> If you run in a VPC without public addressing and want to connect from
> external hosts you will want to look at a VPN (
> http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html).
>
> Ben Bromhead
> Instaclustr | www.instaclustr.com | @instaclustr<http://twitter.com/instaclustr> |
> +61 415 936 359
>
>
>
>
> On 13/05/2014, at 4:31 AM, Huiliang Zhang <zh...@gmail.com> wrote:
>
> Hi,
>
> Cassandra returns ips of the nodes in the cassandra cluster for further
> communication between hadoop program and the casandra cluster. Is there a
> way to configure the cassandra cluster to return hostnames instead of ips?
> My cassandra cluster is on AWS and has no elastic ips which can be accessed
> outside AWS.
>
> Thanks,
> Huiliang
>
>
>
>

Re: Can Cassandra client programs use hostnames instead of IPs?

Posted by Ben Bromhead <be...@instaclustr.com>.
You can set listen_address in cassandra.yaml to a hostname (http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html). 

Cassandra will use the IP address returned by a DNS query for that hostname. On AWS you don't have to assign an elastic IP, all instances will come with a public IP that lasts its lifetime (if you use ec2-classic or your VPC is set up to assign them).

Note that whatever hostname you set in a nodes listen_address, it will need to return the private IP as AWS instances only have network access via there private address. Traffic to a instances public IP is NATed and forwarded to the private address. So you may as well just use the nodes IP address.

If you run hadoop on instances in the same AWS region it will be able to access your Cassandra cluster via private IP. If you run hadoop externally just use the public IPs. 

If you run in a VPC without public addressing and want to connect from external hosts you will want to look at a VPN (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html).

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359




On 13/05/2014, at 4:31 AM, Huiliang Zhang <zh...@gmail.com> wrote:

> Hi,
> 
> Cassandra returns ips of the nodes in the cassandra cluster for further communication between hadoop program and the casandra cluster. Is there a way to configure the cassandra cluster to return hostnames instead of ips? My cassandra cluster is on AWS and has no elastic ips which can be accessed outside AWS.
> 
> Thanks,
> Huiliang
> 
>