You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Brian Tarbox <ta...@cabotresearch.com> on 2014/02/12 19:13:19 UTC

in AWS is it worth trying to talk to a server in the same zone as your client?

We're running a C* cluster with 6 servers spread across the four us-east1
zones.

We also spread our clients (hundreds of them) across the four zones.

Currently we give our clients a connection string listing all six servers
and let C* do its thing.

This is all working just fine...and we're paying a fair bit in AWS transfer
costs.  There is a suspicion that this transfer cost is driven by us
passing data around between our C* servers and clients.

Would there be any value to trying to get a client to talk to one of the C*
servers in its own zone?

I understand (at least partially!) about coordinator nodes and replication
and know that no matter which server is the coordinator for an operation
replication may cause bits to get transferred to/from servers in other
zones.  Having said that...is there a chance that trying to encourage a
client to initially contact a server in its own zone would help?

Thank you,

Brian Tarbox

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Feb 12, 2014 at 1:14 PM, Ben Bromhead <be...@instaclustr.com> wrote:

> An alternate method would be to define the zones as data centres and then
> you could leverage existing DC aware policies (We've never tried this
> though).
>

https://issues.apache.org/jira/browse/CASSANDRA-3810

=Rob

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Brian Tarbox <ta...@cabotresearch.com>.
We're definitely using all private IPs.

I guess my question really is: with repl=3 and quorum operations I know
we're going to push/pull bits across the various AZs within us-east-1.  So,
does having the client start the conversation with a server in the same AZ
save us anything?


On Wed, Feb 12, 2014 at 4:14 PM, Ben Bromhead <be...@instaclustr.com> wrote:

> 0.01/G between zones irrespective of IP is correct.
>
> As for your original question, depending on the driver you are using you
> could write a custom co-ordinator node selection policy.
>
> For example if you are using the Datastax driver you would extend
> http://www.datastax.com/drivers/java/2.0/apidocs/com/datastax/driver/core/policies/LoadBalancingPolicy.html
>
> ... and set the distance based on which zone the node is in.
>
> An alternate method would be to define the zones as data centres and then
> you could leverage existing DC aware policies (We've never tried this
> though).
>
>
> Ben Bromhead
> Instaclustr | www.instaclustr.com | @instaclustr<http://twitter.com/instaclustr> |
> +61 415 936 359
>
>
>
>
> On 13/02/2014, at 8:00 AM, Andrey Ilinykh <ai...@gmail.com> wrote:
>
> I think you are mistaken. It is true for the same zone. between zones
> 0.01/G
>
>
> On Wed, Feb 12, 2014 at 12:17 PM, Russell Bradberry <rb...@gmail.com>wrote:
>
>> Not when using private IP addresses.  That pricing *ONLY *applies if you
>> are using the public interface or EIP/ENI.  If you use the private IP
>> addresses there is no cost associated.
>>
>>
>>
>> On February 12, 2014 at 3:13:58 PM, William Oberman (
>> oberman@civicscience.com <//...@civicscience.com>) wrote:
>>
>> Same region, cross zone transfer is $0.01 / GB (see
>> http://aws.amazon.com/ec2/pricing/, Data Transfer section).
>>
>>
>> On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry <rb...@gmail.com>wrote:
>>
>>>  Cross zone data transfer does not cost any extra money.
>>>
>>>  LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical
>>> datacenter.
>>>
>>>  Ensure your clients are connecting to either the local IP or the AWS
>>> hostname that is a CNAME to the local ip from within AWS.  If you connect
>>> to the public IP you will get charged for outbound data transfer.
>>>
>>>
>>>
>>> On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com<//...@gmail.com>)
>>> wrote:
>>>
>>>  Also, may be you need to check the read consistency to local_quorum,
>>> otherwise the servers still try to read the data from all other data
>>> centers.
>>>
>>> I can understand the latency, but I cant understand how it would save
>>> money?   The amount of data transferred from the AWS server to the client
>>> should be same no matter where the client is connected?
>>>
>>>
>>>
>>> On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com>wrote:
>>>
>>>> yes, sure. Taking data from the same zone will reduce latency and save
>>>> you some money.
>>>>
>>>>
>>>> On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <
>>>> tarbox@cabotresearch.com> wrote:
>>>>
>>>>> We're running a C* cluster with 6 servers spread across the four
>>>>> us-east1 zones.
>>>>>
>>>>> We also spread our clients (hundreds of them) across the four zones.
>>>>>
>>>>> Currently we give our clients a connection string listing all six
>>>>> servers and let C* do its thing.
>>>>>
>>>>> This is all working just fine...and we're paying a fair bit in AWS
>>>>> transfer costs.  There is a suspicion that this transfer cost is driven by
>>>>> us passing data around between our C* servers and clients.
>>>>>
>>>>> Would there be any value to trying to get a client to talk to one of
>>>>> the C* servers in its own zone?
>>>>>
>>>>> I understand (at least partially!) about coordinator nodes and
>>>>> replication and know that no matter which server is the coordinator for an
>>>>> operation replication may cause bits to get transferred to/from servers in
>>>>> other zones.  Having said that...is there a chance that trying to encourage
>>>>> a client to initially contact a server in its own zone would help?
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Brian Tarbox
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>>
>
>

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Ben Bromhead <be...@instaclustr.com>.
0.01/G between zones irrespective of IP is correct.

As for your original question, depending on the driver you are using you could write a custom co-ordinator node selection policy.

For example if you are using the Datastax driver you would extend http://www.datastax.com/drivers/java/2.0/apidocs/com/datastax/driver/core/policies/LoadBalancingPolicy.html

… and set the distance based on which zone the node is in.

An alternate method would be to define the zones as data centres and then you could leverage existing DC aware policies (We've never tried this though). 


Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359




On 13/02/2014, at 8:00 AM, Andrey Ilinykh <ai...@gmail.com> wrote:

> I think you are mistaken. It is true for the same zone. between zones 0.01/G
> 
> 
> On Wed, Feb 12, 2014 at 12:17 PM, Russell Bradberry <rb...@gmail.com> wrote:
> Not when using private IP addresses.  That pricing ONLY applies if you are using the public interface or EIP/ENI.  If you use the private IP addresses there is no cost associated.
> 
> 
> 
> On February 12, 2014 at 3:13:58 PM, William Oberman (oberman@civicscience.com) wrote:
> 
>> Same region, cross zone transfer is $0.01 / GB (see http://aws.amazon.com/ec2/pricing/, Data Transfer section).
>> 
>> 
>> On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry <rb...@gmail.com> wrote:
>> Cross zone data transfer does not cost any extra money. 
>> 
>> LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical datacenter.  
>> 
>> Ensure your clients are connecting to either the local IP or the AWS hostname that is a CNAME to the local ip from within AWS.  If you connect to the public IP you will get charged for outbound data transfer.
>> 
>> 
>> 
>> On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com) wrote:
>> 
>>> Also, may be you need to check the read consistency to local_quorum, otherwise the servers still try to read the data from all other data centers.
>>> 
>>> I can understand the latency, but I cant understand how it would save money?   The amount of data transferred from the AWS server to the client should be same no matter where the client is connected?
>>>    
>>> 
>>> 
>>> On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com> wrote:
>>> yes, sure. Taking data from the same zone will reduce latency and save you some money.
>>> 
>>> 
>>> On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com> wrote:
>>> We're running a C* cluster with 6 servers spread across the four us-east1 zones.
>>> 
>>> We also spread our clients (hundreds of them) across the four zones.
>>> 
>>> Currently we give our clients a connection string listing all six servers and let C* do its thing.
>>> 
>>> This is all working just fine...and we're paying a fair bit in AWS transfer costs.  There is a suspicion that this transfer cost is driven by us passing data around between our C* servers and clients.
>>> 
>>> Would there be any value to trying to get a client to talk to one of the C* servers in its own zone?
>>> 
>>> I understand (at least partially!) about coordinator nodes and replication and know that no matter which server is the coordinator for an operation replication may cause bits to get transferred to/from servers in other zones.  Having said that...is there a chance that trying to encourage a client to initially contact a server in its own zone would help?
>>> 
>>> Thank you,
>>> 
>>> Brian Tarbox
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
> 


Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Andrey Ilinykh <ai...@gmail.com>.
I think you are mistaken. It is true for the same zone. between zones 0.01/G


On Wed, Feb 12, 2014 at 12:17 PM, Russell Bradberry <rb...@gmail.com>wrote:

> Not when using private IP addresses.  That pricing *ONLY *applies if you
> are using the public interface or EIP/ENI.  If you use the private IP
> addresses there is no cost associated.
>
>
>
> On February 12, 2014 at 3:13:58 PM, William Oberman (
> oberman@civicscience.com <//...@civicscience.com>) wrote:
>
> Same region, cross zone transfer is $0.01 / GB (see
> http://aws.amazon.com/ec2/pricing/, Data Transfer section).
>
>
> On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry <rb...@gmail.com>wrote:
>
>>  Cross zone data transfer does not cost any extra money.
>>
>>  LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical
>> datacenter.
>>
>>  Ensure your clients are connecting to either the local IP or the AWS
>> hostname that is a CNAME to the local ip from within AWS.  If you connect
>> to the public IP you will get charged for outbound data transfer.
>>
>>
>>
>> On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com<//...@gmail.com>)
>> wrote:
>>
>>  Also, may be you need to check the read consistency to local_quorum,
>> otherwise the servers still try to read the data from all other data
>> centers.
>>
>> I can understand the latency, but I cant understand how it would save
>> money?   The amount of data transferred from the AWS server to the client
>> should be same no matter where the client is connected?
>>
>>
>>
>> On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com>wrote:
>>
>>> yes, sure. Taking data from the same zone will reduce latency and save
>>> you some money.
>>>
>>>
>>> On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <tarbox@cabotresearch.com
>>> > wrote:
>>>
>>>> We're running a C* cluster with 6 servers spread across the four
>>>> us-east1 zones.
>>>>
>>>> We also spread our clients (hundreds of them) across the four zones.
>>>>
>>>> Currently we give our clients a connection string listing all six
>>>> servers and let C* do its thing.
>>>>
>>>> This is all working just fine...and we're paying a fair bit in AWS
>>>> transfer costs.  There is a suspicion that this transfer cost is driven by
>>>> us passing data around between our C* servers and clients.
>>>>
>>>> Would there be any value to trying to get a client to talk to one of
>>>> the C* servers in its own zone?
>>>>
>>>> I understand (at least partially!) about coordinator nodes and
>>>> replication and know that no matter which server is the coordinator for an
>>>> operation replication may cause bits to get transferred to/from servers in
>>>> other zones.  Having said that...is there a chance that trying to encourage
>>>> a client to initially contact a server in its own zone would help?
>>>>
>>>> Thank you,
>>>>
>>>> Brian Tarbox
>>>>
>>>>
>>>
>>
>
>
>
>

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Russell Bradberry <rb...@gmail.com>.
Not when using private IP addresses.  That pricing ONLY applies if you are using the public interface or EIP/ENI.  If you use the private IP addresses there is no cost associated.



On February 12, 2014 at 3:13:58 PM, William Oberman (oberman@civicscience.com) wrote:

Same region, cross zone transfer is $0.01 / GB (see http://aws.amazon.com/ec2/pricing/, Data Transfer section).


On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry <rb...@gmail.com> wrote:
Cross zone data transfer does not cost any extra money. 

LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical datacenter.  

Ensure your clients are connecting to either the local IP or the AWS hostname that is a CNAME to the local ip from within AWS.  If you connect to the public IP you will get charged for outbound data transfer.



On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com) wrote:

Also, may be you need to check the read consistency to local_quorum, otherwise the servers still try to read the data from all other data centers.

I can understand the latency, but I cant understand how it would save money?   The amount of data transferred from the AWS server to the client should be same no matter where the client is connected?
   


On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com> wrote:
yes, sure. Taking data from the same zone will reduce latency and save you some money.


On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com> wrote:
We're running a C* cluster with 6 servers spread across the four us-east1 zones.

We also spread our clients (hundreds of them) across the four zones.

Currently we give our clients a connection string listing all six servers and let C* do its thing.

This is all working just fine...and we're paying a fair bit in AWS transfer costs.  There is a suspicion that this transfer cost is driven by us passing data around between our C* servers and clients.

Would there be any value to trying to get a client to talk to one of the C* servers in its own zone?

I understand (at least partially!) about coordinator nodes and replication and know that no matter which server is the coordinator for an operation replication may cause bits to get transferred to/from servers in other zones.  Having said that...is there a chance that trying to encourage a client to initially contact a server in its own zone would help?

Thank you,

Brian Tarbox








Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by William Oberman <ob...@civicscience.com>.
Same region, cross zone transfer is $0.01 / GB (see
http://aws.amazon.com/ec2/pricing/, Data Transfer section).


 On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry <rb...@gmail.com>wrote:

> Cross zone data transfer does not cost any extra money.
>
> LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical
> datacenter.
>
> Ensure your clients are connecting to either the local IP or the AWS
> hostname that is a CNAME to the local ip from within AWS.  If you connect
> to the public IP you will get charged for outbound data transfer.
>
>
>
> On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com<//...@gmail.com>)
> wrote:
>
> Also, may be you need to check the read consistency to local_quorum,
> otherwise the servers still try to read the data from all other data
> centers.
>
> I can understand the latency, but I cant understand how it would save
> money?   The amount of data transferred from the AWS server to the client
> should be same no matter where the client is connected?
>
>
>
> On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com>wrote:
>
>> yes, sure. Taking data from the same zone will reduce latency and save
>> you some money.
>>
>>
>> On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com>wrote:
>>
>>> We're running a C* cluster with 6 servers spread across the four
>>> us-east1 zones.
>>>
>>> We also spread our clients (hundreds of them) across the four zones.
>>>
>>> Currently we give our clients a connection string listing all six
>>> servers and let C* do its thing.
>>>
>>> This is all working just fine...and we're paying a fair bit in AWS
>>> transfer costs.  There is a suspicion that this transfer cost is driven by
>>> us passing data around between our C* servers and clients.
>>>
>>> Would there be any value to trying to get a client to talk to one of the
>>> C* servers in its own zone?
>>>
>>> I understand (at least partially!) about coordinator nodes and
>>> replication and know that no matter which server is the coordinator for an
>>> operation replication may cause bits to get transferred to/from servers in
>>> other zones.  Having said that...is there a chance that trying to encourage
>>> a client to initially contact a server in its own zone would help?
>>>
>>> Thank you,
>>>
>>> Brian Tarbox
>>>
>>>
>>
>

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Russell Bradberry <rb...@gmail.com>.
Cross zone data transfer does not cost any extra money. 

LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical datacenter.  

Ensure your clients are connecting to either the local IP or the AWS hostname that is a CNAME to the local ip from within AWS.  If you connect to the public IP you will get charged for outbound data transfer.



On February 12, 2014 at 2:58:07 PM, Yogi Nerella (ynerella999@gmail.com) wrote:

Also, may be you need to check the read consistency to local_quorum, otherwise the servers still try to read the data from all other data centers.

I can understand the latency, but I cant understand how it would save money?   The amount of data transferred from the AWS server to the client should be same no matter where the client is connected?
   


On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com> wrote:
yes, sure. Taking data from the same zone will reduce latency and save you some money.


On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com> wrote:
We're running a C* cluster with 6 servers spread across the four us-east1 zones.

We also spread our clients (hundreds of them) across the four zones.

Currently we give our clients a connection string listing all six servers and let C* do its thing.

This is all working just fine...and we're paying a fair bit in AWS transfer costs.  There is a suspicion that this transfer cost is driven by us passing data around between our C* servers and clients.

Would there be any value to trying to get a client to talk to one of the C* servers in its own zone?

I understand (at least partially!) about coordinator nodes and replication and know that no matter which server is the coordinator for an operation replication may cause bits to get transferred to/from servers in other zones.  Having said that...is there a chance that trying to encourage a client to initially contact a server in its own zone would help?

Thank you,

Brian Tarbox




Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Yogi Nerella <yn...@gmail.com>.
Also, may be you need to check the read consistency to local_quorum,
otherwise the servers still try to read the data from all other data
centers.

I can understand the latency, but I cant understand how it would save
money?   The amount of data transferred from the AWS server to the client
should be same no matter where the client is connected?



On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh <ai...@gmail.com> wrote:

> yes, sure. Taking data from the same zone will reduce latency and save you
> some money.
>
>
> On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com>wrote:
>
>> We're running a C* cluster with 6 servers spread across the four us-east1
>> zones.
>>
>> We also spread our clients (hundreds of them) across the four zones.
>>
>> Currently we give our clients a connection string listing all six servers
>> and let C* do its thing.
>>
>> This is all working just fine...and we're paying a fair bit in AWS
>> transfer costs.  There is a suspicion that this transfer cost is driven by
>> us passing data around between our C* servers and clients.
>>
>> Would there be any value to trying to get a client to talk to one of the
>> C* servers in its own zone?
>>
>> I understand (at least partially!) about coordinator nodes and
>> replication and know that no matter which server is the coordinator for an
>> operation replication may cause bits to get transferred to/from servers in
>> other zones.  Having said that...is there a chance that trying to encourage
>> a client to initially contact a server in its own zone would help?
>>
>> Thank you,
>>
>> Brian Tarbox
>>
>>
>

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

Posted by Andrey Ilinykh <ai...@gmail.com>.
yes, sure. Taking data from the same zone will reduce latency and save you
some money.


On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox <ta...@cabotresearch.com>wrote:

> We're running a C* cluster with 6 servers spread across the four us-east1
> zones.
>
> We also spread our clients (hundreds of them) across the four zones.
>
> Currently we give our clients a connection string listing all six servers
> and let C* do its thing.
>
> This is all working just fine...and we're paying a fair bit in AWS
> transfer costs.  There is a suspicion that this transfer cost is driven by
> us passing data around between our C* servers and clients.
>
> Would there be any value to trying to get a client to talk to one of the
> C* servers in its own zone?
>
> I understand (at least partially!) about coordinator nodes and replication
> and know that no matter which server is the coordinator for an operation
> replication may cause bits to get transferred to/from servers in other
> zones.  Having said that...is there a chance that trying to encourage a
> client to initially contact a server in its own zone would help?
>
> Thank you,
>
> Brian Tarbox
>
>