You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by maneela a <ma...@yahoo.com> on 2010/07/09 19:36:40 UTC

RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Are there any known performance issues if cassandra cluster launched with RackAwareStrategy because I see huge performance difference between RackAwareStrategy vs RackUnAwareStrategy.  Here are details:




we have a cluster setup with 4 EC2 X large nodes, 3 of them are running in East region and 4th one is running in West region and they all communicate with each other through VPN tunnel interface which is only way we found to achieve ring architecture across Amazon cloud regions:




we are able to process 3.5K write operations per second when we used RackUnAwareStrategy whereas 


:/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100 --keep-going
total,interval_op_rate,avg_latency,elapsed_time
35935,3593,0.0289930914479,10
70531,3459,0.0289145907593,20
80000,946,0.0267288666213,30


whereas we are able to process only 250 write operations per second when we used RackAwareStrategy


:/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100 --keep-going
total,interval_op_rate,avg_latency,elapsed_time
2327,232,0.434396038355,10
4772,244,0.40946514036,20
7383,261,0.384504625415,30
9924,254,0.392919449861,40
12525,260,0.383832110482,50
15158,263,0.378838069983,60
17784,262,0.383219807364,70
20416,263,0.381646275973,80
23030,261,0.382550528602,90
25644,261,0.384442176815,100
28268,262,0.380935921084,110
30910,264,0.377376309224,120
33541,263,0.385158945698,130
36119,257,0.387976026517,140
38735,261,0.382333525368,150
41342,260,0.38413751514,160
43925,258,0.387684800391,170
46642,271,0.36899637237,180
49291,264,0.378489510164,190
51931,264,0.3793784538,200
54573,264,0.378474057217,210
57253,268,0.374258003573,220
59884,263,0.380020038658,230
62484,260,0.387267011954,240
64728,224,0.439328571054,250
67340,261,0.389221810455,260
69920,258,0.386144905127,270
72531,261,0.384242234948,280
75202,267,0.372129596605,290
77843,264,0.354621512291,300
80000,215,0.183918378283,310
Thanks in advance
Niru

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by maneela a <ma...@yahoo.com>.

ConsistencyLevel.ONE is default option given inside stress.py so I am using default one

--- On Fri, 7/9/10, Bill de hÓra <bi...@dehora.net> wrote:

From: Bill de hÓra <bi...@dehora.net>
Subject: Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud
To: user@cassandra.apache.org
Date: Friday, July 9, 2010, 2:12 PM

  east: A B C 

  west: D

Perhaps you are blocking on a write to D - what's your quorum/rf set up
as?


Bill

On Fri, 2010-07-09 at 10:36 -0700, maneela a wrote:
>         Are there any known performance issues if cassandra cluster
>         launched with RackAwareStrategy because I see huge performance
>         difference between RackAwareStrategy vs RackUnAwareStrategy.
>         Here are details:
>         
>         
>         
>         we have a cluster setup with 4 EC2 X large nodes, 3 of them
>         are running in East region and 4th one is running in West
>         region and they all communicate with each other through VPN
>         tunnel interface which is only way we found to achieve ring
>         architecture across Amazon cloud regions:
>         
>         
>         
>         we are able to process 3.5K write operations per second when
>         we used RackUnAwareStrategy whereas 
>         
>         
>         :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o
>         insert -n 80000 -y regular -d
>         ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
>         --keep-going
>         
>         total,interval_op_rate,avg_latency,elapsed_time
>         
>         35935,3593,0.0289930914479,10
>         
>         70531,3459,0.0289145907593,20
>         
>         80000,946,0.0267288666213,30
>         
>         
>         whereas we are able to process only 250 write operations per
>         second when we used RackAwareStrategy
>         
>         
>         :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o
>         insert -n 80000 -y regular -d
>         ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
>         --keep-going
>         
>         total,interval_op_rate,avg_latency,elapsed_time
>         
>         2327,232,0.434396038355,10
>         
>         4772,244,0.40946514036,20
>         
>         7383,261,0.384504625415,30
>         
>         9924,254,0.392919449861,40
>         
>         12525,260,0.383832110482,50
>         
>         15158,263,0.378838069983,60
>         
>         17784,262,0.383219807364,70
>         
>         20416,263,0.381646275973,80
>         
>         23030,261,0.382550528602,90
>         
>         25644,261,0.384442176815,100
>         
>         28268,262,0.380935921084,110
>         
>         30910,264,0.377376309224,120
>         
>         33541,263,0.385158945698,130
>         
>         36119,257,0.387976026517,140
>         
>         38735,261,0.382333525368,150
>         
>         41342,260,0.38413751514,160
>         
>         43925,258,0.387684800391,170
>         
>         46642,271,0.36899637237,180
>         
>         49291,264,0.378489510164,190
>         
>         51931,264,0.3793784538,200
>         
>         54573,264,0.378474057217,210
>         
>         57253,268,0.374258003573,220
>         
>         59884,263,0.380020038658,230
>         
>         62484,260,0.387267011954,240
>         
>         64728,224,0.439328571054,250
>         
>         67340,261,0.389221810455,260
>         
>         69920,258,0.386144905127,270
>         
>         72531,261,0.384242234948,280
>         
>         75202,267,0.372129596605,290
>         
>         77843,264,0.354621512291,300
>         
>         80000,215,0.183918378283,310
>         
>         
>         
>         Thanks in advance
>         
>         
>         Niru
>

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by Bill de hÓra <bi...@dehora.net>.

  east: A B C 

  west: D

Perhaps you are blocking on a write to D - what's your quorum/rf set up
as?


Bill

On Fri, 2010-07-09 at 10:36 -0700, maneela a wrote:
>         Are there any known performance issues if cassandra cluster
>         launched with RackAwareStrategy because I see huge performance
>         difference between RackAwareStrategy vs RackUnAwareStrategy.
>         Here are details:
>         
>         
>         
>         we have a cluster setup with 4 EC2 X large nodes, 3 of them
>         are running in East region and 4th one is running in West
>         region and they all communicate with each other through VPN
>         tunnel interface which is only way we found to achieve ring
>         architecture across Amazon cloud regions:
>         
>         
>         
>         we are able to process 3.5K write operations per second when
>         we used RackUnAwareStrategy whereas 
>         
>         
>         :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o
>         insert -n 80000 -y regular -d
>         ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
>         --keep-going
>         
>         total,interval_op_rate,avg_latency,elapsed_time
>         
>         35935,3593,0.0289930914479,10
>         
>         70531,3459,0.0289145907593,20
>         
>         80000,946,0.0267288666213,30
>         
>         
>         whereas we are able to process only 250 write operations per
>         second when we used RackAwareStrategy
>         
>         
>         :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o
>         insert -n 80000 -y regular -d
>         ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
>         --keep-going
>         
>         total,interval_op_rate,avg_latency,elapsed_time
>         
>         2327,232,0.434396038355,10
>         
>         4772,244,0.40946514036,20
>         
>         7383,261,0.384504625415,30
>         
>         9924,254,0.392919449861,40
>         
>         12525,260,0.383832110482,50
>         
>         15158,263,0.378838069983,60
>         
>         17784,262,0.383219807364,70
>         
>         20416,263,0.381646275973,80
>         
>         23030,261,0.382550528602,90
>         
>         25644,261,0.384442176815,100
>         
>         28268,262,0.380935921084,110
>         
>         30910,264,0.377376309224,120
>         
>         33541,263,0.385158945698,130
>         
>         36119,257,0.387976026517,140
>         
>         38735,261,0.382333525368,150
>         
>         41342,260,0.38413751514,160
>         
>         43925,258,0.387684800391,170
>         
>         46642,271,0.36899637237,180
>         
>         49291,264,0.378489510164,190
>         
>         51931,264,0.3793784538,200
>         
>         54573,264,0.378474057217,210
>         
>         57253,268,0.374258003573,220
>         
>         59884,263,0.380020038658,230
>         
>         62484,260,0.387267011954,240
>         
>         64728,224,0.439328571054,250
>         
>         67340,261,0.389221810455,260
>         
>         69920,258,0.386144905127,270
>         
>         72531,261,0.384242234948,280
>         
>         75202,267,0.372129596605,290
>         
>         77843,264,0.354621512291,300
>         
>         80000,215,0.183918378283,310
>         
>         
>         
>         Thanks in advance
>         
>         
>         Niru
>

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by Joe Stump <jo...@joestump.net>.

On Jul 9, 2010, at 1:16 PM, maneela a wrote:

> Is there any way to mark cassandra node to keep it as just for replication purpose and not to be as Primary for any data range in the ring? 

I believe there is. This is what we're doing, but we do all of our writes via a queue. Derek or Mike from SimpleGeo (both on the list) can probably chime in with a better explanation than I can.

--Joe

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by maneela a <ma...@yahoo.com>.

Thanks for your quick reply.. JoeI forgot to mentioned that we are using PropertyFileEndPointSnitch to provide cassandra about our network topology and below is property file uses by that class
cat rack.properties10.9.0.6=east:r1b10.9.0.18=east:r1c10.9.0.14=east:r1d10.9.0.10=west:r1adefault=east:rdef
In my first glance, VPNCubed uses openVPN software as part of transport layer so I did not look into it much deeper. I will work on it to see if it helps in out set up
we are looking for something like Oracle Primary/Standby solution where write operations happens only on Primary set of nodes ( in our case nodes those are running in different AZ in East region) and one copy of each data block should replicate to the node running in the West region so that read operation can be available local to any region because we have applications that consume cassandra data,  are running from both East and West regions. we are inclined to accept write latency between regions because readers won't look for data immediately and there will be around 5 -10 mins gap between write and read operations.
Is there any way to mark cassandra node to keep it as just for replication purpose and not to be as Primary for any data range in the ring? 
Niru

--- On Fri, 7/9/10, Joe Stump <jo...@joestump.net> wrote:

From: Joe Stump <jo...@joestump.net>
Subject: Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud
To: user@cassandra.apache.org
Date: Friday, July 9, 2010, 1:41 PM

We had similar issues when we started running Cassandra on EC2 between multiple AZ's (not regions; we're working up to that shortly). We ended up building a rack aware strategy specific to AWS, which is posted somewhere in JIRA. Basically it uses the AWS API to ensure that replicants are stored in each AZ. We then ensure that our clients are only reading from nodes in a given AZ. What I'm guessing is that you're seeing latency issues between regions combined with a higher consistency level than what we use.
Also, SSL tunnels are hard to scale from the management side. The Amazon folks have told us that VPNCubed is a better solution for such things.
--Joe

On Jul 9, 2010, at 11:36 AM, maneela a wrote:
Are there any known performance issues if cassandra cluster launched with RackAwareStrategy because I see huge performance difference between RackAwareStrategy vs RackUnAwareStrategy.  Here are details:

we have a cluster setup with 4 EC2 X large nodes, 3 of them are running in East region and 4th one is running in West region and they all communicate with each other through VPN tunnel interface which is only way we found to achieve ring architecture across Amazon cloud regions:

we are able to process 3.5K write operations per second when we used RackUnAwareStrategy whereas 
:/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100 --keep-goingtotal,interval_op_rate,avg_latency,elapsed_time35935,3593,0.0289930914479,1070531,3459,0.0289145907593,2080000,946,0.0267288666213,30
whereas we are able to process only 250 write operations per second when we used RackAwareStrategy
:/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
 --keep-goingtotal,interval_op_rate,avg_latency,elapsed_time2327,232,0.434396038355,104772,244,0.40946514036,207383,261,0.384504625415,309924,254,0.392919449861,4012525,260,0.383832110482,5015158,263,0.378838069983,6017784,262,0.383219807364,7020416,263,0.381646275973,8023030,261,0.382550528602,9025644,261,0.384442176815,10028268,262,0.380935921084,11030910,264,0.377376309224,12033541,263,0.385158945698,13036119,257,0.387976026517,14038735,261,0.382333525368,15041342,260,0.38413751514,16043925,258,0.387684800391,17046642,271,0.36899637237,18049291,264,0.378489510164,19051931,264,0.3793784538,20054573,264,0.378474057217,21057253,268,0.374258003573,22059884,263,0.380020038658,23062484,260,0.387267011954,24064728,224,0.439328571054,25067340,261,0.389221810455,26069920,258,0.386144905127,27072531,261,0.384242234948,28075202,267,0.372129596605,29077843,264,0.354621512291,30080000,215,0.183918378283,310
Thanks in advance
Niru

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by Joe Stump <jo...@joestump.net>.

We had similar issues when we started running Cassandra on EC2 between multiple AZ's (not regions; we're working up to that shortly). We ended up building a rack aware strategy specific to AWS, which is posted somewhere in JIRA. Basically it uses the AWS API to ensure that replicants are stored in each AZ. We then ensure that our clients are only reading from nodes in a given AZ. What I'm guessing is that you're seeing latency issues between regions combined with a higher consistency level than what we use.

Also, SSL tunnels are hard to scale from the management side. The Amazon folks have told us that VPNCubed is a better solution for such things.

--Joe


On Jul 9, 2010, at 11:36 AM, maneela a wrote:

> Are there any known performance issues if cassandra cluster launched with RackAwareStrategy because I see huge performance difference between RackAwareStrategy vs RackUnAwareStrategy.  Here are details:
> 
> 
> we have a cluster setup with 4 EC2 X large nodes, 3 of them are running in East region and 4th one is running in West region and they all communicate with each other through VPN tunnel interface which is only way we found to achieve ring architecture across Amazon cloud regions:
> 
> 
> we are able to process 3.5K write operations per second when we used RackUnAwareStrategy whereas 
> 
> :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100 --keep-going
> total,interval_op_rate,avg_latency,elapsed_time
> 35935,3593,0.0289930914479,10
> 70531,3459,0.0289145907593,20
> 80000,946,0.0267288666213,30
> 
> whereas we are able to process only 250 write operations per second when we used RackAwareStrategy
> 
> :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000 -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100 --keep-going
> total,interval_op_rate,avg_latency,elapsed_time
> 2327,232,0.434396038355,10
> 4772,244,0.40946514036,20
> 7383,261,0.384504625415,30
> 9924,254,0.392919449861,40
> 12525,260,0.383832110482,50
> 15158,263,0.378838069983,60
> 17784,262,0.383219807364,70
> 20416,263,0.381646275973,80
> 23030,261,0.382550528602,90
> 25644,261,0.384442176815,100
> 28268,262,0.380935921084,110
> 30910,264,0.377376309224,120
> 33541,263,0.385158945698,130
> 36119,257,0.387976026517,140
> 38735,261,0.382333525368,150
> 41342,260,0.38413751514,160
> 43925,258,0.387684800391,170
> 46642,271,0.36899637237,180
> 49291,264,0.378489510164,190
> 51931,264,0.3793784538,200
> 54573,264,0.378474057217,210
> 57253,268,0.374258003573,220
> 59884,263,0.380020038658,230
> 62484,260,0.387267011954,240
> 64728,224,0.439328571054,250
> 67340,261,0.389221810455,260
> 69920,258,0.386144905127,270
> 72531,261,0.384242234948,280
> 75202,267,0.372129596605,290
> 77843,264,0.354621512291,300
> 80000,215,0.183918378283,310
> 
> Thanks in advance
> 
> Niru
>

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

Posted by Dave Viner <da...@pobox.com>.

Hi,

Can you post the stress test code and storage.conf used?

I have a cluster in EC2 using RackAware.  However, I am in 1 region
(us-east-1) but 2 Availability Zones.  Amazon helps to ensure that AZ's are
isolated from each other creating a fail-resistant cluster.  But, staying in
the same region allows for higher thruput numbers.

Dave Viner


On Fri, Jul 9, 2010 at 10:36 AM, maneela a <ma...@yahoo.com> wrote:

> Are there any known performance issues if cassandra cluster launched with
> RackAwareStrategy because I see huge performance difference between
> RackAwareStrategy vs RackUnAwareStrategy.  Here are details:
>
>
>
> we have a cluster setup with 4 EC2 X large nodes, 3 of them are running in
> East region and 4th one is running in West region and they all communicate
> with each other through VPN tunnel interface which is only way we found to
> achieve ring architecture across Amazon cloud regions:
>
>
>
> we are able to process 3.5K write operations per second when we used
> RackUnAwareStrategy whereas
>
>
> :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000
> -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
> --keep-going
>
> total,interval_op_rate,avg_latency,elapsed_time
>
> 35935,3593,0.0289930914479,10
>
> 70531,3459,0.0289145907593,20
>
> 80000,946,0.0267288666213,30
>
>
> whereas we are able to process only 250 write operations per second when we
> used RackAwareStrategy
>
>
> :/home/ubuntu/cassandra/contrib/py_stress# ./stress.py -o insert -n 80000
> -y regular -d ec2-xxx-xxx-xxx-xx.compute-1.amazonaws.com --threads 100
> --keep-going
>
> total,interval_op_rate,avg_latency,elapsed_time
>
> 2327,232,0.434396038355,10
>
> 4772,244,0.40946514036,20
>
> 7383,261,0.384504625415,30
>
> 9924,254,0.392919449861,40
>
> 12525,260,0.383832110482,50
>
> 15158,263,0.378838069983,60
>
> 17784,262,0.383219807364,70
>
> 20416,263,0.381646275973,80
>
> 23030,261,0.382550528602,90
>
> 25644,261,0.384442176815,100
>
> 28268,262,0.380935921084,110
>
> 30910,264,0.377376309224,120
>
> 33541,263,0.385158945698,130
>
> 36119,257,0.387976026517,140
>
> 38735,261,0.382333525368,150
>
> 41342,260,0.38413751514,160
>
> 43925,258,0.387684800391,170
>
> 46642,271,0.36899637237,180
>
> 49291,264,0.378489510164,190
>
> 51931,264,0.3793784538,200
>
> 54573,264,0.378474057217,210
>
> 57253,268,0.374258003573,220
>
> 59884,263,0.380020038658,230
>
> 62484,260,0.387267011954,240
>
> 64728,224,0.439328571054,250
>
> 67340,261,0.389221810455,260
>
> 69920,258,0.386144905127,270
>
> 72531,261,0.384242234948,280
>
> 75202,267,0.372129596605,290
>
> 77843,264,0.354621512291,300
>
> 80000,215,0.183918378283,310
>
> Thanks in advance
>
> Niru
>
>
>