You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Deno Vichas <de...@syncopated.net> on 2012/03/26 21:12:52 UTC

multi region EC2

all,

we just about ready to push our app live and just have some cassandra 
tuning left.  i've been currently running a 4 node (rep factor 3, 
simple) in EC2 using the datastax AMIs (thanks datastax).  so after 
reading through a bunch of docs i have a few questions.

  - what is the min and recommended number of nodes to use in multiple 
region cluster.  we only have a single app server right now.
- can i migrate the replication strategy one node at a time or do i need 
to shut to the whole cluster to do this?
- what type of performance hit am i going to take having my app server 
cross regions to get to a node.  coming from the SQL world, this is 
usually not a good thing.

if i was to stick in a single region is the any best practices for 
backing up a whole cluster?  from the docs it looks like i need to 
snapshot each node one by one and then copy off the snapshot to 
somewhere offsite.


thanks,
deno

Re: multi region EC2

Posted by Rob Coli <rc...@palominodb.com>.
On Mon, Mar 26, 2012 at 3:31 PM, Deno Vichas > but what if i already
have a bunch (8g per node) data that i need and i
> don't have a way to re-create it.

Note that the below may have unintended consequences if using Counter
column families. It actually can be done with the cluster running,
below is the least tricky version of this process.

a) stop writing to your cluster
b) do a major compaction and then stop cluster
c) ensure globally unique filenames for all sstable files for all cfs
for all nodes
d) copy all sstables to all new nodes
e) start cluster, join new nodes, run cleanup compactions

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

Re: multi region EC2

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
I've switched from SS to NTS on 1.0.x on a single-az cluster with RF3 (which obviously created a single-dc, single-rack NTS cluster). Worked without a hitch. Also switched from SimpleSnitch to Ec2Snitch on-the-fly. I had about 12GB of data per node.

Of course, your mileage may vary, so while I can report that it has been done successfully, I'd still recommend testing it out first...

/Janne

On Mar 31, 2012, at 22:45 , aaron morton wrote:

> I'm kind of guessing here because it's not something I've done before. Obviously test things first…
> 
> The NTS with a single DC and a single Rack will place data in the same location as the Simple Strategy. You *should* be able to change the replication strategy from, say, SS with RF 3 to NTS with RF 3 in a single DC with a single Rack. 
> 
> I think you can also migrate to using multiple racks under NTS, but I would need to double check the code. 
> 
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 27/03/2012, at 11:31 AM, Deno Vichas wrote:
> 
>> On 3/26/2012 2:15 PM, aaron morton wrote:
>>>> - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this?
>>> Just use the NTS from the start.
>> but what if i already have a bunch (8g per node) data that i need and i don't have a way to re-create it.
>> 
>> 
>> thanks,
>> deno
> 


Re: multi region EC2

Posted by aaron morton <aa...@thelastpickle.com>.
I'm kind of guessing here because it's not something I've done before. Obviously test things first…

The NTS with a single DC and a single Rack will place data in the same location as the Simple Strategy. You *should* be able to change the replication strategy from, say, SS with RF 3 to NTS with RF 3 in a single DC with a single Rack. 

I think you can also migrate to using multiple racks under NTS, but I would need to double check the code. 

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/03/2012, at 11:31 AM, Deno Vichas wrote:

> On 3/26/2012 2:15 PM, aaron morton wrote:
>>> - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this?
>> Just use the NTS from the start.
> but what if i already have a bunch (8g per node) data that i need and i don't have a way to re-create it.
> 
> 
> thanks,
> deno


Re: multi region EC2

Posted by Deno Vichas <de...@syncopated.net>.
On 3/26/2012 2:15 PM, aaron morton wrote:
>> - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this?
> Just use the NTS from the start.
but what if i already have a bunch (8g per node) data that i need and i 
don't have a way to re-create it.


thanks,
deno

Re: multi region EC2

Posted by aaron morton <aa...@thelastpickle.com>.
> (rep factor 3, simple)
if this means you are using the SimpleStrategy I would recommend using the NetworkTopologyStrategy.

>  - what is the min and recommended number of nodes to use in multiple region cluster.  we only have a single app server right now.
It depends on how exciting you want your life to be. 
You probably want at least 3 nodes in each cassandra DC / EC2 region. These could be spread across 3 AZ's in an EC2 region. 

Some background on availability http://thelastpickle.com/2011/06/13/Down-For-Me/
 
> - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this? 
Just use the NTS from the start. 

> - what type of performance hit am i going to take having my app server cross regions to get to a node.  coming from the SQL world, this is usually not a good thing.
You will need to do some tests. Using LOCAL_QUORUM requests will only block on nodes in the local DC. 

> if i was to stick in a single region is the any best practices for backing up a whole cluster?  from the docs it looks like i need to snapshot each node one by one and then copy off the snapshot to somewhere offsite.
yes. 

Good luck. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/03/2012, at 8:12 AM, Deno Vichas wrote:

> all,
> 
> we just about ready to push our app live and just have some cassandra tuning left.  i've been currently running a 4 node (rep factor 3, simple) in EC2 using the datastax AMIs (thanks datastax).  so after reading through a bunch of docs i have a few questions.
> 
>  - what is the min and recommended number of nodes to use in multiple region cluster.  we only have a single app server right now.
> - can i migrate the replication strategy one node at a time or do i need to shut to the whole cluster to do this? 
> - what type of performance hit am i going to take having my app server cross regions to get to a node.  coming from the SQL world, this is usually not a good thing.
> 
> if i was to stick in a single region is the any best practices for backing up a whole cluster?  from the docs it looks like i need to snapshot each node one by one and then copy off the snapshot to somewhere offsite.
> 
> 
> thanks,
> deno