You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jason Axelson <ja...@engagestage.com> on 2012/08/24 12:25:48 UTC

two-node cassandra cluster

Hi, I have an application that will be very dormant most of the time
but will need high-bursting a few days out of the month. Since we are
deploying on EC2 I would like to keep only one Cassandra server up
most of the time and then on burst days I want to bring one more
server up (with more RAM and CPU than the first) to help serve the
load. What is the best way to do this? Should I take a different
approach?

Some notes about what I plan to do:
* Bring the node up and repair it immediately
* After the burst time is over decommission the powerful node
* Use the always-on server as the seed node
* My main question is how to get the nodes to share all the data since
I want a replication factor of 2 (so both nodes have all the data) but
that won't work while there is only one server. Should I bring up 2
extra servers instead of just one?

Thanks,
Jason

Re: two-node cassandra cluster

Posted by Franc Carter <fr...@sirca.org.au>.
On Fri, Aug 24, 2012 at 8:25 PM, Jason Axelson <ja...@engagestage.com>wrote:

> Hi, I have an application that will be very dormant most of the time
> but will need high-bursting a few days out of the month. Since we are
> deploying on EC2 I would like to keep only one Cassandra server up
> most of the time and then on burst days I want to bring one more
> server up (with more RAM and CPU than the first) to help serve the
> load. What is the best way to do this? Should I take a different
> approach?
>
> Some notes about what I plan to do:
> * Bring the node up and repair it immediately
> * After the burst time is over decommission the powerful node
> * Use the always-on server as the seed node
> * My main question is how to get the nodes to share all the data since
> I want a replication factor of 2 (so both nodes have all the data) but
> that won't work while there is only one server. Should I bring up 2
> extra servers instead of just one?
>
> Thanks,
> Jason
>

Caveat: I haven't tried what I am about to suggest

Could you run the cluster on smaller instances for most of the time and
then when you need more performance increases the instance size to get more
CPU/Memory. If you use EBS with provisioned IOPs you should be able to make
the transition reasonably quickly.

cheers

-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <ma...@sirca.org.au>

franc.carter@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Re: two-node cassandra cluster

Posted by aaron morton <aa...@thelastpickle.com>.
> most of the time and then on burst days I want to bring one more
> server up (with more RAM and CPU than the first) to help serve the
> load.
Unless you are using virtual nodes (coming in 1.2) and I higher RF  I would recommend using machines that have the same HW spec. Otherwise you need to capacity plan against the combined machine with the lowest spec components from all machines. If you are scaling for throughput you will need to keep this in mind. With 2 nodes and RF 2 both nodes have to do the same amount of work. 

In _general_ the assumption is that cluster membership is reasonably stable. Routinely scaling up and down will be fighting things a little. 
 
To bring the node up I would:
* start it with auto_bootstrap off. 
* copy over all the data from the first node
* run repair. 

To decommission I would:
* run repair on the always on node.
* turn off the additional node
* run nodetool removetoken on the always on node to remove the additional node. 

If you want to go down this path make sure you do a lot of testing to get the process ironed out. I would be thinking about: 

Does the cluster have to be continuously available ? 
How much data are we talking about ? How long will it take to transfer ?
What will the network latency be like between the nodes ? Latency between new nodes can be a lottery. 
If you are storing the data on a single node that uses RAID 0, how will you handle disk failure ?

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/08/2012, at 10:25 PM, Jason Axelson <ja...@engagestage.com> wrote:

> Hi, I have an application that will be very dormant most of the time
> but will need high-bursting a few days out of the month. Since we are
> deploying on EC2 I would like to keep only one Cassandra server up
> most of the time and then on burst days I want to bring one more
> server up (with more RAM and CPU than the first) to help serve the
> load. What is the best way to do this? Should I take a different
> approach?
> 
> Some notes about what I plan to do:
> * Bring the node up and repair it immediately
> * After the burst time is over decommission the powerful node
> * Use the always-on server as the seed node
> * My main question is how to get the nodes to share all the data since
> I want a replication factor of 2 (so both nodes have all the data) but
> that won't work while there is only one server. Should I bring up 2
> extra servers instead of just one?
> 
> Thanks,
> Jason