You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Subscriber <su...@zfabrik.de> on 2011/07/04 11:54:20 UTC

How to scale Cassandra?

Hi there, 

I read a lot of Cassandra's high scalability feature: allowing seamless addition of nodes, no downtime etc.
But I wonder how one will do this in practice in an operational system. 

In the system we're going to implement we're expecting a huge number of writes with uniformly distributed keys 
(the keys are given and cannot be generated). That means using RandomPartitioner will (more or less) result in 
the same work-load per node as any other OrderPreservePartitioner - right?

But how do you scale a (more or less) balanced Cassandra cluster? I think that in the end 
you always have to double the number of nodes (adding just a handful of nodes disburdens only the split regions, the
work-load of untouched regions will grow with unchanged speed).

This seems to be ok for small clusters. But what do you do with when you have several 100s of nodes in your cluster? 
It seems to me that a balanced cluster is a bless for performance but a curse for scalability...

What are the alternatives? One could re-distribute the token ranges, but this would cause 
downtimes (AFAIK); not an option!

Is there anything that I didn't understand or do I miss something else? Is the only left strategy to make sure that
the cluster grows unbalanced so one can add nodes to the hotspots? However in this case you have to make sure
that this strategy is lasting. Could be too optimistic...

Best Regards
Udo

Re: How to scale Cassandra?

Posted by Paul Loy <ke...@gmail.com>.

Do you mean the ring does not change until the move has completed?

On Mon, Jul 4, 2011 at 4:49 PM, Edward Capriolo <ed...@gmail.com>wrote:

>
>
> On Mon, Jul 4, 2011 at 10:21 AM, Paul Loy <ke...@gmail.com> wrote:
>
>> Well, by issuing a nodetool move when a node is under high load, you
>> basically make that node unresponsive. That's fine, but a nodetool move on
>> one node also means that that node's replica data needs to move around the
>> ring and possibly some replica data from the next (or previous) node in the
>> ring. So how does this affect other nodes wrt RF and quorum? Will quorum
>> fail until the replicas have moved also?
>>
>> On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com>wrote:
>>
>>> Moving nodes does not result in downtime provide you use proper
>>> replication factors and read/write consistencies. The typical recommendation
>>> is RF=3 and QUORUM reads/writes.****
>>>
>>> ** **
>>>
>>> Dan****
>>>
>>> ** **
>>>
>>> *From:* Paul Loy [mailto:keteracel@gmail.com]
>>> *Sent:* July-04-11 5:59
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: How to scale Cassandra?****
>>>
>>> ** **
>>>
>>> That's basically how I understand it.
>>>
>>> However, I think it gets better with larger clusters as the proportion of
>>> the ring you move around at any time is much lower.****
>>>
>>> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de>
>>> wrote:****
>>>
>>> Hi there,
>>>
>>> I read a lot of Cassandra's high scalability feature: allowing seamless
>>> addition of nodes, no downtime etc.
>>> But I wonder how one will do this in practice in an operational system.
>>>
>>> In the system we're going to implement we're expecting a huge number of
>>> writes with uniformly distributed keys
>>> (the keys are given and cannot be generated). That means using
>>> RandomPartitioner will (more or less) result in
>>> the same work-load per node as any other OrderPreservePartitioner -
>>> right?
>>>
>>> But how do you scale a (more or less) balanced Cassandra cluster? I think
>>> that in the end
>>> you always have to double the number of nodes (adding just a handful of
>>> nodes disburdens only the split regions, the
>>> work-load of untouched regions will grow with unchanged speed).
>>>
>>> This seems to be ok for small clusters. But what do you do with when you
>>> have several 100s of nodes in your cluster?
>>> It seems to me that a balanced cluster is a bless for performance but a
>>> curse for scalability...
>>>
>>> What are the alternatives? One could re-distribute the token ranges, but
>>> this would cause
>>> downtimes (AFAIK); not an option!
>>>
>>> Is there anything that I didn't understand or do I miss something else?
>>> Is the only left strategy to make sure that
>>> the cluster grows unbalanced so one can add nodes to the hotspots?
>>> However in this case you have to make sure
>>> that this strategy is lasting. Could be too optimistic...
>>>
>>> Best Regards
>>> Udo****
>>>
>>>
>>>
>>>
>>> --
>>> ---------------------------------------------
>>> Paul Loy
>>> paul@keteracel.com
>>> http://uk.linkedin.com/in/paulloy****
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG - www.avg.com
>>> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11
>>> 02:35:00****
>>>
>>
>>
>>
>> --
>> ---------------------------------------------
>> Paul Loy
>> paul@keteracel.com
>> http://uk.linkedin.com/in/paulloy
>>
>
> No. If you are using nodetool move (or any of the nodetool operations)
> quorum and replication factor is properly maintained.
>



-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

Re: How to scale Cassandra?

Posted by Edward Capriolo <ed...@gmail.com>.

On Mon, Jul 4, 2011 at 10:21 AM, Paul Loy <ke...@gmail.com> wrote:

> Well, by issuing a nodetool move when a node is under high load, you
> basically make that node unresponsive. That's fine, but a nodetool move on
> one node also means that that node's replica data needs to move around the
> ring and possibly some replica data from the next (or previous) node in the
> ring. So how does this affect other nodes wrt RF and quorum? Will quorum
> fail until the replicas have moved also?
>
> On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com>wrote:
>
>> Moving nodes does not result in downtime provide you use proper
>> replication factors and read/write consistencies. The typical recommendation
>> is RF=3 and QUORUM reads/writes.****
>>
>> ** **
>>
>> Dan****
>>
>> ** **
>>
>> *From:* Paul Loy [mailto:keteracel@gmail.com]
>> *Sent:* July-04-11 5:59
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: How to scale Cassandra?****
>>
>> ** **
>>
>> That's basically how I understand it.
>>
>> However, I think it gets better with larger clusters as the proportion of
>> the ring you move around at any time is much lower.****
>>
>> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de>
>> wrote:****
>>
>> Hi there,
>>
>> I read a lot of Cassandra's high scalability feature: allowing seamless
>> addition of nodes, no downtime etc.
>> But I wonder how one will do this in practice in an operational system.
>>
>> In the system we're going to implement we're expecting a huge number of
>> writes with uniformly distributed keys
>> (the keys are given and cannot be generated). That means using
>> RandomPartitioner will (more or less) result in
>> the same work-load per node as any other OrderPreservePartitioner - right?
>>
>> But how do you scale a (more or less) balanced Cassandra cluster? I think
>> that in the end
>> you always have to double the number of nodes (adding just a handful of
>> nodes disburdens only the split regions, the
>> work-load of untouched regions will grow with unchanged speed).
>>
>> This seems to be ok for small clusters. But what do you do with when you
>> have several 100s of nodes in your cluster?
>> It seems to me that a balanced cluster is a bless for performance but a
>> curse for scalability...
>>
>> What are the alternatives? One could re-distribute the token ranges, but
>> this would cause
>> downtimes (AFAIK); not an option!
>>
>> Is there anything that I didn't understand or do I miss something else? Is
>> the only left strategy to make sure that
>> the cluster grows unbalanced so one can add nodes to the hotspots? However
>> in this case you have to make sure
>> that this strategy is lasting. Could be too optimistic...
>>
>> Best Regards
>> Udo****
>>
>>
>>
>>
>> --
>> ---------------------------------------------
>> Paul Loy
>> paul@keteracel.com
>> http://uk.linkedin.com/in/paulloy****
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com
>> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11
>> 02:35:00****
>>
>
>
>
> --
> ---------------------------------------------
> Paul Loy
> paul@keteracel.com
> http://uk.linkedin.com/in/paulloy
>

No. If you are using nodetool move (or any of the nodetool operations)
quorum and replication factor is properly maintained.

Re: How to scale Cassandra?

Posted by Sebastien Coutu <sc...@openplaces.org>.

Hi Priyanka,

We're also using Hadoop on a seperate 2x 1TB SATA disks in RAID0 on the RAID
card.

Regards,

SC

On Tue, Jul 5, 2011 at 10:39 AM, Priyanka <pr...@gmail.com> wrote:

> Thanks a lot Sebastien.
> Did you use hadoop map reduce or bulk loading techniques for loading data?
>
> regards,
> Priyanka
>
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-scale-Cassandra-tp6545491p6550029.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at
> Nabble.com.
>

Re: How to scale Cassandra?

Posted by Priyanka <pr...@gmail.com>.

Thanks a lot Sebastien.
Did you use hadoop map reduce or bulk loading techniques for loading data?

regards,
Priyanka


--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-scale-Cassandra-tp6545491p6550029.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: How to scale Cassandra?

Posted by Sebastien Coutu <sc...@openplaces.org>.

Hi,

Our hardware configuration is very simple:

1 x 8 Core Processor
32 GB of memory
1 x 250GB SATA disk for the OS/Swap plugged in the motherboard
1 x 250GB SATA disk for Cassandra's commit log plugged in the motherboard
1 x RAID card
2 x 1TB SATA disks for Cassandra in RAID-0 plugged on the RAID card

We're always evolving our system standards but this handles the load. Our
idea is to build these servers with "commodity" hardware so nothing fancy
here.

Hope this helps.

SC

On Tue, Jul 5, 2011 at 9:46 AM, Priyanka <pr...@gmail.com> wrote:

>
>  Hello Sebastien,
>
>           I am trying to load around 50GB data on a single node.But I am
> facing lot of issues for doing the same.
> In your previous post you mentioned that you were able to handle 1TB per
> node..Could you  please let me know the hardware configuration of your
> system.
>
> Thanks,
> Priyanka
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-scale-Cassandra-tp6545491p6549814.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at
> Nabble.com.
>

Re: How to scale Cassandra?

Posted by Priyanka <pr...@gmail.com>.

  Hello Sebastien,
      
           I am trying to load around 50GB data on a single node.But I am
facing lot of issues for doing the same.
In your previous post you mentioned that you were able to handle 1TB per
node..Could you  please let me know the hardware configuration of your
system.

Thanks,
Priyanka

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-scale-Cassandra-tp6545491p6549814.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: How to scale Cassandra?

Posted by ZFabrik Subscriber <su...@zfabrik.de>.

Hi SC, 

I'm just talking about workload in general. The point is that sooner or later you come to the point that you need to scale-out. And the question is, what's the best strategy here? Especially when your cluster is almost balanced.

500 GB seems to be a good ball-park figure, I think I read this number somewhere else recently.

Regards
Udo

Am 04.07.2011 um 21:10 schrieb Sebastien Coutu:

> Hi Udo,
> 
> I didn't read the whole thread but can you define the type of workload you're looking at? Do you have jobs that require reading the whole data stored in your database? For example one big column family that needs to be read entirely by a job? Because the amount of time required to read a whole disk (SATA II) of 1TB is roughly 2-2.5 hours. Now add RAID to this and you can modify this amount of time but your bottleneck will pretty much always be your disks. On our cluster, we currently have more than 1TB per node and it holds but we find that our sweet spot should be around 400-500GB per node.
> 
> Regards,
> 
> SC
> 
> 
> On Mon, Jul 4, 2011 at 3:01 PM, ZFabrik Subscriber <su...@zfabrik.de> wrote:
> Let's assume you have 50 nodes and their work-load grows simultaneously. You discover that the nodes are about to reach their limits (btw. what is the actual limit of a Cassandra node? 100GB? 500GB? 1TB?) 
> You decide to add another 50 nodes. Do you do this within one step? Or one after the other? Or in several rounds, always every RF-rd node?
> Or you add 20 nodes and move the token ranges. Again, in one step? 20 steps? 4 steps 5 nodes each?
> This could take a while (in terms of days, if not weeks) in larger clusters!
> 
> Does anybody has experience with real life scale-outs?
> 
> Regards
> Udo
> 
> Am 04.07.2011 um 16:21 schrieb Paul Loy:
> 
>> Well, by issuing a nodetool move when a node is under high load, you basically make that node unresponsive. That's fine, but a nodetool move on one node also means that that node's replica data needs to move around the ring and possibly some replica data from the next (or previous) node in the ring. So how does this affect other nodes wrt RF and quorum? Will quorum fail until the replicas have moved also?
>> 
>> On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com> wrote:
>> Moving nodes does not result in downtime provide you use proper replication factors and read/write consistencies. The typical recommendation is RF=3 and QUORUM reads/writes.
>> 
>>  
>> 
>> Dan
>> 
>>  
>> 
>> From: Paul Loy [mailto:keteracel@gmail.com] 
>> Sent: July-04-11 5:59
>> To: user@cassandra.apache.org
>> Subject: Re: How to scale Cassandra?
>> 
>>  
>> 
>> That's basically how I understand it.
>> 
>> However, I think it gets better with larger clusters as the proportion of the ring you move around at any time is much lower.
>> 
>> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de> wrote:
>> 
>> Hi there,
>> 
>> I read a lot of Cassandra's high scalability feature: allowing seamless addition of nodes, no downtime etc.
>> But I wonder how one will do this in practice in an operational system.
>> 
>> In the system we're going to implement we're expecting a huge number of writes with uniformly distributed keys
>> (the keys are given and cannot be generated). That means using RandomPartitioner will (more or less) result in
>> the same work-load per node as any other OrderPreservePartitioner - right?
>> 
>> But how do you scale a (more or less) balanced Cassandra cluster? I think that in the end
>> you always have to double the number of nodes (adding just a handful of nodes disburdens only the split regions, the
>> work-load of untouched regions will grow with unchanged speed).
>> 
>> This seems to be ok for small clusters. But what do you do with when you have several 100s of nodes in your cluster?
>> It seems to me that a balanced cluster is a bless for performance but a curse for scalability...
>> 
>> What are the alternatives? One could re-distribute the token ranges, but this would cause
>> downtimes (AFAIK); not an option!
>> 
>> Is there anything that I didn't understand or do I miss something else? Is the only left strategy to make sure that
>> the cluster grows unbalanced so one can add nodes to the hotspots? However in this case you have to make sure
>> that this strategy is lasting. Could be too optimistic...
>> 
>> Best Regards
>> Udo
>> 
>> 
>> 
>> 
>> -- 
>> ---------------------------------------------
>> Paul Loy
>> paul@keteracel.com
>> http://uk.linkedin.com/in/paulloy
>> 
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com
>> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11 02:35:00
>> 
>> 
>> 
>> 
>> -- 
>> ---------------------------------------------
>> Paul Loy
>> paul@keteracel.com
>> http://uk.linkedin.com/in/paulloy
> 
>

Re: How to scale Cassandra?

Posted by Sebastien Coutu <sc...@openplaces.org>.

Hi Udo,

I didn't read the whole thread but can you define the type of workload
you're looking at? Do you have jobs that require reading the whole data
stored in your database? For example one big column family that needs to be
read entirely by a job? Because the amount of time required to read a whole
disk (SATA II) of 1TB is roughly 2-2.5 hours. Now add RAID to this and you
can modify this amount of time but your bottleneck will pretty much always
be your disks. On our cluster, we currently have more than 1TB per node and
it holds but we find that our sweet spot should be around 400-500GB per
node.

Regards,

SC


On Mon, Jul 4, 2011 at 3:01 PM, ZFabrik Subscriber <su...@zfabrik.de>wrote:

> Let's assume you have 50 nodes and their work-load grows simultaneously.
> You discover that the nodes are about to reach their limits (btw. what is
> the actual limit of a Cassandra node? 100GB? 500GB? 1TB?)
> You decide to add another 50 nodes. Do you do this within one step? Or one
> after the other? Or in several rounds, always every RF-rd node?
> Or you add 20 nodes and move the token ranges. Again, in one step? 20
> steps? 4 steps 5 nodes each?
> This could take a while (in terms of days, if not weeks) in larger
> clusters!
>
> Does anybody has experience with real life scale-outs?
>
> Regards
> Udo
>
> Am 04.07.2011 um 16:21 schrieb Paul Loy:
>
> Well, by issuing a nodetool move when a node is under high load, you
> basically make that node unresponsive. That's fine, but a nodetool move on
> one node also means that that node's replica data needs to move around the
> ring and possibly some replica data from the next (or previous) node in the
> ring. So how does this affect other nodes wrt RF and quorum? Will quorum
> fail until the replicas have moved also?
>
> On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com>wrote:
>
>> Moving nodes does not result in downtime provide you use proper
>> replication factors and read/write consistencies. The typical recommendation
>> is RF=3 and QUORUM reads/writes.****
>>
>> ** **
>>
>> Dan****
>>
>> ** **
>>
>> *From:* Paul Loy [mailto:keteracel@gmail.com]
>> *Sent:* July-04-11 5:59
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: How to scale Cassandra?****
>>
>> ** **
>>
>> That's basically how I understand it.
>>
>> However, I think it gets better with larger clusters as the proportion of
>> the ring you move around at any time is much lower.****
>>
>> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de>
>> wrote:****
>>
>> Hi there,
>>
>> I read a lot of Cassandra's high scalability feature: allowing seamless
>> addition of nodes, no downtime etc.
>> But I wonder how one will do this in practice in an operational system.
>>
>> In the system we're going to implement we're expecting a huge number of
>> writes with uniformly distributed keys
>> (the keys are given and cannot be generated). That means using
>> RandomPartitioner will (more or less) result in
>> the same work-load per node as any other OrderPreservePartitioner - right?
>>
>> But how do you scale a (more or less) balanced Cassandra cluster? I think
>> that in the end
>> you always have to double the number of nodes (adding just a handful of
>> nodes disburdens only the split regions, the
>> work-load of untouched regions will grow with unchanged speed).
>>
>> This seems to be ok for small clusters. But what do you do with when you
>> have several 100s of nodes in your cluster?
>> It seems to me that a balanced cluster is a bless for performance but a
>> curse for scalability...
>>
>> What are the alternatives? One could re-distribute the token ranges, but
>> this would cause
>> downtimes (AFAIK); not an option!
>>
>> Is there anything that I didn't understand or do I miss something else? Is
>> the only left strategy to make sure that
>> the cluster grows unbalanced so one can add nodes to the hotspots? However
>> in this case you have to make sure
>> that this strategy is lasting. Could be too optimistic...
>>
>> Best Regards
>> Udo****
>>
>>
>>
>>
>> --
>> ---------------------------------------------
>> Paul Loy
>> paul@keteracel.com
>> http://uk.linkedin.com/in/paulloy****
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com
>> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11
>> 02:35:00****
>>
>
>
>
> --
> ---------------------------------------------
> Paul Loy
> paul@keteracel.com
> http://uk.linkedin.com/in/paulloy
>
>
>

Re: How to scale Cassandra?

Posted by ZFabrik Subscriber <su...@zfabrik.de>.

Let's assume you have 50 nodes and their work-load grows simultaneously. You discover that the nodes are about to reach their limits (btw. what is the actual limit of a Cassandra node? 100GB? 500GB? 1TB?) 
You decide to add another 50 nodes. Do you do this within one step? Or one after the other? Or in several rounds, always every RF-rd node?
Or you add 20 nodes and move the token ranges. Again, in one step? 20 steps? 4 steps 5 nodes each?
This could take a while (in terms of days, if not weeks) in larger clusters!

Does anybody has experience with real life scale-outs?

Regards
Udo

Am 04.07.2011 um 16:21 schrieb Paul Loy:

> Well, by issuing a nodetool move when a node is under high load, you basically make that node unresponsive. That's fine, but a nodetool move on one node also means that that node's replica data needs to move around the ring and possibly some replica data from the next (or previous) node in the ring. So how does this affect other nodes wrt RF and quorum? Will quorum fail until the replicas have moved also?
> 
> On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com> wrote:
> Moving nodes does not result in downtime provide you use proper replication factors and read/write consistencies. The typical recommendation is RF=3 and QUORUM reads/writes.
> 
>  
> 
> Dan
> 
>  
> 
> From: Paul Loy [mailto:keteracel@gmail.com] 
> Sent: July-04-11 5:59
> To: user@cassandra.apache.org
> Subject: Re: How to scale Cassandra?
> 
>  
> 
> That's basically how I understand it.
> 
> However, I think it gets better with larger clusters as the proportion of the ring you move around at any time is much lower.
> 
> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de> wrote:
> 
> Hi there,
> 
> I read a lot of Cassandra's high scalability feature: allowing seamless addition of nodes, no downtime etc.
> But I wonder how one will do this in practice in an operational system.
> 
> In the system we're going to implement we're expecting a huge number of writes with uniformly distributed keys
> (the keys are given and cannot be generated). That means using RandomPartitioner will (more or less) result in
> the same work-load per node as any other OrderPreservePartitioner - right?
> 
> But how do you scale a (more or less) balanced Cassandra cluster? I think that in the end
> you always have to double the number of nodes (adding just a handful of nodes disburdens only the split regions, the
> work-load of untouched regions will grow with unchanged speed).
> 
> This seems to be ok for small clusters. But what do you do with when you have several 100s of nodes in your cluster?
> It seems to me that a balanced cluster is a bless for performance but a curse for scalability...
> 
> What are the alternatives? One could re-distribute the token ranges, but this would cause
> downtimes (AFAIK); not an option!
> 
> Is there anything that I didn't understand or do I miss something else? Is the only left strategy to make sure that
> the cluster grows unbalanced so one can add nodes to the hotspots? However in this case you have to make sure
> that this strategy is lasting. Could be too optimistic...
> 
> Best Regards
> Udo
> 
> 
> 
> 
> -- 
> ---------------------------------------------
> Paul Loy
> paul@keteracel.com
> http://uk.linkedin.com/in/paulloy
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11 02:35:00
> 
> 
> 
> 
> -- 
> ---------------------------------------------
> Paul Loy
> paul@keteracel.com
> http://uk.linkedin.com/in/paulloy

Re: How to scale Cassandra?

Posted by Paul Loy <ke...@gmail.com>.

Well, by issuing a nodetool move when a node is under high load, you
basically make that node unresponsive. That's fine, but a nodetool move on
one node also means that that node's replica data needs to move around the
ring and possibly some replica data from the next (or previous) node in the
ring. So how does this affect other nodes wrt RF and quorum? Will quorum
fail until the replicas have moved also?

On Mon, Jul 4, 2011 at 3:08 PM, Dan Hendry <da...@gmail.com>wrote:

> Moving nodes does not result in downtime provide you use proper replication
> factors and read/write consistencies. The typical recommendation is RF=3 and
> QUORUM reads/writes.****
>
> ** **
>
> Dan****
>
> ** **
>
> *From:* Paul Loy [mailto:keteracel@gmail.com]
> *Sent:* July-04-11 5:59
> *To:* user@cassandra.apache.org
> *Subject:* Re: How to scale Cassandra?****
>
> ** **
>
> That's basically how I understand it.
>
> However, I think it gets better with larger clusters as the proportion of
> the ring you move around at any time is much lower.****
>
> On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de> wrote:
> ****
>
> Hi there,
>
> I read a lot of Cassandra's high scalability feature: allowing seamless
> addition of nodes, no downtime etc.
> But I wonder how one will do this in practice in an operational system.
>
> In the system we're going to implement we're expecting a huge number of
> writes with uniformly distributed keys
> (the keys are given and cannot be generated). That means using
> RandomPartitioner will (more or less) result in
> the same work-load per node as any other OrderPreservePartitioner - right?
>
> But how do you scale a (more or less) balanced Cassandra cluster? I think
> that in the end
> you always have to double the number of nodes (adding just a handful of
> nodes disburdens only the split regions, the
> work-load of untouched regions will grow with unchanged speed).
>
> This seems to be ok for small clusters. But what do you do with when you
> have several 100s of nodes in your cluster?
> It seems to me that a balanced cluster is a bless for performance but a
> curse for scalability...
>
> What are the alternatives? One could re-distribute the token ranges, but
> this would cause
> downtimes (AFAIK); not an option!
>
> Is there anything that I didn't understand or do I miss something else? Is
> the only left strategy to make sure that
> the cluster grows unbalanced so one can add nodes to the hotspots? However
> in this case you have to make sure
> that this strategy is lasting. Could be too optimistic...
>
> Best Regards
> Udo****
>
>
>
>
> --
> ---------------------------------------------
> Paul Loy
> paul@keteracel.com
> http://uk.linkedin.com/in/paulloy****
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11
> 02:35:00****
>



-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

RE: How to scale Cassandra?

Posted by Dan Hendry <da...@gmail.com>.

Moving nodes does not result in downtime provide you use proper replication
factors and read/write consistencies. The typical recommendation is RF=3 and
QUORUM reads/writes.

 

Dan

 

From: Paul Loy [mailto:keteracel@gmail.com] 
Sent: July-04-11 5:59
To: user@cassandra.apache.org
Subject: Re: How to scale Cassandra?

 

That's basically how I understand it.

However, I think it gets better with larger clusters as the proportion of
the ring you move around at any time is much lower.

On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de> wrote:

Hi there,

I read a lot of Cassandra's high scalability feature: allowing seamless
addition of nodes, no downtime etc.
But I wonder how one will do this in practice in an operational system.

In the system we're going to implement we're expecting a huge number of
writes with uniformly distributed keys
(the keys are given and cannot be generated). That means using
RandomPartitioner will (more or less) result in
the same work-load per node as any other OrderPreservePartitioner - right?

But how do you scale a (more or less) balanced Cassandra cluster? I think
that in the end
you always have to double the number of nodes (adding just a handful of
nodes disburdens only the split regions, the
work-load of untouched regions will grow with unchanged speed).

This seems to be ok for small clusters. But what do you do with when you
have several 100s of nodes in your cluster?
It seems to me that a balanced cluster is a bless for performance but a
curse for scalability...

What are the alternatives? One could re-distribute the token ranges, but
this would cause
downtimes (AFAIK); not an option!

Is there anything that I didn't understand or do I miss something else? Is
the only left strategy to make sure that
the cluster grows unbalanced so one can add nodes to the hotspots? However
in this case you have to make sure
that this strategy is lasting. Could be too optimistic...

Best Regards
Udo




-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.901 / Virus Database: 271.1.1/3743 - Release Date: 07/04/11
02:35:00

Re: How to scale Cassandra?

Posted by Paul Loy <ke...@gmail.com>.

That's basically how I understand it.

However, I think it gets better with larger clusters as the proportion of
the ring you move around at any time is much lower.

On Mon, Jul 4, 2011 at 10:54 AM, Subscriber <su...@zfabrik.de> wrote:

> Hi there,
>
> I read a lot of Cassandra's high scalability feature: allowing seamless
> addition of nodes, no downtime etc.
> But I wonder how one will do this in practice in an operational system.
>
> In the system we're going to implement we're expecting a huge number of
> writes with uniformly distributed keys
> (the keys are given and cannot be generated). That means using
> RandomPartitioner will (more or less) result in
> the same work-load per node as any other OrderPreservePartitioner - right?
>
> But how do you scale a (more or less) balanced Cassandra cluster? I think
> that in the end
> you always have to double the number of nodes (adding just a handful of
> nodes disburdens only the split regions, the
> work-load of untouched regions will grow with unchanged speed).
>
> This seems to be ok for small clusters. But what do you do with when you
> have several 100s of nodes in your cluster?
> It seems to me that a balanced cluster is a bless for performance but a
> curse for scalability...
>
> What are the alternatives? One could re-distribute the token ranges, but
> this would cause
> downtimes (AFAIK); not an option!
>
> Is there anything that I didn't understand or do I miss something else? Is
> the only left strategy to make sure that
> the cluster grows unbalanced so one can add nodes to the hotspots? However
> in this case you have to make sure
> that this strategy is lasting. Could be too optimistic...
>
> Best Regards
> Udo




-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy