You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Filippo Diotalevi <fi...@ntoklo.com> on 2012/06/07 14:55:25 UTC

Maximum load per node

Hi, 
one of latest Aaron's observation about the max load per Cassandra node caught my attention
> At ~840GB I'm probably running close
> to the max load I should have on a node,[AM] roughly 300GB to 400GB is the max load
Since we currently have a Cassandra node with roughly 330GB of data, it looks like that's a good time for us to really understand what's that limit in our case. Also, a (maybe old) Stackoverflow question at http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster , seems to suggest a higher limit per node.

Just considering the compaction issues, what are the factors we need to account to determine the max load? 

* disk space
Datastax cassandra docs state (pg 97) that a major compaction "temporarily doubles disk space usage". Is it a safe estimate to say that the Cassandra machine needs to have roughly the same amount of free disk space as the current load of the Cassandra node, or are there any other factor to consider?

* RAM
Is the amount of RAM in the machine (or dedicated to the Cassandra node) affecting in any way the speed/efficiency of the compaction process? 

* Performance degradation for overloaded nodes?
What kind of performance degradation can we expect for a Cassandra node which is "overloaded"? (f.i. with 500GB or more of data)


Thanks for the clarifications,
-- 
Filippo Diotalevi



Re: Maximum load per node

Posted by aaron morton <aa...@thelastpickle.com>.
It's not a hard rule, you can put more data on a node. The 300GB to 400GB idea is mostly concerned with operations, you may want to put less on a node due to higher throughput demands. 

(We are talking about the amount of data on a node, regardless of the RF). 

On the operations side the considerations are:

* If you want to move the node to a new host moving 400 GB at 35MB/sec takes about 3 to 4  hours (this is the speed I recently got for moving 500GB on AWS in the same AZ)

* Repair will need to process all of the data. Assuming the bottle neck is not the CPU, and there are no other background processes running, it will take 7 hours to read the data at the default 16MB/sec (compaction_throughput_mb_per_sec).  

* Some throughput considerations for compaction.

* Major compaction compacts all the sstables, and assumes that it needs that much space again to write the new file. We normally dont want to do major compactions though. 

* If you are in a situation where you have lost redundancy for all or part of the key ring, you will want to get new nodes online ASAP. Taking several hours to bring new nodes on may not be acceptable. 

* The more data on disk the memory needed. The memory is taken up by bloom filters and index sampling. These can be tuned to reduce the memory footprint, with potential reduction in read speed. 

* Using compression helps reduce the on disk, and makes some things run faster. My experience is that is that repair and compaction will still take a while, as they deal with the uncompressed data. 

* Startup time for index sampling is/was an issue (it's faster in 1.1). If the node has more memory and more disk the time to get the page cache hot will increase.

* As the amount of data per node goes up, potentially so does the working set of hot data. If the memory per node available for the page cache remains the same potentially the latency will increase. e.g.  3 nodes with 800Gb each has less memory for the hot set than 6 nodes with 400GB each.

It's just a rule of thumb to avoid getting into trouble. Where trouble is often "help something went wrong and it's takes ages to fix" or "why does X take forever" or "why does it use Y amount of memory". If you are aware of the issues, there is essentially no upper limit on how much data you can put on a node. 

Hope that helps. 
Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 8/06/2012, at 12:59 AM, Ben Kaehne wrote:

> Does this "max load" have correlation to replication factor?
> 
> IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or what people generally mention the max load is?
> 
> On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi <fi...@ntoklo.com> wrote:
> Hi,
> one of latest Aaron's observation about the max load per Cassandra node caught my attention
>> At ~840GB I'm probably running close
>> to the max load I should have on a node,
> [AM] roughly 300GB to 400GB is the max load
> Since we currently have a Cassandra node with roughly 330GB of data, it looks like that's a good time for us to really understand what's that limit in our case. Also, a (maybe old) Stackoverflow question at http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster , seems to suggest a higher limit per node.
> 
> Just considering the compaction issues, what are the factors we need to account to determine the max load? 
> 
> * disk space
> Datastax cassandra docs state (pg 97) that a major compaction "temporarily doubles disk space usage". Is it a safe estimate to say that the Cassandra machine needs to have roughly the same amount of free disk space as the current load of the Cassandra node, or are there any other factor to consider?
> 
> * RAM
> Is the amount of RAM in the machine (or dedicated to the Cassandra node) affecting in any way the speed/efficiency of the compaction process? 
> 
> * Performance degradation for overloaded nodes?
> What kind of performance degradation can we expect for a Cassandra node which is "overloaded"? (f.i. with 500GB or more of data)
> 
> 
> Thanks for the clarifications,
> -- 
> Filippo Diotalevi
> 
> 
> 
> 
> 
> -- 
> -Ben


Re: Maximum load per node

Posted by Ben Kaehne <be...@sirca.org.au>.
Does this "max load" have correlation to replication factor?

IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or
what people generally mention the max load is?

On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi <fi...@ntoklo.com>wrote:

>  Hi,
> one of latest Aaron's observation about the max load per Cassandra node
> caught my attention
>
> At ~840GB I'm probably running close
> to the max load I should have on a node,
>
> [AM] roughly 300GB to 400GB is the max load
> Since we currently have a Cassandra node with roughly 330GB of data, it
> looks like that's a good time for us to really understand what's that limit
> in our case. Also, a (maybe old) Stackoverflow question at
> http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster ,
> seems to suggest a higher limit per node.
>
> Just considering the compaction issues, what are the factors we need to
> account to determine the max load?
>
> * disk space
> Datastax cassandra docs state (pg 97) that a major compaction "temporarily
> doubles disk space usage". Is it a safe estimate to say that the Cassandra
> machine needs to have roughly the same amount of free disk space as the
> current load of the Cassandra node, or are there any other factor to
> consider?
>
> * RAM
> Is the amount of RAM in the machine (or dedicated to the Cassandra node)
> affecting in any way the speed/efficiency of the compaction process?
>
> * Performance degradation for overloaded nodes?
> What kind of performance degradation can we expect for a Cassandra node
> which is "overloaded"? (f.i. with 500GB or more of data)
>
>
> Thanks for the clarifications,
> --
> Filippo Diotalevi
>
>
>


-- 
-Ben