You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Hefeng Yuan <hf...@rhapsody.com> on 2011/09/06 21:53:19 UTC

Calculate number of nodes required based on data

Hi,

Is there any suggested way of calculating number of nodes needed based on data? 

We currently have 6 nodes (each has 8G memory) with RF5 (because we want to be able to survive loss of 2 nodes).
The flush of memtable happens around every 30 min (while not doing compaction), with ~9m serialized bytes.

The problem is that we see more than 3 nodes doing compaction at the same time, which slows down the application.
(tried to increase/decrease compaction_throughput_mb_per_sec, not helping much)

So I'm thinking probably we should add more nodes, but not sure how many more to add. 
Based on the data rate, is there any suggested way of calculating number of nodes required?

Thanks,
Hefeng

Re: Calculate number of nodes required based on data

Posted by Hefeng Yuan <hf...@rhapsody.com>.
Adi, just to make sure my calculation is correct, the configured ops threshold is ~2m, we have 6 nodes, does that mean each node's threshold is around 300k? I do see the when flushing happens, ops is about 300k, with several 500k. Seems like the ops threshold is throttling us.

On Sep 7, 2011, at 11:31 AM, Adi wrote:

> On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:
> We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're 499/4/32.
> As for why we're flushing at ~9m, I guess it has to do with this: http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
> The only parameter I tried to play with is the compaction_throughput_mb_per_sec, tried cutting it in half and doubled, seems none of them helps avoiding the simultaneous compactions on nodes.
> 
> I agree that we don't necessarily need to add node, as long as we have a way to avoid simultaneous compaction on 4+ nodes.
> 
> Thanks,
> Hefeng
> 
> 
> 
> Can you check in the logs for something like this 
> ...... Memtable.java (line 157) Writing Memtable-<ColumnFamilyName>@1151031968(67138588 bytes, 47430 operations)
> to see the bytes/operations at which the column family gets flushed. In case you are hitting the operations threshold you can try increasing that to a high number. The operations threshold is getting hit at  less than 2% of size threshold. I would try bumping up the memtable_operations substantially. Default is 1.1624999999999999(in millions).  Try 10 or 20 and see if your CF flushes at higher size. Keep adjusting it until the frequency/size of flushing becomes satisfactory and hopefully reduces the compaction overhead.
> 
> -Adi
> 
> 
> 
> 
> 
>  
> On Sep 7, 2011, at 10:51 AM, Adi wrote:
> 
>> 
>> On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:
>> Adi,
>> 
>> The reason we're attempting to add more nodes is trying to solve the long/simultaneous compactions, i.e. the performance issue, not the storage issue yet.
>> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and when 4 nodes doing compaction at the same period, we're screwed, especially on read, since it'll cover one of the compaction node anyways. 
>> My assumption is that if we add more nodes, each node will have less load, and therefore need less compaction, and probably will compact faster, eternally avoid 4+ nodes doing compaction simultaneously.
>> 
>> Any suggestion on how to calculate how many more nodes to add? Or, generally how to plan for number of nodes required, from a performance perspective?
>> 
>> Thanks,
>> Hefeng
>> 
>> 
>> 
>> Adding nodes to delay and reduce compaction is an interesting performance use case :-)  I am thinking you can find a smarter/cheaper way to manage that.
>> Have you looked at 
>> a) increasing memtable througput
>> What is the nature of your writes?  Is it mostly inserts or also has lot of quick updates of recently inserted data. Increasing memtable_throughput can delay and maybe reduce the compaction cost if you have lots of updates to same data.You will have to provide for memory if you try this. 
>> When mentioned "with ~9m serialized bytes" is that the memtable throughput? That is quite a low threshold which will result in large number of SSTables needing to be compacted. I think the default is 256 MB and on the lower end values I have seen are 64 MB or maybe 32 MB.
>> 
>> 
>> b) tweaking min_compaction_threshold and max_compaction_threshold
>> - increasing min_compaction_threshold will delay compactions
>> - decreasing max_compaction_threshold will reduce number of sstables per compaction cycle
>> Are you using the defaults 4-32 or are trying some different values
>> 
>> c) splitting column families
>> Again splitting column families can also help because compactions occur serially one CF at a time and that spreads out your compaction cost over time and column families. It requires change in app logic though.
>> 
>> -Adi
>> 
> 
> 


Re: Calculate number of nodes required based on data

Posted by Adi <ad...@gmail.com>.
On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:

> We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're
> 499/4/32.
> As for why we're flushing at ~9m, I guess it has to do with this:
> http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
> The only parameter I tried to play with is the *
> compaction_throughput_mb_per_sec*, tried cutting it in half and doubled,
> seems none of them helps avoiding the simultaneous compactions on nodes.
>
> I agree that we don't necessarily need to add node, as long as we have a
> way to avoid simultaneous compaction on 4+ nodes.
>
> Thanks,
> Hefeng
>
>
>
Can you check in the logs for something like this
...... Memtable.java (line 157) Writing
Memtable-<ColumnFamilyName>@1151031968(67138588 bytes, 47430 operations)
to see the bytes/operations at which the column family gets flushed. In case
you are hitting the operations threshold you can try increasing that to a
high number. The operations threshold is getting hit at  less than 2% of
size threshold. I would try bumping up the *memtable_operations *substantially.
Default is 1.1624999999999999(in millions).  Try 10 or 20 and see if your CF
flushes at higher size. Keep adjusting it until the frequency/size of
flushing becomes satisfactory and hopefully reduces the compaction overhead.

-Adi







> On Sep 7, 2011, at 10:51 AM, Adi wrote:
>
>
> On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:
>
>> Adi,
>>
>> The reason we're attempting to add more nodes is trying to solve the
>> long/simultaneous compactions, i.e. the performance issue, not the storage
>> issue yet.
>> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes,
>> and when 4 nodes doing compaction at the same period, we're screwed,
>> especially on read, since it'll cover one of the compaction node anyways.
>> My assumption is that if we add more nodes, each node will have less load,
>> and therefore need less compaction, and probably will compact faster,
>> eternally avoid 4+ nodes doing compaction simultaneously.
>>
>> Any suggestion on how to calculate how many more nodes to add? Or,
>> generally how to plan for number of nodes required, from a performance
>> perspective?
>>
>> Thanks,
>> Hefeng
>>
>>
>>
> Adding nodes to delay and reduce compaction is an interesting performance
> use case :-)  I am thinking you can find a smarter/cheaper way to manage
> that.
> Have you looked at
> a) increasing memtable througput
> What is the nature of your writes?  Is it mostly inserts or also has lot of
> quick updates of recently inserted data. Increasing memtable_throughput can
> delay and maybe reduce the compaction cost if you have lots of updates to
> same data.You will have to provide for memory if you try this.
> When mentioned "with ~9m serialized bytes" is that the memtable
> throughput? That is quite a low threshold which will result in large number
> of SSTables needing to be compacted. I think the default is 256 MB and on
> the lower end values I have seen are 64 MB or maybe 32 MB.
>
>
> b) tweaking min_compaction_threshold and max_compaction_threshold
> - increasing min_compaction_threshold will delay compactions
> - decreasing max_compaction_threshold will reduce number of sstables per
> compaction cycle
> Are you using the defaults 4-32 or are trying some different values
>
> c) splitting column families
> Again splitting column families can also help because compactions occur
> serially one CF at a time and that spreads out your compaction cost over
> time and column families. It requires change in app logic though.
>
> -Adi
>
>
>

Re: Calculate number of nodes required based on data

Posted by Hefeng Yuan <hf...@rhapsody.com>.
We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're 499/4/32.
As for why we're flushing at ~9m, I guess it has to do with this: http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
The only parameter I tried to play with is the compaction_throughput_mb_per_sec, tried cutting it in half and doubled, seems none of them helps avoiding the simultaneous compactions on nodes.

I agree that we don't necessarily need to add node, as long as we have a way to avoid simultaneous compaction on 4+ nodes.

Thanks,
Hefeng

On Sep 7, 2011, at 10:51 AM, Adi wrote:

> 
> On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:
> Adi,
> 
> The reason we're attempting to add more nodes is trying to solve the long/simultaneous compactions, i.e. the performance issue, not the storage issue yet.
> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and when 4 nodes doing compaction at the same period, we're screwed, especially on read, since it'll cover one of the compaction node anyways. 
> My assumption is that if we add more nodes, each node will have less load, and therefore need less compaction, and probably will compact faster, eternally avoid 4+ nodes doing compaction simultaneously.
> 
> Any suggestion on how to calculate how many more nodes to add? Or, generally how to plan for number of nodes required, from a performance perspective?
> 
> Thanks,
> Hefeng
> 
> 
> 
> Adding nodes to delay and reduce compaction is an interesting performance use case :-)  I am thinking you can find a smarter/cheaper way to manage that.
> Have you looked at 
> a) increasing memtable througput
> What is the nature of your writes?  Is it mostly inserts or also has lot of quick updates of recently inserted data. Increasing memtable_throughput can delay and maybe reduce the compaction cost if you have lots of updates to same data.You will have to provide for memory if you try this. 
> When mentioned "with ~9m serialized bytes" is that the memtable throughput? That is quite a low threshold which will result in large number of SSTables needing to be compacted. I think the default is 256 MB and on the lower end values I have seen are 64 MB or maybe 32 MB.
> 
> 
> b) tweaking min_compaction_threshold and max_compaction_threshold
> - increasing min_compaction_threshold will delay compactions
> - decreasing max_compaction_threshold will reduce number of sstables per compaction cycle
> Are you using the defaults 4-32 or are trying some different values
> 
> c) splitting column families
> Again splitting column families can also help because compactions occur serially one CF at a time and that spreads out your compaction cost over time and column families. It requires change in app logic though.
> 
> -Adi
> 


Re: Calculate number of nodes required based on data

Posted by Adi <ad...@gmail.com>.
On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:

> Adi,
>
> The reason we're attempting to add more nodes is trying to solve the
> long/simultaneous compactions, i.e. the performance issue, not the storage
> issue yet.
> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes,
> and when 4 nodes doing compaction at the same period, we're screwed,
> especially on read, since it'll cover one of the compaction node anyways.
> My assumption is that if we add more nodes, each node will have less load,
> and therefore need less compaction, and probably will compact faster,
> eternally avoid 4+ nodes doing compaction simultaneously.
>
> Any suggestion on how to calculate how many more nodes to add? Or,
> generally how to plan for number of nodes required, from a performance
> perspective?
>
> Thanks,
> Hefeng
>
>
>
Adding nodes to delay and reduce compaction is an interesting performance
use case :-)  I am thinking you can find a smarter/cheaper way to manage
that.
Have you looked at
a) increasing memtable througput
What is the nature of your writes?  Is it mostly inserts or also has lot of
quick updates of recently inserted data. Increasing memtable_throughput can
delay and maybe reduce the compaction cost if you have lots of updates to
same data.You will have to provide for memory if you try this.
When mentioned "with ~9m serialized bytes" is that the memtable throughput?
That is quite a low threshold which will result in large number of SSTables
needing to be compacted. I think the default is 256 MB and on the lower end
values I have seen are 64 MB or maybe 32 MB.


b) tweaking min_compaction_threshold and max_compaction_threshold
- increasing min_compaction_threshold will delay compactions
- decreasing max_compaction_threshold will reduce number of sstables per
compaction cycle
Are you using the defaults 4-32 or are trying some different values

c) splitting column families
Again splitting column families can also help because compactions occur
serially one CF at a time and that spreads out your compaction cost over
time and column families. It requires change in app logic though.

-Adi

Re: Calculate number of nodes required based on data

Posted by Hefeng Yuan <hf...@rhapsody.com>.
Adi,

The reason we're attempting to add more nodes is trying to solve the long/simultaneous compactions, i.e. the performance issue, not the storage issue yet.
We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and when 4 nodes doing compaction at the same period, we're screwed, especially on read, since it'll cover one of the compaction node anyways. 
My assumption is that if we add more nodes, each node will have less load, and therefore need less compaction, and probably will compact faster, eternally avoid 4+ nodes doing compaction simultaneously.

Any suggestion on how to calculate how many more nodes to add? Or, generally how to plan for number of nodes required, from a performance perspective?

Thanks,
Hefeng

On Sep 7, 2011, at 9:56 AM, Adi wrote:

> On Tue, Sep 6, 2011 at 3:53 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:
> Hi,
> 
> Is there any suggested way of calculating number of nodes needed based on data?
>  
> We currently have 6 nodes (each has 8G memory) with RF5 (because we want to be able to survive loss of 2 nodes).
> The flush of memtable happens around every 30 min (while not doing compaction), with ~9m serialized bytes.
> 
> The problem is that we see more than 3 nodes doing compaction at the same time, which slows down the application.
> (tried to increase/decrease compaction_throughput_mb_per_sec, not helping much)
> 
> So I'm thinking probably we should add more nodes, but not sure how many more to add.
> Based on the data rate, is there any suggested way of calculating number of nodes required?
> 
> Thanks,
> Hefeng
> 
> 
> What is the total  amount of data?
> What is the total amount in the biggest column family?
> 
> There is no hard limit per node. Cassandra gurus like more nodes :-). One number for 'happy cassandra users'  I have seen mentioned in discussions is around 250-300 GB per node. But you could store more per node by having multiple column families each storing around 250-300 GB per column family. The main problem being repair/compactions and such operations taking longer and requiring much more spare disk space.
> 
> As for slow down in application during compaction I was wondering 
> what is the CL you are using for read and writes?
> Make sure it is not a client issue - Is your client hitting all nodes in round-robin or some other fashion?
> 
> -Adi


Re: Calculate number of nodes required based on data

Posted by Adi <ad...@gmail.com>.
On Tue, Sep 6, 2011 at 3:53 PM, Hefeng Yuan <hf...@rhapsody.com> wrote:

> Hi,
>
> Is there any suggested way of calculating number of nodes needed based on
> data?
>

We currently have 6 nodes (each has 8G memory) with RF5 (because we want to
> be able to survive loss of 2 nodes).
> The flush of memtable happens around every 30 min (while not doing
> compaction), with ~9m serialized bytes.
>
> The problem is that we see more than 3 nodes doing compaction at the same
> time, which slows down the application.
> (tried to increase/decrease compaction_throughput_mb_per_sec, not helping
> much)
>
> So I'm thinking probably we should add more nodes, but not sure how many
> more to add.
> Based on the data rate, is there any suggested way of calculating number of
> nodes required?
>
> Thanks,
> Hefeng



What is the total  amount of data?
What is the total amount in the biggest column family?

There is no hard limit per node. Cassandra gurus like more nodes :-). One
number for 'happy cassandra users'  I have seen mentioned in discussions is
around 250-300 GB per node. But you could store more per node by having
multiple column families each storing around 250-300 GB per column family.
The main problem being repair/compactions and such operations taking longer
and requiring much more spare disk space.

As for slow down in application during compaction I was wondering
what is the CL you are using for read and writes?
Make sure it is not a client issue - Is your client hitting all nodes in
round-robin or some other fashion?

-Adi