You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Thomas Borg Salling <tb...@tbsalling.dk> on 2015/04/08 10:36:22 UTC

What's to think of when increasing disk size on Cassandra nodes?

I run a 10-node Cassandra cluster in production. 99% writes; 1% reads, 0%
deletes. The nodes have 32 GB RAM; C* runs with 8 GB heap. Each node has a
SDD for commitlog and 2x4 TB spinning disks for data (sstables). The schema
uses key caching only. C* version is 2.1.2.

It can be predicted that the cluster will run out of free disk space in not
too long. So its storage capacity needs to be increased. The client prefers
increasing disk size over adding more nodes. So a plan is to take the 2x4
TB spinning disks in each node and replace by 3x6 TB spinning disks.

   - Are there any obvious pitfalls/caveats to be aware of here? Like:

   - Can C* handle up to 18 TB data size per node with this amount of RAM?

      - Is it feasible to increase the disk size by mounting a new (larger)
      disk, copy all SS tables to it, and then mount it on the same mount point
      as the original (smaller) disk (to replace it)?


( -- also posted on StackOverflow
<http://stackoverflow.com/questions/29509595/whats-to-think-of-when-increasing-disk-size-on-cassandra-nodes>
)

Thanks in advance.


Med venlig hilsen / Best regards,


*Thomas Borg Salling*
Freelance IT architect and programmer.
Java and open source specialist.

tbsalling@tbsalling.dk :: +45 4063 2353 :: @tbsalling
<http://twitter.com/tbsalling> :: tbsalling.dk :: linkedin.com/in/tbsalling

Re: What's to think of when increasing disk size on Cassandra nodes?

Posted by Colin <co...@gmail.com>.

Yikes, 18tb/node is a very bad idea.

I dont like to go over 2-3 personally and you have to be careful with JBOD.  See one of Ellis's latest posts on this and suggested use of LVM.  It is a reversal on previous position re JBOD.

--
Colin 
+1 612 859 6129
Skype colin.p.clark

> On Apr 8, 2015, at 3:11 PM, Jack Krupansky <ja...@gmail.com> wrote:
> 
> I can certainly sympathize if you have IT staff/management who will willingly spring for some disk drives, but not for full machines, even if they are relatively commodity boxes. Seems penny-wise and pound-foolish to me, but management has their own priorities, plus there is the pre-existing Oracle mindset of dense/fat nodes as a preference.
> 
> -- Jack Krupansky
> 
>> On Wed, Apr 8, 2015 at 2:00 PM, Nate McCall <na...@thelastpickle.com> wrote:
>> First off, I agree that the preferred path is adding nodes, but it is possible. 
>> 
>> > Can C* handle up to 18 TB data size per node with this amount of RAM?
>> 
>> Depends on how deep in the weeds you want to get tuning and testing. See below. 
>> 
>> >
>> > Is it feasible to increase the disk size by mounting a new (larger) disk, copy all SS tables to it, and then mount it on the same mount point as the original (smaller) disk (to replace it)? 
>> 
>> Yes (with C* off of course). 
>> 
>> As for tuning, you will need to look at, experiment with, and get a good understanding of:
>> - index_interval (turn this up now anyway if have not already ~ start at 512 and go up from there)
>> - bloom filter space usage via bloom_filter_fp_chance 
>> - compression metadata storage via chunk_length_kb 
>> - repair time and how compaction_throughput_in_mb_per_sec and stream_throughput_outbound_megabits_per_sec will effect such
>> 
>> The first three will have a direct negative impact on read performance.
>> 
>> You will definitely want to use JBOD so you don't have to repair everything if you loose a single disk, but you will still be degraded for *a very long time* when you loose a disk.  
>> 
>> This is hard and takes experimentation and research (I can't emphasize this part enough), but i've seen it work. That said, the engineering time spent is probably more than buying and deploying additional hardware in the first place. YMMV. 
>> 
>> 
>> --
>> -----------------
>> Nate McCall
>> Austin, TX
>> @zznate
>> 
>> Co-Founder & Sr. Technical Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>

Re: What's to think of when increasing disk size on Cassandra nodes?

Posted by Jack Krupansky <ja...@gmail.com>.

I can certainly sympathize if you have IT staff/management who will
willingly spring for some disk drives, but not for full machines, even if
they are relatively commodity boxes. Seems penny-wise and pound-foolish to
me, but management has their own priorities, plus there is the pre-existing
Oracle mindset of dense/fat nodes as a preference.

-- Jack Krupansky

On Wed, Apr 8, 2015 at 2:00 PM, Nate McCall <na...@thelastpickle.com> wrote:

> First off, I agree that the preferred path is adding nodes, but it is
> possible.
>
> > Can C* handle up to 18 TB data size per node with this amount of RAM?
>
> Depends on how deep in the weeds you want to get tuning and testing. See
> below.
>
> >
> > Is it feasible to increase the disk size by mounting a new (larger)
> disk, copy all SS tables to it, and then mount it on the same mount point
> as the original (smaller) disk (to replace it)?
>
> Yes (with C* off of course).
>
> As for tuning, you will need to look at, experiment with, and get a good
> understanding of:
> - index_interval (turn this up now anyway if have not already ~ start at
> 512 and go up from there)
> - bloom filter space usage via bloom_filter_fp_chance
> - compression metadata storage via chunk_length_kb
> - repair time and how compaction_throughput_in_mb_per_sec and
> stream_throughput_outbound_megabits_per_sec will effect such
>
> The first three will have a direct negative impact on read performance.
>
> You will definitely want to use JBOD so you don't have to repair
> everything if you loose a single disk, but you will still be degraded for
> *a very long time* when you loose a disk.
>
> This is hard and takes experimentation and research (I can't emphasize
> this part enough), but i've seen it work. That said, the engineering time
> spent is probably more than buying and deploying additional hardware in the
> first place. YMMV.
>
>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: What's to think of when increasing disk size on Cassandra nodes?

Posted by Nate McCall <na...@thelastpickle.com>.

First off, I agree that the preferred path is adding nodes, but it is
possible.

> Can C* handle up to 18 TB data size per node with this amount of RAM?

Depends on how deep in the weeds you want to get tuning and testing. See
below.

>
> Is it feasible to increase the disk size by mounting a new (larger) disk,
copy all SS tables to it, and then mount it on the same mount point as the
original (smaller) disk (to replace it)?

Yes (with C* off of course).

As for tuning, you will need to look at, experiment with, and get a good
understanding of:
- index_interval (turn this up now anyway if have not already ~ start at
512 and go up from there)
- bloom filter space usage via bloom_filter_fp_chance
- compression metadata storage via chunk_length_kb
- repair time and how compaction_throughput_in_mb_per_sec and
stream_throughput_outbound_megabits_per_sec will effect such

The first three will have a direct negative impact on read performance.

You will definitely want to use JBOD so you don't have to repair everything
if you loose a single disk, but you will still be degraded for *a very long
time* when you loose a disk.

This is hard and takes experimentation and research (I can't emphasize this
part enough), but i've seen it work. That said, the engineering time spent
is probably more than buying and deploying additional hardware in the first
place. YMMV.


--
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: What's to think of when increasing disk size on Cassandra nodes?

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

Agreed with Jack.  Cassandra is a database meant to scale horizontally by
adding nodes, and what you're describing is vertical scale.

Aside from the vertical scale issue, unless you're running a very specific
workload (time series data w/ Date Tiered Compaction) and you REALLY know
what you're doing, I wouldn't go above 3-5TB per node right now.  You'll
start to see GC issues and your cluster performance will suffer.

Add nodes and sleep comfortably at night.

Jon

On Wed, Apr 8, 2015 at 4:27 AM Jack Krupansky <ja...@gmail.com>
wrote:

> The preferred pattern for scaling data with Cassandra is to add nodes.
> Growing the disk on each node is an anti-pattern. The key strength of
> Cassandra is that it is a DISTRIBUTED database, so always keep your eye on
> distributing your data.
>
> But if you do need to grow disk, be sure to grow RAM and CPU power as
> well. More disk without more RAM AND CPU is just asking for trouble. But
> even that has its limits relative to the preferred pattern of adding nodes.
>
> -- Jack Krupansky
>
> On Wed, Apr 8, 2015 at 4:36 AM, Thomas Borg Salling <
> tbsalling@tbsalling.dk> wrote:
>
>> I run a 10-node Cassandra cluster in production. 99% writes; 1% reads, 0%
>> deletes. The nodes have 32 GB RAM; C* runs with 8 GB heap. Each node has a
>> SDD for commitlog and 2x4 TB spinning disks for data (sstables). The schema
>> uses key caching only. C* version is 2.1.2.
>>
>> It can be predicted that the cluster will run out of free disk space in
>> not too long. So its storage capacity needs to be increased. The client
>> prefers increasing disk size over adding more nodes. So a plan is to take
>> the 2x4 TB spinning disks in each node and replace by 3x6 TB spinning disks.
>>
>>    - Are there any obvious pitfalls/caveats to be aware of here? Like:
>>
>>    - Can C* handle up to 18 TB data size per node with this amount of
>>       RAM?
>>
>>       - Is it feasible to increase the disk size by mounting a new
>>       (larger) disk, copy all SS tables to it, and then mount it on the same
>>       mount point as the original (smaller) disk (to replace it)?
>>
>>
>> ( -- also posted on StackOverflow
>> <http://stackoverflow.com/questions/29509595/whats-to-think-of-when-increasing-disk-size-on-cassandra-nodes>
>> )
>>
>> Thanks in advance.
>>
>>
>> Med venlig hilsen / Best regards,
>>
>>
>> *Thomas Borg Salling*
>> Freelance IT architect and programmer.
>> Java and open source specialist.
>>
>> tbsalling@tbsalling.dk :: +45 4063 2353 :: @tbsalling
>> <http://twitter.com/tbsalling> :: tbsalling.dk ::
>> linkedin.com/in/tbsalling
>>
>
>

Re: What's to think of when increasing disk size on Cassandra nodes?

Posted by Jack Krupansky <ja...@gmail.com>.

The preferred pattern for scaling data with Cassandra is to add nodes.
Growing the disk on each node is an anti-pattern. The key strength of
Cassandra is that it is a DISTRIBUTED database, so always keep your eye on
distributing your data.

But if you do need to grow disk, be sure to grow RAM and CPU power as well.
More disk without more RAM AND CPU is just asking for trouble. But even
that has its limits relative to the preferred pattern of adding nodes.

-- Jack Krupansky

On Wed, Apr 8, 2015 at 4:36 AM, Thomas Borg Salling <tb...@tbsalling.dk>
wrote:

> I run a 10-node Cassandra cluster in production. 99% writes; 1% reads, 0%
> deletes. The nodes have 32 GB RAM; C* runs with 8 GB heap. Each node has a
> SDD for commitlog and 2x4 TB spinning disks for data (sstables). The schema
> uses key caching only. C* version is 2.1.2.
>
> It can be predicted that the cluster will run out of free disk space in
> not too long. So its storage capacity needs to be increased. The client
> prefers increasing disk size over adding more nodes. So a plan is to take
> the 2x4 TB spinning disks in each node and replace by 3x6 TB spinning disks.
>
>    - Are there any obvious pitfalls/caveats to be aware of here? Like:
>
>    - Can C* handle up to 18 TB data size per node with this amount of RAM?
>
>       - Is it feasible to increase the disk size by mounting a new
>       (larger) disk, copy all SS tables to it, and then mount it on the same
>       mount point as the original (smaller) disk (to replace it)?
>
>
> ( -- also posted on StackOverflow
> <http://stackoverflow.com/questions/29509595/whats-to-think-of-when-increasing-disk-size-on-cassandra-nodes>
> )
>
> Thanks in advance.
>
>
> Med venlig hilsen / Best regards,
>
>
> *Thomas Borg Salling*
> Freelance IT architect and programmer.
> Java and open source specialist.
>
> tbsalling@tbsalling.dk :: +45 4063 2353 :: @tbsalling
> <http://twitter.com/tbsalling> :: tbsalling.dk ::
> linkedin.com/in/tbsalling
>