You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anshu Vajpayee <an...@gmail.com> on 2016/05/27 19:52:34 UTC

Per node limit for Disk Space

Hi All,
I have question regarding max disk space limit  on a node.

As per Data stax, We can have 1TB max disk space for rotational disks and
up to 5 TB for SSDs on a node.

Could you please suggest per your experience what would be limit for space
on a single node with out causing so much stress on a  node?





*​Thanks,​*

RE: Per node limit for Disk Space

Posted by SE...@homedepot.com.
Eric is right on.

Let me share my experience. I have found that dense nodes over 4 TB are a pain to manage (for rebuilds, repair, compaction, etc.) with size-tiered compaction and basically a single table schema. However, 1 TB nodes that yield only about 500 GB of usable space can create rings with just too many nodes (and too expensive for the usable storage). 2 TB seems to be a good sweet spot to avoid either extreme. However, Eric is correct, there are lots of factors to weigh in the decision.

Sean Durity – Lead Cassandra Admin

From: Eric Stevens [mailto:mightye@gmail.com]
Sent: Monday, May 30, 2016 10:56 AM
To: user@cassandra.apache.org
Subject: Re: Per node limit for Disk Space

Those are rough guidelines, actual effective node size is going to depend on your read/write workload and the compaction strategy you choose.  The biggest reason data density per node usually needs to be limited is due to data grooming overhead introduced by compaction.  Data at rest essentially becomes I/O debt.  If you're using Leveled compaction, the interest rate on that debt is higher.

If you're writing aggressively you'll find that you run out of I/O capacity for smaller data at rest.  If you use compaction strategies that allow for data to eventually stop compacting (Date Tiered, Time Windowed), you may be able to have higher data density per node assuming that some of your data is going into the no-longer-compacting stages.

Beyond that it'll be hard to say what the right size for you is.  Target the recommended numbers and if you find that you're not running out of I/O as you approach them you can probably go bigger.  Just remember to leave ~50% disk capacity free to leave room for compaction to happen.

On Fri, May 27, 2016 at 1:52 PM Anshu Vajpayee <an...@gmail.com>> wrote:
Hi All,
I have question regarding max disk space limit  on a node.

As per Data stax, We can have 1TB max disk space for rotational disks and up to 5 TB for SSDs on a node.

Could you please suggest per your experience what would be limit for space on a single node with out causing so much stress on a  node?




​Thanks,​



________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Per node limit for Disk Space

Posted by Eric Stevens <mi...@gmail.com>.
Those are rough guidelines, actual effective node size is going to depend
on your read/write workload and the compaction strategy you choose.  The
biggest reason data density per node usually needs to be limited is due to
data grooming overhead introduced by compaction.  Data at rest essentially
becomes I/O debt.  If you're using Leveled compaction, the interest rate on
that debt is higher.

If you're writing aggressively you'll find that you run out of I/O capacity
for smaller data at rest.  If you use compaction strategies that allow for
data to eventually stop compacting (Date Tiered, Time Windowed), you may be
able to have higher data density per node assuming that some of your data
is going into the no-longer-compacting stages.

Beyond that it'll be hard to say what the right size for you is.  Target
the recommended numbers and if you find that you're not running out of I/O
as you approach them you can probably go bigger.  Just remember to leave
~50% disk capacity free to leave room for compaction to happen.

On Fri, May 27, 2016 at 1:52 PM Anshu Vajpayee <an...@gmail.com>
wrote:

> Hi All,
> I have question regarding max disk space limit  on a node.
>
> As per Data stax, We can have 1TB max disk space for rotational disks and
> up to 5 TB for SSDs on a node.
>
> Could you please suggest per your experience what would be limit for space
> on a single node with out causing so much stress on a  node?
>
>
>
>
>
> *​Thanks,​*
>
>