You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by William Saar <Wi...@king.com> on 2015/01/12 11:47:16 UTC

Growing SSTable count as Cassandra does not saturate the disk I/O

Hi,
We are running a test with Cassandra 2.1.2 on Fusion I/O drives where we load about 2 billion rows of data during a few hours each night onto a 6-node cluster, but compactions that run 24/7 don't seem to be keeping up as the number of SSTables keep growing and our disks seem way underutilized. We are getting write throughputs during compactions of 300 - 500 kB/sec while other non-Cassandra servers with the same hardware have continuous write loads of 25 MB/sec.

We were initially running with Leveled compaction with compaction throughput set to 0 and tested the leveled compaction with 2, 8 and 16 concurrent compactors. We  have just switched to size-tiered compaction (but the disk utilization does not seem to increase. Anyone have any idea on how to increase Cassandra's disk utilization for compaction?

Thanks,
William


Re: Growing SSTable count as Cassandra does not saturate the disk I/O

Posted by Eric Stevens <mi...@gmail.com>.
> Is size-tiered compaction easier on the CPU than leveled compaction?


I don't think so.  It's easier on I/O though, so if you're not I/O bound,
that probably makes you more likely to become CPU bound.

Have you looked at nodetool setcompactionthroughput?

On Tue, Jan 13, 2015 at 4:01 AM, William Saar <Wi...@king.com> wrote:

>  Hi,
>
> Thanks for the reply. Yes, we are definitely CPU bound and disabling
> compression increases IO utilization a lot. However, the compactions still
> do not utilize the IO capacity of the machines while spiking the CPU
> (increasing the number of concurrent compactors do not seem to help). Oddly
> enough, one node has just 160 SSTables while the rest are at 500-600
> tables.
>
>
>
> Is size-tiered compaction easier on the CPU than leveled compaction?
>
>
>
> Thanks,
>
> William
>
>
>
>
>
> *From:* Eric Stevens [mailto:mightye@gmail.com]
> *Sent:* den 12 januari 2015 14:51
> *To:* user@cassandra.apache.org
> *Subject:* Re: Growing SSTable count as Cassandra does not saturate the
> disk I/O
>
>
>
> Are you using compression on the sstables?  If so, possibly you're CPU
> bound instead of disk bound.
>
>
>
> On Mon, Jan 12, 2015 at 3:47 AM, William Saar <Wi...@king.com>
> wrote:
>
> Hi,
>
> We are running a test with Cassandra 2.1.2 on Fusion I/O drives where we
> load about 2 billion rows of data during a few hours each night onto a
> 6-node cluster, but compactions that run 24/7 don’t seem to be keeping up
> as the number of SSTables keep growing and our disks seem way
> underutilized. We are getting write throughputs during compactions of 300 –
> 500 kB/sec while other non-Cassandra servers with the same hardware have
> continuous write loads of 25 MB/sec.
>
>
>
> We were initially running with Leveled compaction with compaction
> throughput set to 0 and tested the leveled compaction with 2, 8 and 16
> concurrent compactors. We  have just switched to size-tiered compaction
> (but the disk utilization does not seem to increase. Anyone have any idea
> on how to increase Cassandra’s disk utilization for compaction?
>
>
>
> Thanks,
>
> William
>
>
>
>
>

RE: Growing SSTable count as Cassandra does not saturate the disk I/O

Posted by William Saar <Wi...@king.com>.
Hi,
Thanks for the reply. Yes, we are definitely CPU bound and disabling compression increases IO utilization a lot. However, the compactions still do not utilize the IO capacity of the machines while spiking the CPU (increasing the number of concurrent compactors do not seem to help). Oddly enough, one node has just 160 SSTables while the rest are at 500-600 tables.

Is size-tiered compaction easier on the CPU than leveled compaction?

Thanks,
William


From: Eric Stevens [mailto:mightye@gmail.com]
Sent: den 12 januari 2015 14:51
To: user@cassandra.apache.org
Subject: Re: Growing SSTable count as Cassandra does not saturate the disk I/O

Are you using compression on the sstables?  If so, possibly you're CPU bound instead of disk bound.

On Mon, Jan 12, 2015 at 3:47 AM, William Saar <Wi...@king.com>> wrote:
Hi,
We are running a test with Cassandra 2.1.2 on Fusion I/O drives where we load about 2 billion rows of data during a few hours each night onto a 6-node cluster, but compactions that run 24/7 don’t seem to be keeping up as the number of SSTables keep growing and our disks seem way underutilized. We are getting write throughputs during compactions of 300 – 500 kB/sec while other non-Cassandra servers with the same hardware have continuous write loads of 25 MB/sec.

We were initially running with Leveled compaction with compaction throughput set to 0 and tested the leveled compaction with 2, 8 and 16 concurrent compactors. We  have just switched to size-tiered compaction (but the disk utilization does not seem to increase. Anyone have any idea on how to increase Cassandra’s disk utilization for compaction?

Thanks,
William



Re: Growing SSTable count as Cassandra does not saturate the disk I/O

Posted by Eric Stevens <mi...@gmail.com>.
Are you using compression on the sstables?  If so, possibly you're CPU
bound instead of disk bound.

On Mon, Jan 12, 2015 at 3:47 AM, William Saar <Wi...@king.com> wrote:

>  Hi,
>
> We are running a test with Cassandra 2.1.2 on Fusion I/O drives where we
> load about 2 billion rows of data during a few hours each night onto a
> 6-node cluster, but compactions that run 24/7 don’t seem to be keeping up
> as the number of SSTables keep growing and our disks seem way
> underutilized. We are getting write throughputs during compactions of 300 –
> 500 kB/sec while other non-Cassandra servers with the same hardware have
> continuous write loads of 25 MB/sec.
>
>
>
> We were initially running with Leveled compaction with compaction
> throughput set to 0 and tested the leveled compaction with 2, 8 and 16
> concurrent compactors. We  have just switched to size-tiered compaction
> (but the disk utilization does not seem to increase. Anyone have any idea
> on how to increase Cassandra’s disk utilization for compaction?
>
>
>
> Thanks,
>
> William
>
>
>