You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Baskar Duraikannu <ba...@gmail.com> on 2011/04/27 12:38:02 UTC

Performance tests using stress testing tool

I have setup 4 node cluster for testing and all of them have around 25gb of
data.

I ran a read and write tests using 100 and 200 threads with each thread
reading or writing 50 columns with quorum consistency using stress tool
against 4 nodes

Test servers have 4 cores and 16gb of ram.

While running the test
a) I am not seeing cpu usage more  than 10pct. In some of the forums, i see
that 8 cpu 32 gb is considered as good sweet spot for cassandra. Is this
true? Also when would I see real cpu spikes. At this moment looks like 4
core is more than sufficient.

B) iostat -x is reporting avgqueue size of around 0.25 and await time of
around 30 ms.  What would be acceptable queue size and await time?

Please help

Thanks
Baskar

Re: Performance tests using stress testing tool

Posted by Baskar Duraikannu <ba...@gmail.com>.
Thanks Peter. 

I believe...I found the root cause. Switch that we used was bad. 
Now on a 4 node cluster ( Each Node has 1 CPU  - Quad Core and 16 GB of RAM), 
I was able to get around 11,000 writes and 10,050 reads per second simultaneously (CPU usage is around 45% on all nodes. Disk queue size is in the neighbourhood of 10)

Is this inline with what you usually see with Cassandra? 


----- Original Message ----- 
From: Peter Schuller 
To: user@cassandra.apache.org 
Sent: Friday, April 29, 2011 12:21 PM
Subject: Re: Performance tests using stress testing tool


> Thanks Peter. I am using java version of the stress testing tool from the
> contrib folder. Is there any issue that should be aware of? Do you recommend
> using pystress?

I just saw Brandon file this:
https://issues.apache.org/jira/browse/CASSANDRA-2578

Maybe that's it.

-- 
/ Peter Schuller

Re: Performance tests using stress testing tool

Posted by Peter Schuller <pe...@infidyne.com>.
> Thanks Peter. I am using java version of the stress testing tool from the
> contrib folder. Is there any issue that should be aware of? Do you recommend
> using pystress?

I just saw Brandon file this:
https://issues.apache.org/jira/browse/CASSANDRA-2578

Maybe that's it.

-- 
/ Peter Schuller

Re: Performance tests using stress testing tool

Posted by Baskar Duraikannu <ba...@gmail.com>.
Thanks Peter. I am using java version of the stress testing tool from the contrib folder. Is there any issue that should be aware of? Do you recommend using pystress? 

I will rerun tests in order to monitor ethernet stats closely and will update.

--
Thanks,
Baskar Duraikannu 


----- Original Message ----- 
From: Peter Schuller 
To: user@cassandra.apache.org 
Sent: Thursday, April 28, 2011 1:21 PM
Subject: Re: Performance tests using stress testing tool


> When I looked at the benchmark client machine, it was not under any stress
> in terms of disk or CPU.

Are you running with the python multiprocessor module available?
stress should print a warning if it's not. If it's not, you'd end up
with a threaded mode and due to Python's GIL you'd be bottlenecking on
CPU without actually using more than ~ 1 core on the machine.

> But test machines are connected through 10/100 mbps switch port (not
> gigabit). Can this be a bottleneck?

Maybe but seems not-so-likely unless you've turned up the column size.
Check if 'ifstat 1' if you're pushing lots of data.

Another possibility is that the writes are periodically blocking due
to compaction (and compaction is not yet parallel unless you're
running 0.8 betas). However this should show up in the stress client
by a period lack of progress; it shouldn't give you a smooth amount of
cpu usage over time.

-- 
/ Peter Schuller

Re: Performance tests using stress testing tool

Posted by Peter Schuller <pe...@infidyne.com>.
> When I looked at the benchmark client machine, it was not under any stress
> in terms of disk or CPU.

Are you running with the python multiprocessor module available?
stress should print a warning if it's not. If it's not, you'd end up
with a threaded mode and due to Python's GIL you'd be bottlenecking on
CPU without actually using more than ~ 1 core on the machine.

> But test machines are connected through 10/100 mbps switch port (not
> gigabit). Can this be a bottleneck?

Maybe but seems not-so-likely unless you've turned up the column size.
Check if 'ifstat 1' if you're pushing lots of data.

Another possibility is that the writes are periodically blocking due
to compaction (and compaction is not yet parallel unless you're
running 0.8 betas). However this should show up in the stress client
by a period lack of progress; it shouldn't give you a smooth amount of
cpu usage over time.

-- 
/ Peter Schuller

Re: Performance tests using stress testing tool

Posted by Baskar Duraikannu <ba...@gmail.com>.
Thanks Peter. 

When I looked at the benchmark client machine, it was not under any stress in terms of disk or CPU.  
But test machines are connected through 10/100 mbps switch port (not gigabit). Can this be a bottleneck?

Thanks
Baskar 


----- Original Message ----- 
From: Peter Schuller 
To: user@cassandra.apache.org 
Sent: Thursday, April 28, 2011 2:34 AM
Subject: Re: Performance tests using stress testing tool


> a) I am not seeing cpu usage more than 10pct.

Sounds like the benchmarking client is bottlenecking.

> In some of the forums, i see
> that 8 cpu 32 gb is considered as good sweet spot for cassandra. Is this
> true?

Seems reasonable in a very general sense, but of course varies with use-case.

> Also when would I see real cpu spikes. At this moment looks like 4
> core is more than sufficient.

In general, the more requests and columns you read and write, the more
you'll be bottlenecking on CPU. The larger individual columns (and
thus fewer columns), the more you'll be bound on disk instead.

In your case the bottleneck seems to be the benchmark I think.

> B) iostat -x is reporting avgqueue size of around 0.25 and await time of
> around 30 ms. What would be acceptable queue size and await time?

Any avgqueuesize significantly below 1 is generally good. For close to
1 or higher than one, it will depend on your access pattern, latency
demands, and the nature of your storage device (e.g., SSD:s, RAID:s
can sustain concurrent I/O).

To simplify, there is some maximum amount of I/O requests that your
storage device will service concurrently. For a normal disk, this is 1
request (I'm ignoring optimizations due to TCQ/NCQ which can be
significant sometimes). As long as you're below the saturation point
it's mostly about statistics and varying I/O patterns causing latency.
The less saturated you are, the better your average latency will be.

Once you're *above* saturation latency goes haywire as you don't
service as many I/O requests as are coming in.

There is a grey area in between where latency will be very sensitive
to smallish changes in I/O load but aggregate throughput remaining
below what can be sustained.

-- 
/ Peter Schuller

Re: Performance tests using stress testing tool

Posted by Peter Schuller <pe...@infidyne.com>.
> a) I am not seeing cpu usage more  than 10pct.

Sounds like the benchmarking client is bottlenecking.

> In some of the forums, i see
> that 8 cpu 32 gb is considered as good sweet spot for cassandra. Is this
> true?

Seems reasonable in a very general sense, but of course varies with use-case.

> Also when would I see real cpu spikes. At this moment looks like 4
> core is more than sufficient.

In general, the more requests and columns you read and write, the more
you'll be bottlenecking on CPU. The larger individual columns (and
thus fewer columns), the more you'll be bound on disk instead.

In your case the bottleneck seems to be the benchmark I think.

> B) iostat -x is reporting avgqueue size of around 0.25 and await time of
> around 30 ms.  What would be acceptable queue size and await time?

Any avgqueuesize significantly below 1 is generally good. For close to
1 or higher than one, it will depend on your access pattern, latency
demands, and the nature of your storage device (e.g., SSD:s, RAID:s
can sustain concurrent I/O).

To simplify, there is some maximum amount of I/O requests that your
storage device will service concurrently. For a normal disk, this is 1
request (I'm ignoring optimizations due to TCQ/NCQ which can be
significant sometimes). As long as you're below the saturation point
it's mostly about statistics and varying I/O patterns causing latency.
The less saturated you are, the better your average latency will be.

Once you're *above* saturation latency goes haywire as you don't
service as many I/O requests as are coming in.

There is a grey area in between where latency will be very sensitive
to smallish changes in I/O load but aggregate throughput remaining
below what can be sustained.

-- 
/ Peter Schuller