You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Markus Klems <ma...@klems.eu> on 2011/02/15 20:45:29 UTC

Benchmarking Cassandra with YCSB

Hi there,

we are currently benchmarking a Cassandra 0.6.5 cluster with 3
High-Mem Quadruple Extra Large EC2 nodes
(http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
(replication factor is 3, random partitioner). We assigned 32 GB RAM
to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
We also set the user count to a very large number via ulimit -u
999999.

Our goal is to achieve max throughput by increasing YCSB's threadcount
parameter (i.e. the number of parallel benchmarking client threads).
However, this does only improve Cassandra throughput for low numbers
of threads. If we move to higher threadcounts, throughput does not
increase and even  decreases. Do you have any idea why this is
happening and possibly suggestions how to scale throughput to much
higher numbers? Why is throughput hitting a wall, anyways? And where
does the latency/throughput tradeoff come from?

Here is our YCSB configuration:
recordcount=300000
operationcount=1000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=true
readproportion=0.5
updateproportion=0.5
scanproportion=0
insertproportion=0
threadcount= 500
target = 10000
hosts=EC2-1,EC2-2,EC2-3
requestdistribution=uniform

These are typical results for threadcount=1:
Loading workload...
Starting test.
 0 sec: 0 operations;
 10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
 20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]

These are typical results for threadcount=10:
10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
 20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]

These are typical results for threadcount=100:
10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
 20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]

These are typical results for threadcount=500:
10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
 20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]

We never measured more than ~6000 ops/sec. Are there ways to tune
Cassandra that we are not aware of? We made some modification to the
Cassandra 0.6.5 core for experimental reasons, so it's not easy to
switch to 0.7x or 0.8x. However, if this might solve the scaling
issues, we might consider to port our modifications to a newer
Cassandra version...

Thanks,

Markus Klems

Karlsruhe Institute of Technology, Germany

Re: Benchmarking Cassandra with YCSB

Posted by Markus Klems <ma...@gmail.com>.
Good point. When we looked at the EC2 nodes, we measured 120% CPU utilization or so. We interpreted this as a false representation of CPU utilization on a multi-core machine. Our EC2 nodes have 8 virtual cores each.

Maybe Cassandra 0.6.5 is not so good with execution on multi-core systems?

On 15.02.2011, at 20:59, Thibaut Britz <th...@trendiction.com> wrote:

> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
> What's your CPU usage during these tests?
> 
> 
> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
>> Hi there,
>> 
>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>> High-Mem Quadruple Extra Large EC2 nodes
>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>> We also set the user count to a very large number via ulimit -u
>> 999999.
>> 
>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>> parameter (i.e. the number of parallel benchmarking client threads).
>> However, this does only improve Cassandra throughput for low numbers
>> of threads. If we move to higher threadcounts, throughput does not
>> increase and even  decreases. Do you have any idea why this is
>> happening and possibly suggestions how to scale throughput to much
>> higher numbers? Why is throughput hitting a wall, anyways? And where
>> does the latency/throughput tradeoff come from?
>> 
>> Here is our YCSB configuration:
>> recordcount=300000
>> operationcount=1000000
>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>> readallfields=true
>> readproportion=0.5
>> updateproportion=0.5
>> scanproportion=0
>> insertproportion=0
>> threadcount= 500
>> target = 10000
>> hosts=EC2-1,EC2-2,EC2-3
>> requestdistribution=uniform
>> 
>> These are typical results for threadcount=1:
>> Loading workload...
>> Starting test.
>>  0 sec: 0 operations;
>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>> 
>> These are typical results for threadcount=10:
>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>> 
>> These are typical results for threadcount=100:
>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>> 
>> These are typical results for threadcount=500:
>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>> 
>> We never measured more than ~6000 ops/sec. Are there ways to tune
>> Cassandra that we are not aware of? We made some modification to the
>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>> issues, we might consider to port our modifications to a newer
>> Cassandra version...
>> 
>> Thanks,
>> 
>> Markus Klems
>> 
>> Karlsruhe Institute of Technology, Germany
>> 

Re: Benchmarking Cassandra with YCSB

Posted by Markus Klems <ma...@klems.eu>.
Sure will do. We are currently running a couple of benchmarks on
differently configured EC2 landscapes. We will share our results in
the next weeks.

On Sat, Feb 19, 2011 at 6:53 PM, Lior Golan <li...@taboola.com> wrote:
> Can you share what numbers you are now getting?
>
> -----Original Message-----
> From: markusklems@gmail.com [mailto:markusklems@gmail.com] On Behalf Of Markus Klems
> Sent: Saturday, February 19, 2011 10:53 AM
> To: user@cassandra.apache.org
> Subject: Re: Benchmarking Cassandra with YCSB
>
> Hi,
>
> we sorted out the performance problems and tuned the cluster. In
> particular, we identified the following weak spot in our setup:
> ConcurrentReads and ConcurrentWrites was set to the default values
> which were much too low for our setup. Now, we get some serious
> numbers.
>
> Thanks,
>
> Markus
>
> On Tue, Feb 15, 2011 at 9:09 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>> Initial thoughts are you are overloading the cluster, are their any log lines about dropping messages?
>>
>> What is the schema, what settings do you have in Cassandra yaml  and what are CF stats telling you? E.g. Are you switching Memtables too quickly? What are the write latency numbers?
>>
>> Also 0.7 is much faster.
>>
>> Aaron
>>
>> On 16/02/2011, at 8:59 AM, Thibaut Britz <th...@trendiction.com> wrote:
>>
>>> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
>>> What's your CPU usage during these tests?
>>>
>>>
>>> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
>>>> Hi there,
>>>>
>>>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>>>> High-Mem Quadruple Extra Large EC2 nodes
>>>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>>>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>>>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>>>> We also set the user count to a very large number via ulimit -u
>>>> 999999.
>>>>
>>>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>>>> parameter (i.e. the number of parallel benchmarking client threads).
>>>> However, this does only improve Cassandra throughput for low numbers
>>>> of threads. If we move to higher threadcounts, throughput does not
>>>> increase and even  decreases. Do you have any idea why this is
>>>> happening and possibly suggestions how to scale throughput to much
>>>> higher numbers? Why is throughput hitting a wall, anyways? And where
>>>> does the latency/throughput tradeoff come from?
>>>>
>>>> Here is our YCSB configuration:
>>>> recordcount=300000
>>>> operationcount=1000000
>>>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>>>> readallfields=true
>>>> readproportion=0.5
>>>> updateproportion=0.5
>>>> scanproportion=0
>>>> insertproportion=0
>>>> threadcount= 500
>>>> target = 10000
>>>> hosts=EC2-1,EC2-2,EC2-3
>>>> requestdistribution=uniform
>>>>
>>>> These are typical results for threadcount=1:
>>>> Loading workload...
>>>> Starting test.
>>>>  0 sec: 0 operations;
>>>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>>>>
>>>> These are typical results for threadcount=10:
>>>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>>>>
>>>> These are typical results for threadcount=100:
>>>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>>>>
>>>> These are typical results for threadcount=500:
>>>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>>>>
>>>> We never measured more than ~6000 ops/sec. Are there ways to tune
>>>> Cassandra that we are not aware of? We made some modification to the
>>>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>>>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>>>> issues, we might consider to port our modifications to a newer
>>>> Cassandra version...
>>>>
>>>> Thanks,
>>>>
>>>> Markus Klems
>>>>
>>>> Karlsruhe Institute of Technology, Germany
>>>>
>>
>
>

RE: Benchmarking Cassandra with YCSB

Posted by Lior Golan <li...@taboola.com>.
Can you share what numbers you are now getting?

-----Original Message-----
From: markusklems@gmail.com [mailto:markusklems@gmail.com] On Behalf Of Markus Klems
Sent: Saturday, February 19, 2011 10:53 AM
To: user@cassandra.apache.org
Subject: Re: Benchmarking Cassandra with YCSB

Hi,

we sorted out the performance problems and tuned the cluster. In
particular, we identified the following weak spot in our setup:
ConcurrentReads and ConcurrentWrites was set to the default values
which were much too low for our setup. Now, we get some serious
numbers.

Thanks,

Markus

On Tue, Feb 15, 2011 at 9:09 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Initial thoughts are you are overloading the cluster, are their any log lines about dropping messages?
>
> What is the schema, what settings do you have in Cassandra yaml  and what are CF stats telling you? E.g. Are you switching Memtables too quickly? What are the write latency numbers?
>
> Also 0.7 is much faster.
>
> Aaron
>
> On 16/02/2011, at 8:59 AM, Thibaut Britz <th...@trendiction.com> wrote:
>
>> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
>> What's your CPU usage during these tests?
>>
>>
>> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
>>> Hi there,
>>>
>>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>>> High-Mem Quadruple Extra Large EC2 nodes
>>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>>> We also set the user count to a very large number via ulimit -u
>>> 999999.
>>>
>>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>>> parameter (i.e. the number of parallel benchmarking client threads).
>>> However, this does only improve Cassandra throughput for low numbers
>>> of threads. If we move to higher threadcounts, throughput does not
>>> increase and even  decreases. Do you have any idea why this is
>>> happening and possibly suggestions how to scale throughput to much
>>> higher numbers? Why is throughput hitting a wall, anyways? And where
>>> does the latency/throughput tradeoff come from?
>>>
>>> Here is our YCSB configuration:
>>> recordcount=300000
>>> operationcount=1000000
>>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>>> readallfields=true
>>> readproportion=0.5
>>> updateproportion=0.5
>>> scanproportion=0
>>> insertproportion=0
>>> threadcount= 500
>>> target = 10000
>>> hosts=EC2-1,EC2-2,EC2-3
>>> requestdistribution=uniform
>>>
>>> These are typical results for threadcount=1:
>>> Loading workload...
>>> Starting test.
>>>  0 sec: 0 operations;
>>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>>>
>>> These are typical results for threadcount=10:
>>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>>>
>>> These are typical results for threadcount=100:
>>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>>>
>>> These are typical results for threadcount=500:
>>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>>>
>>> We never measured more than ~6000 ops/sec. Are there ways to tune
>>> Cassandra that we are not aware of? We made some modification to the
>>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>>> issues, we might consider to port our modifications to a newer
>>> Cassandra version...
>>>
>>> Thanks,
>>>
>>> Markus Klems
>>>
>>> Karlsruhe Institute of Technology, Germany
>>>
>


Re: Benchmarking Cassandra with YCSB

Posted by Markus Klems <ma...@klems.eu>.
Hi,

we sorted out the performance problems and tuned the cluster. In
particular, we identified the following weak spot in our setup:
ConcurrentReads and ConcurrentWrites was set to the default values
which were much too low for our setup. Now, we get some serious
numbers.

Thanks,

Markus

On Tue, Feb 15, 2011 at 9:09 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Initial thoughts are you are overloading the cluster, are their any log lines about dropping messages?
>
> What is the schema, what settings do you have in Cassandra yaml  and what are CF stats telling you? E.g. Are you switching Memtables too quickly? What are the write latency numbers?
>
> Also 0.7 is much faster.
>
> Aaron
>
> On 16/02/2011, at 8:59 AM, Thibaut Britz <th...@trendiction.com> wrote:
>
>> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
>> What's your CPU usage during these tests?
>>
>>
>> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
>>> Hi there,
>>>
>>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>>> High-Mem Quadruple Extra Large EC2 nodes
>>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>>> We also set the user count to a very large number via ulimit -u
>>> 999999.
>>>
>>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>>> parameter (i.e. the number of parallel benchmarking client threads).
>>> However, this does only improve Cassandra throughput for low numbers
>>> of threads. If we move to higher threadcounts, throughput does not
>>> increase and even  decreases. Do you have any idea why this is
>>> happening and possibly suggestions how to scale throughput to much
>>> higher numbers? Why is throughput hitting a wall, anyways? And where
>>> does the latency/throughput tradeoff come from?
>>>
>>> Here is our YCSB configuration:
>>> recordcount=300000
>>> operationcount=1000000
>>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>>> readallfields=true
>>> readproportion=0.5
>>> updateproportion=0.5
>>> scanproportion=0
>>> insertproportion=0
>>> threadcount= 500
>>> target = 10000
>>> hosts=EC2-1,EC2-2,EC2-3
>>> requestdistribution=uniform
>>>
>>> These are typical results for threadcount=1:
>>> Loading workload...
>>> Starting test.
>>>  0 sec: 0 operations;
>>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>>>
>>> These are typical results for threadcount=10:
>>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>>>
>>> These are typical results for threadcount=100:
>>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>>>
>>> These are typical results for threadcount=500:
>>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>>>
>>> We never measured more than ~6000 ops/sec. Are there ways to tune
>>> Cassandra that we are not aware of? We made some modification to the
>>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>>> issues, we might consider to port our modifications to a newer
>>> Cassandra version...
>>>
>>> Thanks,
>>>
>>> Markus Klems
>>>
>>> Karlsruhe Institute of Technology, Germany
>>>
>

Re: Benchmarking Cassandra with YCSB

Posted by Aaron Morton <aa...@thelastpickle.com>.
Initial thoughts are you are overloading the cluster, are their any log lines about dropping messages?

What is the schema, what settings do you have in Cassandra yaml  and what are CF stats telling you? E.g. Are you switching Memtables too quickly? What are the write latency numbers?

Also 0.7 is much faster.

Aaron

On 16/02/2011, at 8:59 AM, Thibaut Britz <th...@trendiction.com> wrote:

> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
> What's your CPU usage during these tests?
> 
> 
> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
>> Hi there,
>> 
>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>> High-Mem Quadruple Extra Large EC2 nodes
>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>> We also set the user count to a very large number via ulimit -u
>> 999999.
>> 
>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>> parameter (i.e. the number of parallel benchmarking client threads).
>> However, this does only improve Cassandra throughput for low numbers
>> of threads. If we move to higher threadcounts, throughput does not
>> increase and even  decreases. Do you have any idea why this is
>> happening and possibly suggestions how to scale throughput to much
>> higher numbers? Why is throughput hitting a wall, anyways? And where
>> does the latency/throughput tradeoff come from?
>> 
>> Here is our YCSB configuration:
>> recordcount=300000
>> operationcount=1000000
>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>> readallfields=true
>> readproportion=0.5
>> updateproportion=0.5
>> scanproportion=0
>> insertproportion=0
>> threadcount= 500
>> target = 10000
>> hosts=EC2-1,EC2-2,EC2-3
>> requestdistribution=uniform
>> 
>> These are typical results for threadcount=1:
>> Loading workload...
>> Starting test.
>>  0 sec: 0 operations;
>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>> 
>> These are typical results for threadcount=10:
>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>> 
>> These are typical results for threadcount=100:
>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>> 
>> These are typical results for threadcount=500:
>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>> 
>> We never measured more than ~6000 ops/sec. Are there ways to tune
>> Cassandra that we are not aware of? We made some modification to the
>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>> issues, we might consider to port our modifications to a newer
>> Cassandra version...
>> 
>> Thanks,
>> 
>> Markus Klems
>> 
>> Karlsruhe Institute of Technology, Germany
>> 

Re: Benchmarking Cassandra with YCSB

Posted by Thibaut Britz <th...@trendiction.com>.
Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
What's your CPU usage during these tests?


On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <ma...@klems.eu> wrote:
> Hi there,
>
> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
> High-Mem Quadruple Extra Large EC2 nodes
> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
> (replication factor is 3, random partitioner). We assigned 32 GB RAM
> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
> We also set the user count to a very large number via ulimit -u
> 999999.
>
> Our goal is to achieve max throughput by increasing YCSB's threadcount
> parameter (i.e. the number of parallel benchmarking client threads).
> However, this does only improve Cassandra throughput for low numbers
> of threads. If we move to higher threadcounts, throughput does not
> increase and even  decreases. Do you have any idea why this is
> happening and possibly suggestions how to scale throughput to much
> higher numbers? Why is throughput hitting a wall, anyways? And where
> does the latency/throughput tradeoff come from?
>
> Here is our YCSB configuration:
> recordcount=300000
> operationcount=1000000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=true
> readproportion=0.5
> updateproportion=0.5
> scanproportion=0
> insertproportion=0
> threadcount= 500
> target = 10000
> hosts=EC2-1,EC2-2,EC2-3
> requestdistribution=uniform
>
> These are typical results for threadcount=1:
> Loading workload...
> Starting test.
>  0 sec: 0 operations;
>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>
> These are typical results for threadcount=10:
> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>
> These are typical results for threadcount=100:
> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>
> These are typical results for threadcount=500:
> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>
> We never measured more than ~6000 ops/sec. Are there ways to tune
> Cassandra that we are not aware of? We made some modification to the
> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
> switch to 0.7x or 0.8x. However, if this might solve the scaling
> issues, we might consider to port our modifications to a newer
> Cassandra version...
>
> Thanks,
>
> Markus Klems
>
> Karlsruhe Institute of Technology, Germany
>