You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Huming Wu <hu...@gmail.com> on 2009/08/17 21:14:41 UTC

Cassandra performance

I did some performance test and I am not impressed :). The data set is
880K unique keys and there are 4 columns with 2 columns being string
and the other 2 are integers (from client side, to the backend it is
all byte[]). After high throughput set (very fast), 220MB are injected
via batch_insert. I restarted the cassandra and started a client
calling get_slice at 5000rps with 100 connections. Here are some
graphs over 2 days:

1. rps/qps:  http://farm3.static.flickr.com/2585/3831093496_068b90caa0_o.png
2. latency:  http://farm4.static.flickr.com/3421/3830297179_8decd66e34_o.png
3. CPU: http://farm4.static.flickr.com/3432/3831093584_b5bd459f55_o.png
4. mem: http://farm4.static.flickr.com/3526/3830356879_d09ac2695c_o.png

A couple of observations:

a) Read is too CPU intensive. With the actual peak rps around 3000,
the CPU usage is 70% already. I doubt it I can double the rps and have
the same read latency.
b) The memory footprint is too big given the data size. I used
incremental QC. I am pretty new to JAVA especially for the performance
tuning. So maybe something is not right in the setting. But here is
the JVM config:

-Xmx6000m -Xms6000m -XX:+HeapDumpOnOutOfMemoryError -XX:NewSize=1000m
-XX:MaxNewSize=1000m -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode

The machines are 8 cores and 8G RAM.  here are some configuration
parameters (client is doing non block get_slice):
    <ReplicationFactor>2</ReplicationFactor>
    <MemtableSizeInMB>1024</MemtableSizeInMB>
    <MemtableObjectCountInMillions>2</MemtableObjectCountInMillions>
    <KeysCachedFraction>1</KeysCachedFraction>
    <ConcurrentReads>8</ConcurrentReads>
    <ConcurrentWrites>32</ConcurrentWrites>

The performance is very important to us (under high throughput). I did
some preliminary test on sustained put and get and the performance is
worse. But I thought I started the report with read only first.

Any comments on those numbers?

Thanks,
Huming

p.s. I am using trunk as of Aug. 12

svn info
Path: .
URL: https://svn.apache.org/repos/asf/incubator/cassandra/trunk
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 803947
Node Kind: directory
Schedule: normal
Last Changed Author: jbellis
Last Changed Rev: 803716
Last Changed Date: 2009-08-12 21:27:24 +0000 (Wed, 12 Aug 2009)

Re: Cassandra performance

Posted by Huming Wu <hu...@gmail.com>.

On Wed, Aug 19, 2009 at 5:45 PM, Michael Greene<mi...@gmail.com> wrote:
> What Jonathan said.  Also, if you have the ability to switch your profiler
> between wall mode and CPU mode, I would recommend it to give you a better
> overall picture of what is going on.  If your profiler can switch between
> 'sampling' and 'tracing' modes that would also be useful.
> I'm not sure if that picture and the profiling description is intended to
> explain the need for incremental GC, but I don't understand it.  What
> difference in memory growth are you seeing?  I'm not sure I can explain that
> based on the GC docs I've read (although I'll be the first to admit I have a
> relatively limited understanding of how it works)

Right. I am probably looking at the wrong place (the waiting time is
counted as cpu usage in the report). I'll try to check out the
profiler to see there is such switch and get a real picture on where
the CPU cycle is spent. On the incremental GC it is not related to the
CPU profiling, I was just responding to your suggestion of removing
that from JVM setting...

Thanks,
Huming

Re: Cassandra performance

Posted by Michael Greene <mi...@gmail.com>.

What Jonathan said.  Also, if you have the ability to switch your profiler
between wall mode and CPU mode, I would recommend it to give you a better
overall picture of what is going on.  If your profiler can switch between
'sampling' and 'tracing' modes that would also be useful.
I'm not sure if that picture and the profiling description is intended to
explain the need for incremental GC, but I don't understand it.  What
difference in memory growth are you seeing?  I'm not sure I can explain that
based on the GC docs I've read (although I'll be the first to admit I have a
relatively limited understanding of how it works)

Michael

On Wed, Aug 19, 2009 at 7:37 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> be careful when profiling blocking io -- I bet that means that "I'm
> spending all my time blocking for more data to read since there is
> only one call per second."
>
> the internal Cassandra MessagingService uses nonblocking io, but the
> Thrift stuff is just your standard thread pool with blocking sockets.
>
> On Wed, Aug 19, 2009 at 5:32 PM, Huming Wu<hu...@gmail.com> wrote:
> > On Tue, Aug 18, 2009 at 12:02 PM, Michael
> > Greene<mi...@gmail.com> wrote:
> >> According to the HBase guys and confirmed by
> >>
> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms,
> >> on an 8-core machine you are not going to want to enable
> >> -XX:+CMSIncrementalMode -- it is for 1 or 2 core machines that are
> >> using the CMS GC.  This should affect your latency numbers positively,
> >> but I wouldn't see it changing the CPU usage much.
> >
> > Looks I do need to use the incremental GC though otherwise the memory
> > growth is very fast... On the get_slice CPU, I tried the profiling (on
> > single node and low load 1 get_slice call per second) and it looks
> > org.apache.thrift.transport.TIOStreamTransport.read sucks up all CPU.
> > Here is the snapshot
> > http://farm3.static.flickr.com/2595/3838562412_8ffb42ea8c_o.png
> > (sounds an issue in thrift). Any idea?
> >
> > Thanks,
> > Huming
> >
>

Re: Cassandra performance

Posted by Jonathan Ellis <jb...@gmail.com>.

be careful when profiling blocking io -- I bet that means that "I'm
spending all my time blocking for more data to read since there is
only one call per second."

the internal Cassandra MessagingService uses nonblocking io, but the
Thrift stuff is just your standard thread pool with blocking sockets.

On Wed, Aug 19, 2009 at 5:32 PM, Huming Wu<hu...@gmail.com> wrote:
> On Tue, Aug 18, 2009 at 12:02 PM, Michael
> Greene<mi...@gmail.com> wrote:
>> According to the HBase guys and confirmed by
>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms,
>> on an 8-core machine you are not going to want to enable
>> -XX:+CMSIncrementalMode -- it is for 1 or 2 core machines that are
>> using the CMS GC.  This should affect your latency numbers positively,
>> but I wouldn't see it changing the CPU usage much.
>
> Looks I do need to use the incremental GC though otherwise the memory
> growth is very fast... On the get_slice CPU, I tried the profiling (on
> single node and low load 1 get_slice call per second) and it looks
> org.apache.thrift.transport.TIOStreamTransport.read sucks up all CPU.
> Here is the snapshot
> http://farm3.static.flickr.com/2595/3838562412_8ffb42ea8c_o.png
> (sounds an issue in thrift). Any idea?
>
> Thanks,
> Huming
>

Re: Cassandra performance

Posted by Huming Wu <hu...@gmail.com>.

On Tue, Aug 18, 2009 at 12:02 PM, Michael
Greene<mi...@gmail.com> wrote:
> According to the HBase guys and confirmed by
> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms,
> on an 8-core machine you are not going to want to enable
> -XX:+CMSIncrementalMode -- it is for 1 or 2 core machines that are
> using the CMS GC.  This should affect your latency numbers positively,
> but I wouldn't see it changing the CPU usage much.

Looks I do need to use the incremental GC though otherwise the memory
growth is very fast... On the get_slice CPU, I tried the profiling (on
single node and low load 1 get_slice call per second) and it looks
org.apache.thrift.transport.TIOStreamTransport.read sucks up all CPU.
Here is the snapshot
http://farm3.static.flickr.com/2595/3838562412_8ffb42ea8c_o.png
(sounds an issue in thrift). Any idea?

Thanks,
Huming

Re: Cassandra performance

Posted by Michael Greene <mi...@gmail.com>.

According to the HBase guys and confirmed by
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms,
on an 8-core machine you are not going to want to enable
-XX:+CMSIncrementalMode -- it is for 1 or 2 core machines that are
using the CMS GC.  This should affect your latency numbers positively,
but I wouldn't see it changing the CPU usage much.

Michael

On Tue, Aug 18, 2009 at 12:51 PM, Huming Wu<hu...@gmail.com> wrote:
> Here is the log4j setting (yes, it is just INFO):
> log4j.rootLogger=INFO,stdout,R
>
> I am not sure why the latency dropped during last 24 hours. One stats
> I can see (from other reports) is that disk reads dropped 80% (reads
> per sec) in last 24 hour which may explain the drop (less disk read).
>
> I wanted to bring up this discussion on performance and ultimately
> this is what matters the most to a production system (at least for
> us). And I'd be glad to contribute (still fairly new to Cassandra
> though) in this area. I hoped someone had done some tests and
> therefore numbers can be cross examined. Performance numbers are
> always dependent on the test data. I provided the data profile in my
> original email already.
>
> By the way to measure the performance, I inserted timing code to
> get_slice and batch_insert, for example, for get_slice:
>
> public List<ColumnOrSuperColumn> get_slice(...)
> {
> long t0 = System.currentTimeMillis();
> ....
>
> long lat = System.currentTimeMillis() - t0)/(float)1000.0
> return col;
> }
> And report this latency for every call to our in-house metrics system.
> So this is server side report (not client).
>
> To me, for the sustained read, cassandra uses too much cpu. Anyone has
> seen the same thing?
>
> Thanks,
> Huming
>

Re: Cassandra performance

Posted by Huming Wu <hu...@gmail.com>.

Here is the log4j setting (yes, it is just INFO):
log4j.rootLogger=INFO,stdout,R

I am not sure why the latency dropped during last 24 hours. One stats
I can see (from other reports) is that disk reads dropped 80% (reads
per sec) in last 24 hour which may explain the drop (less disk read).

I wanted to bring up this discussion on performance and ultimately
this is what matters the most to a production system (at least for
us). And I'd be glad to contribute (still fairly new to Cassandra
though) in this area. I hoped someone had done some tests and
therefore numbers can be cross examined. Performance numbers are
always dependent on the test data. I provided the data profile in my
original email already.

By the way to measure the performance, I inserted timing code to
get_slice and batch_insert, for example, for get_slice:

public List<ColumnOrSuperColumn> get_slice(...)
{
long t0 = System.currentTimeMillis();
....

long lat = System.currentTimeMillis() - t0)/(float)1000.0
return col;
}
And report this latency for every call to our in-house metrics system.
So this is server side report (not client).

To me, for the sustained read, cassandra uses too much cpu. Anyone has
seen the same thing?

Thanks,
Huming

Re: Cassandra performance

Posted by Jonathan Ellis <jb...@gmail.com>.

What happened about 20h in to make the latency drop so dramatically?

On Mon, Aug 17, 2009 at 12:14 PM, Huming Wu<hu...@gmail.com> wrote:
> I did some performance test and I am not impressed :). The data set is
> 880K unique keys and there are 4 columns with 2 columns being string
> and the other 2 are integers (from client side, to the backend it is
> all byte[]). After high throughput set (very fast), 220MB are injected
> via batch_insert. I restarted the cassandra and started a client
> calling get_slice at 5000rps with 100 connections. Here are some
> graphs over 2 days:
>
> 1. rps/qps:  http://farm3.static.flickr.com/2585/3831093496_068b90caa0_o.png
> 2. latency:  http://farm4.static.flickr.com/3421/3830297179_8decd66e34_o.png
> 3. CPU: http://farm4.static.flickr.com/3432/3831093584_b5bd459f55_o.png
> 4. mem: http://farm4.static.flickr.com/3526/3830356879_d09ac2695c_o.png
>
> A couple of observations:
>
> a) Read is too CPU intensive. With the actual peak rps around 3000,
> the CPU usage is 70% already. I doubt it I can double the rps and have
> the same read latency.
> b) The memory footprint is too big given the data size. I used
> incremental QC. I am pretty new to JAVA especially for the performance
> tuning. So maybe something is not right in the setting. But here is
> the JVM config:
>
> -Xmx6000m -Xms6000m -XX:+HeapDumpOnOutOfMemoryError -XX:NewSize=1000m
> -XX:MaxNewSize=1000m -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode
>
> The machines are 8 cores and 8G RAM.  here are some configuration
> parameters (client is doing non block get_slice):
>    <ReplicationFactor>2</ReplicationFactor>
>    <MemtableSizeInMB>1024</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>2</MemtableObjectCountInMillions>
>    <KeysCachedFraction>1</KeysCachedFraction>
>    <ConcurrentReads>8</ConcurrentReads>
>    <ConcurrentWrites>32</ConcurrentWrites>
>
> The performance is very important to us (under high throughput). I did
> some preliminary test on sustained put and get and the performance is
> worse. But I thought I started the report with read only first.
>
> Any comments on those numbers?
>
> Thanks,
> Huming
>
> p.s. I am using trunk as of Aug. 12
>
> svn info
> Path: .
> URL: https://svn.apache.org/repos/asf/incubator/cassandra/trunk
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 803947
> Node Kind: directory
> Schedule: normal
> Last Changed Author: jbellis
> Last Changed Rev: 803716
> Last Changed Date: 2009-08-12 21:27:24 +0000 (Wed, 12 Aug 2009)
>

Re: Cassandra performance

Posted by Tangram Liu <la...@gmail.com>.

did you set the root to INFO instead DEBUG in log4j.properties?

2009/8/18 Huming Wu <hu...@gmail.com>

> > What sort of slicing are you doing?  This will impact CPU usage.
>
> The slice is all columns (under one column family which is Standard1):
>            List<ColumnOrSuperColumn> cols = thriftClient_.get_slice
> ("Table1",
>
>                            myKey,
>
>                            new ColumnParent("Standard1",null),
>
>                            new SlicePredicate(colNames_,null),
>
>                            1);
>
> Thanks,
> Huming
>

Re: Cassandra performance

Posted by Huming Wu <hu...@gmail.com>.

> What sort of slicing are you doing?  This will impact CPU usage.

The slice is all columns (under one column family which is Standard1):
            List<ColumnOrSuperColumn> cols = thriftClient_.get_slice ("Table1",

                            myKey,

                            new ColumnParent("Standard1",null),

                            new SlicePredicate(colNames_,null),

                            1);

Thanks,
Huming

Re: Cassandra performance

Posted by Michael Greene <mi...@gmail.com>.

What sort of slicing are you doing?  This will impact CPU usage.

Michael

Huming Wu<hu...@gmail.com> wrote:
> I did some performance test and I am not impressed :). The data set is
> 880K unique keys and there are 4 columns with 2 columns being string
> and the other 2 are integers (from client side, to the backend it is
> all byte[]). After high throughput set (very fast), 220MB are injected
> via batch_insert. I restarted the cassandra and started a client
> calling get_slice at 5000rps with 100 connections.
[...]
> a) Read is too CPU intensive. With the actual peak rps around 3000,
> the CPU usage is 70% already. I doubt it I can double the rps and have
> the same read latency.
[...]
> Any comments on those numbers?