You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by James Lee <Ja...@metaswitch.com> on 2013/06/14 16:19:57 UTC

Cassandra periodically stops responding to write requests under load

Hello,

I have been doing some scripted load testing of Cassandra as part of determining what hardware to deploy Cassandra on for the particular load profile our application will generate. I'm seeing generally good performance, but with periods where the Cassandra node entirely stops responding to write requests for several seconds at a time. I don't have much experience of Cassandra performance tuning, and would very much appreciate some pointers on what I can do to improve matters.

__Load profile__

The load profile I've tested is the following:

-- A single Cassandra node

-- 40 keyspaces

-- Each keyspace has 2 small column families with a handful of rows, and 1 large column family with approx 25,000 rows. The rows are <1k in size.

The perf test then has 20 instances connect to the server (they use pycassa), doing the following operation:

-- Read a random row from the large column family on a random one of the keyspaces

-- Write that row into a random one of the other keyspaces (the test is arranged so that this is likely to be a non-existing row in the new keyspace)

I run the test so that 500 read/write cycles are generated per second (total across all instance). The Cassandra server keeps up fine with this rate, and is using only a small fraction of the CPU/memory available. Every several minutes though, there is a several-second period during which no writes are serviced. This seems to coincide with memtables being flushed to disk.

Note that this read/write rate is an order of magnitude lower than the maximum load this server is able to cope with if pushed as hard as possible by the clients.

__Tuning attempted__

I've tried making several changes to see if any of them improved matters:

-- I've tried putting the commitlog directory on a separate drive. That didn't make any appreciable difference.

-- I've used a RAID array for the data directory to improve write performance. This significantly reduces the length of the slow period (from ~10s to ~2s), but doesn't eliminate it. I've tried RAID10 and RAID0 using varying number of drives, but there doesn't seem to be a significant difference between the two.

-- I've used multiple drives for the data directory, symlinking the directories for different keyspaces to different drives. That didn't improve things significantly compared to using a single drive.

-- Reducing the commitlog segment interval to force more frequent smaller flushes doesn't make any difference.

-- Increasing the memtable flush queue size doesn't make any difference.

-- Disabling compaction doesn't help either.

Any suggestions would be much appreciated.

Thanks,

James Lee

RE: Cassandra periodically stops responding to write requests under load

Posted by James Lee <Ja...@metaswitch.com>.

Hi Rob,

Thanks for the reply.  To answer your questions below:

I'm using the following JVM:
java version "1.7.0_10"
Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

Stock Cassandra version 1.2.2

Heap settings are as follows (the system has 48GB of physical RAM, with over 2/3 of it unused while Cassandra is running): -Xms8192M -Xmx8192M -Xmn2048M

I don't see any GC logs around the time of the slowness.

For now I'm testing with a single node only.  I'm expecting to have multiple nodes with all data fully replicated (eg. 2 nodes, replication factor 2) so am testing with a single node to check performance in the case where all other nodes have failed.  I'd like to get away with as few nodes as possible ideally...

I've put the full commandline of the running java process below, in case any of it is relevant:
/usr/jdk1.7.0_10/bin/java -ea -javaagent:/dsc-cassandra-1.2.2/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8192M -Xmx8192M -Xmn2048M -XX:+HeapDumpOnOutOfMemoryError -Xss180k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -cp /dsc-cassandra-1.2.2/bin/../conf:/dsc-cassandra-1.2.2/bin/../build/classes/main:/dsc-cassandra-1.2.2/bin/../build/classes/thrift:/dsc-cassandra-1.2.2/bin/../lib/antlr-3.2.jar:/dsc-cassandra-1.2.2/bin/../lib/apache-cassandra-1.2.2.jar:/dsc-cassandra-1.2.2/bin/../lib/apache-cassandra-clientutil-1.2.2.jar:/dsc-cassandra-1.2.2/bin/../lib/apache-cassandra-thrift-1.2.2.jar:/dsc-cassandra-1.2.2/bin/../lib/avro-1.4.0-fixes.jar:/dsc-cassandra-1.2.2/bin/../lib/avro-1.4.0-sources-fixes.jar:/dsc-cassandra-1.2.2/bin/../lib/commons-cli-1.1.jar:/dsc-cassandra-1.2.2/bin/../lib/commons-codec-1.2.jar:/dsc-cassandra-1.2.2/bin/../lib/commons-lang-2.6.jar:/dsc-cassandra-1.2.2/bin/../lib/compress-lzf-0.8.4.jar:/dsc-cassandra-1.2.2/bin/../lib/concurrentlinkedhashmap-lru-1.3.jar:/dsc-cassandra-1.2.2/bin/../lib/guava-13.0.1.jar:/dsc-cassandra-1.2.2/bin/../lib/high-scale-lib-1.1.2.jar:/dsc-cassandra-1.2.2/bin/../lib/jackson-core-asl-1.9.2.jar:/dsc-cassandra-1.2.2/bin/../lib/jackson-mapper-asl-1.9.2.jar:/dsc-cassandra-1.2.2/bin/../lib/jamm-0.2.5.jar:/dsc-cassandra-1.2.2/bin/../lib/jbcrypt-0.3m.jar:/dsc-cassandra-1.2.2/bin/../lib/jline-1.0.jar:/dsc-cassandra-1.2.2/bin/../lib/jna-3.4.0.jar:/dsc-cassandra-1.2.2/bin/../lib/json-simple-1.1.jar:/dsc-cassandra-1.2.2/bin/../lib/libthrift-0.7.0.jar:/dsc-cassandra-1.2.2/bin/../lib/log4j-1.2.16.jar:/dsc-cassandra-1.2.2/bin/../lib/lz4-1.1.0.jar:/dsc-cassandra-1.2.2/bin/../lib/metrics-core-2.0.3.jar:/dsc-cassandra-1.2.2/bin/../lib/netty-3.5.9.Final.jar:/dsc-cassandra-1.2.2/bin/../lib/servlet-api-2.5-20081211.jar:/dsc-cassandra-1.2.2/bin/../lib/slf4j-api-1.7.2.jar:/dsc-cassandra-1.2.2/bin/../lib/slf4j-log4j12-1.7.2.jar:/dsc-cassandra-1.2.2/bin/../lib/snakeyaml-1.6.jar:/dsc-cassandra-1.2.2/bin/../lib/snappy-java-1.0.4.1.jar:/dsc-cassandra-1.2.2/bin/../lib/snaptree-0.1.jar org.apache.cassandra.service.CassandraDaemon

Thanks,
James

-----Original Message-----
From: Robert Coli [mailto:rcoli@eventbrite.com] 
Sent: 14 June 2013 17:17
To: user@cassandra.apache.org
Subject: Re: Cassandra periodically stops responding to write requests under load

On Fri, Jun 14, 2013 at 7:19 AM, James Lee <Ja...@metaswitch.com> wrote:
>  I'm seeing generally good performance, but with periods where the 
> Cassandra node entirely stops responding to write requests for several 
> seconds at a time.  I don't have much experience of Cassandra 
> performance tuning, and would very much appreciate some pointers on what I can do to improve matters.

It is relatively common for a Cassandra node to become unresponsive for a few seconds when doing various things. However as one usually has multiple replicas for any given key, this transient unavailability does not meaningfully impact overall availability. Pausing for more-than-a-few seconds is relatively uncommon and probably does indicate either sub optimal configuration or excessive workload.

> -- I've used a RAID array for the data directory to improve write 
> performance.  This significantly reduces the length of the slow period 
> (from ~10s to ~2s), but doesn't eliminate it.  I've tried RAID10 and 
> RAID0 using varying number of drives, but there doesn't seem to be a 
> significant difference between the two.

Do you see disk saturation when you're flushing? This statement suggests that you might be..

> -- I've used multiple drives for the data directory, symlinking the 
> directories for different keyspaces to different drives.  That didn't 
> improve things significantly compared to using a single drive.

I would not expect this to improve things if you are bounded on how quickly you can flush from a single thread.

Stock questions :

1) What JVM?
2) What heap settings?
3) Do you also see GC logs around flush time?
4) Are you testing a single node only?

=Rob

Re: Cassandra periodically stops responding to write requests under load

Posted by Robert Coli <rc...@eventbrite.com>.

On Fri, Jun 14, 2013 at 7:19 AM, James Lee <Ja...@metaswitch.com> wrote:
>  I’m seeing generally good performance, but with periods where the Cassandra node entirely stops
> responding to write requests for several seconds at a time.  I don’t have
> much experience of Cassandra performance tuning, and would very much
> appreciate some pointers on what I can do to improve matters.

It is relatively common for a Cassandra node to become unresponsive
for a few seconds when doing various things. However as one usually
has multiple replicas for any given key, this transient unavailability
does not meaningfully impact overall availability. Pausing for
more-than-a-few seconds is relatively uncommon and probably does
indicate either sub optimal configuration or excessive workload.

> -- I've used a RAID array for the data directory to improve write
> performance.  This significantly reduces the length of the slow period (from
> ~10s to ~2s), but doesn't eliminate it.  I've tried RAID10 and RAID0 using
> varying number of drives, but there doesn't seem to be a significant
> difference between the two.

Do you see disk saturation when you're flushing? This statement
suggests that you might be..

> -- I've used multiple drives for the data directory, symlinking the
> directories for different keyspaces to different drives.  That didn't
> improve things significantly compared to using a single drive.

I would not expect this to improve things if you are bounded on how
quickly you can flush from a single thread.

Stock questions :

1) What JVM?
2) What heap settings?
3) Do you also see GC logs around flush time?
4) Are you testing a single node only?

=Rob

RE: Cassandra periodically stops responding to write requests under load

Posted by S C <as...@outlook.com>.

How big is your HEAP?
From: asf11@outlook.com
To: user@cassandra.apache.org
Subject: RE: Cassandra periodically stops responding to write requests under load
Date: Fri, 14 Jun 2013 10:09:24 -0500

What version of Cassandra are you using? Did you look if Cassandra is under going GC? 
-SC
From: James.Lee@metaswitch.com
To: user@cassandra.apache.org
Subject: Cassandra periodically stops responding to write requests under load
Date: Fri, 14 Jun 2013 14:19:57 +0000

Hello,

I have been doing some scripted load testing of Cassandra as part of determining what hardware to deploy Cassandra on for the particular load profile our application will generate.  I’m seeing generally good performance, but with periods
 where the Cassandra node entirely stops responding to write requests for several seconds at a time.  I don’t have much experience of Cassandra performance tuning, and would very much appreciate some pointers on what I can do to improve matters.

__Load profile__

The load profile I’ve tested is the following:
-- A single Cassandra node
-- 40 keyspaces
-- Each keyspace has 2 small column families with a handful of rows, and 1 large column family with approx 25,000 rows.  The rows are <1k in size.

The perf test then has 20 instances connect to the server (they use pycassa), doing the following operation:
-- Read a random row from the large column family on a random one of the keyspaces
-- Write that row into a random one of the other keyspaces (the test is arranged so that this is likely to be a non-existing row in the new keyspace)

I run the test so that 500 read/write cycles are generated per second (total across all instance).  The Cassandra server keeps up fine with this rate, and is using only a small fraction of the CPU/memory available.  Every several minutes
 though, there is a several-second period during which no writes are serviced.  This seems to coincide with memtables being flushed to disk.

Note that this read/write rate is an order of magnitude lower than the maximum load this server is able to cope with if pushed as hard as possible by the clients.

__Tuning attempted__

I've tried making several changes to see if any of them improved matters:
-- I've tried putting the commitlog directory on a separate drive.  That didn't make any appreciable difference.
-- I've used a RAID array for the data directory to improve write performance.  This significantly reduces the length of the slow period (from ~10s to ~2s), but doesn't eliminate it.  I've tried RAID10 and RAID0 using varying number
 of drives, but there doesn't seem to be a significant difference between the two.
-- I've used multiple drives for the data directory, symlinking the directories for different keyspaces to different drives.  That didn't improve things significantly compared to using a single drive.
-- Reducing the commitlog segment interval to force more frequent smaller flushes doesn't make any difference.
-- Increasing the memtable flush queue size doesn't make any difference.
-- Disabling compaction doesn't help either.

Any suggestions would be much appreciated.

Thanks,
James Lee

RE: Cassandra periodically stops responding to write requests under load

Posted by S C <as...@outlook.com>.

What version of Cassandra are you using? Did you look if Cassandra is under going GC? 
-SC
From: James.Lee@metaswitch.com
To: user@cassandra.apache.org
Subject: Cassandra periodically stops responding to write requests under load
Date: Fri, 14 Jun 2013 14:19:57 +0000

Hello,

I have been doing some scripted load testing of Cassandra as part of determining what hardware to deploy Cassandra on for the particular load profile our application will generate.  I’m seeing generally good performance, but with periods
 where the Cassandra node entirely stops responding to write requests for several seconds at a time.  I don’t have much experience of Cassandra performance tuning, and would very much appreciate some pointers on what I can do to improve matters.

__Load profile__

The load profile I’ve tested is the following:
-- A single Cassandra node
-- 40 keyspaces
-- Each keyspace has 2 small column families with a handful of rows, and 1 large column family with approx 25,000 rows.  The rows are <1k in size.

The perf test then has 20 instances connect to the server (they use pycassa), doing the following operation:
-- Read a random row from the large column family on a random one of the keyspaces
-- Write that row into a random one of the other keyspaces (the test is arranged so that this is likely to be a non-existing row in the new keyspace)

I run the test so that 500 read/write cycles are generated per second (total across all instance).  The Cassandra server keeps up fine with this rate, and is using only a small fraction of the CPU/memory available.  Every several minutes
 though, there is a several-second period during which no writes are serviced.  This seems to coincide with memtables being flushed to disk.

Note that this read/write rate is an order of magnitude lower than the maximum load this server is able to cope with if pushed as hard as possible by the clients.

__Tuning attempted__

I've tried making several changes to see if any of them improved matters:
-- I've tried putting the commitlog directory on a separate drive.  That didn't make any appreciable difference.
-- I've used a RAID array for the data directory to improve write performance.  This significantly reduces the length of the slow period (from ~10s to ~2s), but doesn't eliminate it.  I've tried RAID10 and RAID0 using varying number
 of drives, but there doesn't seem to be a significant difference between the two.
-- I've used multiple drives for the data directory, symlinking the directories for different keyspaces to different drives.  That didn't improve things significantly compared to using a single drive.
-- Reducing the commitlog segment interval to force more frequent smaller flushes doesn't make any difference.
-- Increasing the memtable flush queue size doesn't make any difference.
-- Disabling compaction doesn't help either.

Any suggestions would be much appreciated.

Thanks,
James Lee