You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chris Dean <ct...@sokitomi.com> on 2010/05/28 04:48:07 UTC

ec2 tests

I'm interested in performing some simple performance tests on EC2.  I
was thinking of using py_stress and Cassandra deployed on 3 servers with
one separate machine to run py_stress.

Are there any particular configuration settings I should use?  I was
planning on changing the JVM heap size to reflect the Large Instances
we're using.

Thanks!

Cheers,
Chris Dean

Re: ec2 tests

Posted by Joe Stump <jo...@joestump.net>.
On Jun 18, 2010, at 6:39 PM, Olivier Mallassi wrote:

> and I did not see any improvements (Cassandra stays around 7000 W/sec). 

It's a brave new world where N+1 scaling with 7,000 writes per second per node is considered suboptimal performance.

--Joe


Re: ec2 tests

Posted by Olivier Mallassi <om...@octo.com>.
I tried the following :
- always one cassandra node on one EC2 m.large instance. two other m.large
instance, I run 4 stress.py (50 thread each, 2 stress.py on each instance)
- RAID0 EBS for data and ephemeral EBS (/dev/sda1 partition) for commit log.
- -Xmx4G

and I did not see any improvements (Cassandra stays around 7000 W/sec).

CPU is running up to 130% (spike) but I have two 2,5Ghz CPU
the avgqu-sz goes up to 20 (sometimes more) (for the device /dev/sda1 that
stores the commitlog)

Do you think concurrentWrites or MemtableThroughputInMB parameters must be
increased (using default value right now)
Any suggestions are welcomed. ;o)

On Fri, Jun 18, 2010 at 7:42 PM, Benjamin Black <b...@b3k.us> wrote:

> On Fri, Jun 18, 2010 at 8:00 AM, Olivier Mallassi <om...@octo.com>
> wrote:
> > I use the default conf settings (Xmx 1G, concurrentwrite 32...) except
> for
> > commitlog and DataFileDirectory : I have a raid0 EBS for commit log and
> > another raid0 EBS for data.
> > I can't get through 7500 write/sec (when launching 4 stress.py in the
> same
> > time).
> > Moreover I can see some pending tasks in the
> > org.cassandra.db.ColumnFamilyStores.Keyspace1.Standard1 MBean
> > Any ideas on the bottleneck?
>
> Your instance has 7.5G of RAM, but you are limiting Cassandra to 1G.
> Increase -Xmx to 4G for a start.  You are likely to get significantly
> better performance with the ephemeral drive, as well.  I suggest
> testing with commitlog on the ephemeral drive for comparison.
>
>
> b
>



-- 
............................................................
Olivier Mallassi
OCTO Technology
............................................................
50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com

Re: ec2 tests

Posted by Benjamin Black <b...@b3k.us>.
On Fri, Jun 18, 2010 at 8:00 AM, Olivier Mallassi <om...@octo.com> wrote:
> I use the default conf settings (Xmx 1G, concurrentwrite 32...) except for
> commitlog and DataFileDirectory : I have a raid0 EBS for commit log and
> another raid0 EBS for data.
> I can't get through 7500 write/sec (when launching 4 stress.py in the same
> time).
> Moreover I can see some pending tasks in the
> org.cassandra.db.ColumnFamilyStores.Keyspace1.Standard1 MBean
> Any ideas on the bottleneck?

Your instance has 7.5G of RAM, but you are limiting Cassandra to 1G.
Increase -Xmx to 4G for a start.  You are likely to get significantly
better performance with the ephemeral drive, as well.  I suggest
testing with commitlog on the ephemeral drive for comparison.


b

Re: ec2 tests

Posted by Olivier Mallassi <om...@octo.com>.
@chris. Thanks. I wil keep you update if I find something

@Joe. I am not telling this is a bad number. I am just telling this is
still not enough for us ( in order to limit the number of nodes)  ;o)
If I look at the last bench, version 0.6.2 is around 13000w/s
 I should/would be able to reach 10000w/sec (in fact this is almost
the case in non virtualized env.).
I am just trying to understand where is the bottleneck.

What do you mean by "N+1 scaling"? Not sure to understand the expression.

Thanks.

On Saturday, June 19, 2010, Chris Dean  wrote:
>> @Chris, Did you get any bench you could share with us?
>
> We're still working on it.  It's a lower priority task so it will take a
> while to finish.  So far we've run on all the AWS data centers in the US
> and used several different setups.  We also did a test on Rackspace with
> one setup and some whitebox servers we had in the office.  (The whitebox
> servers are still running I believe.)
>
> I don't have the numbers here, but the fastest by far is the
> non-virtualized whitebox servers.  No real surprise.  Rackspace was
> faster than AWS US-West; US-West faster than the than US-East.
>
> We always use 3 Cassandra servers and one or two machines to run
> stress.py.  I don't think we're seeing the 7500 writes/sec so maybe our
> config is wrong.  You'll have to be patient until my colleague writes
> this all up.
>
> Cheers,
> Chris Dean
>

-- 
............................................................
Olivier Mallassi
OCTO Technology
............................................................
50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com

Re: ec2 tests

Posted by Chris Dean <ct...@sokitomi.com>.
> @Chris, Did you get any bench you could share with us?

We're still working on it.  It's a lower priority task so it will take a
while to finish.  So far we've run on all the AWS data centers in the US
and used several different setups.  We also did a test on Rackspace with
one setup and some whitebox servers we had in the office.  (The whitebox
servers are still running I believe.)

I don't have the numbers here, but the fastest by far is the
non-virtualized whitebox servers.  No real surprise.  Rackspace was
faster than AWS US-West; US-West faster than the than US-East.  

We always use 3 Cassandra servers and one or two machines to run
stress.py.  I don't think we're seeing the 7500 writes/sec so maybe our
config is wrong.  You'll have to be patient until my colleague writes
this all up.

Cheers,
Chris Dean

Re: ec2 tests

Posted by Olivier Mallassi <om...@octo.com>.
Hi all,

@Chris, Did you get any bench you could share with us?

I am running the same kind of test on EC2  (m.large instances) :
- one VM for stress.py (can be launched several times)
- another VM for a unique cassandra node

I use the default conf settings (Xmx 1G, concurrentwrite 32...) except for
commitlog and DataFileDirectory : I have a raid0 EBS for commit log and
another raid0 EBS for data.

I can't get through 7500 write/sec (when launching 4 stress.py in the same
time).
Moreover I can see some pending tasks in the
org.cassandra.db.ColumnFamilyStores.Keyspace1.Standard1 MBean

Any ideas on the bottleneck?

Thanks a lot.

oliv/

On Fri, May 28, 2010 at 5:14 PM, gabriele renzi <rf...@gmail.com> wrote:

> On Fri, May 28, 2010 at 3:48 PM, Mark Greene <gr...@gmail.com> wrote:
> > First thing I would do is stripe your EBS volumes. I've seen blogs that
> say
> > this helps and blogs that say it's fairly marginal.
>
>
> just to point out: another option is to stripe the ephemeral drives
> (if using instances > small)
>



-- 
............................................................
Olivier Mallassi
OCTO Technology
............................................................
50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com

Re: ec2 tests

Posted by gabriele renzi <rf...@gmail.com>.
On Fri, May 28, 2010 at 3:48 PM, Mark Greene <gr...@gmail.com> wrote:
> First thing I would do is stripe your EBS volumes. I've seen blogs that say
> this helps and blogs that say it's fairly marginal.


just to point out: another option is to stripe the ephemeral drives
(if using instances > small)

Re: ec2 tests

Posted by Mark Greene <gr...@gmail.com>.
First thing I would do is stripe your EBS volumes. I've seen blogs that say
this helps and blogs that say it's fairly marginal. (You may want to try
rackspace cloud as they're local storage is much faster.)

Second, I would start out with N=2 and set W=1 and R=1. That will mirror
your data across two of the three nodes and possibly give you stale data on
the reads. If you feel you need stronger durability you increase N and W.

As far as heap memory, don't use 100% of the available physical ram.
Remember, object heap will be smaller than your overall JVM process heap.

That should get you started.


On Fri, May 28, 2010 at 3:10 AM, Chris Dean <ct...@sokitomi.com> wrote:

> Mark Greene <gr...@gmail.com> writes:
> > If you give us an objective of the test that will help. Trying to get max
> > write throughput? Read throughput? Weak consistency?
>
> I would like reading to be as fast as I can get.  My real-world problem
> is write heavy, but the latency requirements are minimal on that side.
> If there are any particular config setting that would help with the slow
> ec2 IO that would be great to know.
>
> Cheers,
> Chris Dean
>

Re: ec2 tests

Posted by Chris Dean <ct...@sokitomi.com>.
Mark Greene <gr...@gmail.com> writes:
> If you give us an objective of the test that will help. Trying to get max
> write throughput? Read throughput? Weak consistency?

I would like reading to be as fast as I can get.  My real-world problem
is write heavy, but the latency requirements are minimal on that side.
If there are any particular config setting that would help with the slow
ec2 IO that would be great to know.

Cheers,
Chris Dean

Re: ec2 tests

Posted by Mark Greene <gr...@gmail.com>.
If you give us an objective of the test that will help. Trying to get max
write throughput? Read throughput? Weak consistency?

On Thu, May 27, 2010 at 8:48 PM, Chris Dean <ct...@sokitomi.com> wrote:

> I'm interested in performing some simple performance tests on EC2.  I
> was thinking of using py_stress and Cassandra deployed on 3 servers with
> one separate machine to run py_stress.
>
> Are there any particular configuration settings I should use?  I was
> planning on changing the JVM heap size to reflect the Large Instances
> we're using.
>
> Thanks!
>
> Cheers,
> Chris Dean
>