You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Jim R. Wilson" <wi...@gmail.com> on 2010/04/21 17:56:59 UTC

Re: At what point does the cluster get faster than the individual nodes?

Hi Mark,

I'm a relative newcomer to Cassandra, but I believe the common
experience is that you start seeing gains after 5 nodes in a
column-oriented data store.  It may also depend on your usage pattern.

Others may know better - hope this helps!

-- Jim R. Wilson (jimbojw)

On Wed, Apr 21, 2010 at 11:28 AM, Mark Jones <MJ...@imagehawk.com> wrote:
> I’m seeing a cluster of 4 (replication factor=2) to be about as slow overall
> as the barely faster than the slowest node in the group.  When I run the 4
> nodes individually, I see:
>
>
>
> For inserts:
>
> Two nodes @ 12000/second
>
> 1 node @ 9000/second
>
> 1 node @ 7000/second
>
>
>
> For reads:
>
> Abysmal, less than 1000/second (not range slices, individual lookups)  Disk
> util @ 88+%
>
>
>
>
>
> How many nodes are required before you see a net positive gain on inserts
> and reads (QUORUM consistency on both)?
>
> When I use my 2 fastest nodes as a pair, the thruput is around 9000
> inserts/second.
>
>
>
> What is a good to excellent hardware config for Cassandra?  I have separate
> drives for data and commit log and 8GB in 3 machines (all dual core).  My
> fastest insert node has 4GB and a triple core processor.
>
>
>
> I’ve run py_stress, and my C++ code beats it by several 1000 inserts/second
> toward the end of the runs, so I don’t think it is my app, and I’ve removed
> the super columns per some suggestions yesterday.
>
>
>
> When Cassandra is working, it performs well, the problem is that is
> frequently slows down to < 50% of its peaks and occasionally slows down to 0
> inserts/second which greatly reduces aggregate thruput.