You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Richard Grossman <ri...@gmail.com> on 2010/01/20 17:23:55 UTC

How to lower the number of has reached it's threshold ?

I use cassandra to store data but the data loading is massive.
After advise from Jonathan I've turn the code to multithread but now the
server are just overloaded when I load data.

I get all the time

channelShow has reached its threshold; switching in a fresh Memtable
INFO - Enqueuing flush of Memtable(channelShow)@23496883

which is normal i think but may be I can parameter something to get less
message like this.

The second question is really direct : do you think it's insane to use EC2
small instance to build a cassandra cluster. The machine are virtualized
with 2G memory

Thanks

Re: How to lower the number of has reached it's threshold ?

Posted by Ryan Daum <ry...@thimbleware.com>.
>
>
>
> The second question is really direct : do you think it's insane to use EC2
> small instance to build a cassandra cluster. The machine are virtualized
> with 2G memory
>
>
We are using 6 ec2 small instances + EBS to build a cluster. I can't say
it's seen significant stress -- perhaps 50 writes a second, but performance
has been acceptable for us on writes. Reads are not as fast as I would like
-- 3600 row keys with 60,000 super columns retrieved in about 10seconds, but
again, acceptable for our purposes.

I'd be interested to see how you net out and what you decide for
configuration.  Large instances are unfortunately quite a bit more expensive
than small, but at least the have advantage of 64-bit CPU and more memory.

Ryan

Re: How to lower the number of has reached it's threshold ?

Posted by Jonathan Ellis <jb...@gmail.com>.
On Wed, Jan 20, 2010 at 10:23 AM, Richard Grossman <ri...@gmail.com> wrote:
> I use cassandra to store data but the data loading is massive.
> After advise from Jonathan I've turn the code to multithread but now the
> server are just overloaded when I load data.
>
> I get all the time
>
> channelShow has reached its threshold; switching in a fresh Memtable
> INFO - Enqueuing flush of Memtable(channelShow)@23496883

http://wiki.apache.org/cassandra/MemtableThresholds

> The second question is really direct : do you think it's insane to use EC2
> small instance to build a cassandra cluster. The machine are virtualized
> with 2G memory

If I were going to use EC2 I would try high-cpu medium as well
depending on workload.

I would also look at RCS (rackspace cloud servers) b/c of burstable
cpu and better local disk performance.  (EC2 also has EBS, but maybe
you don't want to rely on network storage on EC2 --
https://www.cloudkick.com/blog/2010/jan/12/visual-ec2-latency/)

-Jonathan