You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jesse McConnell <je...@gmail.com> on 2010/03/09 14:43:47 UTC
Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

in my experience #2 will work well up to a point where it will trigger
a limitation of cassandra (slated to be resolved in .7 \o/) where all
of the columns under a given key must be able to fit into memory.  For
things like index's of data I have opted to shard the keys for really
large data sets to get around this until its fixed....

I suspect if you doubled the test for #2 once or twice you'll start
seeing OOM's

also #2 will end up having a lumpy distribution around a cluster as
all the data under a given key needs to be able to fit on one machine,
#1 will spread out a bit finer.

cheers,
jesse

--
jesse mcconnell
jesse.mcconnell@gmail.com



On Tue, Mar 9, 2010 at 07:15, Sylvain Lebresne <sy...@yakaz.com> wrote:
> Hello,
>
> I've done some tests and it seems that somehow to have more rows with few
> columns is better than to have more rows with fewer columns, at least as long
> as read performance is concerned.
> Using stress.py, on a quad core 2.27Ghz with 4Go RAM and the out of the box
> cassandra configuration, I inserted:
>
>  1) 50000000 rows (that's 50 millions) with 1 column each
> (stress.py -n 50000000 -c 1)
>  2) 500000 rows (that's 500 thousands) with 100 column each
> (stress.py -n 500000 -c 100)
>
> that is, it ends up with 50 millions columns in both case (I use such big
> numbers so that in case 2, the resulting data are big enough not to fit in
> the system caches, in which case the problem I'm mentioning below
> doesn't show).
> Those two 'tests' have been done separatly, with data flushed completely
> between them. I let cassandra compact everything each time, shutdown the
> server and start it again (so that no data is in memtable). Then I tried
> reading columns, one at a time using:
>  1) stress.py -t 10 -o read -n 50000000 -c 1 -r
>  2) stress.py -t 10 -o read -n 500000 -c 1 -r
>
> In the case 1) I get around 200 reads/seconds and that's pretty stable. The
> disk is spinning like crazy (~25% io_wait), very few cpu or memory used,
> performances are IO bound, which is expected.
> In the case 2) however, it starts with reasonnable performance (400+
> reads/seconds), but it very quickly drop to an average of 80 reads/seconds
> (after a minute and a half or so). And it don't go up significantly after
> that. Turns out this seems to be a GC problem. Indeed, the info log (I'm
> running trunk from today, but I first saw the problem on an older version of
> trunk) show every few seconds lines like:
>  GC for ConcurrentMarkSweep: 4599 ms, 57247304 reclaimed leaving
> 1033481216 used; max is 1211498496
> I'm not surprised that performance are bad with such GC pauses. I'm surprised
> to have such GC pauses.
>
> Note that in case 1), the resulting data 'weights' ~14G, while in case 2) it
> 'weights' only ~2.4G.
>
> Let me add that I used stress.py to try to identify the problem, but I first
> run into it in an application I'm writting where I had rows with around 1000
> columns of 30K each. With about 1000 rows, I had awfull performances, like 5
> reads/seconds on average. I try switching to 1 millions row having each 1
> column of 30K and end up with more than 300 reads/seconds.
>
> Any idea, insight ? Am I doing something utterly wrong ?
> Thanks in advance.
>
> --
> Sylvain
>