You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Giorgos Margaritis <gm...@gmail.com> on 2012/06/09 17:00:28 UTC

get_range_slices() latency

Hi all,

I'm using ycsb to test Cassanda's performance on key range gets. I have
install
ycsb on one node and latest Cassandra server on another node. Using one
thread, I insert 10GB of uniformly random keys in Cassandra using ycsb,
while
performing range gets (get_range_slices) (every 1000 puts, I perform 1 range
get). Keys are 100 bytes, values 1KB. Each range get retrieves a random
number of entries between 1 and 100. Cassandra node has 3GB RAM and
one 7500RPM SATA disk. I use default configuration for Cassandra.
I have also replayed the experiment above with 10 threads instead of one.

I calculate the time needed for each get_range_slices() both in Cassandra
and
in ycsb. I was surprised with the extremely low latencies I got, and I'm not
sure I understand why (or if they are correct). E.g. I see 5ms latencies,
even lower than disk seek latencies, when I know that since there are
multiple files on disk, Cassandra should check in all files to satisfy a
get_range_slices() call (BF are of no use). Since node has 3GB RAM
and I insert 10GB of data, there is no possibility data is cached in memory
and calls are satisfied from there.

So, since there are multiple files on disk (I don't know if leveldb
compactions
are default or not, but in either case there are more than one disk files
that
should be checked for each range get), and since -at least after inserting
3-4GB- each get_range_slices() must touch disk, and must touch more
than one files, is it possible for a get_range_slices() to be satisfied in
5ms?
Am I missing something?

Thanks!