You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Philippe <wa...@gmail.com> on 2010/04/13 16:05:48 UTC

RE : Re: RE : Re: Two dimensional matrices

I'm confused : don't range queries such as the ones we've been discussing
require using an orderedpartitionner ?

Le 13 avr. 2010 15:58, "Eric Evans" <ee...@rackspace.com> a écrit :

On Tue, 2010-04-13 at 08:57 +0200, Philippe wrote:
> Okay so if i switch columns and super columns i...
Sure.


> Assuming this is all correct, what are the consequences of these
> design decisions in terms of p...
Partition tolerance doesn't really come into play here, and as for how
the data is distributed, that depends on your dataset, and your choice
of partitioner.

--

Eric Evans
eevans@rackspace.com

Re: RE : Re: RE : Re: Two dimensional matrices

Posted by Philippe <wa...@gmail.com>.
> I'm confused : don't range queries such as the ones we've been

> > discussing require using an orderedpartitionner ?
>
> Alright, so distribution depends on your choice of token.
>
Ah yes, I get it now : with a naive orderedpartitioner, the key is
associated with the node whose token is the closest numerically-wise and
that is where the "master" replica is located. Yes ?

Now let's assume I am using super columns as {X} and columns as {timeFrame}.
In time each row will grow very large because X can (very sparsly) go to
2^28
i) does cassandra load all columns everytime it reads a row ? Same question
for super column
ii) Similarly does it cache all columns in memory ?

Now some order of magnitudes, let's say a row is about 20KB and the cluster
is running smoothly on low-end servers. There are millions of rows per node.
i) If I were to only issue gets on the key, what is the order of magnitude I
can expect to reach : 10/s, 100/s, 1000/s or 10.000/s ?
ii) If I were to issue a slice on just the keys, does cassandra optimize the
gets or does it run every get on the server and then concatenate to send to
the client ?
iii) is slicing on the columns going to improve the time to get the data on
the server side or does it just cut down on network traffic ?

Thanks
Philippe

Re: RE : Re: RE : Re: Two dimensional matrices

Posted by Eric Evans <ee...@rackspace.com>.
On Tue, 2010-04-13 at 16:05 +0200, Philippe wrote:
> I'm confused : don't range queries such as the ones we've been
> discussing require using an orderedpartitionner ? 

Alright, so distribution depends on your choice of token.

-- 
Eric Evans
eevans@rackspace.com