You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by eugene miretsky <eu...@gmail.com> on 2017/11/15 16:12:50 UTC

CQL Map vs clustering keys

Hi,

What would be the tradeoffs between using

1) Map

(

id UUID PRIMARY KEY,

myMap map<int,text>

);

2) Clustering key

(

 id UUID PRIMARY KEY,

key int,

val text,

PRIMARY KEY (id, key))

);

My understanding is that maps are stored very similarly to clustering
columns, where the map key is part of the SSTable's column name. The main
difference seems to be that with maps all the key/value pairs get retrieved
together, while with clustering keys we can retrieve individual rows, or a
range of keys.

Cheers,
Eugene

Re: CQL Map vs clustering keys

Posted by DuyHai Doan <do...@gmail.com>.

Yes, your remark is correct.

However, once CASSANDRA-7396 (right now in 4.0 trunk) get released, you
will be able to get a slice of map values using their (sorted) keys

SELECT map[fromKey ... toKey] FROM TABLE ...

Needless to say, it will be also possible to get a single element from the
map by its key with SELECT map[key] syntax

It will work exactly like clustering columns storage engine-wise.



On Wed, Nov 15, 2017 at 5:12 PM, eugene miretsky <eu...@gmail.com>
wrote:

> Hi,
>
> What would be the tradeoffs between using
>
> 1) Map
>
> (
>
> id UUID PRIMARY KEY,
>
> myMap map<int,text>
>
> );
>
> 2) Clustering key
>
> (
>
>  id UUID PRIMARY KEY,
>
> key int,
>
> val text,
>
> PRIMARY KEY (id, key))
>
> );
>
> My understanding is that maps are stored very similarly to clustering
> columns, where the map key is part of the SSTable's column name. The main
> difference seems to be that with maps all the key/value pairs get retrieved
> together, while with clustering keys we can retrieve individual rows, or a
> range of keys.
>
> Cheers,
> Eugene
>

Re: CQL Map vs clustering keys

Posted by eugene miretsky <eu...@gmail.com>.

Thanks!

So assuming C* 3.0 and that my table stores only one collection, using
clustering keys will be more performant?

Extending this to sets - would doing something like this make sense?

(

 id UUID PRIMARY KEY,

val text,

PRIMARY KEY (id, val))

);

SELECT count(*) FROM TABLE WHERE id = 123 AND val = "test" // Key exists if
count != 0

On Wed, Nov 15, 2017 at 12:48 PM, Jon Haddad <jo...@jonhaddad.com> wrote:

> In 3.0, clustering columns are not actually part of the column name
> anymore.  Yay.  Aaron Morton wrote a detailed analysis of the 3.x storage
> engine here: http://thelastpickle.com/blog/2016/03/04/
> introductiont-to-the-apache-cassandra-3-storage-engine.html
>
> The advantage of maps is a single table that can contain a very flexible
> data model, of maps and sets all in the same table.  Fun times.
>
> The advantage of using clustering keys is performance and you can use WAY
> more K/V pairs.
>
> Jon
>
>
> On Nov 15, 2017, at 8:12 AM, eugene miretsky <eu...@gmail.com>
> wrote:
>
> Hi,
>
> What would be the tradeoffs between using
>
> 1) Map
>
> (
>
> id UUID PRIMARY KEY,
>
> myMap map<int,text>
>
> );
>
> 2) Clustering key
>
> (
>
>  id UUID PRIMARY KEY,
>
> key int,
>
> val text,
>
> PRIMARY KEY (id, key))
>
> );
>
> My understanding is that maps are stored very similarly to clustering
> columns, where the map key is part of the SSTable's column name. The main
> difference seems to be that with maps all the key/value pairs get retrieved
> together, while with clustering keys we can retrieve individual rows, or a
> range of keys.
>
> Cheers,
> Eugene
>
>
>

Re: CQL Map vs clustering keys

Posted by Jon Haddad <jo...@jonhaddad.com>.

In 3.0, clustering columns are not actually part of the column name anymore.  Yay.  Aaron Morton wrote a detailed analysis of the 3.x storage engine here: http://thelastpickle.com/blog/2016/03/04/introductiont-to-the-apache-cassandra-3-storage-engine.html <http://thelastpickle.com/blog/2016/03/04/introductiont-to-the-apache-cassandra-3-storage-engine.html>

The advantage of maps is a single table that can contain a very flexible data model, of maps and sets all in the same table.  Fun times.

The advantage of using clustering keys is performance and you can use WAY more K/V pairs.  

Jon

> On Nov 15, 2017, at 8:12 AM, eugene miretsky <eu...@gmail.com> wrote:
> 
> Hi, 
> 
> What would be the tradeoffs between using
> 
> 1) Map
> 
> (
> id UUID PRIMARY KEY,  
> myMap map<int,text>
> );
> 
> 2) Clustering key
> 
> (
>  id UUID PRIMARY KEY,
> key int,
> val text,
> PRIMARY KEY (id, key))
> );
> 
> My understanding is that maps are stored very similarly to clustering columns, where the map key is part of the SSTable's column name. The main difference seems to be that with maps all the key/value pairs get retrieved together, while with clustering keys we can retrieve individual rows, or a range of keys. 
> 
> Cheers,
> Eugene