You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2015/01/01 19:46:02 UTC

is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

I think the two tables are the same.  Correct?

create table foo (

    source text,
    target text,
    primary key( source, target )
)


vs

create table foo (

    source text,
    target set<text>,
    primary key( source )
)

… meaning that the first one, under the covers is represented the same as
the second.  As a slice.

Am I correct?

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Jens Rantil <je...@tink.se>.
...they have a somewhat different conflict/repair resolutions, too.

On Thu, Jan 1, 2015 at 8:06 PM, DuyHai Doan <do...@gmail.com> wrote:

> Storage-engine wise, they are almost equivalent, thought there are some
> minor differences:
> 1) with Set structure, you cannot store more that 64kb worth of data
> 2) collections and maps are loaded entirely by Cassandra for each query,
> whereas with clustering columns you can select a slice of columns
> On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>> I think the two tables are the same.  Correct?
>>
>> create table foo (
>>
>>     source text,
>>     target text,
>>     primary key( source, target )
>> )
>>
>>
>> vs
>>
>> create table foo (
>>
>>     source text,
>>     target set<text>,
>>     primary key( source )
>> )
>>
>> … meaning that the first one, under the covers is represented the same as
>> the second.  As a slice.
>>
>> Am I correct?
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>> <http://spinn3r.com>
>>
>>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Sylvain Wallez <sy...@apache.org>.
Le 04/01/2015 11:52, Sylvain Lebresne a écrit :
> On Sun, Jan 4, 2015 at 12:48 AM, Sylvain Wallez <sylvain@apache.org 
> <ma...@apache.org>> wrote:
>
>     Indeed this makes sense for map keys and set values, but AFAIU
>     from the docs this also applies to map and list _values_: " The
>     maximum size of an item in a collection is 64K"
>
>
> Somehow it appears that from Jack's quote you've only read what was in 
> the parenthesis. The part that was not in parenthesis was:
>   "Collections values are currently limited to 64K because the 
> serialized form used uses shorts to encode the elements length"
>
> That's a limitation of the binary protocol if you will, not an 
> internal storage one.
>
> I'll add that this protocol limitation has in fact already be lifted 
> in the v3 of the protocol (so C* 2.1) but documentation may not be 
> entirely up to date yet.

Got it! I thought it was referring to the storage format, not the 
protocol. Hence the misunderstanding.

Thanks!

Sylvain

-- 
Sylvain Wallez - http://bluxte.net


Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Sun, Jan 4, 2015 at 12:48 AM, Sylvain Wallez <sy...@apache.org> wrote:

>  Indeed this makes sense for map keys and set values, but AFAIU from the
> docs this also applies to map and list _values_: " The maximum size of an
> item in a collection is 64K"
>

Somehow it appears that from Jack's quote you've only read what was in the
parenthesis. The part that was not in parenthesis was:
  "Collections values are currently limited to 64K because the serialized
form used uses shorts to encode the elements length"

That's a limitation of the binary protocol if you will, not an internal
storage one.

I'll add that this protocol limitation has in fact already be lifted in the
v3 of the protocol (so C* 2.1) but documentation may not be entirely up to
date yet.



>
>
> http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_collections_c.html
>
> Or are collection values also represented as keys?
>
> Sylvain
>
> Le 03/01/2015 20:50, Jack Krupansky a écrit :
>
> See: https://issues.apache.org/jira/browse/CASSANDRA-5355
>
>  "Collections values are currently limited to 64K because the serialized
> form used uses shorts to encode the elements length (and for sets elements
> and key map, because they are part of the internal column name that is
> itself limited to 64K)."
>
>  -- Jack Krupansky
>
> On Sat, Jan 3, 2015 at 2:31 PM, Sylvain Wallez <sy...@apache.org> wrote:
>
>>  From what I understand from the docs, the 64k limit applies to both the
>> number of items in a collection and the size of its elements?
>>
>> Why is there a constraint on value size in collections, when other types
>> such as blob or text can be larger?
>>
>> Thanks,
>> Sylvain
>>
>> Le 01/01/2015 20:04, DuyHai Doan a écrit :
>>
>> Storage-engine wise, they are almost equivalent, thought there are some
>> minor differences:
>>
>>  1) with Set structure, you cannot store more that 64kb worth of data
>> 2) collections and maps are loaded entirely by Cassandra for each query,
>> whereas with clustering columns you can select a slice of columns
>>
>>
>>
>> On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>>
>>> I think the two tables are the same.  Correct?
>>>
>>>  create table foo (
>>>
>>>      source text,
>>>     target text,
>>>     primary key( source, target )
>>> )
>>>
>>>
>>>  vs
>>>
>>>   create table foo (
>>>
>>>      source text,
>>>     target set<text>,
>>>     primary key( source )
>>> )
>>>
>>>  … meaning that the first one, under the covers is represented the same
>>> as the second.  As a slice.
>>>
>>>  Am I correct?
>>>
>>>  --
>>>   Founder/CEO Spinn3r.com
>>>  Location: *San Francisco, CA*
>>>  blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>  <http://spinn3r.com>
>>>
>>
>>
>>
>>   --
>> Sylvain Wallez - http://bluxte.net
>>
>>
>
>
> --
> Sylvain Wallez - http://bluxte.net
>
>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Sylvain Wallez <sy...@apache.org>.
Indeed this makes sense for map keys and set values, but AFAIU from the 
docs this also applies to map and list _values_: " The maximum size of 
an item in a collection is 64K"

http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_collections_c.html

Or are collection values also represented as keys?

Sylvain

Le 03/01/2015 20:50, Jack Krupansky a écrit :
> See: https://issues.apache.org/jira/browse/CASSANDRA-5355
>
> "Collections values are currently limited to 64K because the 
> serialized form used uses shorts to encode the elements length (and 
> for sets elements and key map, because they are part of the internal 
> column name that is itself limited to 64K)."
>
> -- Jack Krupansky
>
> On Sat, Jan 3, 2015 at 2:31 PM, Sylvain Wallez <sylvain@apache.org 
> <ma...@apache.org>> wrote:
>
>     From what I understand from the docs, the 64k limit applies to
>     both the number of items in a collection and the size of its elements?
>
>     Why is there a constraint on value size in collections, when other
>     types such as blob or text can be larger?
>
>     Thanks,
>     Sylvain
>
>     Le 01/01/2015 20:04, DuyHai Doan a écrit :
>>     Storage-engine wise, they are almost equivalent, thought there
>>     are some minor differences:
>>
>>     1) with Set structure, you cannot store more that 64kb worth of data
>>     2) collections and maps are loaded entirely by Cassandra for each
>>     query, whereas with clustering columns you can select a slice of
>>     columns
>>
>>
>>
>>     On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <burton@spinn3r.com
>>     <ma...@spinn3r.com>> wrote:
>>
>>         I think the two tables are the same.  Correct?
>>
>>         create table foo (
>>
>>             source text,
>>             target text,
>>             primary key( source, target )
>>         )
>>
>>
>>         vs
>>
>>         create table foo (
>>
>>             source text,
>>             target set<text>,
>>             primary key( source )
>>         )
>>
>>         … meaning that the first one, under the covers is represented
>>         the same as the second.  As a slice.
>>
>>         Am I correct?
>>
>>         -- 
>>         Founder/CEO Spinn3r.com <http://Spinn3r.com>
>>         Location: *San Francisco, CA*
>>         blog:**http://burtonator.wordpress.com
>>         … or check out my Google+ profile
>>         <https://plus.google.com/102718274791889610666/posts>
>>         <http://spinn3r.com>
>>
>>
>
>
>     -- 
>     Sylvain Wallez -http://bluxte.net
>
>


-- 
Sylvain Wallez - http://bluxte.net


Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Jack Krupansky <ja...@gmail.com>.
See: https://issues.apache.org/jira/browse/CASSANDRA-5355

"Collections values are currently limited to 64K because the serialized
form used uses shorts to encode the elements length (and for sets elements
and key map, because they are part of the internal column name that is
itself limited to 64K)."

-- Jack Krupansky

On Sat, Jan 3, 2015 at 2:31 PM, Sylvain Wallez <sy...@apache.org> wrote:

>  From what I understand from the docs, the 64k limit applies to both the
> number of items in a collection and the size of its elements?
>
> Why is there a constraint on value size in collections, when other types
> such as blob or text can be larger?
>
> Thanks,
> Sylvain
>
> Le 01/01/2015 20:04, DuyHai Doan a écrit :
>
> Storage-engine wise, they are almost equivalent, thought there are some
> minor differences:
>
>  1) with Set structure, you cannot store more that 64kb worth of data
> 2) collections and maps are loaded entirely by Cassandra for each query,
> whereas with clustering columns you can select a slice of columns
>
>
>
> On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I think the two tables are the same.  Correct?
>>
>>  create table foo (
>>
>>      source text,
>>     target text,
>>     primary key( source, target )
>> )
>>
>>
>>  vs
>>
>>   create table foo (
>>
>>      source text,
>>     target set<text>,
>>     primary key( source )
>> )
>>
>>  … meaning that the first one, under the covers is represented the same
>> as the second.  As a slice.
>>
>>  Am I correct?
>>
>>  --
>>   Founder/CEO Spinn3r.com
>>  Location: *San Francisco, CA*
>>  blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>  <http://spinn3r.com>
>>
>
>
>
> --
> Sylvain Wallez - http://bluxte.net
>
>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Sylvain Wallez <sy...@apache.org>.
 From what I understand from the docs, the 64k limit applies to both the 
number of items in a collection and the size of its elements?

Why is there a constraint on value size in collections, when other types 
such as blob or text can be larger?

Thanks,
Sylvain

Le 01/01/2015 20:04, DuyHai Doan a écrit :
> Storage-engine wise, they are almost equivalent, thought there are 
> some minor differences:
>
> 1) with Set structure, you cannot store more that 64kb worth of data
> 2) collections and maps are loaded entirely by Cassandra for each 
> query, whereas with clustering columns you can select a slice of columns
>
>
>
> On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <burton@spinn3r.com 
> <ma...@spinn3r.com>> wrote:
>
>     I think the two tables are the same. Correct?
>
>     create table foo (
>
>         source text,
>         target text,
>         primary key( source, target )
>     )
>
>
>     vs
>
>     create table foo (
>
>         source text,
>         target set<text>,
>         primary key( source )
>     )
>
>     … meaning that the first one, under the covers is represented the
>     same as the second.  As a slice.
>
>     Am I correct?
>
>     -- 
>     Founder/CEO Spinn3r.com <http://Spinn3r.com>
>     Location: *San Francisco, CA*
>     blog:**http://burtonator.wordpress.com
>     … or check out my Google+ profile
>     <https://plus.google.com/102718274791889610666/posts>
>     <http://spinn3r.com>
>
>


-- 
Sylvain Wallez - http://bluxte.net


Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Kevin Burton <bu...@spinn3r.com>.
AH!!! I had forgotten about both of those issues.  Good points..

On Thu, Jan 1, 2015 at 11:04 AM, DuyHai Doan <do...@gmail.com> wrote:

> Storage-engine wise, they are almost equivalent, thought there are some
> minor differences:
>
> 1) with Set structure, you cannot store more that 64kb worth of data
> 2) collections and maps are loaded entirely by Cassandra for each query,
> whereas with clustering columns you can select a slice of columns
>
>
>
> On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I think the two tables are the same.  Correct?
>>
>> create table foo (
>>
>>     source text,
>>     target text,
>>     primary key( source, target )
>> )
>>
>>
>> vs
>>
>> create table foo (
>>
>>     source text,
>>     target set<text>,
>>     primary key( source )
>> )
>>
>> … meaning that the first one, under the covers is represented the same as
>> the second.  As a slice.
>>
>> Am I correct?
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>> <http://spinn3r.com>
>>
>>
>


-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Jan 2, 2015 at 11:35 AM, Tyler Hobbs <ty...@datastax.com> wrote:

>
> This is not true (with one minor exception).  All operations on sets and
> maps require no reads.  The same is true for appends and prepends on lists,
> but delete and set operations on lists with (non-zero) indexes require the
> list to be read first.  However, the entire list does not need to be
> re-written to disk.
>

Thank you guys for the correction; a case where I am glad to be wrong. I
must have been thinking about the delete/set operations and have drawn an
erroneous inference. :)

=Rob

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Tyler Hobbs <ty...@datastax.com>.
On Fri, Jan 2, 2015 at 1:13 PM, Eric Stevens <mi...@gmail.com> wrote:

> > And also stored entirely for each UPDATE. Change one element,
> re-serialize the whole thing to disk.
>
> Is this true?  I thought updates (adds, removes, but not overwrites)
> affected just the indicated columns.  Isn't it just the reads that involve
> reading the entire collection?


This is not true (with one minor exception).  All operations on sets and
maps require no reads.  The same is true for appends and prepends on lists,
but delete and set operations on lists with (non-zero) indexes require the
list to be read first.  However, the entire list does not need to be
re-written to disk.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Eric Stevens <mi...@gmail.com>.
> And also stored entirely for each UPDATE. Change one element,
re-serialize the whole thing to disk.

Is this true?  I thought updates (adds, removes, but not overwrites)
affected just the indicated columns.  Isn't it just the reads that involve
reading the entire collection?

DS docs talk about reading whole collections, but I don't see anything
about having to overwrite the entire collection each time.  That would
indicate a read then write style operation, which is antipatterny.

> When you query a table containing a collection, Cassandra retrieves the
collection in its entirety
http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_set_t.html



On Fri, Jan 2, 2015 at 11:48 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Jan 1, 2015 at 11:04 AM, DuyHai Doan <do...@gmail.com> wrote:
>
>> 2) collections and maps are loaded entirely by Cassandra for each query,
>> whereas with clustering columns you can select a slice of columns
>>
>
> And also stored entirely for each UPDATE. Change one element, re-serialize
> the whole thing to disk.
>
> =Rob
>

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Jan 1, 2015 at 11:04 AM, DuyHai Doan <do...@gmail.com> wrote:

> 2) collections and maps are loaded entirely by Cassandra for each query,
> whereas with clustering columns you can select a slice of columns
>

And also stored entirely for each UPDATE. Change one element, re-serialize
the whole thing to disk.

=Rob

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

Posted by DuyHai Doan <do...@gmail.com>.
Storage-engine wise, they are almost equivalent, thought there are some
minor differences:

1) with Set structure, you cannot store more that 64kb worth of data
2) collections and maps are loaded entirely by Cassandra for each query,
whereas with clustering columns you can select a slice of columns



On Thu, Jan 1, 2015 at 7:46 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> I think the two tables are the same.  Correct?
>
> create table foo (
>
>     source text,
>     target text,
>     primary key( source, target )
> )
>
>
> vs
>
> create table foo (
>
>     source text,
>     target set<text>,
>     primary key( source )
> )
>
> … meaning that the first one, under the covers is represented the same as
> the second.  As a slice.
>
> Am I correct?
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>