You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ted Zlatanov <tz...@lifelogs.com> on 2010/03/03 14:15:57 UTC

why have ColumnFamilies?

I don't understand the advantages of ColumnFamilies over a
SuperColumnFamily with just one supercolumn.  Why have the former if the
latter is functionally equivalent?

Thanks
Ted

Re: why have ColumnFamilies?

Posted by Alexandre Conrad <al...@gmail.com>.

2010/3/3 Gary Dusbabek <gd...@gmail.com>:
> It is basically correct.  Your diagram could be extended by indicating
> that columns consist of name, value, timestamp and that super columns
> have names and that a column family consists exclusively of columns or
> super columns (never both).

Thanks Gary for the precision. So they can't be mixed within a same
ColumnFamily.

> I might switch Row and ColumnFamily in the hierarchy too since the
> rows live in column families and not necessarily in the keyspace.  The
> API masks this detail away though, by allowing you to supply a single
> keyspace and row key and then operations that span column families.
> So maybe that is more of an interpretation that depends on which side
> of the client you spend most of your time.

Interesting. I haven't used Cassandra in practice yet, but I'm
currently trying understand its data model. Reading more and more
about it sharpens my blurry Cassandra picture.

-- 
Alex
twitter.com/alexconrad

Re: why have ColumnFamilies?

Posted by Gary Dusbabek <gd...@gmail.com>.

On Wed, Mar 3, 2010 at 07:43, Alexandre Conrad
<al...@gmail.com> wrote:
> As far as I understand, there's how I organize Cassandra entities:
>
> http://paste.pocoo.org/show/185126/
>
> Is this somehow correct?
>
> --
> Alex
> twitter.com/alexconrad
>

It is basically correct.  Your diagram could be extended by indicating
that columns consist of name, value, timestamp and that super columns
have names and that a column family consists exclusively of columns or
super columns (never both).

I might switch Row and ColumnFamily in the hierarchy too since the
rows live in column families and not necessarily in the keyspace.  The
API masks this detail away though, by allowing you to supply a single
keyspace and row key and then operations that span column families.
So maybe that is more of an interpretation that depends on which side
of the client you spend most of your time.

Gary.

Re: why have ColumnFamilies?

Posted by Ted Zlatanov <tz...@lifelogs.com>.

On Wed, 3 Mar 2010 15:21:42 +0100 Alexandre Conrad <al...@gmail.com> wrote: 

AC> 2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
KeySpace-> Row->ColumnFamily->Column[name, value]
>> (a two-level map)
>> 
KeySpace-> Row->SuperColumnFamily->SuperColumn[name]->Column[name, value]
>> (a three-level map)

AC> Thanks for the explanation. So this means that entities under
AC> SuperColumnFamily can only be SuperColumns and ColumnFamilies can only
AC> hold Columns.

AC> I guess the term SuperColumnFamily (which I haven't seen documented)
AC> is used when you want to implicitly say that SuperColumns are used
AC> beneath.

Sorry for the confusion.  I'm using SuperColumnFamily as a shortcut to
say "ColumnFamily of type 'Super'" which is what you'll find in the
configuration.

Your diagram was not incorrect.  It shows both Columns and SuperColumns
under the same family which can't happen in practice but logically do
live under the same branch.  So it's sort of correct but confusing :)

Ted

Re: why have ColumnFamilies?

Posted by Alexandre Conrad <al...@gmail.com>.

2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
> KeySpace->Row->ColumnFamily->Column[name, value]
> (a two-level map)
>
> KeySpace->Row->SuperColumnFamily->SuperColumn[name]->Column[name, value]
> (a three-level map)

Thanks for the explanation. So this means that entities under
SuperColumnFamily can only be SuperColumns and ColumnFamilies can only
hold Columns.

I guess the term SuperColumnFamily (which I haven't seen documented)
is used when you want to implicitly say that SuperColumns are used
beneath.

-- 
Alex
twitter.com/alexconrad

Re: why have ColumnFamilies?

Posted by Ted Zlatanov <tz...@lifelogs.com>.

On Wed, 3 Mar 2010 14:43:14 +0100 Alexandre Conrad <al...@gmail.com> wrote: 

AC> 2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
>> I don't understand the advantages of ColumnFamilies over a
>> SuperColumnFamily with just one supercolumn.  Why have the former if the
>> latter is functionally equivalent?

AC> As far as I understand, there's how I organize Cassandra entities:

AC> http://paste.pocoo.org/show/185126/

AC> Is this somehow correct?

This was your diagram (fixed-width font required):

> KeySpace                            
>    |                                
>    +-- Row                          
>         |                           
>         +-- ColumnFamily            
>                  |                  
>                  +-- Column         
>                  |                  
>                  +-- SuperColumn    
>                          |          
>                          +-- Column 

That's incorrect.  Here's a (shorter, correct) version:

KeySpace->Row->ColumnFamily->Column[name, value]

(a two-level map)

KeySpace->Row->SuperColumnFamily->SuperColumn[name]->Column[name, value]

(a three-level map)

My point was that conceptually, the three-level map can express the
two-level map.

Ted

Re: why have ColumnFamilies?

Posted by Alexandre Conrad <al...@gmail.com>.

2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
> I don't understand the advantages of ColumnFamilies over a
> SuperColumnFamily with just one supercolumn.  Why have the former if the
> latter is functionally equivalent?

Being pretty new here with Cassandra's terminology, I'm not sure what
a SuperColumnFamily is. Or maybe this page it not complete:
http://wiki.apache.org/cassandra/DataModel

As far as I understand, there's how I organize Cassandra entities:

http://paste.pocoo.org/show/185126/

Is this somehow correct?

-- 
Alex
twitter.com/alexconrad

Re: why have ColumnFamilies?

Posted by Ted Zlatanov <tz...@lifelogs.com>.

On Wed, 3 Mar 2010 08:56:16 -0600 Jonathan Ellis <jb...@gmail.com> wrote: 

JE> I would rather move to a more flexible model ("as many levels of
JE> nesting as you want") than a less-flexible one.

That's very exciting.  I've often wished for "just one more level" while
putting Cassandra schemas together, so I hope this happens in a future
release.

Thanks
Ted

Re: why have ColumnFamilies?

Posted by Tatu Saloranta <ts...@gmail.com>.

On Wed, Mar 3, 2010 at 6:56 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> I would rather move to a more flexible model ("as many levels of
> nesting as you want") than a less-flexible one.

+1

This is one of patterns that I have seen many times: providing for "as
many levels as you want" may not be more difficult than fixed number.
But it offers much more possibilities. So it would be great to see
move to this direction.

-+ Tatu +-

Re: why have ColumnFamilies?

Posted by Jonathan Ellis <jb...@gmail.com>.

I would rather move to a more flexible model ("as many levels of
nesting as you want") than a less-flexible one.

2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
> On Wed, 3 Mar 2010 07:23:48 -0600 Jonathan Ellis <jb...@gmail.com> wrote:
>
> JE> 2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
>>> I don't understand the advantages of ColumnFamilies over a
>>> SuperColumnFamily with just one supercolumn.  Why have the former if the
>>> latter is functionally equivalent?
>
> JE> http://issues.apache.org/jira/browse/CASSANDRA-598
>
> So is there a vague plan to move to just SuperColumnFamilies once this
> is resolved?  The API can emulate ColumnFamilies, obviously.  Or are
> ColumnFamilies here to stay?
>
> Ted
>
>

Re: why have ColumnFamilies?

Posted by Ted Zlatanov <tz...@lifelogs.com>.

On Wed, 3 Mar 2010 07:23:48 -0600 Jonathan Ellis <jb...@gmail.com> wrote: 

JE> 2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
>> I don't understand the advantages of ColumnFamilies over a
>> SuperColumnFamily with just one supercolumn.  Why have the former if the
>> latter is functionally equivalent?

JE> http://issues.apache.org/jira/browse/CASSANDRA-598

So is there a vague plan to move to just SuperColumnFamilies once this
is resolved?  The API can emulate ColumnFamilies, obviously.  Or are
ColumnFamilies here to stay?

Ted

Re: why have ColumnFamilies?

Posted by Jonathan Ellis <jb...@gmail.com>.

http://issues.apache.org/jira/browse/CASSANDRA-598

2010/3/3 Ted Zlatanov <tz...@lifelogs.com>:
> I don't understand the advantages of ColumnFamilies over a
> SuperColumnFamily with just one supercolumn.  Why have the former if the
> latter is functionally equivalent?
>
> Thanks
> Ted
>
>