You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Eric Evans <ee...@acunu.com> on 2011/12/20 23:56:13 UTC

CQL support for compound columns

There has been a discussion taking place in CASSANDRA-2474[1]
regarding the language and semantics of compound columns in CQL.
Though the issue was only opened in July, and despite extended periods
of inactivity, it is monstrously long.  Additionally, the discussion
necessarily includes inline visual aids (tables, graphics, and
verbatim code snippets) that are constantly being revised, which only
compounds (pun intended) the problem.  I feel as though this is not
only making the discussion less constructive, but that it may be
scaring people off, (and IMO, this issue could use to be discussed
among a larger group anyway).

I propose two things, 1) that we move the discussion to this mailing
list, and 2) that we track the various approaches in the wiki.

For the latter of these, I've stubbed out a page[2] that would
hopefully serve as a starting point.

Thoughts?


[1]: https://issues.apache.org/jira/browse/CASSANDRA-2474
[2]: http://wiki.apache.org/cassandra/Cassandra2475

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

https://issues.apache.org/jira/browse/CASSANDRA-3685

On Thu, Dec 29, 2011 at 6:34 PM, Eric Evans <ee...@acunu.com> wrote:
> On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> That's to allow defining column names that are not text/utf8.  So you
>> could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
>> actual 128-bit uuid binary value internally, not its string
>> representation.  Put another way, this would affect the CqlMetadata
>> name_types map.
>>
>> However, we already have the "column names are always strings"
>> limitations with existing CQL DDL so it probably makes more sense to
>> consider it separately from transposition.
>
> Right, and to get a jump on that bikeshedding I'd propose that look
> something like:
>
> CREATE TABLE test (
>   int(10) text,
>   uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint
> )
>
> or...
>
> CREATE TABLE test (
>   (int)10 text,
>   (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint
> )
>
> But I digress, that's probably best left for another issue and another time. :)
>
>
>> On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans <ee...@acunu.com> wrote:
>>> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>> Gamma proposal update:
>>>>
>>>> The more I think about it the less happy I am with omitting support
>>>> for sparse columns.  Remember that dense composites may only be
>>>> inserted and deleted, not updated, since they are just a tuple of
>>>> values with "column names" determined by schema and/or convention.
>>>>
>>>> I think we can support sparse columns well in a way that improves the
>>>> conceptual integrity for the dense composites as well:
>>>>
>>>> {code}
>>>> -- "column" and "value" are sparse; a transposed row will be stored as
>>>> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
>>>> posted_at, 'value': blob)
>>>> CREATE TABLE timeline (
>>>>   user_id int,
>>>>   posted_at uuid,
>>>>   column string,
>>>>   value blob,
>>>>   PRIMARY KEY(user_id, posted_at)
>>>> ) TRANSPOSED;
>>>>
>>>> -- entire transposed row is stored as a single dense composite column
>>>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
>>>> -- composite column's value is unused in this case.
>>>> CREATE TABLE events (
>>>>   series text,
>>>>   ts1 int,
>>>>   cat text,
>>>>   subcat text,
>>>>   "1337" uuid,
>>>>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>>>>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
>>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
>>>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
>>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
>>>> {code}
>>>
>>> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
>>> (or link to a previous description if I missed it)?
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Thu, Dec 29, 2011 at 3:44 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> That's to allow defining column names that are not text/utf8.  So you
> could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
> actual 128-bit uuid binary value internally, not its string
> representation.  Put another way, this would affect the CqlMetadata
> name_types map.
>
> However, we already have the "column names are always strings"
> limitations with existing CQL DDL so it probably makes more sense to
> consider it separately from transposition.

Right, and to get a jump on that bikeshedding I'd propose that look
something like:

CREATE TABLE test (
   int(10) text,
   uuid(92d21d0a-d6cb-437c-9d3f-b67aa733a19f) bigint
)

or...

CREATE TABLE test (
   (int)10 text,
   (uuid)92d21d0a-d6cb-437c-9d3f-b67aa733a19f bigint
)

But I digress, that's probably best left for another issue and another time. :)


> On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans <ee...@acunu.com> wrote:
>> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> Gamma proposal update:
>>>
>>> The more I think about it the less happy I am with omitting support
>>> for sparse columns.  Remember that dense composites may only be
>>> inserted and deleted, not updated, since they are just a tuple of
>>> values with "column names" determined by schema and/or convention.
>>>
>>> I think we can support sparse columns well in a way that improves the
>>> conceptual integrity for the dense composites as well:
>>>
>>> {code}
>>> -- "column" and "value" are sparse; a transposed row will be stored as
>>> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
>>> posted_at, 'value': blob)
>>> CREATE TABLE timeline (
>>>   user_id int,
>>>   posted_at uuid,
>>>   column string,
>>>   value blob,
>>>   PRIMARY KEY(user_id, posted_at)
>>> ) TRANSPOSED;
>>>
>>> -- entire transposed row is stored as a single dense composite column
>>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
>>> -- composite column's value is unused in this case.
>>> CREATE TABLE events (
>>>   series text,
>>>   ts1 int,
>>>   cat text,
>>>   subcat text,
>>>   "1337" uuid,
>>>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>>>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
>>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
>>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
>>> {code}
>>
>> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
>> (or link to a previous description if I missed it)?

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

That's to allow defining column names that are not text/utf8.  So you
could have column name "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" be an
actual 128-bit uuid binary value internally, not its string
representation.  Put another way, this would affect the CqlMetadata
name_types map.

However, we already have the "column names are always strings"
limitations with existing CQL DDL so it probably makes more sense to
consider it separately from transposition.

On Thu, Dec 29, 2011 at 3:22 PM, Eric Evans <ee...@acunu.com> wrote:
> On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> Gamma proposal update:
>>
>> The more I think about it the less happy I am with omitting support
>> for sparse columns.  Remember that dense composites may only be
>> inserted and deleted, not updated, since they are just a tuple of
>> values with "column names" determined by schema and/or convention.
>>
>> I think we can support sparse columns well in a way that improves the
>> conceptual integrity for the dense composites as well:
>>
>> {code}
>> -- "column" and "value" are sparse; a transposed row will be stored as
>> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
>> posted_at, 'value': blob)
>> CREATE TABLE timeline (
>>   user_id int,
>>   posted_at uuid,
>>   column string,
>>   value blob,
>>   PRIMARY KEY(user_id, posted_at)
>> ) TRANSPOSED;
>>
>> -- entire transposed row is stored as a single dense composite column
>> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
>> -- composite column's value is unused in this case.
>> CREATE TABLE events (
>>   series text,
>>   ts1 int,
>>   cat text,
>>   subcat text,
>>   "1337" uuid,
>>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
>> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
>> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
>> {code}
>
> Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
> (or link to a previous description if I missed it)?
>
>> Thus, columns included in the (transposed) primary key will be
>> "dense," and not updateable, which conforms to our existing practice
>> that keys are not updateable.  Remaining columns will be updateable
>> since they will each map to a separate physical column.
>
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Wed, Dec 28, 2011 at 1:05 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> Gamma proposal update:
>
> The more I think about it the less happy I am with omitting support
> for sparse columns.  Remember that dense composites may only be
> inserted and deleted, not updated, since they are just a tuple of
> values with "column names" determined by schema and/or convention.
>
> I think we can support sparse columns well in a way that improves the
> conceptual integrity for the dense composites as well:
>
> {code}
> -- "column" and "value" are sparse; a transposed row will be stored as
> -- two columns of (user_id, posted_at, 'column': string) and (user_id,
> posted_at, 'value': blob)
> CREATE TABLE timeline (
>   user_id int,
>   posted_at uuid,
>   column string,
>   value blob,
>   PRIMARY KEY(user_id, posted_at)
> ) TRANSPOSED;
>
> -- entire transposed row is stored as a single dense composite column
> -- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
> -- composite column's value is unused in this case.
> CREATE TABLE events (
>   series text,
>   ts1 int,
>   cat text,
>   subcat text,
>   "1337" uuid,
>   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
>   PRIMARY KEY(series, ts1, cat, subcat, "1337",
> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
> ) TRANSPOSED WITH COLUMN NAMES ("1337" int,
> "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
> {code}

Could you explain what this "TRANSPOSED WITH COLUMN NAMES" syntax does
(or link to a previous description if I missed it)?

> Thus, columns included in the (transposed) primary key will be
> "dense," and not updateable, which conforms to our existing practice
> that keys are not updateable.  Remaining columns will be updateable
> since they will each map to a separate physical column.



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Fri, Dec 30, 2011 at 11:12 AM, Rick Shaw <wf...@gmail.com> wrote:
> +1 on Gamma
> +1 on haveing capability to specify a value.
>
> My only reservation is the choice of the keword "TABLE",  which is going to
> be a source of continued confusion.

TABLE is an alias for COLUMNFAMILY (pretty much always has been); It's
there because it seemed cruel and unusual to penalize people with that
muscle-memory

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Rick Shaw <wf...@gmail.com>.

+1 on Gamma
+1 on haveing capability to specify a value.

My only reservation is the choice of the keword "TABLE",  which is going to
be a source of continued confusion.

On Fri, Dec 30, 2011 at 11:58 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> I think we're closing in on something workable.
>
> Dropping TRANSPOSED from Gamma as redundant with respect to the
> composite PRIMARY KEY definition.
>
> Should we support column values in non-sparse rows by adding a
> VALUE(column_name) section?
>
> CREATE TABLE timeline (
>    user_id int,
>    posted_at uuid,
>    body string,
>    posted_by string,
>    PRIMARY KEY(user_id, posted_at, posted_by),
>    VALUE(body)
> );
>
> (Open to better suggestions for that keyword.)
>
>
> On Thu, Dec 29, 2011 at 3:13 PM, Eric Evans <ee...@acunu.com> wrote:
> > On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >> I've updated the wiki page at
> >> http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
> >> Background section that hopefully clears up where I'm going with this
> >> sparse/dense business.
> >>
> >> Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
> >> implicitly using the first element of PRIMARY KEY as the row key.  We
> >> could make it explicit with another WITH option to the TRANSPOSED
> >> clause:
> >>
> >> {{{
> >> CREATE TABLE timeline (
> >>    user_id int,
> >>    posted_at uuid,
> >>    column string,
> >>    value blob,
> >>    PRIMARY KEY(user_id, posted_at)
> >> ) TRANSPOSED WITH ROW KEY(user_id)
> >> }}}
> >>
> >> This makes things more verbose (this would be a required clause) but
> >> I'm okay with that if consensus is that being explicit here is better.
> >
> > I think that was a reaction to an earlier iteration.  Assuming that
> > the only place where order matters is in that primary key definition,
> > then I think it makes sense without the "... WITH ROW KEY..." bit.
> >
> >
> >
> > --
> > Eric Evans
> > Acunu | http://www.acunu.com | @acunu
>
>
>
>  --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: CQL support for compound columns

Posted by Sylvain Lebresne <sy...@datastax.com>.

I had complains that my preceding mail was unreadable (thanks gmailfor
fucking my formatting up), so I've posted the same thing with nice
formatting on the JIRA ticket.

--
Sylvain

On Tue, Jan 3, 2012 at 7:08 PM, Sylvain Lebresne <sy...@datastax.com> wrote:
> Ok, I think I'm warming up to what we're getting at. I would change
> thesyntax of the VALUE() thing however. Instead of:CREATE TABLE
> timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
> KEY(user_id, posted_at),   VALUE(body))I would prefer:CREATE COMPACT
> TABLE timeline (   userid int,   posted_at uuid,   body string,
> PRIMARY KEY(user_id, posted_at),)
> The reasons being that it really influences the implementation layout
> of theCF in C*. Namely, the non-compact CF defined by CREATE TABLE
> timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
> KEY(user_id, posted_at),)would look in C* like:<userid> : {
> <posted_at>:'body' -> <value>}while the COMPACT variant would
> be:<userid> : {    <posted_at> -> <value>}which is using the fact that
> there is only 1 field not part of the key to"optimize" the layout. And
> I believe making the COMPACT part of the CREATEemphasis better that
> it's a property of the definition itself (that cannot bechanged)
> rather than of that specific 'body' field. It also make the rule
> forCOMPACT table rather simple: "a compact table should have only one
> field notpart of the primary key"; you don't have to deal with errors
> like someonedefining two VALUE() for instance.
>
> That being said, I'd like to try to resume where we're at (including
> theCOMPACT change above) and add a few random ideas along the way.
> Please correctme if I've got something wrong.
> I think we have 4 different cases, 2 for 'standard' CF without
> composites:- static CFs (the only case CQL handle really well today)-
> dynamic CFs (wide rows, time series if you prefer)and 2 for CF with
> composite column names:- 'dense' composite (typically time series but
> where the key is naturally  multi-parts)- 'sparse' composite (aka
> super columns)
> Let me try to take an example for which, with how it would
> translateinternally and example queries.
>
> Standard "static" CF--------------------
> "For each user, holds his infos"
> CREATE TABLE users (    userid uuid PRIMARY KEY,    firstname text,
> lastname text,    age int)
> In C*:<userid> : {    'firstname' -> <value>    'lastname' -> <value>
>   'age' -> <value>}
> Query:SELECT firstname, lastname FROM users WHERE userid = '...';
> Standard "dynamic" CF---------------------
> "For each user, keep each url he clicked on with the date of last click"
> CREATE COMPACT TABLE clicks (    userid uuid,    url text,
> timestamp date    PRIMARY KEY (userid, url))
> In C*:<userid> : {    <url> -> <timestamp>}
> Query:SELECT url, timestamp FROM clicks WHERE userid = '...';SELECT
> timestamp FROM clicks WHERE userid = '...' and url = 'http://...';
> 'dense' composite-----------------
> "For each user, keep ip and port from where he connected with the date
> of lastconnection"
> CREATE COMPACT TABLE connections (    userid uuid,    ip binary,
> port int,    timestamp date    PRIMARY KEY (userid, ip, port))
> In C*:<userid> : {    <ip>:<port> -> <timestamp>}
> Query:SELECT ip, port, timestamp FROM connections WHERE userid = '...';
> 'sparse' composite------------------
> "User timeline"
> CREATE TABLE timeline (    userid uuid,    posted_at date,    body
> text,    posted_by text,    PRIMARY KEY (user_id, posted_at),);
> In C*:<userid> : {    <posted_at>:'body' -> <value>
> <posted_at>:'posted_by' -> <value>}
> Query:SELECT body, posted_by FROM timeline WHERE userid = '...' and
> posted_at = '2 janvier 2010'
> Note: I think we really should also be able to do queries like:SELECT
> posted_ad, body, posted_by FROM timeline WHERE userid = '...' and
> posted_at > '2 janvier 2010'but that's more akin to the modification
> of the syntax for slices.
>
> Random other ideas------------------
> 1) We could allow something like:    CONSTRAINT key PRIMARY KEY
> (userid, ip, port)  which would then allow to write    SELECT
> timestamp FROM users WHERE key = ('...', 192.168.0.1, 80);  (I believe
> this is the 'standard' notation to name a 'composite' key in SQL)
> - Above we're ony handling the use of composites for column names, but
> they  can be useful for value (and row keys) and it could be nice to
> have an easy  notation for that (clearly a following ticket however).
> What about:
> CREATE COMPACT TABLE timeline (    userid_part1 text,    userid_part2
> int,    posted_at date,    posted_by uuid,    body text    header text
>    GROUP (userid_part1, userid_part2) AS userid,    PRIMARY KEY
> (userid, posted_at, posted_by)    GROUP (header, body))
> In C*:<userid_part1>:<userid_part2> : {    <posted_at>:<posted_by> ->
> <header>:<body>}
> Query:SELECT posted_at, posted_by, body, header FROM timeline WHERE
> userid = ('john', 32)
>
> --
> Sylvain
> On Mon, Jan 2, 2012 at 8:29 PM, Eric Evans <ee...@acunu.com> wrote:
>> On Mon, Jan 2, 2012 at 12:55 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> On Mon, Jan 2, 2012 at 10:53 AM, Eric Evans <ee...@acunu.com> wrote:
>>>> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
>>>> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
>>>> applied to a KEY() function.  It's also a little strange the way it
>>>> appears in the grouping of column specs, when it's actually defining a
>>>> grouping or relationship of them (maybe this is what you meant about
>>>> using TRANSPOSED WITH <options> to emphasize the non-standard).
>>>
>>> Fear not, I can set your mind at ease. :)
>>>
>>> Personally I think the syntax works reasonably well in its own right,
>>> but my main reason for the proposed syntax is that it is actually
>>> standard SQL for composite primary keys at least as far back as SQL
>>> 92, as a subcategory of table constraints.  The SQL standard is not
>>> freely linkable, but see
>>> http://www.postgresql.org/docs/9.1/static/sql-createtable.html for a
>>> real-world example.
>>
>> OK, I stand corrected (and my mind is at ease :) ).
>>
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Sylvain Lebresne <sy...@datastax.com>.

Ok, I think I'm warming up to what we're getting at. I would change
thesyntax of the VALUE() thing however. Instead of:CREATE TABLE
timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
KEY(user_id, posted_at),   VALUE(body))I would prefer:CREATE COMPACT
TABLE timeline (   userid int,   posted_at uuid,   body string,
PRIMARY KEY(user_id, posted_at),)
The reasons being that it really influences the implementation layout
of theCF in C*. Namely, the non-compact CF defined by CREATE TABLE
timeline (   userid int,   posted_at uuid,   body string,   PRIMARY
KEY(user_id, posted_at),)would look in C* like:<userid> : {
<posted_at>:'body' -> <value>}while the COMPACT variant would
be:<userid> : {    <posted_at> -> <value>}which is using the fact that
there is only 1 field not part of the key to"optimize" the layout. And
I believe making the COMPACT part of the CREATEemphasis better that
it's a property of the definition itself (that cannot bechanged)
rather than of that specific 'body' field. It also make the rule
forCOMPACT table rather simple: "a compact table should have only one
field notpart of the primary key"; you don't have to deal with errors
like someonedefining two VALUE() for instance.

That being said, I'd like to try to resume where we're at (including
theCOMPACT change above) and add a few random ideas along the way.
Please correctme if I've got something wrong.
I think we have 4 different cases, 2 for 'standard' CF without
composites:- static CFs (the only case CQL handle really well today)-
dynamic CFs (wide rows, time series if you prefer)and 2 for CF with
composite column names:- 'dense' composite (typically time series but
where the key is naturally  multi-parts)- 'sparse' composite (aka
super columns)
Let me try to take an example for which, with how it would
translateinternally and example queries.

Standard "static" CF--------------------
"For each user, holds his infos"
CREATE TABLE users (    userid uuid PRIMARY KEY,    firstname text,
lastname text,    age int)
In C*:<userid> : {    'firstname' -> <value>    'lastname' -> <value>
  'age' -> <value>}
Query:SELECT firstname, lastname FROM users WHERE userid = '...';
Standard "dynamic" CF---------------------
"For each user, keep each url he clicked on with the date of last click"
CREATE COMPACT TABLE clicks (    userid uuid,    url text,
timestamp date    PRIMARY KEY (userid, url))
In C*:<userid> : {    <url> -> <timestamp>}
Query:SELECT url, timestamp FROM clicks WHERE userid = '...';SELECT
timestamp FROM clicks WHERE userid = '...' and url = 'http://...';
'dense' composite-----------------
"For each user, keep ip and port from where he connected with the date
of lastconnection"
CREATE COMPACT TABLE connections (    userid uuid,    ip binary,
port int,    timestamp date    PRIMARY KEY (userid, ip, port))
In C*:<userid> : {    <ip>:<port> -> <timestamp>}
Query:SELECT ip, port, timestamp FROM connections WHERE userid = '...';
'sparse' composite------------------
"User timeline"
CREATE TABLE timeline (    userid uuid,    posted_at date,    body
text,    posted_by text,    PRIMARY KEY (user_id, posted_at),);
In C*:<userid> : {    <posted_at>:'body' -> <value>
<posted_at>:'posted_by' -> <value>}
Query:SELECT body, posted_by FROM timeline WHERE userid = '...' and
posted_at = '2 janvier 2010'
Note: I think we really should also be able to do queries like:SELECT
posted_ad, body, posted_by FROM timeline WHERE userid = '...' and
posted_at > '2 janvier 2010'but that's more akin to the modification
of the syntax for slices.

Random other ideas------------------
1) We could allow something like:    CONSTRAINT key PRIMARY KEY
(userid, ip, port)  which would then allow to write    SELECT
timestamp FROM users WHERE key = ('...', 192.168.0.1, 80);  (I believe
this is the 'standard' notation to name a 'composite' key in SQL)
- Above we're ony handling the use of composites for column names, but
they  can be useful for value (and row keys) and it could be nice to
have an easy  notation for that (clearly a following ticket however).
What about:
CREATE COMPACT TABLE timeline (    userid_part1 text,    userid_part2
int,    posted_at date,    posted_by uuid,    body text    header text
   GROUP (userid_part1, userid_part2) AS userid,    PRIMARY KEY
(userid, posted_at, posted_by)    GROUP (header, body))
In C*:<userid_part1>:<userid_part2> : {    <posted_at>:<posted_by> ->
<header>:<body>}
Query:SELECT posted_at, posted_by, body, header FROM timeline WHERE
userid = ('john', 32)

--
Sylvain
On Mon, Jan 2, 2012 at 8:29 PM, Eric Evans <ee...@acunu.com> wrote:
> On Mon, Jan 2, 2012 at 12:55 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> On Mon, Jan 2, 2012 at 10:53 AM, Eric Evans <ee...@acunu.com> wrote:
>>> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
>>> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
>>> applied to a KEY() function.  It's also a little strange the way it
>>> appears in the grouping of column specs, when it's actually defining a
>>> grouping or relationship of them (maybe this is what you meant about
>>> using TRANSPOSED WITH <options> to emphasize the non-standard).
>>
>> Fear not, I can set your mind at ease. :)
>>
>> Personally I think the syntax works reasonably well in its own right,
>> but my main reason for the proposed syntax is that it is actually
>> standard SQL for composite primary keys at least as far back as SQL
>> 92, as a subcategory of table constraints.  The SQL standard is not
>> freely linkable, but see
>> http://www.postgresql.org/docs/9.1/static/sql-createtable.html for a
>> real-world example.
>
> OK, I stand corrected (and my mind is at ease :) ).
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Mon, Jan 2, 2012 at 12:55 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Mon, Jan 2, 2012 at 10:53 AM, Eric Evans <ee...@acunu.com> wrote:
>> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
>> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
>> applied to a KEY() function.  It's also a little strange the way it
>> appears in the grouping of column specs, when it's actually defining a
>> grouping or relationship of them (maybe this is what you meant about
>> using TRANSPOSED WITH <options> to emphasize the non-standard).
>
> Fear not, I can set your mind at ease. :)
>
> Personally I think the syntax works reasonably well in its own right,
> but my main reason for the proposed syntax is that it is actually
> standard SQL for composite primary keys at least as far back as SQL
> 92, as a subcategory of table constraints.  The SQL standard is not
> freely linkable, but see
> http://www.postgresql.org/docs/9.1/static/sql-createtable.html for a
> real-world example.

OK, I stand corrected (and my mind is at ease :) ).


-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

On Mon, Jan 2, 2012 at 10:53 AM, Eric Evans <ee...@acunu.com> wrote:
> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
> applied to a KEY() function.  It's also a little strange the way it
> appears in the grouping of column specs, when it's actually defining a
> grouping or relationship of them (maybe this is what you meant about
> using TRANSPOSED WITH <options> to emphasize the non-standard).

Fear not, I can set your mind at ease. :)

Personally I think the syntax works reasonably well in its own right,
but my main reason for the proposed syntax is that it is actually
standard SQL for composite primary keys at least as far back as SQL
92, as a subcategory of table constraints.  The SQL standard is not
freely linkable, but see
http://www.postgresql.org/docs/9.1/static/sql-createtable.html for a
real-world example.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Edward Capriolo <ed...@gmail.com>.

Maybe the ship on this has sailed, but I am a bit miffed  on "create
table". CQL is going out of its way to make things so easy for people. But
if someone does not understand the concept of a column family making it
easy for them to design something that is an anti-pattern is odd to me.

As an admin I have been called many times to troubleshoot database
performance issues databases. It sometimes boils down to a bad schema
choice. At later/production stages these become hard to dig out of. It
usually takes more hardware, converting GB or TB of data, application cut
overs.

I do not call "column families" "tables". If someone newer to cassandra did
I would correct them. Why not call Java references pointers? I hate being
ambiguous on key terminology.



On Mon, Jan 2, 2012 at 11:53 AM, Eric Evans <ee...@acunu.com> wrote:

> On Sat, Dec 31, 2011 at 1:12 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> > On Fri, Dec 30, 2011 at 12:30 PM, Eric Evans <ee...@acunu.com> wrote:
> >>> CREATE TABLE timeline (
> >>>    user_id int,
> >>>    posted_at uuid,
> >>>    body string,
> >>>    posted_by string,
> >>>    PRIMARY KEY(user_id, posted_at, posted_by),
> >>>    VALUE(body)
> >>> );
> >>
> >> I think the value declaration also helps in that it's one more thing
> >> that provides cues as to the data model it creates (more expressive).
> >> But this got me thinking, why not introduce something special for the
> >> composite name as well?  That way the PRIMARY KEY syntax (which comes
> >> preloaded with meaning and expectations) could be kept more SQLish,
> >> and the whole thing looks more like an extension to the language as
> >> opposed to a modification.
> >>
> >> Say:
> >>
> >> CREATE TABLE timeline (
> >>  user_id int PRIMARY KEY,
> >>  posted_at uuid,
> >>  body text,
> >>  posted_by text,
> >>  COMPOSITE_NAME(posted_at, posted_by),
> >>  COMPOSITE_VALUE(body)
> >> )
> >
> > I went back and forth on this mentally, but I come down as -0 on CN
> > instead of PK.  For two reasons:
> >
> > First, the composite PRIMARY KEY is a better description of what you
> > can actually do with the data.  In a relational model, a PK of user_id
> > means there is only one (user_id, posted_at, body, posted_by) row with
> > a given user_id.  Which is not the case here.  PK = (row key +
> > composite components) captures exactly what is "immutable and unique"
> > in a given object, so it's actually exactly what it's meant for and
> > not an abuse at all.  (It even fits nicely with the "queries involving
> > the PK are always indexed" assumption that isn't required by the SQL
> > standard but every other database does anyway because it makes the
> > most sense.)
>
> Yeah, you're right, PK is a better fit for this.
>
> Now that I'm forced to think about it a bit more, I think my un-SQL
> reaction is probably rooted more in the abuse of the PRIMARY KEY
> syntax, than the meaning it conveys.
>
> In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
> KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
> applied to a KEY() function.  It's also a little strange the way it
> appears in the grouping of column specs, when it's actually defining a
> grouping or relationship of them (maybe this is what you meant about
> using TRANSPOSED WITH <options> to emphasize the non-standard).
>
> I wonder if there isn't a way to keep the PRIMARY KEY connection while
> making it a little more SQL (and hence more intuitive).  Maybe
> something like:
>
>
> CREATE TABLE timeline (
>  (user_id int, posted_at uuid, posted_by) PRIMARY KEY,
>  body text
> )
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Sat, Dec 31, 2011 at 1:12 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Fri, Dec 30, 2011 at 12:30 PM, Eric Evans <ee...@acunu.com> wrote:
>>> CREATE TABLE timeline (
>>>    user_id int,
>>>    posted_at uuid,
>>>    body string,
>>>    posted_by string,
>>>    PRIMARY KEY(user_id, posted_at, posted_by),
>>>    VALUE(body)
>>> );
>>
>> I think the value declaration also helps in that it's one more thing
>> that provides cues as to the data model it creates (more expressive).
>> But this got me thinking, why not introduce something special for the
>> composite name as well?  That way the PRIMARY KEY syntax (which comes
>> preloaded with meaning and expectations) could be kept more SQLish,
>> and the whole thing looks more like an extension to the language as
>> opposed to a modification.
>>
>> Say:
>>
>> CREATE TABLE timeline (
>>  user_id int PRIMARY KEY,
>>  posted_at uuid,
>>  body text,
>>  posted_by text,
>>  COMPOSITE_NAME(posted_at, posted_by),
>>  COMPOSITE_VALUE(body)
>> )
>
> I went back and forth on this mentally, but I come down as -0 on CN
> instead of PK.  For two reasons:
>
> First, the composite PRIMARY KEY is a better description of what you
> can actually do with the data.  In a relational model, a PK of user_id
> means there is only one (user_id, posted_at, body, posted_by) row with
> a given user_id.  Which is not the case here.  PK = (row key +
> composite components) captures exactly what is "immutable and unique"
> in a given object, so it's actually exactly what it's meant for and
> not an abuse at all.  (It even fits nicely with the "queries involving
> the PK are always indexed" assumption that isn't required by the SQL
> standard but every other database does anyway because it makes the
> most sense.)

Yeah, you're right, PK is a better fit for this.

Now that I'm forced to think about it a bit more, I think my un-SQL
reaction is probably rooted more in the abuse of the PRIMARY KEY
syntax, than the meaning it conveys.

In SQL, PRIMARY KEY is a modifier to a column spec, and here PRIMARY
KEY(user_id, posted_at, posted_by) reads like a PRIMARY modifier
applied to a KEY() function.  It's also a little strange the way it
appears in the grouping of column specs, when it's actually defining a
grouping or relationship of them (maybe this is what you meant about
using TRANSPOSED WITH <options> to emphasize the non-standard).

I wonder if there isn't a way to keep the PRIMARY KEY connection while
making it a little more SQL (and hence more intuitive).  Maybe
something like:


CREATE TABLE timeline (
  (user_id int, posted_at uuid, posted_by) PRIMARY KEY,
  body text
)


-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Dec 30, 2011 at 12:30 PM, Eric Evans <ee...@acunu.com> wrote:
>> CREATE TABLE timeline (
>>    user_id int,
>>    posted_at uuid,
>>    body string,
>>    posted_by string,
>>    PRIMARY KEY(user_id, posted_at, posted_by),
>>    VALUE(body)
>> );
>
> I think the value declaration also helps in that it's one more thing
> that provides cues as to the data model it creates (more expressive).
> But this got me thinking, why not introduce something special for the
> composite name as well?  That way the PRIMARY KEY syntax (which comes
> preloaded with meaning and expectations) could be kept more SQLish,
> and the whole thing looks more like an extension to the language as
> opposed to a modification.
>
> Say:
>
> CREATE TABLE timeline (
>  user_id int PRIMARY KEY,
>  posted_at uuid,
>  body text,
>  posted_by text,
>  COMPOSITE_NAME(posted_at, posted_by),
>  COMPOSITE_VALUE(body)
> )

I went back and forth on this mentally, but I come down as -0 on CN
instead of PK.  For two reasons:

First, the composite PRIMARY KEY is a better description of what you
can actually do with the data.  In a relational model, a PK of user_id
means there is only one (user_id, posted_at, body, posted_by) row with
a given user_id.  Which is not the case here.  PK = (row key +
composite components) captures exactly what is "immutable and unique"
in a given object, so it's actually exactly what it's meant for and
not an abuse at all.  (It even fits nicely with the "queries involving
the PK are always indexed" assumption that isn't required by the SQL
standard but every other database does anyway because it makes the
most sense.)

The only place where we do violence to relational expectations by
using PK this way is that "insert should raise an error if there's
already a row with that PK" (instead of updating that row the way we
do).  But, this is already a quirk we inflict on the use of PK, and I
don't think it's a big deal.

Second, it feels like this bleeds too much of the implementation into
the DDL.  What if we move to a storage engine like CASSANDRA-678 and
we represent this as more like the traditional relational rows than
composite columns?  Unlikely, I know, but the spirit of SQL is to
specify what you want to do and let the engine figure out how to do it
most efficiently.

Following the second line of reasoning made me realize that using
TRANSPOSED WITH <options> for things besides the PRIMARY KEY
definition [i.e., the COMPOSITE VALUE option] has the advantage of
emphasizing that it's a non-standard option that allows the
implementation to bleed through into the DDL.  But, I can see the
advantage in regularity of having COMPOSITE VALUE adjacent to the PK
definition as well.  So I'm fine either way, but I wanted to point
that out.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Fri, Dec 30, 2011 at 10:58 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> I think we're closing in on something workable.

I agree.

> Dropping TRANSPOSED from Gamma as redundant with respect to the
> composite PRIMARY KEY definition.

+1

> Should we support column values in non-sparse rows by adding a
> VALUE(column_name) section?

I like that.

> CREATE TABLE timeline (
>    user_id int,
>    posted_at uuid,
>    body string,
>    posted_by string,
>    PRIMARY KEY(user_id, posted_at, posted_by),
>    VALUE(body)
> );

I think the value declaration also helps in that it's one more thing
that provides cues as to the data model it creates (more expressive).
But this got me thinking, why not introduce something special for the
composite name as well?  That way the PRIMARY KEY syntax (which comes
preloaded with meaning and expectations) could be kept more SQLish,
and the whole thing looks more like an extension to the language as
opposed to a modification.

Say:

CREATE TABLE timeline (
  user_id int PRIMARY KEY,
  posted_at uuid,
  body text,
  posted_by text,
  COMPOSITE_NAME(posted_at, posted_by),
  COMPOSITE_VALUE(body)
)

> (Open to better suggestions for that keyword.)

Yeah, same.

> On Thu, Dec 29, 2011 at 3:13 PM, Eric Evans <ee...@acunu.com> wrote:
>> On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> I've updated the wiki page at
>>> http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
>>> Background section that hopefully clears up where I'm going with this
>>> sparse/dense business.
>>>
>>> Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
>>> implicitly using the first element of PRIMARY KEY as the row key.  We
>>> could make it explicit with another WITH option to the TRANSPOSED
>>> clause:
>>>
>>> {{{
>>> CREATE TABLE timeline (
>>>    user_id int,
>>>    posted_at uuid,
>>>    column string,
>>>    value blob,
>>>    PRIMARY KEY(user_id, posted_at)
>>> ) TRANSPOSED WITH ROW KEY(user_id)
>>> }}}
>>>
>>> This makes things more verbose (this would be a required clause) but
>>> I'm okay with that if consensus is that being explicit here is better.
>>
>> I think that was a reaction to an earlier iteration.  Assuming that
>> the only place where order matters is in that primary key definition,
>> then I think it makes sense without the "... WITH ROW KEY..." bit.
>>
>>
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

I think we're closing in on something workable.

Dropping TRANSPOSED from Gamma as redundant with respect to the
composite PRIMARY KEY definition.

Should we support column values in non-sparse rows by adding a
VALUE(column_name) section?

CREATE TABLE timeline (
    user_id int,
    posted_at uuid,
    body string,
    posted_by string,
    PRIMARY KEY(user_id, posted_at, posted_by),
    VALUE(body)
);

(Open to better suggestions for that keyword.)


On Thu, Dec 29, 2011 at 3:13 PM, Eric Evans <ee...@acunu.com> wrote:
> On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> I've updated the wiki page at
>> http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
>> Background section that hopefully clears up where I'm going with this
>> sparse/dense business.
>>
>> Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
>> implicitly using the first element of PRIMARY KEY as the row key.  We
>> could make it explicit with another WITH option to the TRANSPOSED
>> clause:
>>
>> {{{
>> CREATE TABLE timeline (
>>    user_id int,
>>    posted_at uuid,
>>    column string,
>>    value blob,
>>    PRIMARY KEY(user_id, posted_at)
>> ) TRANSPOSED WITH ROW KEY(user_id)
>> }}}
>>
>> This makes things more verbose (this would be a required clause) but
>> I'm okay with that if consensus is that being explicit here is better.
>
> I think that was a reaction to an earlier iteration.  Assuming that
> the only place where order matters is in that primary key definition,
> then I think it makes sense without the "... WITH ROW KEY..." bit.
>
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Thu, Dec 29, 2011 at 12:04 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> I've updated the wiki page at
> http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
> Background section that hopefully clears up where I'm going with this
> sparse/dense business.
>
> Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
> implicitly using the first element of PRIMARY KEY as the row key.  We
> could make it explicit with another WITH option to the TRANSPOSED
> clause:
>
> {{{
> CREATE TABLE timeline (
>    user_id int,
>    posted_at uuid,
>    column string,
>    value blob,
>    PRIMARY KEY(user_id, posted_at)
> ) TRANSPOSED WITH ROW KEY(user_id)
> }}}
>
> This makes things more verbose (this would be a required clause) but
> I'm okay with that if consensus is that being explicit here is better.

I think that was a reaction to an earlier iteration.  Assuming that
the only place where order matters is in that primary key definition,
then I think it makes sense without the "... WITH ROW KEY..." bit.



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

I've updated the wiki page at
http://wiki.apache.org/cassandra/Cassandra2474 with a more in-depth
Background section that hopefully clears up where I'm going with this
sparse/dense business.

Eric mentioned on IRC that he's uneasy about the PRIMARY KEY syntax
implicitly using the first element of PRIMARY KEY as the row key.  We
could make it explicit with another WITH option to the TRANSPOSED
clause:

{{{
CREATE TABLE timeline (
    user_id int,
    posted_at uuid,
    column string,
    value blob,
    PRIMARY KEY(user_id, posted_at)
) TRANSPOSED WITH ROW KEY(user_id)
}}}

This makes things more verbose (this would be a required clause) but
I'm okay with that if consensus is that being explicit here is better.

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

Gamma proposal update:

The more I think about it the less happy I am with omitting support
for sparse columns.  Remember that dense composites may only be
inserted and deleted, not updated, since they are just a tuple of
values with "column names" determined by schema and/or convention.

I think we can support sparse columns well in a way that improves the
conceptual integrity for the dense composites as well:

{code}
-- "column" and "value" are sparse; a transposed row will be stored as
-- two columns of (user_id, posted_at, 'column': string) and (user_id,
posted_at, 'value': blob)
CREATE TABLE timeline (
   user_id int,
   posted_at uuid,
   column string,
   value blob,
   PRIMARY KEY(user_id, posted_at)
) TRANSPOSED;

-- entire transposed row is stored as a single dense composite column
-- (series, ts1, cat, subcat, 1337, 92d21d0a-...: []).  Note that the
-- composite column's value is unused in this case.
CREATE TABLE events (
   series text,
   ts1 int,
   cat text,
   subcat text,
   "1337" uuid,
   "92d21d0a-d6cb-437c-9d3f-b67aa733a19f" bigint,
   PRIMARY KEY(series, ts1, cat, subcat, "1337",
"92d21d0a-d6cb-437c-9d3f-b67aa733a19f")
) TRANSPOSED WITH COLUMN NAMES ("1337" int,
"92d21d0a-d6cb-437c-9d3f-b67aa733a19f" uuid);
{code}

Thus, columns included in the (transposed) primary key will be
"dense," and not updateable, which conforms to our existing practice
that keys are not updateable.  Remaining columns will be updateable
since they will each map to a separate physical column.

(I've updated the wiki page.)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL support for compound columns

Posted by Eric Evans <ee...@acunu.com>.

On Sat, Dec 24, 2011 at 9:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> Hmm, I thought I sent this but it was sitting in my Drafts box.
>
> Anyway, I've updated http://wiki.apache.org/cassandra/Cassandra2474 to
> flesh out the earlier proposals and editorialize a little in
> "Discussion Summary" sections.
>
> I'm skeptical that splitting dicussion across Jira and the ML is a big
> improvement over Jira-only but I'm willing to give it a try.

Yeah, it probably depends on how much more discussion there will be.
The summary in the wiki is fantastic though.  It's especially nice
being able to see the most recent proposal, up to date with all of the
incremental changes,  and in one place.  Thanks for that!

> On Tue, Dec 20, 2011 at 4:56 PM, Eric Evans <ee...@acunu.com> wrote:
>> There has been a discussion taking place in CASSANDRA-2474[1]
>> regarding the language and semantics of compound columns in CQL.
>> Though the issue was only opened in July, and despite extended periods
>> of inactivity, it is monstrously long.  Additionally, the discussion
>> necessarily includes inline visual aids (tables, graphics, and
>> verbatim code snippets) that are constantly being revised, which only
>> compounds (pun intended) the problem.  I feel as though this is not
>> only making the discussion less constructive, but that it may be
>> scaring people off, (and IMO, this issue could use to be discussed
>> among a larger group anyway).
>>
>> I propose two things, 1) that we move the discussion to this mailing
>> list, and 2) that we track the various approaches in the wiki.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: CQL support for compound columns

Posted by Jonathan Ellis <jb...@gmail.com>.

Hmm, I thought I sent this but it was sitting in my Drafts box.

Anyway, I've updated http://wiki.apache.org/cassandra/Cassandra2474 to
flesh out the earlier proposals and editorialize a little in
"Discussion Summary" sections.

I'm skeptical that splitting dicussion across Jira and the ML is a big
improvement over Jira-only but I'm willing to give it a try.

On Tue, Dec 20, 2011 at 4:56 PM, Eric Evans <ee...@acunu.com> wrote:
> There has been a discussion taking place in CASSANDRA-2474[1]
> regarding the language and semantics of compound columns in CQL.
> Though the issue was only opened in July, and despite extended periods
> of inactivity, it is monstrously long.  Additionally, the discussion
> necessarily includes inline visual aids (tables, graphics, and
> verbatim code snippets) that are constantly being revised, which only
> compounds (pun intended) the problem.  I feel as though this is not
> only making the discussion less constructive, but that it may be
> scaring people off, (and IMO, this issue could use to be discussed
> among a larger group anyway).
>
> I propose two things, 1) that we move the discussion to this mailing
> list, and 2) that we track the various approaches in the wiki.
>
> For the latter of these, I've stubbed out a page[2] that would
> hopefully serve as a starting point.
>
> Thoughts?
>
>
> [1]: https://issues.apache.org/jira/browse/CASSANDRA-2474
> [2]: http://wiki.apache.org/cassandra/Cassandra2475
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com