You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Maciej Miklas <ma...@gmail.com> on 2014/05/19 17:20:12 UTC

CQL 3 and wide rows

Hi *,

I’ve checked DataStax driver code for CQL 3, and it looks like the column
names for particular table are fully loaded into memory, it this true?

Cassandra should support wide rows, meaning tables with millions of
columns. Knowing that, I would expect kind of iterator for column names. Am
I missing something here?


Regards,
Maciej Miklas

Re: CQL 3 and wide rows

Posted by Maciej Miklas <ma...@gmail.com>.

Hallo Jack,

You have given a perfect example for wide row.  Each reading from sensor creates new column within a row. It was also possible with Hector/CLI to have millions of columns within a single row. According to this page http://wiki.apache.org/cassandra/CassandraLimitations single row can have 2 billions columns.

How does this relate to CQL 3 and tables? 

I still do not understand it because:
- it looks like driver loads all column names into memory - it looks to me that the 2 billions limitation from CLI is not valid anymore
- Map and Set values do not support iterator 


Regards,
Maciej


On 19 May 2014, at 17:31, Jack Krupansky <ja...@basetechnology.com> wrote:

> You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.”
>  
> See:
> http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
>  
> -- Jack Krupansky
>  
> From: Maciej Miklas
> Sent: Monday, May 19, 2014 11:20 AM
> To: user@cassandra.apache.org
> Subject: CQL 3 and wide rows
>  
> Hi *,
>  
> I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?
>  
> Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here?
>  
>  
> Regards,
> Maciej Miklas

Re: CQL 3 and wide rows

Posted by Jack Krupansky <ja...@basetechnology.com>.

You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.”

See:
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows

-- Jack Krupansky

From: Maciej Miklas 
Sent: Monday, May 19, 2014 11:20 AM
To: user@cassandra.apache.org 
Subject: CQL 3 and wide rows

Hi *, 

I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?

Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? 

Regards,
Maciej Miklas

Re: CQL 3 and wide rows

Posted by Maciej Miklas <ma...@gmail.com>.

Thank you Nate - now I understand it ! This is real improvement when compared to CLI :)

Regards,
Maciej


On 20 May 2014, at 17:16, Nate McCall <na...@thelastpickle.com> wrote:

> Something like this might work:
> 
> 
> cqlsh:my_keyspace> CREATE TABLE my_widerow (
>                  ...   id text,
>                  ...   my_col timeuuid,
>                  ...   PRIMARY KEY (id, my_col)
>                  ... ) WITH caching='KEYS_ONLY' AND
>                  ...   compaction={'class': 'LeveledCompactionStrategy'};
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> insert into my_widerow (id, my_col) values ('some_key_1',now());
> cqlsh:my_keyspace> select * from my_widerow;
> 
>  id         | my_col
> ------------+--------------------------------------
>  some_key_1 | 7266d240-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 73ba0630-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10
> 
> cqlsh:my_keyspace> select * from my_widerow where id = 'some_key_1' and my_col > 73ba0630-e030-11e3-a50d-8b2f9bfbfa10;
> 
>  id         | my_col
> ------------+--------------------------------------
>  some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10
> 
> cqlsh:my_keyspace> select * from my_widerow where id = 'some_key_1' and my_col > 73ba0630-e030-11e3-a50d-8b2f9bfbfa10 and my_col < 76227ab0-e030-11e3-a50d-8b2f9bfbfa10;
> 
>  id         | my_col
> ------------+--------------------------------------
>  some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
>  some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10
> 
> 
> 
> These queries would all work fine from the DS Java Driver. Note that only the cells that are needed are pulled into memory:
> 
> 
> ./bin/nodetool cfstats my_keyspace my_widerow
>    ...
>    Column Family: my_widerow
>    ...
>    Average live cells per slice (last five minutes): 6.0
>    ...
> 
> 
> This shows that we are slicing across 6 rows on average for the last couple of select statements. 
> 
> Hope that helps.
> 
> 
> 
> -- 
> -----------------
> Nate McCall
> Austin, TX
> @zznate
> 
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com

Re: CQL 3 and wide rows

Posted by Nate McCall <na...@thelastpickle.com>.

Something like this might work:


cqlsh:my_keyspace> CREATE TABLE my_widerow (
                 ...   id text,
                 ...   my_col timeuuid,
                 ...   PRIMARY KEY (id, my_col)
                 ... ) WITH caching='KEYS_ONLY' AND
                 ...   compaction={'class': 'LeveledCompactionStrategy'};
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> insert into my_widerow (id, my_col) values
('some_key_1',now());
cqlsh:my_keyspace> select * from my_widerow;

 id         | my_col
------------+--------------------------------------
 some_key_1 | 7266d240-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 73ba0630-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10

cqlsh:my_keyspace> select * from my_widerow where id = 'some_key_1' and
my_col > 73ba0630-e030-11e3-a50d-8b2f9bfbfa10;

 id         | my_col
------------+--------------------------------------
 some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 76227ab0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 76cfd1b0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 777364b0-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 7aa061b0-e030-11e3-a50d-8b2f9bfbfa10

cqlsh:my_keyspace> select * from my_widerow where id = 'some_key_1' and
my_col > 73ba0630-e030-11e3-a50d-8b2f9bfbfa10 and my_col <
76227ab0-e030-11e3-a50d-8b2f9bfbfa10;

 id         | my_col
------------+--------------------------------------
 some_key_1 | 74404d30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 74defe30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75569f30-e030-11e3-a50d-8b2f9bfbfa10
 some_key_1 | 75bf9a30-e030-11e3-a50d-8b2f9bfbfa10



These queries would all work fine from the DS Java Driver. Note that only
the cells that are needed are pulled into memory:


./bin/nodetool cfstats my_keyspace my_widerow
   ...
   Column Family: my_widerow
   ...
   Average live cells per slice (last five minutes): 6.0
   ...


This shows that we are slicing across 6 rows on average for the last couple
of select statements.

Hope that helps.



-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: CQL 3 and wide rows

Posted by Maciej Miklas <ma...@gmail.com>.

Hi Aron,

Thanks for the answer!


Lest consider such CLI code:

for(int i = 0 ; i < 10_000_000 ; i++) {
  set[‘rowKey1’][‘myCol::i’] = UUID.randomUUID();
}


The code above will create single row, that contains 10^6 columns sorted by ‘i’. This will work fine, and this is the wide row to my understanding - row that holds many columns AND I can read only some part of it by right slice query. On the other hand side, I can iterate over all columns without latencies because data is stored on single node. I’ve been using similar structures as replacement for secondary indexes - it’s well known pattern.

How would I model it in CQL 3?

1) I could create Map, but Maps are fully loaded into memory, and Map containing 10^6 elements is definitely a problem. Plus it’s a big waste of RAM if you consider that I need only to read small subset.

2) I could alter table for each new column, which would create similar structure to this one from my CLI example. But it looks to me that all columns names are loaded into ram, which is still large limitation. I hope that I am wrong here - I am not sure.

3) I could redesign my model and divide data into many rows, but why would I do that, if I can use wide rows.

My idea of wide row, is a row that can hold large amount of key-value pairs (in any form), where I can filter on those keys to efficiently load only that part which I currently need.


Regards,
Maciej 


On 20 May 2014, at 09:06, Aaron Morton <aa...@thelastpickle.com> wrote:

> In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. 
> 
> 
>>> CREATE TABLE keyspace.widerow (
>>> row_key text,
>>> wide_row_column text,
>>> data_column text,
>>> PRIMARY KEY (row_key, wide_row_column));
>>> 
>>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
> 
> Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns. 
> 
> IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models. 
> 
>> But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
> Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help. 
> 
> Cheers
> Aaron
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 20/05/2014, at 8:18 am, Maciej Miklas <ma...@gmail.com> wrote:
> 
>> Hi James,
>> 
>> Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram…..
>> 
>> My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map,  in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
>> 
>> 
>> Regards,
>> Maciej
>> 
>> On 19 May 2014, at 17:30, James Campbell <ja...@breachintelligence.com> wrote:
>> 
>>> Maciej,
>>> 
>>> In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:
>>> 
>>> CREATE TABLE keyspace.widerow (
>>> row_key text,
>>> wide_row_column text,
>>> data_column text,
>>> PRIMARY KEY (row_key, wide_row_column));
>>> 
>>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
>>> 
>>> James
>>> From: Maciej Miklas <ma...@gmail.com>
>>> Sent: Monday, May 19, 2014 11:20 AM
>>> To: user@cassandra.apache.org
>>> Subject: CQL 3 and wide rows
>>>  
>>> Hi *,
>>> 
>>> I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?
>>> 
>>> Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? 
>>> 
>>> 
>>> Regards,
>>> Maciej Miklas
>> 
>

Re: CQL 3 and wide rows

Posted by Maciej Miklas <ma...@gmail.com>.

yes :)

On 20 May 2014, at 14:24, Jack Krupansky <ja...@basetechnology.com> wrote:

> To keep the terminology clear, your “row_key” is actually the “partition key”, and “wide_row_column” is actually a “clustering column”, and the combination of your row_key and wide_row_column is a “compound primary key”.
>  
> -- Jack Krupansky
>  
> From: Aaron Morton
> Sent: Tuesday, May 20, 2014 3:06 AM
> To: Cassandra User
> Subject: Re: CQL 3 and wide rows
>  
> In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. 
>  
>  
>>> CREATE TABLE keyspace.widerow (
>>> row_key text,
>>> wide_row_column text,
>>> data_column text,
>>> PRIMARY KEY (row_key, wide_row_column));
>>>  
>>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
>  
> Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns.
>  
> IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models.
>  
>> But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
> Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help.
>  
> Cheers
> Aaron
>  
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>  
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>  
> On 20/05/2014, at 8:18 am, Maciej Miklas <ma...@gmail.com> wrote:
> 
>> Hi James,
>>  
>> Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram…..
>>  
>> My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map,  in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
>>  
>>  
>> Regards,
>> Maciej
>>  
>> On 19 May 2014, at 17:30, James Campbell <ja...@breachintelligence.com> wrote:
>> 
>>> Maciej,
>>>  
>>> In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:
>>>  
>>> CREATE TABLE keyspace.widerow (
>>> row_key text,
>>> wide_row_column text,
>>> data_column text,
>>> PRIMARY KEY (row_key, wide_row_column));
>>>  
>>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
>>>  
>>> James
>>> From: Maciej Miklas <ma...@gmail.com>
>>> Sent: Monday, May 19, 2014 11:20 AM
>>> To: user@cassandra.apache.org
>>> Subject: CQL 3 and wide rows
>>>  
>>> Hi *,
>>>  
>>> I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?
>>>  
>>> Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here?
>>>  
>>>  
>>> Regards,
>>> Maciej Miklas
>> 
>>  
> 
>

Re: CQL 3 and wide rows

Posted by Jack Krupansky <ja...@basetechnology.com>.

To keep the terminology clear, your “row_key” is actually the “partition key”, and “wide_row_column” is actually a “clustering column”, and the combination of your row_key and wide_row_column is a “compound primary key”.

-- Jack Krupansky

From: Aaron Morton 
Sent: Tuesday, May 20, 2014 3:06 AM
To: Cassandra User 
Subject: Re: CQL 3 and wide rows

In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names.  

    CREATE TABLE keyspace.widerow (

    row_key text,

    wide_row_column text,

    data_column text,

    PRIMARY KEY (row_key, wide_row_column));

    Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.

Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns. 

IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models. 

  But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help. 

Cheers
Aaron

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 20/05/2014, at 8:18 am, Maciej Miklas <ma...@gmail.com> wrote:

  Hi James, 

  Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram…..

  My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map,  in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.

  Regards,
  Maciej

  On 19 May 2014, at 17:30, James Campbell <ja...@breachintelligence.com> wrote:

    Maciej,

    In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:

    CREATE TABLE keyspace.widerow (

    row_key text,

    wide_row_column text,

    data_column text,

    PRIMARY KEY (row_key, wide_row_column));

    Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.

    James

----------------------------------------------------------------------------

    From: Maciej Miklas <ma...@gmail.com>
    Sent: Monday, May 19, 2014 11:20 AM
    To: user@cassandra.apache.org
    Subject: CQL 3 and wide rows 

    Hi *, 

    I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?

    Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? 

    Regards,
    Maciej Miklas

Re: CQL 3 and wide rows

Posted by Aaron Morton <aa...@thelastpickle.com>.

In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. 


>> CREATE TABLE keyspace.widerow (
>> row_key text,
>> wide_row_column text,
>> data_column text,
>> PRIMARY KEY (row_key, wide_row_column));
>> 
>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.

Internally there may be more **cells** ( as we now call the internal columns). In the example above each value for row_key will create a single partition (as we now call internal storage engine rows). In each of those partitions there will be cells for each CQL 3 row that has the same row_key, those cells will use a Composite for the name. The first part of the composite will be the value of the wide_row_column and the second will be the literal name of the non primary key columns. 

IMHO Wide partitions (storage engine rows) are more prevalent in CQL3 than thrift models. 

> But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
Now days you can do pretty much everything you can in cli. Provide an example and we may be able to help. 

Cheers
Aaron

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 20/05/2014, at 8:18 am, Maciej Miklas <ma...@gmail.com> wrote:

> Hi James,
> 
> Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram…..
> 
> My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map,  in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.
> 
> 
> Regards,
> Maciej
> 
> On 19 May 2014, at 17:30, James Campbell <ja...@breachintelligence.com> wrote:
> 
>> Maciej,
>> 
>> In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:
>> 
>> CREATE TABLE keyspace.widerow (
>> row_key text,
>> wide_row_column text,
>> data_column text,
>> PRIMARY KEY (row_key, wide_row_column));
>> 
>> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
>> 
>> James
>> From: Maciej Miklas <ma...@gmail.com>
>> Sent: Monday, May 19, 2014 11:20 AM
>> To: user@cassandra.apache.org
>> Subject: CQL 3 and wide rows
>>  
>> Hi *,
>> 
>> I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?
>> 
>> Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? 
>> 
>> 
>> Regards,
>> Maciej Miklas
>

Re: CQL 3 and wide rows

Posted by Maciej Miklas <ma...@gmail.com>.

Hi James,

Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory - which is confusing to me. From one side we have wide row, but we load whole into ram…..

My understanding of wide row is a row that supports millions of columns, or similar things like map or set. In CLI you would generate column names (or use compound columns) to simulate set or map,  in CQL 3 you would use some static names plus Map or Set structures, or you could still alter table and have large number of columns. But still - I do not see Iteration, so it looks to me that CQL 3 is limited when compared to CLI/Hector.

Regards,
Maciej

On 19 May 2014, at 17:30, James Campbell <ja...@breachintelligence.com> wrote:

> Maciej,
> 
> In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:
> 
> CREATE TABLE keyspace.widerow (
> row_key text,
> wide_row_column text,
> data_column text,
> PRIMARY KEY (row_key, wide_row_column));
> 
> Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.
> 
> James
> From: Maciej Miklas <ma...@gmail.com>
> Sent: Monday, May 19, 2014 11:20 AM
> To: user@cassandra.apache.org
> Subject: CQL 3 and wide rows
>  
> Hi *,
> 
> I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?
> 
> Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here? 
> 
> 
> Regards,
> Maciej Miklas

RE: CQL 3 and wide rows

Posted by James Campbell <ja...@breachintelligence.com>.

Maciej,


In CQL3 "wide rows" are expected to be created using clustering columns.  So while the schema will have a relatively smaller number of named columns, the effect is a wide row.  For example:


CREATE TABLE keyspace.widerow (

row_key text,

wide_row_column text,

data_column text,

PRIMARY KEY (row_key, wide_row_column));


Check out, for example, http://www.datastax.com/dev/blog/schema-in-cassandra-1-1.?


James

________________________________
From: Maciej Miklas <ma...@gmail.com>
Sent: Monday, May 19, 2014 11:20 AM
To: user@cassandra.apache.org
Subject: CQL 3 and wide rows

Hi *,

I've checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true?

Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I missing something here?


Regards,
Maciej Miklas