You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Matope Ono <ma...@gmail.com> on 2014/05/16 06:23:07 UTC

Query first 1 columns for each partitioning keys in CQL?

Hi, I'm modeling some queries in CQL3.

I'd like to query first 1 columns for each partitioning keys in CQL3.

For example:

create table posts(
> author ascii,
> created_at timeuuid,
> entry text,
> primary key(author,created_at)
> );
> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> john');
> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> mike');
> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');


And I want results like below.

mike,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by mike
> john,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by john


I think that this is what "SELECT FIRST " statements did in CQL2.

The only way I came across in CQL3 is "retrieve whole records and drop
manually",
but it's obviously not efficient.

Could you please tell me more straightforward way in CQL3?

Re: Query first 1 columns for each partitioning keys in CQL?

Posted by Matope Ono <ma...@gmail.com>.

Hmm. Something like a user-managed-index looks the only way to do what I
want to do.
Thank you, I'll try that.


2014-05-17 18:07 GMT+09:00 DuyHai Doan <do...@gmail.com>:

> Clearly with your current data model, having X latest post for each author
> is not possible.
>
>  However, what's about this ?
>
> CREATE TABLE latest_posts_per_user (
>    author ascii
>    latest_post map<uuid,text>,
>    PRIMARY KEY (author)
> )
>
>  The latest_post will keep a collection of X latest posts for each user.
> Now the challenge is to "update" this latest_post map every time an user
> create a new post. This can be done in a single CQL3 statement: UPDATE
> latest_posts_per_user SET latest_post = latest_post + {new_uuid: 'new
> entry', oldest_uuid: null} WHERE author = xxx;
>
>  You'll need to know the uuid of the oldest post to remove it from the map
>
>
>
> On Sat, May 17, 2014 at 8:53 AM, 後藤 泰陽 <ma...@gmail.com> wrote:
>
>> Hello,
>>
>> Thank you for your addressing.
>>
>> But I consider LIMIT to be a keyword to limits result numbers from WHOLE
>> results retrieved by the SELECT statement.
>> The result with SELECT.. LIMIT is below. Unfortunately, This is not what
>> I wanted.
>> I wante latest posts of each authors. (Now I doubt if CQL3 can't
>> represent it)
>>
>> cqlsh:blog_test> create table posts(
>>              ... author ascii,
>>              ... created_at timeuuid,
>>              ... entry text,
>>              ... primary key(author,created_at)
>>              ... )WITH CLUSTERING ORDER BY (created_at DESC);
>> cqlsh:blog_test>
>> cqlsh:blog_test> insert into posts(author,created_at,entry) values
>> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>> john');
>> cqlsh:blog_test> insert into posts(author,created_at,entry) values
>> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
>> john');
>> cqlsh:blog_test> insert into posts(author,created_at,entry) values
>> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>> mike');
>> cqlsh:blog_test> insert into posts(author,created_at,entry) values
>> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
>> mike');
>> cqlsh:blog_test> select * from posts limit 2;
>>
>>  author | created_at                           | entry
>>
>> --------+--------------------------------------+------------------------------
>>    mike | 1c4d9000-83e9-11e2-8080-808080808080 |  This is a new entry by
>> mike
>>    mike | 4e52d000-6d1f-11e2-8080-808080808080 | This is an old entry by
>> mike
>>
>>
>>
>>
>> 2014/05/16 23:54、Jonathan Lacefield <jl...@datastax.com> のメール：
>>
>> Hello,
>>
>>  Have you looked at using the CLUSTERING ORDER BY and LIMIT features of
>> CQL3?
>>
>>  These may help you achieve your goals.
>>
>>
>> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html
>>
>> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
>>
>> Jonathan Lacefield
>> Solutions Architect, DataStax
>> (404) 822 3487
>> <http://www.linkedin.com/in/jlacefield>
>>
>> <http://www.datastax.com/cassandrasummit14>
>>
>>
>>
>> On Fri, May 16, 2014 at 12:23 AM, Matope Ono <ma...@gmail.com>wrote:
>>
>>> Hi, I'm modeling some queries in CQL3.
>>>
>>> I'd like to query first 1 columns for each partitioning keys in CQL3.
>>>
>>> For example:
>>>
>>> create table posts(
>>>> author ascii,
>>>> created_at timeuuid,
>>>> entry text,
>>>> primary key(author,created_at)
>>>> );
>>>> insert into posts(author,created_at,entry) values
>>>> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>>>> john');
>>>> insert into posts(author,created_at,entry) values
>>>> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
>>>> insert into posts(author,created_at,entry) values
>>>> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>>>> mike');
>>>> insert into posts(author,created_at,entry) values
>>>> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');
>>>
>>>
>>> And I want results like below.
>>>
>>> mike,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by mike
>>>> john,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by john
>>>
>>>
>>> I think that this is what "SELECT FIRST " statements did in CQL2.
>>>
>>> The only way I came across in CQL3 is "retrieve whole records and drop
>>> manually",
>>> but it's obviously not efficient.
>>>
>>> Could you please tell me more straightforward way in CQL3?
>>>
>>
>>
>>
>

Re: Query first 1 columns for each partitioning keys in CQL?

Posted by DuyHai Doan <do...@gmail.com>.

Clearly with your current data model, having X latest post for each author
is not possible.

 However, what's about this ?

CREATE TABLE latest_posts_per_user (
   author ascii
   latest_post map<uuid,text>,
   PRIMARY KEY (author)
)

 The latest_post will keep a collection of X latest posts for each user.
Now the challenge is to "update" this latest_post map every time an user
create a new post. This can be done in a single CQL3 statement: UPDATE
latest_posts_per_user SET latest_post = latest_post + {new_uuid: 'new
entry', oldest_uuid: null} WHERE author = xxx;

 You'll need to know the uuid of the oldest post to remove it from the map



On Sat, May 17, 2014 at 8:53 AM, 後藤 泰陽 <ma...@gmail.com> wrote:

> Hello,
>
> Thank you for your addressing.
>
> But I consider LIMIT to be a keyword to limits result numbers from WHOLE
> results retrieved by the SELECT statement.
> The result with SELECT.. LIMIT is below. Unfortunately, This is not what I
> wanted.
> I wante latest posts of each authors. (Now I doubt if CQL3 can't represent
> it)
>
> cqlsh:blog_test> create table posts(
>              ... author ascii,
>              ... created_at timeuuid,
>              ... entry text,
>              ... primary key(author,created_at)
>              ... )WITH CLUSTERING ORDER BY (created_at DESC);
> cqlsh:blog_test>
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
> john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> mike');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
> mike');
> cqlsh:blog_test> select * from posts limit 2;
>
>  author | created_at                           | entry
>
> --------+--------------------------------------+------------------------------
>    mike | 1c4d9000-83e9-11e2-8080-808080808080 |  This is a new entry by
> mike
>    mike | 4e52d000-6d1f-11e2-8080-808080808080 | This is an old entry by
> mike
>
>
>
>
> 2014/05/16 23:54、Jonathan Lacefield <jl...@datastax.com> のメール：
>
> Hello,
>
>  Have you looked at using the CLUSTERING ORDER BY and LIMIT features of
> CQL3?
>
>  These may help you achieve your goals.
>
>
> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html
>
> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
>
> Jonathan Lacefield
> Solutions Architect, DataStax
> (404) 822 3487
> <http://www.linkedin.com/in/jlacefield>
>
> <http://www.datastax.com/cassandrasummit14>
>
>
>
> On Fri, May 16, 2014 at 12:23 AM, Matope Ono <ma...@gmail.com> wrote:
>
>> Hi, I'm modeling some queries in CQL3.
>>
>> I'd like to query first 1 columns for each partitioning keys in CQL3.
>>
>> For example:
>>
>> create table posts(
>>> author ascii,
>>> created_at timeuuid,
>>> entry text,
>>> primary key(author,created_at)
>>> );
>>> insert into posts(author,created_at,entry) values
>>> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>>> john');
>>> insert into posts(author,created_at,entry) values
>>> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
>>> insert into posts(author,created_at,entry) values
>>> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>>> mike');
>>> insert into posts(author,created_at,entry) values
>>> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');
>>
>>
>> And I want results like below.
>>
>> mike,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by mike
>>> john,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by john
>>
>>
>> I think that this is what "SELECT FIRST " statements did in CQL2.
>>
>> The only way I came across in CQL3 is "retrieve whole records and drop
>> manually",
>> but it's obviously not efficient.
>>
>> Could you please tell me more straightforward way in CQL3?
>>
>
>
>

Re: Query first 1 columns for each partitioning keys in CQL?

Posted by Bryan Talbot <br...@playnext.com>.

I think there are several issues in your schema and queries.

First, the schema can't efficiently return the single newest post for every
author. It can efficiently return the newest N posts for a particular
author.

On Fri, May 16, 2014 at 11:53 PM, 後藤 泰陽 <ma...@gmail.com> wrote:

>
> But I consider LIMIT to be a keyword to limits result numbers from WHOLE
> results retrieved by the SELECT statement.
>


This is happening due to the incorrect use of minTimeuuid() function. All
of your created_at values are equal so you're essentially getting 2 (order
not defined) values that have the lowest created_at value.

The minTimeuuid() function is mean to be used in the WHERE clause of a
SELECT statement often with maxTimeuuid() to do BETWEEN sort of queries on
timeuuid values.




> The result with SELECT.. LIMIT is below. Unfortunately, This is not what I
> wanted.
> I wante latest posts of each authors. (Now I doubt if CQL3 can't represent
> it)
>
> cqlsh:blog_test> create table posts(
>              ... author ascii,
>              ... created_at timeuuid,
>              ... entry text,
>              ... primary key(author,created_at)
>              ... )WITH CLUSTERING ORDER BY (created_at DESC);
> cqlsh:blog_test>
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
> john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
> mike');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values
> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by
> mike');
> cqlsh:blog_test> select * from posts limit 2;
>
>  author | created_at                           | entry
>
> --------+--------------------------------------+------------------------------
>    mike | 1c4d9000-83e9-11e2-8080-808080808080 |  This is a new entry by
> mike
>    mike | 4e52d000-6d1f-11e2-8080-808080808080 | This is an old entry by
> mike
>
>
>
>


To get most recent posts by a particular author, you'll need statements
more like this:

cqlsh:test> insert into posts(author,created_at,entry) values
('john',now(),'This is an old entry by john'); cqlsh:test> insert into
posts(author,created_at,entry) values ('john',now(),'This is a new entry by
john'); cqlsh:test> insert into posts(author,created_at,entry) values
('mike',now(),'This is an old entry by mike'); cqlsh:test> insert into
posts(author,created_at,entry) values ('mike',now(),'This is a new entry by
mike');

and then you can get posts by 'john' ordered by newest to oldest as:

cqlsh:test> select author, created_at, dateOf(created_at), entry from posts
where author = 'john' limit 2 ;

 author | created_at                           | dateOf(created_at)       |
entry
--------+--------------------------------------+--------------------------+------------------------------
   john | 7cb1ac30-df85-11e3-bb46-4d2d68f17aa6 | 2014-05-19 11:43:36-0700 |
 This is a new entry by john
   john | 74bb6750-df85-11e3-bb46-4d2d68f17aa6 | 2014-05-19 11:43:23-0700 |
This is an old entry by john


-Bryan

Re: Query first 1 columns for each partitioning keys in CQL?

Posted by 後藤泰陽 <ma...@gmail.com>.

Hello,

Thank you for your addressing.

But I consider LIMIT to be a keyword to limits result numbers from WHOLE results retrieved by the SELECT statement.
The result with SELECT.. LIMIT is below. Unfortunately, This is not what I wanted.
I wante latest posts of each authors. (Now I doubt if CQL3 can't represent it)

> cqlsh:blog_test> create table posts(
>              ... author ascii,
>              ... created_at timeuuid,
>              ... entry text,
>              ... primary key(author,created_at)
>              ... )WITH CLUSTERING ORDER BY (created_at DESC);
> cqlsh:blog_test> 
> cqlsh:blog_test> insert into posts(author,created_at,entry) values ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by mike');
> cqlsh:blog_test> insert into posts(author,created_at,entry) values ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');
> cqlsh:blog_test> select * from posts limit 2;
> 
>  author | created_at                           | entry
> --------+--------------------------------------+------------------------------
>    mike | 1c4d9000-83e9-11e2-8080-808080808080 |  This is a new entry by mike
>    mike | 4e52d000-6d1f-11e2-8080-808080808080 | This is an old entry by mike



2014/05/16 23:54、Jonathan Lacefield <jl...@datastax.com> のメール：

> Hello,
> 
>  Have you looked at using the CLUSTERING ORDER BY and LIMIT features of CQL3?
> 
>  These may help you achieve your goals.
> 
>   http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html
>   http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
> 
> Jonathan Lacefield
> Solutions Architect, DataStax
> (404) 822 3487
> 
> 
> 
> 
> 
> 
> On Fri, May 16, 2014 at 12:23 AM, Matope Ono <ma...@gmail.com> wrote:
> Hi, I'm modeling some queries in CQL3.
> 
> I'd like to query first 1 columns for each partitioning keys in CQL3.
> 
> For example:
> 
> create table posts(
> 	author ascii,
> 	created_at timeuuid,
> 	entry text,
> 	primary key(author,created_at)
> );
> insert into posts(author,created_at,entry) values ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by john');
> insert into posts(author,created_at,entry) values ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
> insert into posts(author,created_at,entry) values ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by mike');
> insert into posts(author,created_at,entry) values ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');
> 
> And I want results like below.
> 
> mike,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by mike
> john,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by john
> 
> I think that this is what "SELECT FIRST " statements did in CQL2.
> 
> The only way I came across in CQL3 is "retrieve whole records and drop manually",
> but it's obviously not efficient.
> 
> Could you please tell me more straightforward way in CQL3?
>

Re: Query first 1 columns for each partitioning keys in CQL?

Posted by Jonathan Lacefield <jl...@datastax.com>.

Hello,

 Have you looked at using the CLUSTERING ORDER BY and LIMIT features of
CQL3?

 These may help you achieve your goals.


http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html

http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
<http://www.linkedin.com/in/jlacefield>

<http://www.datastax.com/cassandrasummit14>



On Fri, May 16, 2014 at 12:23 AM, Matope Ono <ma...@gmail.com> wrote:

> Hi, I'm modeling some queries in CQL3.
>
> I'd like to query first 1 columns for each partitioning keys in CQL3.
>
> For example:
>
> create table posts(
>> author ascii,
>> created_at timeuuid,
>> entry text,
>> primary key(author,created_at)
>> );
>> insert into posts(author,created_at,entry) values
>> ('john',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>> john');
>> insert into posts(author,created_at,entry) values
>> ('john',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by john');
>> insert into posts(author,created_at,entry) values
>> ('mike',minTimeuuid('2013-02-02 10:00+0000'),'This is an old entry by
>> mike');
>> insert into posts(author,created_at,entry) values
>> ('mike',minTimeuuid('2013-03-03 10:00+0000'),'This is a new entry by mike');
>
>
> And I want results like below.
>
> mike,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by mike
>> john,1c4d9000-83e9-11e2-8080-808080808080,This is a new entry by john
>
>
> I think that this is what "SELECT FIRST " statements did in CQL2.
>
> The only way I came across in CQL3 is "retrieve whole records and drop
> manually",
> but it's obviously not efficient.
>
> Could you please tell me more straightforward way in CQL3?
>