You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "pavlikus@gmail.com" <pa...@gmail.com> on 2013/11/08 12:10:31 UTC
How would you model that?
Hey guys, I need to retrieve a list of distinct users based on their
activity datetime. How can I model a table to store that kind of
information?
The straightforward decision was this:
CREATE TABLE user_activity (user text primary key, ts timeuuid);
but it turned out it is impossible to do a select like this:
select * from user_activity order by ts;
as it fails with "ORDER BY is only supported when the partition key is
restricted by an EQ or an IN".
How would you model the thing? Just need to have a list of users based on
their last activity timestamp...
Thanks!
Re: How would you model that?
Posted by Aaron Morton <aa...@thelastpickle.com>.
> Hey guys, I need to retrieve a list of distinct users based on their activity datetime. How can I model a table to store that kind of information?
If it’s for an arbitrary time slice it will be tricky, if you can use pre set time slices something like this would work:
CREATE TABLE (
timeslice_start timestamp,
timeslice_size int,
user text,
PRIMARY KEY ( (timeslice_start, timeslice_size), user)
);
That would give you the unique users in a time slice, e.g. unique for a 4 hour window.
Cheers
-----------------
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 9/11/2013, at 12:56 am, Franc Carter <fr...@sirca.org.au> wrote:
>
> How about something like using a time-range as the key (e.g an hour depending on your update rate) and a composite (time:user) as the column name
>
> cheers
>
>
>
> On Fri, Nov 8, 2013 at 10:45 PM, Laing, Michael <mi...@nytimes.com> wrote:
> You could try this:
>
> CREATE TABLE user_activity (shard text, user text, ts timeuuid, primary key (shard, ts));
>
> select user, ts from user_activity where shard in ('00', '01', ...) order by ts desc;
>
> Grab each user and ts the first time you see that user.
>
> Use as many shards as you think you need to control row size and spread the load.
>
> Set ttls to expire user_activity entries when you are no longer interested in them.
>
> ml
>
>
> On Fri, Nov 8, 2013 at 6:10 AM, pavlikus@gmail.com <pa...@gmail.com> wrote:
> Hey guys, I need to retrieve a list of distinct users based on their activity datetime. How can I model a table to store that kind of information?
>
> The straightforward decision was this:
>
> CREATE TABLE user_activity (user text primary key, ts timeuuid);
>
> but it turned out it is impossible to do a select like this:
>
> select * from user_activity order by ts;
>
> as it fails with "ORDER BY is only supported when the partition key is restricted by an EQ or an IN".
>
> How would you model the thing? Just need to have a list of users based on their last activity timestamp...
>
> Thanks!
>
>
>
>
>
> --
> Franc Carter | Systems architect | Sirca Ltd
> franc.carter@sirca.org.au | www.sirca.org.au
> Tel: +61 2 8355 2514
> Level 4, 55 Harrington St, The Rocks NSW 2000
> PO Box H58, Australia Square, Sydney NSW 1215
>
Re: How would you model that?
Posted by Franc Carter <fr...@sirca.org.au>.
How about something like using a time-range as the key (e.g an hour
depending on your update rate) and a composite (time:user) as the column
name
cheers
On Fri, Nov 8, 2013 at 10:45 PM, Laing, Michael
<mi...@nytimes.com>wrote:
> You could try this:
>
> CREATE TABLE user_activity (shard text, user text, ts timeuuid, primary
> key (shard, ts));
>
> select user, ts from user_activity where shard in ('00', '01', ...) order
> by ts desc;
>
> Grab each user and ts the first time you see that user.
>
> Use as many shards as you think you need to control row size and spread
> the load.
>
> Set ttls to expire user_activity entries when you are no longer interested
> in them.
>
> ml
>
>
> On Fri, Nov 8, 2013 at 6:10 AM, pavlikus@gmail.com <pa...@gmail.com>wrote:
>
>> Hey guys, I need to retrieve a list of distinct users based on their
>> activity datetime. How can I model a table to store that kind of
>> information?
>>
>> The straightforward decision was this:
>>
>> CREATE TABLE user_activity (user text primary key, ts timeuuid);
>>
>> but it turned out it is impossible to do a select like this:
>>
>> select * from user_activity order by ts;
>>
>> as it fails with "ORDER BY is only supported when the partition key is
>> restricted by an EQ or an IN".
>>
>> How would you model the thing? Just need to have a list of users based on
>> their last activity timestamp...
>>
>> Thanks!
>>
>>
>
--
*Franc Carter* | Systems architect | Sirca Ltd
<ma...@sirca.org.au>
franc.carter@sirca.org.au | www.sirca.org.au
Tel: +61 2 8355 2514
Level 4, 55 Harrington St, The Rocks NSW 2000
PO Box H58, Australia Square, Sydney NSW 1215
Re: How would you model that?
Posted by "Laing, Michael" <mi...@nytimes.com>.
You could try this:
CREATE TABLE user_activity (shard text, user text, ts timeuuid, primary key
(shard, ts));
select user, ts from user_activity where shard in ('00', '01', ...) order
by ts desc;
Grab each user and ts the first time you see that user.
Use as many shards as you think you need to control row size and spread the
load.
Set ttls to expire user_activity entries when you are no longer interested
in them.
ml
On Fri, Nov 8, 2013 at 6:10 AM, pavlikus@gmail.com <pa...@gmail.com>wrote:
> Hey guys, I need to retrieve a list of distinct users based on their
> activity datetime. How can I model a table to store that kind of
> information?
>
> The straightforward decision was this:
>
> CREATE TABLE user_activity (user text primary key, ts timeuuid);
>
> but it turned out it is impossible to do a select like this:
>
> select * from user_activity order by ts;
>
> as it fails with "ORDER BY is only supported when the partition key is
> restricted by an EQ or an IN".
>
> How would you model the thing? Just need to have a list of users based on
> their last activity timestamp...
>
> Thanks!
>
>