You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Ryan, Brent" <BR...@cvent.com> on 2013/06/19 15:08:45 UTC

timeuuid and cql3 query

I'm experimenting with a data model that will need to ingest a lot of data that will need to be query able by time.  In the example below, I want to be able to run a query like "select * from count3 where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:00') and ts < minTimeuuid('2013-06-18 16:24:00');".  However, in certain cases this query fails with the error "Bad Request: Start key must sort before (or equal to) finish key in your partitioner!".  It's not clear to be why this happens or what the issue is as it seems like a bug.

Here's the table:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

It has data like so:

cqlsh:Statistics> select counter,dateof(ts),key1,value from count3;

 counter | dateof(ts)               | key1 | value
---------+--------------------------+------+-------
    test | 2013-06-18 16:23:25-0400 |    1 |     1
    test | 2013-06-18 16:23:28-0400 |    1 |     1
    test | 2013-06-18 16:23:28-0400 |    1 |     1
    test | 2013-06-18 16:23:28-0400 |    1 |     1
    test | 2013-06-18 16:23:29-0400 |    1 |     1
    test | 2013-06-18 16:23:29-0400 |    1 |     1
    test | 2013-06-18 16:23:29-0400 |    1 |     1
    test | 2013-06-18 16:23:30-0400 |    1 |     1
    test | 2013-06-18 16:23:30-0400 |    1 |     1
    test | 2013-06-18 16:23:31-0400 |    1 |     1
    test | 2013-06-18 16:23:31-0400 |    1 |     1
    test | 2013-06-18 16:23:31-0400 |    1 |     1
    test | 2013-06-18 16:23:32-0400 |    1 |     1
    test | 2013-06-18 16:23:32-0400 |    1 |     1


DOESN'T WORK:
    cqlsh:Statistics> select * from count3 where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:00') and ts < minTimeuuid('2013-06-18 16:24:00');
Bad Request: Start key must sort before (or equal to) finish key in your partitioner!

WORKS FINE:
cqlsh:Statistics> select * from count3 where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:25') and ts < minTimeuuid('2013-06-18 16:23:31');

 counter | ts                                   | key1 | value
---------+--------------------------------------+------+-------
    test | edee0df0-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | ef9a5e60-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | efccb900-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | effb1c00-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | f0284680-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | f05b8b80-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | f08c5f80-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | f0c6f780-d854-11e2-ac46-cba9e55f995d |    1 |     1
    test | f1018f80-d854-11e2-ac46-cba9e55f995d |    1 |     1


Thanks,
Brent

Re: timeuuid and cql3 query

Posted by "Ryan, Brent" <BR...@cvent.com>.

I'm using the byte ordered partitioner.

Sent from my iPhone

On Jun 19, 2013, at 11:26 AM, "Sylvain Lebresne" <sy...@datastax.com>> wrote:

You're using the ordered partitioner, right?


On Wed, Jun 19, 2013 at 5:06 PM, Davide Anastasia <da...@gmail.com>> wrote:

Hi Tyler,
I am interested in this scenario as well: could you please elaborate further your answer?

Thanks a lot,
Davide

On 19 Jun 2013 16:01, "Tyler Hobbs" <ty...@datastax.com>> wrote:

On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com>> wrote:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.


--
Tyler Hobbs
DataStax<http://datastax.com/>

Re: timeuuid and cql3 query

Posted by Sylvain Lebresne <sy...@datastax.com>.

You're using the ordered partitioner, right?


On Wed, Jun 19, 2013 at 5:06 PM, Davide Anastasia <
davide.anastasia@gmail.com> wrote:

> Hi Tyler,
> I am interested in this scenario as well: could you please elaborate
> further your answer?
>
> Thanks a lot,
> Davide
> On 19 Jun 2013 16:01, "Tyler Hobbs" <ty...@datastax.com> wrote:
>
>>
>> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:
>>
>>>
>>>  CREATE TABLE count3 (
>>>   counter text,
>>>   ts timeuuid,
>>>   key1 text,
>>>   value int,
>>>   PRIMARY KEY ((counter, ts))
>>> )
>>>
>>
>> Instead of doing a composite partition key, remove a set of parens and
>> let ts be your clustering key.  That will cause cql rows to be stored in
>> sorted order by the ts column (for a given value of "counter") and allow
>> you to do the kind of query you're looking for.
>>
>>
>> --
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>>
>

Re: timeuuid and cql3 query

Posted by Davide Anastasia <da...@gmail.com>.

Hi Tyler,
I am interested in this scenario as well: could you please elaborate
further your answer?

Thanks a lot,
Davide
On 19 Jun 2013 16:01, "Tyler Hobbs" <ty...@datastax.com> wrote:

>
> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:
>
>>
>>  CREATE TABLE count3 (
>>   counter text,
>>   ts timeuuid,
>>   key1 text,
>>   value int,
>>   PRIMARY KEY ((counter, ts))
>> )
>>
>
> Instead of doing a composite partition key, remove a set of parens and let
> ts be your clustering key.  That will cause cql rows to be stored in sorted
> order by the ts column (for a given value of "counter") and allow you to do
> the kind of query you're looking for.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: timeuuid and cql3 query

Posted by "Ryan, Brent" <BR...@cvent.com>.

Yes.  The problem is that I can't use "counter" as the partition key otherwise I'd wind up with hot spots in my cluster where majority of the data is being written to single node in the cluster.  The only real way around this problem with Cassandra is to follow along with what this blog does:

http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

From: Eric Stevens <mi...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Friday, June 21, 2013 8:38 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query

It's my understanding that if cardinality of the first part of the primary key has low cardinality, you will struggle with cluster balance as (unless you use WITH COMPACT STORAGE) the first entry of the primary key equates to the row key from the traditional interface, thus all entries related to a single value for the "counter" column will map to the same partition.

So consider the cardinality of this field, if cardinality is low, you might need to remodel with PRIMARY KEY (counter, ts, key1) then tack on WITH COMPACT STORAGE (then the entire primary key becomes the row key, but you can only have one column which is not part of the primary key)  If cardinality of "counter" is high, then you have nothing to worry about.

On Wed, Jun 19, 2013 at 3:16 PM, Francisco Andrades Grassi <bi...@gmail.com>> wrote:
Hi,

I believe what he's recommending is:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (counter, ts)
)

That way counter will be your partitioning key, and all the rows that have the same counter value will be clustered (stored as a single wide row sorted by the ts value). In this scenario the query:

 where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:00') and ts < minTimeuuid('2013-06-18 16:24:00');

would actually be a sequential read on a wide row on a single node.

--
Francisco Andrades Grassi
www.bigjocker.com<http://www.bigjocker.com/>
@bigjocker

On Jun 19, 2013, at 12:17 PM, "Ryan, Brent" <BR...@cvent.com>> wrote:

Tyler,

You're recommending this schema instead, correct?

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
)

I believe I tried this as well and ran into similar problems but I'll try it again.  I'm using the "ByteOrderedPartitioner" if that helps with the latest version of DSE community edition which I believe is Cassandra 1.2.3.

Thanks,
Brent

From: Tyler Hobbs <ty...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 11:00 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query

On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com>> wrote:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.

--
Tyler Hobbs
DataStax<http://datastax.com/>

Re: timeuuid and cql3 query

Posted by Eric Stevens <mi...@gmail.com>.

It's my understanding that if cardinality of the first part of the primary
key has low cardinality, you will struggle with cluster balance as (unless
you use WITH COMPACT STORAGE) the first entry of the primary key equates to
the row key from the traditional interface, thus all entries related to a
single value for the "counter" column will map to the same partition.

So consider the cardinality of this field, if cardinality is low, you might
need to remodel with PRIMARY KEY (counter, ts, key1) then tack on WITH
COMPACT STORAGE (then the entire primary key becomes the row key, but you
can only have one column which is not part of the primary key)  If
cardinality of "counter" is high, then you have nothing to worry about.


On Wed, Jun 19, 2013 at 3:16 PM, Francisco Andrades Grassi <
bigjocker@gmail.com> wrote:

> Hi,
>
> I believe what he's recommending is:
>
> CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (counter, ts)
> )
>
> That way *counter* will be your partitioning key, and all the rows that
> have the same *counter* value will be clustered (stored as a single wide
> row sorted by the *ts* value). In this scenario the query:
>
>  where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:00') and ts
> < minTimeuuid('2013-06-18 16:24:00');
>
> would actually be a sequential read on a wide row on a single node.
>
> --
> Francisco Andrades Grassi
> www.bigjocker.com
> @bigjocker
>
> On Jun 19, 2013, at 12:17 PM, "Ryan, Brent" <BR...@cvent.com> wrote:
>
>  Tyler,
>
>  You're recommending this schema instead, correct?
>
>  CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> )
>
>  I believe I tried this as well and ran into similar problems but I'll
> try it again.  I'm using the "ByteOrderedPartitioner" if that helps with
> the latest version of DSE community edition which I believe is Cassandra
> 1.2.3.
>
>
>  Thanks,
> Brent
>
>
>   From: Tyler Hobbs <ty...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 11:00 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>
> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:
>
>>
>>  CREATE TABLE count3 (
>>   counter text,
>>   ts timeuuid,
>>   key1 text,
>>   value int,
>>   PRIMARY KEY ((counter, ts))
>> )
>>
>
> Instead of doing a composite partition key, remove a set of parens and let
> ts be your clustering key.  That will cause cql rows to be stored in sorted
> order by the ts column (for a given value of "counter") and allow you to do
> the kind of query you're looking for.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>
>
>

Re: timeuuid and cql3 query

Posted by Francisco Andrades Grassi <bi...@gmail.com>.

Hi,

I believe what he's recommending is:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (counter, ts)
)

That way counter will be your partitioning key, and all the rows that have the same counter value will be clustered (stored as a single wide row sorted by the ts value). In this scenario the query:

 where counter = 'test' and ts > minTimeuuid('2013-06-18 16:23:00') and ts < minTimeuuid('2013-06-18 16:24:00');

would actually be a sequential read on a wide row on a single node.

--
Francisco Andrades Grassi
www.bigjocker.com
@bigjocker

On Jun 19, 2013, at 12:17 PM, "Ryan, Brent" <BR...@cvent.com> wrote:

> Tyler,
> 
> You're recommending this schema instead, correct?
> 
> CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> )
> 
> I believe I tried this as well and ran into similar problems but I'll try it again.  I'm using the "ByteOrderedPartitioner" if that helps with the latest version of DSE community edition which I believe is Cassandra 1.2.3.
> 
> 
> Thanks,
> Brent
> 
> 
> From: Tyler Hobbs <ty...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 11:00 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
> 
> 
> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:
> 
> CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY ((counter, ts))
> )
> 
> Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.
> 
> 
> -- 
> Tyler Hobbs
> DataStax

Re: timeuuid and cql3 query

Posted by Sylvain Lebresne <sy...@datastax.com>.

So part of it is a bug, namely
https://issues.apache.org/jira/browse/CASSANDRA-5666. In summary CQL3
should not accept: ts > minTimeuuid('2013-06-17 22:36:16') and ts <
minTimeuuid('2013-06-20 22:44:02'), because it does no know how to handle
it properly. What it should support is token(ts) >
token(minTimeuuid('2013-06-17 22:36:16')) and token(ts) <
token(minTimeuuid('2013-06-20 22:44:02')). And that is different because
the token always sort by bytes, and comparing timeuuid by bytes does not
yield a time based ordering.

Long story short, using non-equal condition on the partition key (i.e. the
first part of your primary key) is generally not advised. Or to put it
another way, the use of the byte ordering partitioner is discouraged. But
if you still want to use the ordering partitioner and do range queries on
the partition key, do not use a timeuuid, because the ordering that the
partitioner enforce will not be one that is meaningful (due to the timeuuid
layout).

--
Sylvain



On Wed, Jun 19, 2013 at 7:04 PM, Ryan, Brent <BR...@cvent.com> wrote:

>  Note that it seems to work when you structure your schema in this
> example below, BUT this is a problem because all of my data will wind up
> hitting a single node in my cassandra cluster because the partitioning key
> is "counter" and that isn't unique enough.  I was hoping that I wasn't
> going to need to build up my own "sharding" scheme as this blog talks about
> (http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra)
> because this becomes much harder for other clients to integrate with
> because they now need to know how my data is structured in order to get it
> out.
>
>  CREATE TABLE count5 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (counter, ts)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count5 where
> counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts <
> minTimeuuid('2013-06-18 22:44:02');
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:43:53-0400 |    1 |     1
>     test | 2013-06-18 22:43:54-0400 |    1 |     1
>     test | 2013-06-18 22:43:55-0400 |    1 |     1
>     test | 2013-06-18 22:43:56-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:59-0400 |    1 |     1
>     test | 2013-06-18 22:44:00-0400 |    1 |     1
>     test | 2013-06-18 22:44:01-0400 |    1 |     1
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count5 where
> counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts <
> minTimeuuid('2013-06-20 22:44:02');
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:43:53-0400 |    1 |     1
>     test | 2013-06-18 22:43:54-0400 |    1 |     1
>     test | 2013-06-18 22:43:55-0400 |    1 |     1
>     test | 2013-06-18 22:43:56-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:59-0400 |    1 |     1
>     test | 2013-06-18 22:44:00-0400 |    1 |     1
>     test | 2013-06-18 22:44:01-0400 |    1 |     1
>     test | 2013-06-18 22:44:02-0400 |    1 |     1
>     test | 2013-06-18 22:44:02-0400 |    1 |     1
>     test | 2013-06-18 22:44:03-0400 |    1 |     1
>     test | 2013-06-18 22:44:04-0400 |    1 |     1
>     test | 2013-06-18 22:44:05-0400 |    1 |     1
>     test | 2013-06-18 22:44:06-0400 |    1 |     1
>
>
>   From: <Ryan>, Brent Ryan <br...@cvent.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 12:56 PM
>
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>   Here's an example of that not working:
>
>  cqlsh:Test> desc table count4;
>
>  CREATE TABLE count4 (
>   ts timeuuid,
>   counter text,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count4;
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:36:16-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:19-0400 |    1 |     1
>     test | 2013-06-18 22:36:19-0400 |    1 |     1
>     test | 2013-06-18 22:36:20-0400 |    1 |     1
>     test | 2013-06-18 22:36:20-0400 |    1 |     1
>     test | 2013-06-18 22:36:21-0400 |    1 |     1
>     test | 2013-06-18 22:36:21-0400 |    1 |     1
>     test | 2013-06-18 22:36:22-0400 |    1 |     1
>     test | 2013-06-18 22:36:22-0400 |    1 |     1
>     test | 2013-06-18 22:36:23-0400 |    1 |     1
>     test | 2013-06-18 22:36:23-0400 |    1 |     1
>     test | 2013-06-18 22:36:25-0400 |    1 |     1
>     test | 2013-06-18 22:36:27-0400 |    1 |     1
>     test | 2013-06-18 22:36:28-0400 |    1 |     1
>
>  cqlsh:Statistics> select counter,dateof(ts),key1,value from count4 where
> ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-19
> 22:36:20');
> Bad Request: 2 Start key must sort before (or equal to) finish key in your
> partitioner!
>
>
>
>  Any ideas?  Seems like a bug to me, right?
>
>  Brent
>
>   From: <Ryan>, Brent Ryan <br...@cvent.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 12:47 PM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>   Tyler,
>
>  You're recommending this schema instead, correct?
>
>  CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> )
>
>  I believe I tried this as well and ran into similar problems but I'll
> try it again.  I'm using the "ByteOrderedPartitioner" if that helps with
> the latest version of DSE community edition which I believe is Cassandra
> 1.2.3.
>
>
>  Thanks,
> Brent
>
>
>   From: Tyler Hobbs <ty...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 11:00 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>
> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:
>
>>
>>  CREATE TABLE count3 (
>>   counter text,
>>   ts timeuuid,
>>   key1 text,
>>   value int,
>>   PRIMARY KEY ((counter, ts))
>> )
>>
>
> Instead of doing a composite partition key, remove a set of parens and let
> ts be your clustering key.  That will cause cql rows to be stored in sorted
> order by the ts column (for a given value of "counter") and allow you to do
> the kind of query you're looking for.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: timeuuid and cql3 query

Posted by "Ryan, Brent" <BR...@cvent.com>.

Note that it seems to work when you structure your schema in this example below, BUT this is a problem because all of my data will wind up hitting a single node in my cassandra cluster because the partitioning key is "counter" and that isn't unique enough.  I was hoping that I wasn't going to need to build up my own "sharding" scheme as this blog talks about (http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra) because this becomes much harder for other clients to integrate with because they now need to know how my data is structured in order to get it out.

CREATE TABLE count5 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (counter, ts)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

cqlsh:Test> select counter,dateof(ts),key1,value from count5 where counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-18 22:44:02');

 counter | dateof(ts)               | key1 | value
---------+--------------------------+------+-------
    test | 2013-06-18 22:43:53-0400 |    1 |     1
    test | 2013-06-18 22:43:54-0400 |    1 |     1
    test | 2013-06-18 22:43:55-0400 |    1 |     1
    test | 2013-06-18 22:43:56-0400 |    1 |     1
    test | 2013-06-18 22:43:58-0400 |    1 |     1
    test | 2013-06-18 22:43:58-0400 |    1 |     1
    test | 2013-06-18 22:43:59-0400 |    1 |     1
    test | 2013-06-18 22:44:00-0400 |    1 |     1
    test | 2013-06-18 22:44:01-0400 |    1 |     1

cqlsh:Test> select counter,dateof(ts),key1,value from count5 where counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-20 22:44:02');

 counter | dateof(ts)               | key1 | value
---------+--------------------------+------+-------
    test | 2013-06-18 22:43:53-0400 |    1 |     1
    test | 2013-06-18 22:43:54-0400 |    1 |     1
    test | 2013-06-18 22:43:55-0400 |    1 |     1
    test | 2013-06-18 22:43:56-0400 |    1 |     1
    test | 2013-06-18 22:43:58-0400 |    1 |     1
    test | 2013-06-18 22:43:58-0400 |    1 |     1
    test | 2013-06-18 22:43:59-0400 |    1 |     1
    test | 2013-06-18 22:44:00-0400 |    1 |     1
    test | 2013-06-18 22:44:01-0400 |    1 |     1
    test | 2013-06-18 22:44:02-0400 |    1 |     1
    test | 2013-06-18 22:44:02-0400 |    1 |     1
    test | 2013-06-18 22:44:03-0400 |    1 |     1
    test | 2013-06-18 22:44:04-0400 |    1 |     1
    test | 2013-06-18 22:44:05-0400 |    1 |     1
    test | 2013-06-18 22:44:06-0400 |    1 |     1


From: <Ryan>, Brent Ryan <br...@cvent.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 12:56 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query

Here's an example of that not working:

cqlsh:Test> desc table count4;

CREATE TABLE count4 (
  ts timeuuid,
  counter text,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

cqlsh:Test> select counter,dateof(ts),key1,value from count4;

 counter | dateof(ts)               | key1 | value
---------+--------------------------+------+-------
    test | 2013-06-18 22:36:16-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:19-0400 |    1 |     1
    test | 2013-06-18 22:36:19-0400 |    1 |     1
    test | 2013-06-18 22:36:20-0400 |    1 |     1
    test | 2013-06-18 22:36:20-0400 |    1 |     1
    test | 2013-06-18 22:36:21-0400 |    1 |     1
    test | 2013-06-18 22:36:21-0400 |    1 |     1
    test | 2013-06-18 22:36:22-0400 |    1 |     1
    test | 2013-06-18 22:36:22-0400 |    1 |     1
    test | 2013-06-18 22:36:23-0400 |    1 |     1
    test | 2013-06-18 22:36:23-0400 |    1 |     1
    test | 2013-06-18 22:36:25-0400 |    1 |     1
    test | 2013-06-18 22:36:27-0400 |    1 |     1
    test | 2013-06-18 22:36:28-0400 |    1 |     1

cqlsh:Statistics> select counter,dateof(ts),key1,value from count4 where ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-19 22:36:20');
Bad Request: 2 Start key must sort before (or equal to) finish key in your partitioner!



Any ideas?  Seems like a bug to me, right?

Brent

From: <Ryan>, Brent Ryan <br...@cvent.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 12:47 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query

Tyler,

You're recommending this schema instead, correct?

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
)

I believe I tried this as well and ran into similar problems but I'll try it again.  I'm using the "ByteOrderedPartitioner" if that helps with the latest version of DSE community edition which I believe is Cassandra 1.2.3.


Thanks,
Brent


From: Tyler Hobbs <ty...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 11:00 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query


On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com>> wrote:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.


--
Tyler Hobbs
DataStax<http://datastax.com/>

Re: timeuuid and cql3 query

Posted by "Ryan, Brent" <BR...@cvent.com>.

Here's an example of that not working:

cqlsh:Test> desc table count4;

CREATE TABLE count4 (
  ts timeuuid,
  counter text,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

cqlsh:Test> select counter,dateof(ts),key1,value from count4;

 counter | dateof(ts)               | key1 | value
---------+--------------------------+------+-------
    test | 2013-06-18 22:36:16-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:18-0400 |    1 |     1
    test | 2013-06-18 22:36:19-0400 |    1 |     1
    test | 2013-06-18 22:36:19-0400 |    1 |     1
    test | 2013-06-18 22:36:20-0400 |    1 |     1
    test | 2013-06-18 22:36:20-0400 |    1 |     1
    test | 2013-06-18 22:36:21-0400 |    1 |     1
    test | 2013-06-18 22:36:21-0400 |    1 |     1
    test | 2013-06-18 22:36:22-0400 |    1 |     1
    test | 2013-06-18 22:36:22-0400 |    1 |     1
    test | 2013-06-18 22:36:23-0400 |    1 |     1
    test | 2013-06-18 22:36:23-0400 |    1 |     1
    test | 2013-06-18 22:36:25-0400 |    1 |     1
    test | 2013-06-18 22:36:27-0400 |    1 |     1
    test | 2013-06-18 22:36:28-0400 |    1 |     1

cqlsh:Statistics> select counter,dateof(ts),key1,value from count4 where ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-19 22:36:20');
Bad Request: 2 Start key must sort before (or equal to) finish key in your partitioner!



Any ideas?  Seems like a bug to me, right?

Brent

From: <Ryan>, Brent Ryan <br...@cvent.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 12:47 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query

Tyler,

You're recommending this schema instead, correct?

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
)

I believe I tried this as well and ran into similar problems but I'll try it again.  I'm using the "ByteOrderedPartitioner" if that helps with the latest version of DSE community edition which I believe is Cassandra 1.2.3.


Thanks,
Brent


From: Tyler Hobbs <ty...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 11:00 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query


On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com>> wrote:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.


--
Tyler Hobbs
DataStax<http://datastax.com/>

Re: timeuuid and cql3 query

Posted by "Ryan, Brent" <BR...@cvent.com>.

Tyler,

You're recommending this schema instead, correct?

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY (ts, counter)
)

I believe I tried this as well and ran into similar problems but I'll try it again.  I'm using the "ByteOrderedPartitioner" if that helps with the latest version of DSE community edition which I believe is Cassandra 1.2.3.


Thanks,
Brent


From: Tyler Hobbs <ty...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, June 19, 2013 11:00 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: timeuuid and cql3 query


On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com>> wrote:

CREATE TABLE count3 (
  counter text,
  ts timeuuid,
  key1 text,
  value int,
  PRIMARY KEY ((counter, ts))
)

Instead of doing a composite partition key, remove a set of parens and let ts be your clustering key.  That will cause cql rows to be stored in sorted order by the ts column (for a given value of "counter") and allow you to do the kind of query you're looking for.


--
Tyler Hobbs
DataStax<http://datastax.com/>

Re: timeuuid and cql3 query

Posted by Tyler Hobbs <ty...@datastax.com>.

On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BR...@cvent.com> wrote:

>
>  CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY ((counter, ts))
> )
>

Instead of doing a composite partition key, remove a set of parens and let
ts be your clustering key.  That will cause cql rows to be stored in sorted
order by the ts column (for a given value of "counter") and allow you to do
the kind of query you're looking for.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>