You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ravikumar Govindarajan <ra...@gmail.com> on 2012/11/15 12:39:59 UTC

Offsets and Range Queries

Usually we do a SELECT * FROM .... ORDER BY .... LIMIT 26,25 for pagination
purpose, but specifying offset is not available for range queries in
cassandra.

I always have to specify a start-key to achieve this. Are there reasons for
choosing such an approach rather than providing an absolute offset?

--
Ravi

Re: Offsets and Range Queries

Posted by aaron morton <aa...@thelastpickle.com>.
> I assume it's because of iterators in read-time, which go over results do merging/reducing/collating results one-by-one that is not so well suited for jumping to arbitrary offsets, given the practically huge number of columns involved, right?
No really, you can have a slice that starts in the middle of a row of 10 million columns just by using a start column. 

Having a slice operation that is constrained in size improves the overall throughout of the server and reduces the (jvm) GC churn in the server. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/11/2012, at 7:02 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Thanks Ed, for the clarifications
> 
> Yes you are correct that the apps have to handle repeatable reads and not the databases themselves when using absolute offsets, but SQL databases do provide such an option at app's peril!!!
> 
> "Slices have a fixed size, this ensures that the the "query" does not execute for arbitrary lengths of time."
> 
> I assume it's because of iterators in read-time, which go over results do merging/reducing/collating results one-by-one that is not so well suited for jumping to arbitrary offsets, given the practically huge number of columns involved, right? Did I understand it correctly?
> 
> We are now faced with persisting the page with both first & last-key for prev/next navigation. The problem gets quickly complex, when there we have to support multiple pages per user. I just wanted to know, if there any known work-arounds for this.
> 
> --
> Ravi
> 
> On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo <ed...@gmail.com> wrote:
> There are several reasons. First there is no "absolute offset". The
> rows are sorted by the data. If someone inserts new data between your
> query and this query the rows have changed.
> 
> Unless you doing select queries inside a transaction with repeatable
> read and your database supports this the query you mention does not
> really have "absolute offsets " either. The results of the query can
> change between reads.
> 
> In cassandra we do not execute large queries (that might results to
> temp tables or whatever) and allow you to page them. Slices have a
> fixed size, this ensures that the the "query" does not execute for
> arbitrary lengths of time.
> 
> 
> On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> > Usually we do a SELECT * FROM .... ORDER BY .... LIMIT 26,25 for pagination
> > purpose, but specifying offset is not available for range queries in
> > cassandra.
> >
> > I always have to specify a start-key to achieve this. Are there reasons for
> > choosing such an approach rather than providing an absolute offset?
> >
> > --
> > Ravi
> 


Re: Offsets and Range Queries

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Thanks Ed, for the clarifications

Yes you are correct that the apps have to handle repeatable reads and not
the databases themselves when using absolute offsets, but SQL databases do
provide such an option at app's peril!!!

"Slices have a fixed size, this ensures that the the "query" does not
execute for arbitrary lengths of time."

I assume it's because of iterators in read-time, which go over results do
merging/reducing/collating results one-by-one that is not so well suited
for jumping to arbitrary offsets, given the practically huge number of
columns involved, right? Did I understand it correctly?

We are now faced with persisting the page with both first & last-key for
prev/next navigation. The problem gets quickly complex, when there we have
to support multiple pages per user. I just wanted to know, if there any
known work-arounds for this.

--
Ravi

On Thu, Nov 15, 2012 at 9:03 PM, Edward Capriolo <ed...@gmail.com>wrote:

> There are several reasons. First there is no "absolute offset". The
> rows are sorted by the data. If someone inserts new data between your
> query and this query the rows have changed.
>
> Unless you doing select queries inside a transaction with repeatable
> read and your database supports this the query you mention does not
> really have "absolute offsets " either. The results of the query can
> change between reads.
>
> In cassandra we do not execute large queries (that might results to
> temp tables or whatever) and allow you to page them. Slices have a
> fixed size, this ensures that the the "query" does not execute for
> arbitrary lengths of time.
>
>
> On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> > Usually we do a SELECT * FROM .... ORDER BY .... LIMIT 26,25 for
> pagination
> > purpose, but specifying offset is not available for range queries in
> > cassandra.
> >
> > I always have to specify a start-key to achieve this. Are there reasons
> for
> > choosing such an approach rather than providing an absolute offset?
> >
> > --
> > Ravi
>

Re: Offsets and Range Queries

Posted by Edward Capriolo <ed...@gmail.com>.
There are several reasons. First there is no "absolute offset". The
rows are sorted by the data. If someone inserts new data between your
query and this query the rows have changed.

Unless you doing select queries inside a transaction with repeatable
read and your database supports this the query you mention does not
really have "absolute offsets " either. The results of the query can
change between reads.

In cassandra we do not execute large queries (that might results to
temp tables or whatever) and allow you to page them. Slices have a
fixed size, this ensures that the the "query" does not execute for
arbitrary lengths of time.


On Thu, Nov 15, 2012 at 6:39 AM, Ravikumar Govindarajan
<ra...@gmail.com> wrote:
> Usually we do a SELECT * FROM .... ORDER BY .... LIMIT 26,25 for pagination
> purpose, but specifying offset is not available for range queries in
> cassandra.
>
> I always have to specify a start-key to achieve this. Are there reasons for
> choosing such an approach rather than providing an absolute offset?
>
> --
> Ravi