You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ramesh Natarajan <ra...@gmail.com> on 2011/09/26 18:51:33 UTC

reverse range query performance

Hi,

 I am trying to use the range query to retrieve a bunch of columns in
reverse order. The API documentation has a parameter bool reversed which
should return the results when queried using keys in a reverse order.

Lets say my row has about 1500 columns with column names 1 to 1500, and I
query asking for columns  1500 (start ) - 1400 (end ) with reverse set to
true.

Does cassandra read the entire row  1 - 1500 columns and then return the
result 1400 - 1500 or it is optimized to look directly into the 1400 - 1500
columns?

thanks
Ramesh


SliceRange

A SliceRange is a structure that stores basic range, ordering and limit
information for a query that will return multiple columns. It could be
thought of as Cassandra's version of LIMIT and ORDER BY.

*Attribute*

*Type*

*Default*

*Required*

*Description*

start

binary

n/a

Y

The column name to start the slice with. This attribute is not required,
though there is no default value, and can be safely set to '', i.e., an
empty byte array, to start with the first column name. Otherwise, it must be
a valid value under the rules of the Comparator defined for the given
ColumnFamily.

finish

binary

n/a

Y

The column name to stop the slice at. This attribute is not required, though
there is no default value, and can be safely set to an empty byte array to
not stop until count results are seen. Otherwise, it must also be a valid
value to the ColumnFamily Comparator.

reversed

bool

false

Y

Whether the results should be ordered in reversed order. Similar to
ORDER BY blah DESC in SQL.

count

integer

100

Y

How many columns to return. Similar to LIMIT 100 in SQL. May be arbitrarily
large, but Thrift will materialize the whole result into memory before
returning it to the client, so be aware that you may be better served by
iterating through slices by passing the last value of one call in as the
start of the next instead of increasing count arbitrarily large.

Re: reverse range query performance

Posted by aaron morton <aa...@thelastpickle.com>.
Does not matter to much but are you looking to get all the columns for some know keys (get_slice, multiget_slice) ? Or are you getting the columns for keys within a range (get_range_slices)? 

If you provide do a reversed query the server will skip to the "end" of the column range.  Here is some info I wrote about how the the different slice predicates work http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/


Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 27/09/2011, at 5:51 AM, Ramesh Natarajan wrote:

> Hi,
> 
>  I am trying to use the range query to retrieve a bunch of columns in reverse order. The API documentation has a parameter bool reversed which should return the results when queried using keys in a reverse order.  
> 
> Lets say my row has about 1500 columns with column names 1 to 1500, and I query asking for columns  1500 (start ) - 1400 (end ) with reverse set to true.
> 
> Does cassandra read the entire row  1 - 1500 columns and then return the result 1400 - 1500 or it is optimized to look directly into the 1400 - 1500 columns?
> 
> thanks
> Ramesh
> 
> 
> SliceRange
> A SliceRange is a structure that stores basic range, ordering and limit information for a query that will return multiple columns. It could be thought of as Cassandra's version of LIMIT and ORDER BY.
> 
> Attribute
> Type
> Default
> Required
> Description
> start
> binary
> n/a
> Y
> The column name to start the slice with. This attribute is not required, though there is no default value, and can be safely set to '', i.e., an empty byte array, to start with the first column name. Otherwise, it must be a valid value under the rules of the Comparator defined for the given ColumnFamily.
> finish
> binary
> n/a
> Y
> The column name to stop the slice at. This attribute is not required, though there is no default value, and can be safely set to an empty byte array to not stop until count results are seen. Otherwise, it must also be a valid value to the ColumnFamily Comparator.
> reversed
> bool
> false
> Y
> Whether the results should be ordered in reversed order. Similar to ORDER BY blah DESC in SQL.
> count
> integer
> 100
> Y
> How many columns to return. Similar to LIMIT 100 in SQL. May be arbitrarily large, but Thrift will materialize the whole result into memory before returning it to the client, so be aware that you may be better served by iterating through slices by passing the last value of one call in as the start of the next instead of increasing count arbitrarily large.