You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Philippe <wa...@gmail.com> on 2012/01/16 22:58:53 UTC

Hector + Range query problem

Hello,
I've been trying to retrieve rows based on key range but every single time
I test, Hector retrieves ALL the rows, no matter the range I give it.
What can I possibly be doing wrong ? Thanks.

I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column
family (I've added & truncated the CF quite a few times during my tests).
Each row has a single column whose name is the byte value "2". The keys are
0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3.
list in the CLI gives me

Using default limit of 100
-------------------
RowKey: 000000000000000002
=> (column=02, value=00, timestamp=1326750723079000)
-------------------
RowKey: 010000000000000002
=> (column=02, value=01, timestamp=1326750723239000)
-------------------
RowKey: 020000000000000002
=> (column=02, value=02, timestamp=1326750723329000)
-------------------
RowKey: 030000000000000002
=> (column=02, value=03, timestamp=1326750723416000)

4 Rows Returned.



Hector code:

> RangeSlicesQuery<TileKey,Byte,byte[]> query =
> HFactory.createRangeSlicesQuery(keyspace, keySerializer,
> columnNameSerializer, BytesArraySerializer
> .get());
> query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames((
> byte)2);

query.execute();


The execution log shows

1359 [main] INFO  com.sensorly.heatmap.drawing.cassandra.CassandraTileDao
>  - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2]
> to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] => morton codes =
> [000000000000000002,010000000000000002]
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0,
> zoom=2] with 1 columns, morton = 000000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0,
> zoom=2] with 1 columns, morton = 010000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1,
> zoom=2] with 1 columns, morton = 020000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1,
> zoom=2] with 1 columns, morton = 030000000000000002

=> ALL rows are returned when I really expect it to only return the 1st one.

Re: Hector + Range query problem

Posted by Philippe <wa...@gmail.com>.
Hi aaron

Nope: I'm using BOP...forgot to mention it in my original message.

I changed it to a multiget and it works but i think the range would be more
efficient so I'd really like to solve this.
Thanks
Le 18 janv. 2012 09:18, "aaron morton" <aa...@thelastpickle.com> a écrit :

> Does this help ?
> http://wiki.apache.org/cassandra/FAQ#range_rp
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 17/01/2012, at 10:58 AM, Philippe wrote:
>
> Hello,
> I've been trying to retrieve rows based on key range but every single time
> I test, Hector retrieves ALL the rows, no matter the range I give it.
> What can I possibly be doing wrong ? Thanks.
>
> I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column
> family (I've added & truncated the CF quite a few times during my tests).
> Each row has a single column whose name is the byte value "2". The keys
> are 0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3.
> list in the CLI gives me
>
> Using default limit of 100
> -------------------
> RowKey: 000000000000000002
> => (column=02, value=00, timestamp=1326750723079000)
> -------------------
> RowKey: 010000000000000002
> => (column=02, value=01, timestamp=1326750723239000)
> -------------------
> RowKey: 020000000000000002
> => (column=02, value=02, timestamp=1326750723329000)
> -------------------
> RowKey: 030000000000000002
> => (column=02, value=03, timestamp=1326750723416000)
>
> 4 Rows Returned.
>
>
>
> Hector code:
>
>> RangeSlicesQuery<TileKey,Byte,byte[]> query =
>> HFactory.createRangeSlicesQuery(keyspace, keySerializer,
>> columnNameSerializer, BytesArraySerializer
>> .get());
>> query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames((
>> byte)2);
>
> query.execute();
>
>
> The execution log shows
>
> 1359 [main] INFO  com.sensorly.heatmap.drawing.cassandra.CassandraTileDao
>>  - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2]
>> to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] => morton codes =
>> [000000000000000002,010000000000000002]
>> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0,
>> zoom=2] with 1 columns, morton = 000000000000000002
>> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0,
>> zoom=2] with 1 columns, morton = 010000000000000002
>> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1,
>> zoom=2] with 1 columns, morton = 020000000000000002
>> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1,
>> zoom=2] with 1 columns, morton = 030000000000000002
>
> => ALL rows are returned when I really expect it to only return the 1st
> one.
>
>
>
>
>
>

Re: Hector + Range query problem

Posted by aaron morton <aa...@thelastpickle.com>.
Does this help ? 
http://wiki.apache.org/cassandra/FAQ#range_rp

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/01/2012, at 10:58 AM, Philippe wrote:

> Hello,
> I've been trying to retrieve rows based on key range but every single time I test, Hector retrieves ALL the rows, no matter the range I give it.
> What can I possibly be doing wrong ? Thanks.
> 
> I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column family (I've added & truncated the CF quite a few times during my tests).
> Each row has a single column whose name is the byte value "2". The keys are 0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3.
> list in the CLI gives me
> 
> Using default limit of 100
> -------------------
> RowKey: 000000000000000002
> => (column=02, value=00, timestamp=1326750723079000)
> -------------------
> RowKey: 010000000000000002
> => (column=02, value=01, timestamp=1326750723239000)
> -------------------
> RowKey: 020000000000000002
> => (column=02, value=02, timestamp=1326750723329000)
> -------------------
> RowKey: 030000000000000002
> => (column=02, value=03, timestamp=1326750723416000)
> 
> 4 Rows Returned.
> 
> 
> 
> Hector code:
> RangeSlicesQuery<TileKey,Byte,byte[]> query = HFactory.createRangeSlicesQuery(keyspace, keySerializer, columnNameSerializer, BytesArraySerializer
> .get());
> query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames((byte)2);
> query.execute();  
> 
> 
> The execution log shows
> 
> 
> 1359 [main] INFO  com.sensorly.heatmap.drawing.cassandra.CassandraTileDao  - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] => morton codes = [000000000000000002,010000000000000002]
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] with 1 columns, morton = 000000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] with 1 columns, morton = 010000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1, zoom=2] with 1 columns, morton = 020000000000000002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1, zoom=2] with 1 columns, morton = 030000000000000002
> => ALL rows are returned when I really expect it to only return the 1st one.
> 
> 
> 
> 
>