You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by aaron morton <aa...@thelastpickle.com> on 2012/06/01 03:32:04 UTC

Re: About Composite range queries

> If you hash 4 composite keys, let's say ('A','B','C'), ('A','D','C'), ('A','E','X'), ('A','R','X'), you have only 4 hashes or you have more?
Four

> If it's 4, how come you are able to range query for example between start_column=('A', 'D') and end_column=('A','E') and get this column ('A','D','C')

That's a slice query against columns, the column value is not hashed. The values of the column are sorted according to the comparator which can be different to the raw byte order.

A range query is against rows. Rows keys are hashed (using the Random Partitioner) to create tokens, and are stored in token order. 

> the composites are like chapters between the whole keys set, there must be intermediate keys added?

Not sure what you mean. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/06/2012, at 12:52 AM, Cyril Auburtin wrote:

> but sorry, I don"t undertand
> 
> If you hash 4 composite keys, let's say ('A','B','C'), ('A','D','C'), ('A','E','X'), ('A','R','X'), you have only 4 hashes or you have more?
> 
> If it's 4, how come you are able to range query for example between start_column=('A', 'D') and end_column=('A','E') and get this column ('A','D','C')
> 
> the composites are like chapters between the whole keys set, there must be intermediate keys added?
> 
> 
> 2012/5/31 aaron morton <aa...@thelastpickle.com>
> it is hashed once. 
> 
> To the partitioner it's just some bytes. Other parts of the code car about it's structure. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 31/05/2012, at 7:00 PM, Cyril Auburtin wrote:
> 
>> Thx for the answer
>> 1 more thing, a Composite key is not hashed only once I guess?
>> It's hashed the number of part the composite have?
>> So this means there are twice or 3 or ... as many keys as for normal column keys, is it true?
>> 
>> Le 31 mai 2012 02:59, "aaron morton" <aa...@thelastpickle.com> a écrit :
>> Composite Columns compare each part in turn, so the values are ordered as you've shown them. 
>> 
>> However the rows are not ordered according to key value. They are ordered using the random token generated by the partitioner see http://wiki.apache.org/cassandra/FAQ#range_rp
>> 
>>> What is the real advantage compared to super column families?
>> They are faster. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 29/05/2012, at 10:08 PM, Cyril Auburtin wrote:
>> 
>>> How is it done in Cassandra to be able to range query on a composite key?
>>> 
>>> "key1" => (A:A:C), (A:B:C), (A:C:C), (A:D:C), (B,A,C)
>>> 
>>> like get_range ("key1", start_column=(A,"), end_column=(A, C)); will return [ (A:B:C), (A:C:C) ] (in pycassa)
>>> 
>>> I mean does the composite implementation add much overhead to make it work?
>>> Does it need to add other Column families, to be able to range query between composites simple keys (first, second and third part of the composite)?
>>> 
>>> What is the real advantage compared to super column families?
>>> 
>>> "key1" => A: (A,C), (B,C), (C,C), (D,C)  , B: (A,C)
>>> 
>>> thx
>> 
> 
> 


Re: About Composite range queries

Posted by Cyril Auburtin <cy...@gmail.com>.
ok sorry I thought columns inside a row had their keys hashed also
So they are just putted as raw bytes

thx

2012/6/1 aaron morton <aa...@thelastpickle.com>

> If you hash 4 composite keys, let's say
> ('A','B','C'), ('A','D','C'), ('A','E','X'), ('A','R','X'), you have only 4
> hashes or you have more?
>
> Four
>
> If it's 4, how come you are able to range query for example between
> start_column=('A', 'D') and end_column=('A','E') and get this column
> ('A','D','C')
>
> That's a slice query against columns, the column value is not hashed. The
> values of the column are sorted according to the comparator which can be
> different to the raw byte order.
>
> A range query is against rows. Rows keys are hashed (using the Random
> Partitioner) to create tokens, and are stored in token order.
>
> the composites are like chapters between the whole keys set, there must be
> intermediate keys added?
>
> Not sure what you mean.
>
> Cheers
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 1/06/2012, at 12:52 AM, Cyril Auburtin wrote:
>
> but sorry, I don"t undertand
>
> If you hash 4 composite keys, let's say
> ('A','B','C'), ('A','D','C'), ('A','E','X'), ('A','R','X'), you have only 4
> hashes or you have more?
>
> If it's 4, how come you are able to range query for example between
> start_column=('A', 'D') and end_column=('A','E') and get this column
> ('A','D','C')
>
> the composites are like chapters between the whole keys set, there must be
> intermediate keys added?
>
>
> 2012/5/31 aaron morton <aa...@thelastpickle.com>
>
>> it is hashed once.
>>
>> To the partitioner it's just some bytes. Other parts of the code car
>> about it's structure.
>>
>> Cheers
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 31/05/2012, at 7:00 PM, Cyril Auburtin wrote:
>>
>> Thx for the answer
>> 1 more thing, a Composite key is not hashed only once I guess?
>> It's hashed the number of part the composite have?
>> So this means there are twice or 3 or ... as many keys as for normal
>> column keys, is it true?
>> Le 31 mai 2012 02:59, "aaron morton" <aa...@thelastpickle.com> a écrit :
>>
>>> Composite Columns compare each part in turn, so the values are ordered
>>> as you've shown them.
>>>
>>> However the rows are not ordered according to key value. They are
>>> ordered using the random token generated by the partitioner see
>>> http://wiki.apache.org/cassandra/FAQ#range_rp
>>>
>>> What is the real advantage compared to super column families?
>>>
>>> They are faster.
>>>
>>> Cheers
>>>
>>>   -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 29/05/2012, at 10:08 PM, Cyril Auburtin wrote:
>>>
>>> How is it done in Cassandra to be able to range query on a composite key?
>>>
>>> "key1" => (A:A:C), (A:B:C), (A:C:C), (A:D:C), (B,A,C)
>>>
>>> like get_range ("key1", start_column=(A,"), end_column=(A, C)); will
>>> return [ (A:B:C), (A:C:C) ] (in pycassa)
>>>
>>> I mean does the composite implementation add much overhead to make it
>>> work?
>>> Does it need to add other Column families, to be able to range query
>>> between composites simple keys (first, second and third part of the
>>> composite)?
>>>
>>> What is the real advantage compared to super column families?
>>>
>>> "key1" => A: (A,C), (B,C), (C,C), (D,C)  , B: (A,C)
>>>
>>> thx
>>>
>>>
>>>
>>
>
>