You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mark <st...@gmail.com> on 2011/08/22 03:47:53 UTC

Sorting question

Why when scanning do I see the following sort order?

"foo  bar"
"foo bar"
"foo"

I thought that "foo" would be sorted before "foo bar" since this is 
natural ordering. Why am I seeing these results?

Re: Sorting question

Posted by Mark <st...@gmail.com>.
I was thinking with 0x00 I can use 0x01 for a stop

On 8/22/11 7:20 PM, Chris Tarnas wrote:
> That has worked well for me when I don't care about using printable characters.
>
> -chris
>
> On Aug 22, 2011, at 6:06 PM, Mark<st...@gmail.com>  wrote:
>
>> How about an empty byte (0x00)?
>>
>> On 8/22/11 6:03 PM, Chris Tarnas wrote:
>>> Generally you want your delimiters to be less than any valid character. For normal character data I've found tab (0x09) works well, it's pretty much the first option. Forward slash (0x2f) is less reliable depending on what other non-alphanumeric characters are allowed.
>>>
>>> -chris
>>>
>>>
>>>
>>> On Aug 22, 2011, at 5:04 PM, Mark wrote:
>>>
>>>> I have another question though ;)
>>>>
>>>> Is there a better separator I could use to accomplish natural sorting? Also what is the preferred way to use start and stop keys when scanning? For example: STARTROW =>   "foo", ENDROW =>   "foo#{what should go here?}".
>>>>
>>>> Thanks
>>>>
>>>> On 8/22/11 4:59 PM, Mark wrote:
>>>>> After further investigation it turns out it is my use case.
>>>>>
>>>>> My keys are actually in the form of:
>>>>> "idx_query/foo bar/9223372035540718511"
>>>>> "idx_query/foo/9223372035540718648"
>>>>>
>>>>> Now that I look at it, it make perfect sense why "foo bar" comes before "foo/"
>>>>>
>>>>> Sorry for the confusion.
>>>>>
>>>>> On 8/22/11 9:16 AM, Chris Tarnas wrote:
>>>>>> Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.
>>>>>>
>>>>>> Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.
>>>>>>
>>>>>> -chris
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>>>>>>
>>>>>>> I don't use the thrift API, but my suspicion is that it doesn't return
>>>>>>> results in the correct order. You're not the only one I've seen report
>>>>>>> strange things about results ordering recently, and IIRC they were using
>>>>>>> thrift as well.
>>>>>>>
>>>>>>> Can you verify that the results sort the same using the Java API or even by
>>>>>>> looking at it in the HBase shell?
>>>>>>>
>>>>>>> Jesse
>>>>>>>
>>>>>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>    wrote:
>>>>>>>
>>>>>>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>>>>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/22/11 7:33 AM, Mark wrote:
>>>>>>>>
>>>>>>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>>>>>>
>>>>>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>>>>>>
>>>>>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>>>>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>>>>>>>>
>>>>>>>>>> -chris
>>>>>>>>>>
>>>>>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>     wrote:
>>>>>>>>>>
>>>>>>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>>>>>>
>>>>>>>>>>>> "foo  bar"
>>>>>>>>>>>> "foo bar"
>>>>>>>>>>>> "foo"
>>>>>>>>>>>>
>>>>>>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>>>>>>

Re: Sorting question

Posted by Chris Tarnas <cf...@email.com>.
That has worked well for me when I don't care about using printable characters. 

-chris

On Aug 22, 2011, at 6:06 PM, Mark <st...@gmail.com> wrote:

> How about an empty byte (0x00)?
> 
> On 8/22/11 6:03 PM, Chris Tarnas wrote:
>> Generally you want your delimiters to be less than any valid character. For normal character data I've found tab (0x09) works well, it's pretty much the first option. Forward slash (0x2f) is less reliable depending on what other non-alphanumeric characters are allowed.
>> 
>> -chris
>> 
>> 
>> 
>> On Aug 22, 2011, at 5:04 PM, Mark wrote:
>> 
>>> I have another question though ;)
>>> 
>>> Is there a better separator I could use to accomplish natural sorting? Also what is the preferred way to use start and stop keys when scanning? For example: STARTROW =>  "foo", ENDROW =>  "foo#{what should go here?}".
>>> 
>>> Thanks
>>> 
>>> On 8/22/11 4:59 PM, Mark wrote:
>>>> After further investigation it turns out it is my use case.
>>>> 
>>>> My keys are actually in the form of:
>>>> "idx_query/foo bar/9223372035540718511"
>>>> "idx_query/foo/9223372035540718648"
>>>> 
>>>> Now that I look at it, it make perfect sense why "foo bar" comes before "foo/"
>>>> 
>>>> Sorry for the confusion.
>>>> 
>>>> On 8/22/11 9:16 AM, Chris Tarnas wrote:
>>>>> Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.
>>>>> 
>>>>> Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.
>>>>> 
>>>>> -chris
>>>>> 
>>>>> 
>>>>> 
>>>>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>>>>> 
>>>>>> I don't use the thrift API, but my suspicion is that it doesn't return
>>>>>> results in the correct order. You're not the only one I've seen report
>>>>>> strange things about results ordering recently, and IIRC they were using
>>>>>> thrift as well.
>>>>>> 
>>>>>> Can you verify that the results sort the same using the Java API or even by
>>>>>> looking at it in the HBase shell?
>>>>>> 
>>>>>> Jesse
>>>>>> 
>>>>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>   wrote:
>>>>>> 
>>>>>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>>>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>>>>> 
>>>>>>> 
>>>>>>> On 8/22/11 7:33 AM, Mark wrote:
>>>>>>> 
>>>>>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>>>>> 
>>>>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>>>>> 
>>>>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>>>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>>>>>>> 
>>>>>>>>> -chris
>>>>>>>>> 
>>>>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>    wrote:
>>>>>>>>> 
>>>>>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>>>>> 
>>>>>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>>>>> 
>>>>>>>>>>> "foo  bar"
>>>>>>>>>>> "foo bar"
>>>>>>>>>>> "foo"
>>>>>>>>>>> 
>>>>>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>>>>> 

Re: Sorting question

Posted by Mark <st...@gmail.com>.
How about an empty byte (0x00)?

On 8/22/11 6:03 PM, Chris Tarnas wrote:
> Generally you want your delimiters to be less than any valid character. For normal character data I've found tab (0x09) works well, it's pretty much the first option. Forward slash (0x2f) is less reliable depending on what other non-alphanumeric characters are allowed.
>
> -chris
>
>
>
> On Aug 22, 2011, at 5:04 PM, Mark wrote:
>
>> I have another question though ;)
>>
>> Is there a better separator I could use to accomplish natural sorting? Also what is the preferred way to use start and stop keys when scanning? For example: STARTROW =>  "foo", ENDROW =>  "foo#{what should go here?}".
>>
>> Thanks
>>
>> On 8/22/11 4:59 PM, Mark wrote:
>>> After further investigation it turns out it is my use case.
>>>
>>> My keys are actually in the form of:
>>> "idx_query/foo bar/9223372035540718511"
>>> "idx_query/foo/9223372035540718648"
>>>
>>> Now that I look at it, it make perfect sense why "foo bar" comes before "foo/"
>>>
>>> Sorry for the confusion.
>>>
>>> On 8/22/11 9:16 AM, Chris Tarnas wrote:
>>>> Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.
>>>>
>>>> Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.
>>>>
>>>> -chris
>>>>
>>>>
>>>>
>>>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>>>>
>>>>> I don't use the thrift API, but my suspicion is that it doesn't return
>>>>> results in the correct order. You're not the only one I've seen report
>>>>> strange things about results ordering recently, and IIRC they were using
>>>>> thrift as well.
>>>>>
>>>>> Can you verify that the results sort the same using the Java API or even by
>>>>> looking at it in the HBase shell?
>>>>>
>>>>> Jesse
>>>>>
>>>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>   wrote:
>>>>>
>>>>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>>>>
>>>>>>
>>>>>> On 8/22/11 7:33 AM, Mark wrote:
>>>>>>
>>>>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>>>>
>>>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>>>>
>>>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>>>>>>
>>>>>>>> -chris
>>>>>>>>
>>>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>    wrote:
>>>>>>>>
>>>>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>>>>
>>>>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>>>>
>>>>>>>>>> "foo  bar"
>>>>>>>>>> "foo bar"
>>>>>>>>>> "foo"
>>>>>>>>>>
>>>>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>>>>

Re: Sorting question

Posted by Chris Tarnas <cf...@email.com>.
Generally you want your delimiters to be less than any valid character. For normal character data I've found tab (0x09) works well, it's pretty much the first option. Forward slash (0x2f) is less reliable depending on what other non-alphanumeric characters are allowed.

-chris



On Aug 22, 2011, at 5:04 PM, Mark wrote:

> I have another question though ;)
> 
> Is there a better separator I could use to accomplish natural sorting? Also what is the preferred way to use start and stop keys when scanning? For example: STARTROW => "foo", ENDROW => "foo#{what should go here?}".
> 
> Thanks
> 
> On 8/22/11 4:59 PM, Mark wrote:
>> After further investigation it turns out it is my use case.
>> 
>> My keys are actually in the form of:
>> "idx_query/foo bar/9223372035540718511"
>> "idx_query/foo/9223372035540718648"
>> 
>> Now that I look at it, it make perfect sense why "foo bar" comes before "foo/"
>> 
>> Sorry for the confusion.
>> 
>> On 8/22/11 9:16 AM, Chris Tarnas wrote:
>>> Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.
>>> 
>>> Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.
>>> 
>>> -chris
>>> 
>>> 
>>> 
>>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>>> 
>>>> I don't use the thrift API, but my suspicion is that it doesn't return
>>>> results in the correct order. You're not the only one I've seen report
>>>> strange things about results ordering recently, and IIRC they were using
>>>> thrift as well.
>>>> 
>>>> Can you verify that the results sort the same using the Java API or even by
>>>> looking at it in the HBase shell?
>>>> 
>>>> Jesse
>>>> 
>>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>  wrote:
>>>> 
>>>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>>> 
>>>>> 
>>>>> On 8/22/11 7:33 AM, Mark wrote:
>>>>> 
>>>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>>> 
>>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>>> 
>>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>>>>> 
>>>>>>> -chris
>>>>>>> 
>>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>   wrote:
>>>>>>> 
>>>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>>> 
>>>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>>> 
>>>>>>>>> "foo  bar"
>>>>>>>>> "foo bar"
>>>>>>>>> "foo"
>>>>>>>>> 
>>>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>>> 


Re: Sorting question

Posted by Mark <st...@gmail.com>.
I have another question though ;)

Is there a better separator I could use to accomplish natural sorting? 
Also what is the preferred way to use start and stop keys when scanning? 
For example: STARTROW => "foo", ENDROW => "foo#{what should go here?}".

Thanks

On 8/22/11 4:59 PM, Mark wrote:
> After further investigation it turns out it is my use case.
>
> My keys are actually in the form of:
> "idx_query/foo bar/9223372035540718511"
> "idx_query/foo/9223372035540718648"
>
> Now that I look at it, it make perfect sense why "foo bar" comes 
> before "foo/"
>
> Sorry for the confusion.
>
> On 8/22/11 9:16 AM, Chris Tarnas wrote:
>> Good point on the sorting issues with thrift - what client language 
>> are you using? Using perl I have not seen inconstancies in ordering.
>>
>> Do your strings have any particular terminator that is being included 
>> but not seen in your output? Can you send out the rowkeys from scans 
>> in the HBase shell? That would help narrow it down.
>>
>> -chris
>>
>>
>>
>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>>
>>> I don't use the thrift API, but my suspicion is that it doesn't return
>>> results in the correct order. You're not the only one I've seen report
>>> strange things about results ordering recently, and IIRC they were 
>>> using
>>> thrift as well.
>>>
>>> Can you verify that the results sort the same using the Java API or 
>>> even by
>>> looking at it in the HBase shell?
>>>
>>> Jesse
>>>
>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>  
>>> wrote:
>>>
>>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>>
>>>>
>>>> On 8/22/11 7:33 AM, Mark wrote:
>>>>
>>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>>
>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>>
>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>>> value. Space is ASCII 32, a value less than the alphanumeric 
>>>>>> characters.
>>>>>>
>>>>>> -chris
>>>>>>
>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>   
>>>>>> wrote:
>>>>>>
>>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>>
>>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>>
>>>>>>>> "foo  bar"
>>>>>>>> "foo bar"
>>>>>>>> "foo"
>>>>>>>>
>>>>>>>> I thought that "foo" would be sorted before "foo bar" since 
>>>>>>>> this is
>>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>>

Re: Sorting question

Posted by Mark <st...@gmail.com>.
After further investigation it turns out it is my use case.

My keys are actually in the form of:
"idx_query/foo bar/9223372035540718511"
"idx_query/foo/9223372035540718648"

Now that I look at it, it make perfect sense why "foo bar" comes before 
"foo/"

Sorry for the confusion.

On 8/22/11 9:16 AM, Chris Tarnas wrote:
> Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.
>
> Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.
>
> -chris
>
>
>
> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:
>
>> I don't use the thrift API, but my suspicion is that it doesn't return
>> results in the correct order. You're not the only one I've seen report
>> strange things about results ordering recently, and IIRC they were using
>> thrift as well.
>>
>> Can you verify that the results sort the same using the Java API or even by
>> looking at it in the HBase shell?
>>
>> Jesse
>>
>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<st...@gmail.com>  wrote:
>>
>>> Im still also confused on how  "foo " is less than "foo". Aren't their
>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>>>
>>>
>>> On 8/22/11 7:33 AM, Mark wrote:
>>>
>>>> Is there anyway to around this to achieve natural ordering? Thanks
>>>>
>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>>>
>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>>>
>>>>> -chris
>>>>>
>>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>   wrote:
>>>>>
>>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>>>
>>>>>>> Why when scanning do I see the following sort order?
>>>>>>>
>>>>>>> "foo  bar"
>>>>>>> "foo bar"
>>>>>>> "foo"
>>>>>>>
>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>>> natural ordering. Why am I seeing these results?
>>>>>>>

Re: Sorting question

Posted by Chris Tarnas <cf...@email.com>.
Good point on the sorting issues with thrift - what client language are you using? Using perl I have not seen inconstancies in ordering.

Do your strings have any particular terminator that is being included but not seen in your output? Can you send out the rowkeys from scans in the HBase shell? That would help narrow it down.

-chris



On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote:

> I don't use the thrift API, but my suspicion is that it doesn't return
> results in the correct order. You're not the only one I've seen report
> strange things about results ordering recently, and IIRC they were using
> thrift as well.
> 
> Can you verify that the results sort the same using the Java API or even by
> looking at it in the HBase shell?
> 
> Jesse
> 
> On Mon, Aug 22, 2011 at 11:28 AM, Mark <st...@gmail.com> wrote:
> 
>> Im still also confused on how  "foo " is less than "foo". Aren't their
>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>> 
>> 
>> On 8/22/11 7:33 AM, Mark wrote:
>> 
>>> Is there anyway to around this to achieve natural ordering? Thanks
>>> 
>>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>> 
>>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>> 
>>>> -chris
>>>> 
>>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>  wrote:
>>>> 
>>>> FYI I am using openScannerWithPrefix thrift api call
>>>>> 
>>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>> 
>>>>>> Why when scanning do I see the following sort order?
>>>>>> 
>>>>>> "foo  bar"
>>>>>> "foo bar"
>>>>>> "foo"
>>>>>> 
>>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>>> natural ordering. Why am I seeing these results?
>>>>>> 
>>>>> 


Re: Sorting question

Posted by Jesse Hutton <je...@gmail.com>.
I don't use the thrift API, but my suspicion is that it doesn't return
results in the correct order. You're not the only one I've seen report
strange things about results ordering recently, and IIRC they were using
thrift as well.

Can you verify that the results sort the same using the Java API or even by
looking at it in the HBase shell?

Jesse

On Mon, Aug 22, 2011 at 11:28 AM, Mark <st...@gmail.com> wrote:

> Im still also confused on how  "foo " is less than "foo". Aren't their
> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?
>
>
> On 8/22/11 7:33 AM, Mark wrote:
>
>> Is there anyway to around this to achieve natural ordering? Thanks
>>
>> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>>
>>> HBase doesn't use the localized sorting rules, it sorts on the byte
>>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>>
>>> -chris
>>>
>>> On Aug 21, 2011, at 8:11 PM, Mark<static.void.dev@gmail.com**>  wrote:
>>>
>>>  FYI I am using openScannerWithPrefix thrift api call
>>>>
>>>> On 8/21/11 6:47 PM, Mark wrote:
>>>>
>>>>> Why when scanning do I see the following sort order?
>>>>>
>>>>> "foo  bar"
>>>>> "foo bar"
>>>>> "foo"
>>>>>
>>>>> I thought that "foo" would be sorted before "foo bar" since this is
>>>>> natural ordering. Why am I seeing these results?
>>>>>
>>>>

Re: Sorting question

Posted by Mark <st...@gmail.com>.
Im still also confused on how  "foo " is less than "foo". Aren't their 
respective bytes [102, 111, 111, 32] , and [102, 111, 111] ?

On 8/22/11 7:33 AM, Mark wrote:
> Is there anyway to around this to achieve natural ordering? Thanks
>
> On 8/21/11 10:17 PM, Chris Tarnas wrote:
>> HBase doesn't use the localized sorting rules, it sorts on the byte 
>> value. Space is ASCII 32, a value less than the alphanumeric characters.
>>
>> -chris
>>
>> On Aug 21, 2011, at 8:11 PM, Mark<st...@gmail.com>  wrote:
>>
>>> FYI I am using openScannerWithPrefix thrift api call
>>>
>>> On 8/21/11 6:47 PM, Mark wrote:
>>>> Why when scanning do I see the following sort order?
>>>>
>>>> "foo  bar"
>>>> "foo bar"
>>>> "foo"
>>>>
>>>> I thought that "foo" would be sorted before "foo bar" since this is 
>>>> natural ordering. Why am I seeing these results?

Re: Sorting question

Posted by Mark <st...@gmail.com>.
Is there anyway to around this to achieve natural ordering? Thanks

On 8/21/11 10:17 PM, Chris Tarnas wrote:
> HBase doesn't use the localized sorting rules, it sorts on the byte value. Space is ASCII 32, a value less than the alphanumeric characters.
>
> -chris
>
> On Aug 21, 2011, at 8:11 PM, Mark<st...@gmail.com>  wrote:
>
>> FYI I am using openScannerWithPrefix thrift api call
>>
>> On 8/21/11 6:47 PM, Mark wrote:
>>> Why when scanning do I see the following sort order?
>>>
>>> "foo  bar"
>>> "foo bar"
>>> "foo"
>>>
>>> I thought that "foo" would be sorted before "foo bar" since this is natural ordering. Why am I seeing these results?

Re: Sorting question

Posted by Chris Tarnas <cf...@tarnas.org>.
HBase doesn't use the localized sorting rules, it sorts on the byte value. Space is ASCII 32, a value less than the alphanumeric characters. 

-chris

On Aug 21, 2011, at 8:11 PM, Mark <st...@gmail.com> wrote:

> FYI I am using openScannerWithPrefix thrift api call
> 
> On 8/21/11 6:47 PM, Mark wrote:
>> Why when scanning do I see the following sort order?
>> 
>> "foo  bar"
>> "foo bar"
>> "foo"
>> 
>> I thought that "foo" would be sorted before "foo bar" since this is natural ordering. Why am I seeing these results?

Re: Sorting question

Posted by Mark <st...@gmail.com>.
FYI I am using openScannerWithPrefix thrift api call

On 8/21/11 6:47 PM, Mark wrote:
> Why when scanning do I see the following sort order?
>
> "foo  bar"
> "foo bar"
> "foo"
>
> I thought that "foo" would be sorted before "foo bar" since this is 
> natural ordering. Why am I seeing these results?