You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Pamecha, Abhishek" <ap...@x.com> on 2012/08/29 01:11:28 UTC

sorting by value

Hi

I probably know the usual answer but are there any tricks to do some sort of sort by value in HBase. The only option I know is to somehow embed value in the key part. The value is not a timestamp but a normal number.

I want to find out, say, top 10 from a range of columns. The range could be millions of columns. The range is dynamic for the most part.

Any pointers?

Thanks,
Abhishek


Re: sorting by value

Posted by "Pamecha, Abhishek" <ap...@x.com>.
Thanks St.Ack and Tom.  Yes I too kinda came up with a similar scheme -- To store the rank as part of the key. Where it broke down for me was for say, k-dimensional   data where ranks are stored for dimension A but the query requires sorting by dimension b. 

For now I have to settle with predetermined dimensions on which to sort. Whereas my ideal requirement is to pick top-N of  all points in an arbitrary hyper plane in a k - dimensional space. 

Thanks
Abhishek


i Sent from my iPad with iMstakes 

On Aug 30, 2012, at 23:14, "Tom Brown" <to...@gmail.com> wrote:

> We do numerical sorting within some of our tables.  We put the numerical
> values as fixed length byte arrays within the keys (and flipped the sign
> bit so negative values are lexigraphically lower than positive values)
> 
> Of course, it's still part of the key so that technique doesn't work for
> everyone, but it does allow us to do some limited sorting.
> 
> --Tom
> 
> On Thursday, August 30, 2012, Stack wrote:
> 
>> On Tue, Aug 28, 2012 at 4:11 PM, Pamecha, Abhishek <apamecha@x.com<javascript:;>>
>> wrote:
>>> Hi
>>> 
>>> I probably know the usual answer but are there any tricks to do some
>> sort of sort by value in HBase. The only option I know is to somehow embed
>> value in the key part. The value is not a timestamp but a normal number.
>>> 
>>> I want to find out, say, top 10 from a range of columns. The range could
>> be millions of columns. The range is dynamic for the most part.
>>> 
>>> Any pointers?
>>> 
>> 
>> 
>> HBase sorts by rows and then within a row by column family, column
>> qualifier, type, then timestamp.  You cannot have it natively sort
>> values for you, not unless you make the values be rows and/or columns
>> in another, separate table.
>> 
>> St.Ack
>> 

Re: sorting by value

Posted by "Pamecha, Abhishek" <ap...@x.com>.
Btw, liked the bit flipping for negative values. It didn't occur to me right off, it would be a problem 

i Sent from my iPad with iMstakes 

On Aug 30, 2012, at 23:14, "Tom Brown" <to...@gmail.com> wrote:

> We do numerical sorting within some of our tables.  We put the numerical
> values as fixed length byte arrays within the keys (and flipped the sign
> bit so negative values are lexigraphically lower than positive values)
> 
> Of course, it's still part of the key so that technique doesn't work for
> everyone, but it does allow us to do some limited sorting.
> 
> --Tom
> 
> On Thursday, August 30, 2012, Stack wrote:
> 
>> On Tue, Aug 28, 2012 at 4:11 PM, Pamecha, Abhishek <apamecha@x.com<javascript:;>>
>> wrote:
>>> Hi
>>> 
>>> I probably know the usual answer but are there any tricks to do some
>> sort of sort by value in HBase. The only option I know is to somehow embed
>> value in the key part. The value is not a timestamp but a normal number.
>>> 
>>> I want to find out, say, top 10 from a range of columns. The range could
>> be millions of columns. The range is dynamic for the most part.
>>> 
>>> Any pointers?
>>> 
>> 
>> 
>> HBase sorts by rows and then within a row by column family, column
>> qualifier, type, then timestamp.  You cannot have it natively sort
>> values for you, not unless you make the values be rows and/or columns
>> in another, separate table.
>> 
>> St.Ack
>> 

Re: sorting by value

Posted by Tom Brown <to...@gmail.com>.
We do numerical sorting within some of our tables.  We put the numerical
values as fixed length byte arrays within the keys (and flipped the sign
bit so negative values are lexigraphically lower than positive values)

Of course, it's still part of the key so that technique doesn't work for
everyone, but it does allow us to do some limited sorting.

--Tom

On Thursday, August 30, 2012, Stack wrote:

> On Tue, Aug 28, 2012 at 4:11 PM, Pamecha, Abhishek <apamecha@x.com<javascript:;>>
> wrote:
> > Hi
> >
> > I probably know the usual answer but are there any tricks to do some
> sort of sort by value in HBase. The only option I know is to somehow embed
> value in the key part. The value is not a timestamp but a normal number.
> >
> > I want to find out, say, top 10 from a range of columns. The range could
> be millions of columns. The range is dynamic for the most part.
> >
> > Any pointers?
> >
>
>
> HBase sorts by rows and then within a row by column family, column
> qualifier, type, then timestamp.  You cannot have it natively sort
> values for you, not unless you make the values be rows and/or columns
> in another, separate table.
>
> St.Ack
>

Re: sorting by value

Posted by Stack <st...@duboce.net>.
On Tue, Aug 28, 2012 at 4:11 PM, Pamecha, Abhishek <ap...@x.com> wrote:
> Hi
>
> I probably know the usual answer but are there any tricks to do some sort of sort by value in HBase. The only option I know is to somehow embed value in the key part. The value is not a timestamp but a normal number.
>
> I want to find out, say, top 10 from a range of columns. The range could be millions of columns. The range is dynamic for the most part.
>
> Any pointers?
>


HBase sorts by rows and then within a row by column family, column
qualifier, type, then timestamp.  You cannot have it natively sort
values for you, not unless you make the values be rows and/or columns
in another, separate table.

St.Ack