You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jonathan Bishop <jb...@gmail.com> on 2012/11/05 19:41:31 UTC

Using doubles and longs as ordering row values

Hi,

In my application my row values are doubles. I would like my scans to
traverse the rows in order of increasing values.

But if I simply use

double d  = ....
byte[] row = Bytes.toBytes(d);

I will get an ordering which is based on the byte values of doubles, not on
the value of the doubles themselves.

I see also that integer values have the same issue due to the first bit
being the sign bit. So negative values will come after positive values.

Any suggestions?

Thanks,

Jon

Re: Using doubles and longs as ordering row values

Posted by anil gupta <an...@gmail.com>.
Hi Jonathan,

If possible try to avoid using Double for rowkey because Double has
precision problems.
Here is more details about problems with double:
1.
http://stackoverflow.com/questions/12699040/double-vs-long-serialization-in-java
2.
http://stackoverflow.com/questions/179427/how-to-resolve-a-java-rounding-double-issue

You might get wrong results due problems in Double. You can use two long or
int to store the decimal value as RowKey.

HTH,
Anil Gupta




On Mon, Nov 5, 2012 at 11:38 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> On Mon, Nov 5, 2012 at 10:41 AM, Jonathan Bishop <jb...@gmail.com>
> wrote:
> > Hi,
> >
> > In my application my row values are doubles. I would like my scans to
> > traverse the rows in order of increasing values.
> >
> > But if I simply use
> >
> > double d  = ....
> > byte[] row = Bytes.toBytes(d);
> >
> > I will get an ordering which is based on the byte values of doubles, not
> on
> > the value of the doubles themselves.
>
> It's the same ordering as long as you don't use negative values.
>
> >
> > I see also that integer values have the same issue due to the first bit
> > being the sign bit. So negative values will come after positive values.
> >
> > Any suggestions?
>
> Don't use row keys that can be negative? :)
>
> Also don't use a single number as a row key, see
> http://hbase.apache.org/book.html#rowkey.design
>
> J-D
>



-- 
Thanks & Regards,
Anil Gupta

Re: Using doubles and longs as ordering row values

Posted by Jean-Daniel Cryans <jd...@apache.org>.
On Mon, Nov 5, 2012 at 10:41 AM, Jonathan Bishop <jb...@gmail.com> wrote:
> Hi,
>
> In my application my row values are doubles. I would like my scans to
> traverse the rows in order of increasing values.
>
> But if I simply use
>
> double d  = ....
> byte[] row = Bytes.toBytes(d);
>
> I will get an ordering which is based on the byte values of doubles, not on
> the value of the doubles themselves.

It's the same ordering as long as you don't use negative values.

>
> I see also that integer values have the same issue due to the first bit
> being the sign bit. So negative values will come after positive values.
>
> Any suggestions?

Don't use row keys that can be negative? :)

Also don't use a single number as a row key, see
http://hbase.apache.org/book.html#rowkey.design

J-D

Re: Using doubles and longs as ordering row values

Posted by Nick Dimiduk <nd...@gmail.com>.
On Thu, Nov 29, 2012 at 3:00 PM, David Koch <og...@googlemail.com> wrote:

> I am having a similar issue, only I need to preserve the order of
> qualifiers which are serialized signed longs - rather than row keys.


Orderly is not rowkey specific. You can use it's serialization anywhere.

-n

Re: Using doubles and longs as ordering row values

Posted by Marcos Ortiz <ml...@uci.cu>.
On 12/01/2012 10:01 AM, David Koch wrote:
> Hello Lars,
>
> Thank you. Where can I find the lily library? I can't find it on github or
> Google.
http://www.lilyproject.org/lily/index.html
>
> /David
>
>
> On Fri, Nov 30, 2012 at 3:54 AM, lars hofhansl <lh...@yahoo.com> wrote:
>
>> As I said, look at the lily library they have solved exactly that problem,
>> I've used that before.
>>
>> It has encodings for Ints/Longs/Floats/Doubles/BigDecimals, to encode them
>> such the byte array will sort according to the magnitude of the value which
>> includes the sign and the floating point exponent.
>>
>> It's a very common problem :)
>>
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: David Koch <og...@googlemail.com>
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, November 29, 2012 3:00 PM
>> Subject: Re: Using doubles and longs as ordering row values
>>
>> Hello,
>>
>> I am having a similar issue, only I need to preserve the order of
>> qualifiers which are serialized signed longs - rather than row keys. The
>> latter is addressed by the orderly library which was mentioned above. Can
>> this library be re-used for my purpose? I imagine this is not an exotic
>> requirement so I am also interested in knowing how other people have solved
>> this problem.
>>
>> Thank you,
>>
>> /David
>>
>>
>> On Tue, Nov 6, 2012 at 6:07 AM, Jonathan Bishop <jbishop.rwc@gmail.com
>>> wrote:
>>> Thanks Dave,
>>>
>>> That looks like what I need.
>>>
>>> Jon
>>>
>>>
>>> On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:
>>>
>>>> This fork looks a bit more up to date:
>>>> https://github.com/ndimiduk/orderly
>>>>
>>>> On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net>
>> wrote:
>>>>> Here's a project to deal with this issue specifically.  I'm not sure
>> of
>>>>> it's status:
>>>>> https://github.com/conikeec/orderly
>>>>>
>>>>>
>>>>> On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
>>>> wrote:
>>>>>> Have a look at the lily library. It has code to encode Longs/Doubles
>>>> into
>>>>>> bytes such that resulting bytes sort as expected based on the
>> numbers.
>>>>>> -- Lars
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>>   From: Jonathan Bishop <jb...@gmail.com>
>>>>>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>>>>>> Sent: Monday, November 5, 2012 10:41 AM
>>>>>> Subject: Using doubles and longs as ordering row values
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> In my application my row values are doubles. I would like my scans
>> to
>>>>>> traverse the rows in order of increasing values.
>>>>>>
>>>>>> But if I simply use
>>>>>>
>>>>>> double d  = ....
>>>>>> byte[] row = Bytes.toBytes(d);
>>>>>>
>>>>>> I will get an ordering which is based on the byte values of doubles,
>>> not
>>>>>> on
>>>>>> the value of the doubles themselves.
>>>>>>
>>>>>> I see also that integer values have the same issue due to the first
>>> bit
>>>>>> being the sign bit. So negative values will come after positive
>>> values.
>>>>>> Any suggestions?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>
>>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci

-- 

Marcos Luis Ortíz Valmaseda
about.me/marcosortiz <http://about.me/marcosortiz>
@marcosluis2186 <http://twitter.com/marcosluis2186>



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Using doubles and longs as ordering row values

Posted by David Koch <og...@googlemail.com>.
Hello Lars,

Thank you. Where can I find the lily library? I can't find it on github or
Google.

/David


On Fri, Nov 30, 2012 at 3:54 AM, lars hofhansl <lh...@yahoo.com> wrote:

> As I said, look at the lily library they have solved exactly that problem,
> I've used that before.
>
> It has encodings for Ints/Longs/Floats/Doubles/BigDecimals, to encode them
> such the byte array will sort according to the magnitude of the value which
> includes the sign and the floating point exponent.
>
> It's a very common problem :)
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: David Koch <og...@googlemail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Thursday, November 29, 2012 3:00 PM
> Subject: Re: Using doubles and longs as ordering row values
>
> Hello,
>
> I am having a similar issue, only I need to preserve the order of
> qualifiers which are serialized signed longs - rather than row keys. The
> latter is addressed by the orderly library which was mentioned above. Can
> this library be re-used for my purpose? I imagine this is not an exotic
> requirement so I am also interested in knowing how other people have solved
> this problem.
>
> Thank you,
>
> /David
>
>
> On Tue, Nov 6, 2012 at 6:07 AM, Jonathan Bishop <jbishop.rwc@gmail.com
> >wrote:
>
> > Thanks Dave,
> >
> > That looks like what I need.
> >
> > Jon
> >
> >
> > On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:
> >
> > > This fork looks a bit more up to date:
> > > https://github.com/ndimiduk/orderly
> > >
> > > On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net>
> wrote:
> > >
> > > > Here's a project to deal with this issue specifically.  I'm not sure
> of
> > > > it's status:
> > > > https://github.com/conikeec/orderly
> > > >
> > > >
> > > > On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
> > > wrote:
> > > >
> > > >> Have a look at the lily library. It has code to encode Longs/Doubles
> > > into
> > > >> bytes such that resulting bytes sort as expected based on the
> numbers.
> > > >>
> > > >> -- Lars
> > > >>
> > > >>
> > > >>
> > > >> ________________________________
> > > >>  From: Jonathan Bishop <jb...@gmail.com>
> > > >> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> > > >> Sent: Monday, November 5, 2012 10:41 AM
> > > >> Subject: Using doubles and longs as ordering row values
> > > >>
> > > >> Hi,
> > > >>
> > > >> In my application my row values are doubles. I would like my scans
> to
> > > >> traverse the rows in order of increasing values.
> > > >>
> > > >> But if I simply use
> > > >>
> > > >> double d  = ....
> > > >> byte[] row = Bytes.toBytes(d);
> > > >>
> > > >> I will get an ordering which is based on the byte values of doubles,
> > not
> > > >> on
> > > >> the value of the doubles themselves.
> > > >>
> > > >> I see also that integer values have the same issue due to the first
> > bit
> > > >> being the sign bit. So negative values will come after positive
> > values.
> > > >>
> > > >> Any suggestions?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Jon
> > > >>
> > > >
> > > >
> > >
> >
>
>

Re: Using doubles and longs as ordering row values

Posted by lars hofhansl <lh...@yahoo.com>.
As I said, look at the lily library they have solved exactly that problem, I've used that before.

It has encodings for Ints/Longs/Floats/Doubles/BigDecimals, to encode them such the byte array will sort according to the magnitude of the value which includes the sign and the floating point exponent.

It's a very common problem :)


-- Lars



----- Original Message -----
From: David Koch <og...@googlemail.com>
To: user@hbase.apache.org
Cc: 
Sent: Thursday, November 29, 2012 3:00 PM
Subject: Re: Using doubles and longs as ordering row values

Hello,

I am having a similar issue, only I need to preserve the order of
qualifiers which are serialized signed longs - rather than row keys. The
latter is addressed by the orderly library which was mentioned above. Can
this library be re-used for my purpose? I imagine this is not an exotic
requirement so I am also interested in knowing how other people have solved
this problem.

Thank you,

/David


On Tue, Nov 6, 2012 at 6:07 AM, Jonathan Bishop <jb...@gmail.com>wrote:

> Thanks Dave,
>
> That looks like what I need.
>
> Jon
>
>
> On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:
>
> > This fork looks a bit more up to date:
> > https://github.com/ndimiduk/orderly
> >
> > On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net> wrote:
> >
> > > Here's a project to deal with this issue specifically.  I'm not sure of
> > > it's status:
> > > https://github.com/conikeec/orderly
> > >
> > >
> > > On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
> > wrote:
> > >
> > >> Have a look at the lily library. It has code to encode Longs/Doubles
> > into
> > >> bytes such that resulting bytes sort as expected based on the numbers.
> > >>
> > >> -- Lars
> > >>
> > >>
> > >>
> > >> ________________________________
> > >>  From: Jonathan Bishop <jb...@gmail.com>
> > >> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> > >> Sent: Monday, November 5, 2012 10:41 AM
> > >> Subject: Using doubles and longs as ordering row values
> > >>
> > >> Hi,
> > >>
> > >> In my application my row values are doubles. I would like my scans to
> > >> traverse the rows in order of increasing values.
> > >>
> > >> But if I simply use
> > >>
> > >> double d  = ....
> > >> byte[] row = Bytes.toBytes(d);
> > >>
> > >> I will get an ordering which is based on the byte values of doubles,
> not
> > >> on
> > >> the value of the doubles themselves.
> > >>
> > >> I see also that integer values have the same issue due to the first
> bit
> > >> being the sign bit. So negative values will come after positive
> values.
> > >>
> > >> Any suggestions?
> > >>
> > >> Thanks,
> > >>
> > >> Jon
> > >>
> > >
> > >
> >
>


Re: Using doubles and longs as ordering row values

Posted by David Koch <og...@googlemail.com>.
Hello,

I am having a similar issue, only I need to preserve the order of
qualifiers which are serialized signed longs - rather than row keys. The
latter is addressed by the orderly library which was mentioned above. Can
this library be re-used for my purpose? I imagine this is not an exotic
requirement so I am also interested in knowing how other people have solved
this problem.

Thank you,

/David


On Tue, Nov 6, 2012 at 6:07 AM, Jonathan Bishop <jb...@gmail.com>wrote:

> Thanks Dave,
>
> That looks like what I need.
>
> Jon
>
>
> On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:
>
> > This fork looks a bit more up to date:
> > https://github.com/ndimiduk/orderly
> >
> > On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net> wrote:
> >
> > > Here's a project to deal with this issue specifically.  I'm not sure of
> > > it's status:
> > > https://github.com/conikeec/orderly
> > >
> > >
> > > On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
> > wrote:
> > >
> > >> Have a look at the lily library. It has code to encode Longs/Doubles
> > into
> > >> bytes such that resulting bytes sort as expected based on the numbers.
> > >>
> > >> -- Lars
> > >>
> > >>
> > >>
> > >> ________________________________
> > >>  From: Jonathan Bishop <jb...@gmail.com>
> > >> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> > >> Sent: Monday, November 5, 2012 10:41 AM
> > >> Subject: Using doubles and longs as ordering row values
> > >>
> > >> Hi,
> > >>
> > >> In my application my row values are doubles. I would like my scans to
> > >> traverse the rows in order of increasing values.
> > >>
> > >> But if I simply use
> > >>
> > >> double d  = ....
> > >> byte[] row = Bytes.toBytes(d);
> > >>
> > >> I will get an ordering which is based on the byte values of doubles,
> not
> > >> on
> > >> the value of the doubles themselves.
> > >>
> > >> I see also that integer values have the same issue due to the first
> bit
> > >> being the sign bit. So negative values will come after positive
> values.
> > >>
> > >> Any suggestions?
> > >>
> > >> Thanks,
> > >>
> > >> Jon
> > >>
> > >
> > >
> >
>

Re: Using doubles and longs as ordering row values

Posted by Jonathan Bishop <jb...@gmail.com>.
Thanks Dave,

That looks like what I need.

Jon


On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:

> This fork looks a bit more up to date:
> https://github.com/ndimiduk/orderly
>
> On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net> wrote:
>
> > Here's a project to deal with this issue specifically.  I'm not sure of
> > it's status:
> > https://github.com/conikeec/orderly
> >
> >
> > On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
> wrote:
> >
> >> Have a look at the lily library. It has code to encode Longs/Doubles
> into
> >> bytes such that resulting bytes sort as expected based on the numbers.
> >>
> >> -- Lars
> >>
> >>
> >>
> >> ________________________________
> >>  From: Jonathan Bishop <jb...@gmail.com>
> >> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> >> Sent: Monday, November 5, 2012 10:41 AM
> >> Subject: Using doubles and longs as ordering row values
> >>
> >> Hi,
> >>
> >> In my application my row values are doubles. I would like my scans to
> >> traverse the rows in order of increasing values.
> >>
> >> But if I simply use
> >>
> >> double d  = ....
> >> byte[] row = Bytes.toBytes(d);
> >>
> >> I will get an ordering which is based on the byte values of doubles, not
> >> on
> >> the value of the doubles themselves.
> >>
> >> I see also that integer values have the same issue due to the first bit
> >> being the sign bit. So negative values will come after positive values.
> >>
> >> Any suggestions?
> >>
> >> Thanks,
> >>
> >> Jon
> >>
> >
> >
>

Re: Using doubles and longs as ordering row values

Posted by Nick Dimiduk <nd...@gmail.com>.
I've used orderly a little, it works pretty well. There are some edge
cases, particularly around null values. I'm not sure of the upstream status
either. Check my git log; I haven't done much to this library. It is
generally quite useful, so I don't mind maintaining it if you have patches
for bug fixes.

-n

On Mon, Nov 5, 2012 at 4:27 PM, Dave Latham <la...@davelink.net> wrote:

> This fork looks a bit more up to date:
> https://github.com/ndimiduk/orderly
>
> On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net> wrote:
>
> > Here's a project to deal with this issue specifically.  I'm not sure of
> > it's status:
> > https://github.com/conikeec/orderly
> >
> >
> > On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com>
> wrote:
> >
> >> Have a look at the lily library. It has code to encode Longs/Doubles
> into
> >> bytes such that resulting bytes sort as expected based on the numbers.
> >>
> >> -- Lars
> >>
> >>
> >>
> >> ________________________________
> >>  From: Jonathan Bishop <jb...@gmail.com>
> >> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> >> Sent: Monday, November 5, 2012 10:41 AM
> >> Subject: Using doubles and longs as ordering row values
> >>
> >> Hi,
> >>
> >> In my application my row values are doubles. I would like my scans to
> >> traverse the rows in order of increasing values.
> >>
> >> But if I simply use
> >>
> >> double d  = ....
> >> byte[] row = Bytes.toBytes(d);
> >>
> >> I will get an ordering which is based on the byte values of doubles, not
> >> on
> >> the value of the doubles themselves.
> >>
> >> I see also that integer values have the same issue due to the first bit
> >> being the sign bit. So negative values will come after positive values.
> >>
> >> Any suggestions?
> >>
> >> Thanks,
> >>
> >> Jon
> >>
> >
> >
>

Re: Using doubles and longs as ordering row values

Posted by Dave Latham <la...@davelink.net>.
This fork looks a bit more up to date:
https://github.com/ndimiduk/orderly

On Mon, Nov 5, 2012 at 4:26 PM, Dave Latham <la...@davelink.net> wrote:

> Here's a project to deal with this issue specifically.  I'm not sure of
> it's status:
> https://github.com/conikeec/orderly
>
>
> On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com> wrote:
>
>> Have a look at the lily library. It has code to encode Longs/Doubles into
>> bytes such that resulting bytes sort as expected based on the numbers.
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Jonathan Bishop <jb...@gmail.com>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>> Sent: Monday, November 5, 2012 10:41 AM
>> Subject: Using doubles and longs as ordering row values
>>
>> Hi,
>>
>> In my application my row values are doubles. I would like my scans to
>> traverse the rows in order of increasing values.
>>
>> But if I simply use
>>
>> double d  = ....
>> byte[] row = Bytes.toBytes(d);
>>
>> I will get an ordering which is based on the byte values of doubles, not
>> on
>> the value of the doubles themselves.
>>
>> I see also that integer values have the same issue due to the first bit
>> being the sign bit. So negative values will come after positive values.
>>
>> Any suggestions?
>>
>> Thanks,
>>
>> Jon
>>
>
>

Re: Using doubles and longs as ordering row values

Posted by Dave Latham <la...@davelink.net>.
Here's a project to deal with this issue specifically.  I'm not sure of
it's status:
https://github.com/conikeec/orderly

On Mon, Nov 5, 2012 at 4:01 PM, lars hofhansl <lh...@yahoo.com> wrote:

> Have a look at the lily library. It has code to encode Longs/Doubles into
> bytes such that resulting bytes sort as expected based on the numbers.
>
> -- Lars
>
>
>
> ________________________________
>  From: Jonathan Bishop <jb...@gmail.com>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> Sent: Monday, November 5, 2012 10:41 AM
> Subject: Using doubles and longs as ordering row values
>
> Hi,
>
> In my application my row values are doubles. I would like my scans to
> traverse the rows in order of increasing values.
>
> But if I simply use
>
> double d  = ....
> byte[] row = Bytes.toBytes(d);
>
> I will get an ordering which is based on the byte values of doubles, not on
> the value of the doubles themselves.
>
> I see also that integer values have the same issue due to the first bit
> being the sign bit. So negative values will come after positive values.
>
> Any suggestions?
>
> Thanks,
>
> Jon
>

Re: Using doubles and longs as ordering row values

Posted by lars hofhansl <lh...@yahoo.com>.
Have a look at the lily library. It has code to encode Longs/Doubles into bytes such that resulting bytes sort as expected based on the numbers.

-- Lars



________________________________
 From: Jonathan Bishop <jb...@gmail.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Sent: Monday, November 5, 2012 10:41 AM
Subject: Using doubles and longs as ordering row values
 
Hi,

In my application my row values are doubles. I would like my scans to
traverse the rows in order of increasing values.

But if I simply use

double d  = ....
byte[] row = Bytes.toBytes(d);

I will get an ordering which is based on the byte values of doubles, not on
the value of the doubles themselves.

I see also that integer values have the same issue due to the first bit
being the sign bit. So negative values will come after positive values.

Any suggestions?

Thanks,

Jon