You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Lavrenty Eskin <la...@NetCracker.com> on 2015/01/08 09:44:07 UTC

Numbers low-level format in Phoenix

Helo all,
I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
Just takes 1234567890 value (0х499602D2):
Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
But why it cannot store as in HBase format value=[B@499602d2 ?

Another issue is why it write wrong bytes if you write from HBase shell? :
Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f


________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Re: Numbers low-level format in Phoenix

Posted by Gabriel Reid <ga...@gmail.com>.
Hi Lavrenty,

I assume that a jRuby script that makes use of the HBase API will
work, but you'd have to try that out to be sure.

It will probably be less work to just work via the JDBC API provided
by Phoenix to write data. This should also be possible via JRuby, and
that way you won't have to worry about accidentally writing data that
isn't encoded correctly.

- Gabriel


On Mon, Jan 12, 2015 at 8:22 AM, Lavrenty Eskin
<la...@netcracker.com> wrote:
> Hi Gabriel,
>
> Thanks for explain how does hbase shell works. What do you think - if I will wrote jRuby script with bytes array support - will this work or it again will convert values into strings?
> By the way, there are was another jRuby types mapped to java.lang.Long and java.math.Bigint - Fixnum and Bignum respectively. You may check that this way:
>
> Hbase shell
>> import java.lang.Long
>> ll = Long.valueOf(1234567890)
> => 1234567890
>> ll.to_byte_array
> NoMethodError: undefined method 'to_byte_array' for 1234567890:Fixnum
>
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, January 08, 2015 7:30 PM
> To: user@phoenix.apache.org
> Subject: Re: Numbers low-level format in Phoenix
>
> Hi Lavrenty,
>
> Thanks for posting that hbase shell session info.
>
> You're currently inserting strings like "[B@1e0f477f" into HBase. It's actually not possible to insert binary (byte array) values into HBase using the HBase shell like you're doing. Internally, the HBase shell calls the equivalent to toString().getBytes() on any value that you write to HBase.
>
> If you want to write binary-encoded data to directly to HBase (without using Phoenix) you'll need to use the Java API, and not the HBase shell, or at least not the typical built-in shell commands.
>
> - Gabriel
>
>
> On Thu, Jan 8, 2015 at 1:45 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
>> There are full table scan from hbase shell. Table has been created from phoenix. Two first values (UL=Unsigned Long) has been upserted from phoenix via jdbc driver (each upsert generate two KV pairs).
>>
>> Another KV pairs has been inserted into the same table via hbase shell the way like this:
>>
>> put 'TEST_EVENTS', '913912383771315636525190071419507789000',
>> 'ref:type_id', Bytes.toBytes(Long.parseLong("1234567890", 10)),
>> 1419507789742 Each of conversation try are explained in key string. No
>> one of them can read correctly then from phoenix. It receives ambigous
>> numbers instead of 1234567890
>>
>> hbase(main):026:0> scan 't3_lavr'
>> ROW                                             COLUMN+CELL
>>  key-PHOENIX-UL-1234567890                      column=ref:_0, timestamp=1419935384987, value=
>>  key-PHOENIX-UL-1234567890                      column=ref:type_id, timestamp=1419935384987, value=\x00\x00\x00\x00I\x96\x02\xD2
>>
>> key-PHOENIX-UL-1234567891                      column=ref:_0, timestamp=1419935641902, value=
>> key-PHOENIX-UL-1234567891                      column=ref:type_id, timestamp=1419935641902, value=\x00\x00\x00\x00I\x96\x02\xD3
>>
>>  key-hbase-Bytes.fromHex(0x00000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@6b3c7a2c
>>  key-hbase-Bytes.fromHex(0x499602D2)            column=ref:type_id, timestamp=1419507789742, value=[B@5385550c
>>  key-hbase-Bytes.fromHex(0x80000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@2c9a49b3
>>  key-hbase-Bytes.fromHex(499602D2)              column=ref:type_id, timestamp=1419507789742, value=[B@1e0f477f
>>  key-hbase-Bytes.fromHex(D2029649)              column=ref:type_id, timestamp=1419507789742, value=[B@722bfe37
>>  key-hbase-Bytes.toStringBinary(Bytes.toBytes(l column=ref:type_id,
>> timestamp=1419507789742,
>> value=\x5Cx00\x5Cx00\x5Cx00\x5Cx00I\x5Cx96\x5Cx02\x5CxD2
>>  on))
>>  key-hbase-Bytes.vintToBytes(bi)                column=ref:type_id, timestamp=1419507789742, value=[B@16a0c14b
>>  key-hbase-Bytes.vintToBytes(lon)               column=ref:type_id, timestamp=1419507789742, value=[B@4b2d49e2
>>  key-hbase-[B@499602D2                          column=ref:type_id, timestamp=1419507789742, value=[B@499602D2
>>  key-hbase-bytes.tobytes(1234567890)            column=ref:type_id, timestamp=1419507789742, value=[B@337f23d0
>>  key-hbase-bytes.tobytes(long)                  column=ref:type_id, timestamp=1419507789742, value=[B@263da67d
>>  key-hbase-bytes.tobytes(long.valueOf)          column=ref:type_id, timestamp=1419507789742, value=[B@73e8c7d7
>>  key-hbase-bytes.tobytes(long.valueOf10)        column=ref:type_id, timestamp=1419507789742, value=[B@4778d705
>>  key-hbase-bytes.tobytes(long.valueOf16)        column=ref:type_id, timestamp=1419507789742, value=[B@5a83fb14
>>
>>
>> -----Original Message-----
>> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
>> Sent: Thursday, January 08, 2015 3:13 PM
>> To: user@phoenix.apache.org
>> Subject: Re: Numbers low-level format in Phoenix
>>
>> It sounds like you might be storing the toString() representation of a byte array of HBase.
>>
>> Could you post an example snippet of the code you're using to store things in HBase, as well as a snippet of how you're reading this data in the HBase shell (or wherever you're reading it).
>>
>> On Thu, Jan 8, 2015 at 12:44 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
>>> Hi Gabriel,
>>>
>>> But why then I receive in HBase shell two different string representation of the byte array?
>>> For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
>>> The same time phoenix have wrong understanding of "[B@13217cf6" and
>>> receives -323837278362736236786-like value instead 1234567890 I have to understand the way to store values via hbase API but read from phoenix then correctly.
>>>
>>> Thanks
>>>
>>> -----Original Message-----
>>> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
>>> Sent: Thursday, January 08, 2015 2:09 PM
>>> To: user@phoenix.apache.org
>>> Subject: Re: Numbers low-level format in Phoenix
>>>
>>> Hi Lavrenty,
>>>
>>> Phoenix actually does store numerical data using byte arrays, in a
>>> similar fashion to what the HBase bytes class does. There's more
>>> information on the various types and their underlying encoding
>>> available here: http://phoenix.apache.org/language/datatypes.html
>>>
>>> I'm guessing you got the string representation
>>> ("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:
>>>
>>>     Bytes.toStringBinary(Bytes.toBytes(1234567890L));
>>>
>>> - Gabriel
>>>
>>> On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
>>>> Helo all,
>>>> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
>>>> Just takes 1234567890 value (0х499602D2):
>>>> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
>>>> But why it cannot store as in HBase format value=[B@499602d2 ?
>>>>
>>>> Another issue is why it write wrong bytes if you write from HBase shell? :
>>>> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
>>>> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>>>
>>>
>>> ________________________________
>>> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
>>
>>
>> ________________________________
>> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

RE: Numbers low-level format in Phoenix

Posted by Lavrenty Eskin <la...@NetCracker.com>.
Hi Gabriel,

Thanks for explain how does hbase shell works. What do you think - if I will wrote jRuby script with bytes array support - will this work or it again will convert values into strings?
By the way, there are was another jRuby types mapped to java.lang.Long and java.math.Bigint - Fixnum and Bignum respectively. You may check that this way:

Hbase shell
> import java.lang.Long
> ll = Long.valueOf(1234567890)
=> 1234567890
> ll.to_byte_array
NoMethodError: undefined method 'to_byte_array' for 1234567890:Fixnum


-----Original Message-----
From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
Sent: Thursday, January 08, 2015 7:30 PM
To: user@phoenix.apache.org
Subject: Re: Numbers low-level format in Phoenix

Hi Lavrenty,

Thanks for posting that hbase shell session info.

You're currently inserting strings like "[B@1e0f477f" into HBase. It's actually not possible to insert binary (byte array) values into HBase using the HBase shell like you're doing. Internally, the HBase shell calls the equivalent to toString().getBytes() on any value that you write to HBase.

If you want to write binary-encoded data to directly to HBase (without using Phoenix) you'll need to use the Java API, and not the HBase shell, or at least not the typical built-in shell commands.

- Gabriel


On Thu, Jan 8, 2015 at 1:45 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
> There are full table scan from hbase shell. Table has been created from phoenix. Two first values (UL=Unsigned Long) has been upserted from phoenix via jdbc driver (each upsert generate two KV pairs).
>
> Another KV pairs has been inserted into the same table via hbase shell the way like this:
>
> put 'TEST_EVENTS', '913912383771315636525190071419507789000',
> 'ref:type_id', Bytes.toBytes(Long.parseLong("1234567890", 10)),
> 1419507789742 Each of conversation try are explained in key string. No
> one of them can read correctly then from phoenix. It receives ambigous
> numbers instead of 1234567890
>
> hbase(main):026:0> scan 't3_lavr'
> ROW                                             COLUMN+CELL
>  key-PHOENIX-UL-1234567890                      column=ref:_0, timestamp=1419935384987, value=
>  key-PHOENIX-UL-1234567890                      column=ref:type_id, timestamp=1419935384987, value=\x00\x00\x00\x00I\x96\x02\xD2
>
> key-PHOENIX-UL-1234567891                      column=ref:_0, timestamp=1419935641902, value=
> key-PHOENIX-UL-1234567891                      column=ref:type_id, timestamp=1419935641902, value=\x00\x00\x00\x00I\x96\x02\xD3
>
>  key-hbase-Bytes.fromHex(0x00000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@6b3c7a2c
>  key-hbase-Bytes.fromHex(0x499602D2)            column=ref:type_id, timestamp=1419507789742, value=[B@5385550c
>  key-hbase-Bytes.fromHex(0x80000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@2c9a49b3
>  key-hbase-Bytes.fromHex(499602D2)              column=ref:type_id, timestamp=1419507789742, value=[B@1e0f477f
>  key-hbase-Bytes.fromHex(D2029649)              column=ref:type_id, timestamp=1419507789742, value=[B@722bfe37
>  key-hbase-Bytes.toStringBinary(Bytes.toBytes(l column=ref:type_id,
> timestamp=1419507789742,
> value=\x5Cx00\x5Cx00\x5Cx00\x5Cx00I\x5Cx96\x5Cx02\x5CxD2
>  on))
>  key-hbase-Bytes.vintToBytes(bi)                column=ref:type_id, timestamp=1419507789742, value=[B@16a0c14b
>  key-hbase-Bytes.vintToBytes(lon)               column=ref:type_id, timestamp=1419507789742, value=[B@4b2d49e2
>  key-hbase-[B@499602D2                          column=ref:type_id, timestamp=1419507789742, value=[B@499602D2
>  key-hbase-bytes.tobytes(1234567890)            column=ref:type_id, timestamp=1419507789742, value=[B@337f23d0
>  key-hbase-bytes.tobytes(long)                  column=ref:type_id, timestamp=1419507789742, value=[B@263da67d
>  key-hbase-bytes.tobytes(long.valueOf)          column=ref:type_id, timestamp=1419507789742, value=[B@73e8c7d7
>  key-hbase-bytes.tobytes(long.valueOf10)        column=ref:type_id, timestamp=1419507789742, value=[B@4778d705
>  key-hbase-bytes.tobytes(long.valueOf16)        column=ref:type_id, timestamp=1419507789742, value=[B@5a83fb14
>
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, January 08, 2015 3:13 PM
> To: user@phoenix.apache.org
> Subject: Re: Numbers low-level format in Phoenix
>
> It sounds like you might be storing the toString() representation of a byte array of HBase.
>
> Could you post an example snippet of the code you're using to store things in HBase, as well as a snippet of how you're reading this data in the HBase shell (or wherever you're reading it).
>
> On Thu, Jan 8, 2015 at 12:44 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
>> Hi Gabriel,
>>
>> But why then I receive in HBase shell two different string representation of the byte array?
>> For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
>> The same time phoenix have wrong understanding of "[B@13217cf6" and
>> receives -323837278362736236786-like value instead 1234567890 I have to understand the way to store values via hbase API but read from phoenix then correctly.
>>
>> Thanks
>>
>> -----Original Message-----
>> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
>> Sent: Thursday, January 08, 2015 2:09 PM
>> To: user@phoenix.apache.org
>> Subject: Re: Numbers low-level format in Phoenix
>>
>> Hi Lavrenty,
>>
>> Phoenix actually does store numerical data using byte arrays, in a
>> similar fashion to what the HBase bytes class does. There's more
>> information on the various types and their underlying encoding
>> available here: http://phoenix.apache.org/language/datatypes.html
>>
>> I'm guessing you got the string representation
>> ("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:
>>
>>     Bytes.toStringBinary(Bytes.toBytes(1234567890L));
>>
>> - Gabriel
>>
>> On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
>>> Helo all,
>>> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
>>> Just takes 1234567890 value (0х499602D2):
>>> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
>>> But why it cannot store as in HBase format value=[B@499602d2 ?
>>>
>>> Another issue is why it write wrong bytes if you write from HBase shell? :
>>> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
>>> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>>
>>
>> ________________________________
>> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.


________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Re: Numbers low-level format in Phoenix

Posted by Gabriel Reid <ga...@gmail.com>.
Hi Lavrenty,

Thanks for posting that hbase shell session info.

You're currently inserting strings like "[B@1e0f477f" into HBase. It's
actually not possible to insert binary (byte array) values into HBase
using the HBase shell like you're doing. Internally, the HBase shell
calls the equivalent to toString().getBytes() on any value that you
write to HBase.

If you want to write binary-encoded data to directly to HBase (without
using Phoenix) you'll need to use the Java API, and not the HBase
shell, or at least not the typical built-in shell commands.

- Gabriel


On Thu, Jan 8, 2015 at 1:45 PM, Lavrenty Eskin
<la...@netcracker.com> wrote:
> There are full table scan from hbase shell. Table has been created from phoenix. Two first values (UL=Unsigned Long) has been upserted from phoenix via jdbc driver (each upsert generate two KV pairs).
>
> Another KV pairs has been inserted into the same table via hbase shell the way like this:
>
> put 'TEST_EVENTS', '913912383771315636525190071419507789000', 'ref:type_id', Bytes.toBytes(Long.parseLong("1234567890", 10)), 1419507789742
> Each of conversation try are explained in key string. No one of them can read correctly then from phoenix. It receives ambigous numbers instead of 1234567890
>
> hbase(main):026:0> scan 't3_lavr'
> ROW                                             COLUMN+CELL
>  key-PHOENIX-UL-1234567890                      column=ref:_0, timestamp=1419935384987, value=
>  key-PHOENIX-UL-1234567890                      column=ref:type_id, timestamp=1419935384987, value=\x00\x00\x00\x00I\x96\x02\xD2
>
> key-PHOENIX-UL-1234567891                      column=ref:_0, timestamp=1419935641902, value=
> key-PHOENIX-UL-1234567891                      column=ref:type_id, timestamp=1419935641902, value=\x00\x00\x00\x00I\x96\x02\xD3
>
>  key-hbase-Bytes.fromHex(0x00000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@6b3c7a2c
>  key-hbase-Bytes.fromHex(0x499602D2)            column=ref:type_id, timestamp=1419507789742, value=[B@5385550c
>  key-hbase-Bytes.fromHex(0x80000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@2c9a49b3
>  key-hbase-Bytes.fromHex(499602D2)              column=ref:type_id, timestamp=1419507789742, value=[B@1e0f477f
>  key-hbase-Bytes.fromHex(D2029649)              column=ref:type_id, timestamp=1419507789742, value=[B@722bfe37
>  key-hbase-Bytes.toStringBinary(Bytes.toBytes(l column=ref:type_id, timestamp=1419507789742, value=\x5Cx00\x5Cx00\x5Cx00\x5Cx00I\x5Cx96\x5Cx02\x5CxD2
>  on))
>  key-hbase-Bytes.vintToBytes(bi)                column=ref:type_id, timestamp=1419507789742, value=[B@16a0c14b
>  key-hbase-Bytes.vintToBytes(lon)               column=ref:type_id, timestamp=1419507789742, value=[B@4b2d49e2
>  key-hbase-[B@499602D2                          column=ref:type_id, timestamp=1419507789742, value=[B@499602D2
>  key-hbase-bytes.tobytes(1234567890)            column=ref:type_id, timestamp=1419507789742, value=[B@337f23d0
>  key-hbase-bytes.tobytes(long)                  column=ref:type_id, timestamp=1419507789742, value=[B@263da67d
>  key-hbase-bytes.tobytes(long.valueOf)          column=ref:type_id, timestamp=1419507789742, value=[B@73e8c7d7
>  key-hbase-bytes.tobytes(long.valueOf10)        column=ref:type_id, timestamp=1419507789742, value=[B@4778d705
>  key-hbase-bytes.tobytes(long.valueOf16)        column=ref:type_id, timestamp=1419507789742, value=[B@5a83fb14
>
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, January 08, 2015 3:13 PM
> To: user@phoenix.apache.org
> Subject: Re: Numbers low-level format in Phoenix
>
> It sounds like you might be storing the toString() representation of a byte array of HBase.
>
> Could you post an example snippet of the code you're using to store things in HBase, as well as a snippet of how you're reading this data in the HBase shell (or wherever you're reading it).
>
> On Thu, Jan 8, 2015 at 12:44 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
>> Hi Gabriel,
>>
>> But why then I receive in HBase shell two different string representation of the byte array?
>> For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
>> The same time phoenix have wrong understanding of "[B@13217cf6" and
>> receives -323837278362736236786-like value instead 1234567890 I have to understand the way to store values via hbase API but read from phoenix then correctly.
>>
>> Thanks
>>
>> -----Original Message-----
>> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
>> Sent: Thursday, January 08, 2015 2:09 PM
>> To: user@phoenix.apache.org
>> Subject: Re: Numbers low-level format in Phoenix
>>
>> Hi Lavrenty,
>>
>> Phoenix actually does store numerical data using byte arrays, in a
>> similar fashion to what the HBase bytes class does. There's more
>> information on the various types and their underlying encoding
>> available here: http://phoenix.apache.org/language/datatypes.html
>>
>> I'm guessing you got the string representation
>> ("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:
>>
>>     Bytes.toStringBinary(Bytes.toBytes(1234567890L));
>>
>> - Gabriel
>>
>> On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
>>> Helo all,
>>> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
>>> Just takes 1234567890 value (0х499602D2):
>>> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
>>> But why it cannot store as in HBase format value=[B@499602d2 ?
>>>
>>> Another issue is why it write wrong bytes if you write from HBase shell? :
>>> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
>>> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>>
>>
>> ________________________________
>> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

RE: Numbers low-level format in Phoenix

Posted by Lavrenty Eskin <la...@NetCracker.com>.
There are full table scan from hbase shell. Table has been created from phoenix. Two first values (UL=Unsigned Long) has been upserted from phoenix via jdbc driver (each upsert generate two KV pairs).

Another KV pairs has been inserted into the same table via hbase shell the way like this:

put 'TEST_EVENTS', '913912383771315636525190071419507789000', 'ref:type_id', Bytes.toBytes(Long.parseLong("1234567890", 10)), 1419507789742
Each of conversation try are explained in key string. No one of them can read correctly then from phoenix. It receives ambigous numbers instead of 1234567890

hbase(main):026:0> scan 't3_lavr'
ROW                                             COLUMN+CELL
 key-PHOENIX-UL-1234567890                      column=ref:_0, timestamp=1419935384987, value=
 key-PHOENIX-UL-1234567890                      column=ref:type_id, timestamp=1419935384987, value=\x00\x00\x00\x00I\x96\x02\xD2

key-PHOENIX-UL-1234567891                      column=ref:_0, timestamp=1419935641902, value=
key-PHOENIX-UL-1234567891                      column=ref:type_id, timestamp=1419935641902, value=\x00\x00\x00\x00I\x96\x02\xD3

 key-hbase-Bytes.fromHex(0x00000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@6b3c7a2c
 key-hbase-Bytes.fromHex(0x499602D2)            column=ref:type_id, timestamp=1419507789742, value=[B@5385550c
 key-hbase-Bytes.fromHex(0x80000000499602D2)    column=ref:type_id, timestamp=1419507789742, value=[B@2c9a49b3
 key-hbase-Bytes.fromHex(499602D2)              column=ref:type_id, timestamp=1419507789742, value=[B@1e0f477f
 key-hbase-Bytes.fromHex(D2029649)              column=ref:type_id, timestamp=1419507789742, value=[B@722bfe37
 key-hbase-Bytes.toStringBinary(Bytes.toBytes(l column=ref:type_id, timestamp=1419507789742, value=\x5Cx00\x5Cx00\x5Cx00\x5Cx00I\x5Cx96\x5Cx02\x5CxD2
 on))
 key-hbase-Bytes.vintToBytes(bi)                column=ref:type_id, timestamp=1419507789742, value=[B@16a0c14b
 key-hbase-Bytes.vintToBytes(lon)               column=ref:type_id, timestamp=1419507789742, value=[B@4b2d49e2
 key-hbase-[B@499602D2                          column=ref:type_id, timestamp=1419507789742, value=[B@499602D2
 key-hbase-bytes.tobytes(1234567890)            column=ref:type_id, timestamp=1419507789742, value=[B@337f23d0
 key-hbase-bytes.tobytes(long)                  column=ref:type_id, timestamp=1419507789742, value=[B@263da67d
 key-hbase-bytes.tobytes(long.valueOf)          column=ref:type_id, timestamp=1419507789742, value=[B@73e8c7d7
 key-hbase-bytes.tobytes(long.valueOf10)        column=ref:type_id, timestamp=1419507789742, value=[B@4778d705
 key-hbase-bytes.tobytes(long.valueOf16)        column=ref:type_id, timestamp=1419507789742, value=[B@5a83fb14


-----Original Message-----
From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
Sent: Thursday, January 08, 2015 3:13 PM
To: user@phoenix.apache.org
Subject: Re: Numbers low-level format in Phoenix

It sounds like you might be storing the toString() representation of a byte array of HBase.

Could you post an example snippet of the code you're using to store things in HBase, as well as a snippet of how you're reading this data in the HBase shell (or wherever you're reading it).

On Thu, Jan 8, 2015 at 12:44 PM, Lavrenty Eskin <la...@netcracker.com> wrote:
> Hi Gabriel,
>
> But why then I receive in HBase shell two different string representation of the byte array?
> For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
> The same time phoenix have wrong understanding of "[B@13217cf6" and
> receives -323837278362736236786-like value instead 1234567890 I have to understand the way to store values via hbase API but read from phoenix then correctly.
>
> Thanks
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, January 08, 2015 2:09 PM
> To: user@phoenix.apache.org
> Subject: Re: Numbers low-level format in Phoenix
>
> Hi Lavrenty,
>
> Phoenix actually does store numerical data using byte arrays, in a
> similar fashion to what the HBase bytes class does. There's more
> information on the various types and their underlying encoding
> available here: http://phoenix.apache.org/language/datatypes.html
>
> I'm guessing you got the string representation
> ("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:
>
>     Bytes.toStringBinary(Bytes.toBytes(1234567890L));
>
> - Gabriel
>
> On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
>> Helo all,
>> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
>> Just takes 1234567890 value (0х499602D2):
>> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
>> But why it cannot store as in HBase format value=[B@499602d2 ?
>>
>> Another issue is why it write wrong bytes if you write from HBase shell? :
>> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
>> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.


________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Re: Numbers low-level format in Phoenix

Posted by Gabriel Reid <ga...@gmail.com>.
It sounds like you might be storing the toString() representation of a
byte array of HBase.

Could you post an example snippet of the code you're using to store
things in HBase, as well as a snippet of how you're reading this data
in the HBase shell (or wherever you're reading it).

On Thu, Jan 8, 2015 at 12:44 PM, Lavrenty Eskin
<la...@netcracker.com> wrote:
> Hi Gabriel,
>
> But why then I receive in HBase shell two different string representation of the byte array?
> For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
> The same time phoenix have wrong understanding of "[B@13217cf6" and receives -323837278362736236786-like value instead 1234567890
> I have to understand the way to store values via hbase API but read from phoenix then correctly.
>
> Thanks
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, January 08, 2015 2:09 PM
> To: user@phoenix.apache.org
> Subject: Re: Numbers low-level format in Phoenix
>
> Hi Lavrenty,
>
> Phoenix actually does store numerical data using byte arrays, in a similar fashion to what the HBase bytes class does. There's more information on the various types and their underlying encoding available here: http://phoenix.apache.org/language/datatypes.html
>
> I'm guessing you got the string representation
> ("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:
>
>     Bytes.toStringBinary(Bytes.toBytes(1234567890L));
>
> - Gabriel
>
> On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
>> Helo all,
>> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
>> Just takes 1234567890 value (0х499602D2):
>> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
>> But why it cannot store as in HBase format value=[B@499602d2 ?
>>
>> Another issue is why it write wrong bytes if you write from HBase shell? :
>> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
>> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

RE: Numbers low-level format in Phoenix

Posted by Lavrenty Eskin <la...@NetCracker.com>.
Hi Gabriel,

But why then I receive in HBase shell two different string representation of the byte array?
For byte arrays stored from phoenix - \x00\x00\x00\x00I\x96\x02\xD2 and [B@13217cf6 for stored from HBase.
The same time phoenix have wrong understanding of "[B@13217cf6" and receives -323837278362736236786-like value instead 1234567890
I have to understand the way to store values via hbase API but read from phoenix then correctly.

Thanks

-----Original Message-----
From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
Sent: Thursday, January 08, 2015 2:09 PM
To: user@phoenix.apache.org
Subject: Re: Numbers low-level format in Phoenix

Hi Lavrenty,

Phoenix actually does store numerical data using byte arrays, in a similar fashion to what the HBase bytes class does. There's more information on the various types and their underlying encoding available here: http://phoenix.apache.org/language/datatypes.html

I'm guessing you got the string representation
("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a string representation of the byte array (containing 8 bytes) containing the serialized value of 1234567890. The strings you posted like "[B@13217cf6" are the default string representation of byte arrays in java. To convert these to a human-readable value (like what the HBase shell does), you could do the following:

    Bytes.toStringBinary(Bytes.toBytes(1234567890L));

- Gabriel

On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin <la...@netcracker.com> wrote:
> Helo all,
> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
> Just takes 1234567890 value (0х499602D2):
> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
> But why it cannot store as in HBase format value=[B@499602d2 ?
>
> Another issue is why it write wrong bytes if you write from HBase shell? :
> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f


________________________________
The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Re: Numbers low-level format in Phoenix

Posted by Gabriel Reid <ga...@gmail.com>.
Hi Lavrenty,

Phoenix actually does store numerical data using byte arrays, in a
similar fashion to what the HBase bytes class does. There's more
information on the various types and their underlying encoding
available here: http://phoenix.apache.org/language/datatypes.html

I'm guessing you got the string representation
("\x00\x00\x00\x00I\x96\x02\xD2") from the HBase shell -- this is a
string representation of the byte array (containing 8 bytes)
containing the serialized value of 1234567890. The strings you posted
like "[B@13217cf6" are the default string representation of byte
arrays in java. To convert these to a human-readable value (like what
the HBase shell does), you could do the following:

    Bytes.toStringBinary(Bytes.toBytes(1234567890L));

- Gabriel

On Thu, Jan 8, 2015 at 9:44 AM, Lavrenty Eskin
<la...@netcracker.com> wrote:
> Helo all,
> I'm surprised that phoenix store numbers not in HBase 'Byte' format. Looks like a big overhead there, isn't it?
> Just takes 1234567890 value (0х499602D2):
> Phoenix stores that as string '\x00\x00\x00\x00I\x96\x02\xD2'
> But why it cannot store as in HBase format value=[B@499602d2 ?
>
> Another issue is why it write wrong bytes if you write from HBase shell? :
> Bytes.toBytes(1234567890)       -->>    value=[B@13217cf6,
> Bytes.toBytes(1234567890L)      -->>    value=[B@3caab4f
>
>
> ________________________________
> The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.