You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by Ah...@swisscom.com on 2014/07/22 09:45:57 UTC

Column Mapping issue via view

Hi there,

While trying to MAP an Hbase locally created table in Phonenix via CREATE VIEW (Mapping to an Existing HBase Table) i am seeing strange and sort of corrupt data in the columns of following data types,

INTEGER
BIGINT
DECIMAL


I can only see the correct data if i choose all the column mappings as VARCHAR in the view which is not acceptable as there could be DML operations (Like DELETE, UPDATES, SUM) needed as per data types on these fields, and semantically INTEGER data should be mapped as INTEGER rather than CHARCTERS.  More problematic is that i am seeing strange data from this view in NUMBER datatypes but the INT data in Hbase has no issue like that, on one of the article on Google i found that it could be due to SERILIAZATION of the data while loading in Hbase. Data looks ok in Hbase but not when query in Phoenix

https://groups.google.com/forum/#!topic/phoenix-hbase-user/wvgzItxliZs

I used importtsv and completebulkload JARS for Hbase table data load, where do i can control the correct serialization of INTEGER data types while loading in Hbase with these loaders?, what could be the cause of that and how to fix it best?

Best Regards,
Ahmed.






Re: Column Mapping issue via view

Posted by James Taylor <ja...@apache.org>.
Hi Ahmed,

First, take at look at this FAQ if you haven't already seen it:
http://phoenix.apache.org/faq.html#How_I_map_Phoenix_table_to_an_existing_HBase_table

If you're mapping to an existing HBase table, then the serialization
that was done to create the existing data must match the serialization
expected by Phoenix. In general, our UNSIGNED_* types matches the way
the HBase Bytes.toBytes(<Java primitive type>) methods do
serialization. For example if the data was serialized using
Bytes.toBytes(int), then declare your type as an UNSIGNED_INT. See
http://phoenix.apache.org/language/datatypes.html for a complete list
of our data types (and how they map to HBase serialized data). Note
that if your data contains negative numbers, you're out of luck, as
this data will not sort correctly wrt positive numbers.

As far as DECIMAL, HBase does not have a type like this. Take a look
instead at the UNSIGNED_FLOAT or UNSIGNED_DOUBLE types.

There are limitations in what can be mapped to. In particular, if the
row key of the existing data represents multiple columns of data, then
more often than not you won't be able to directly map the HBase table
to a Phoenix table since the separator character used will often not
match what Phoenix expects.

One other alternative is to just re-write the table in a Phoenix
compliant manner. There are many ways this can be done: using a Pig
script and our Pig integration, writing out a CSV file and then
loading it with our Bulk CSV Loader, or just using map-reduce and some
of our utility functions.

Thanks,
James

On Tue, Jul 22, 2014 at 12:45 AM,  <Ah...@swisscom.com> wrote:
> Hi there,
>
> While trying to MAP an Hbase locally created table in Phonenix via CREATE VIEW (Mapping to an Existing HBase Table) i am seeing strange and sort of corrupt data in the columns of following data types,
>
> INTEGER
> BIGINT
> DECIMAL
>
>
> I can only see the correct data if i choose all the column mappings as VARCHAR in the view which is not acceptable as there could be DML operations (Like DELETE, UPDATES, SUM) needed as per data types on these fields, and semantically INTEGER data should be mapped as INTEGER rather than CHARCTERS.  More problematic is that i am seeing strange data from this view in NUMBER datatypes but the INT data in Hbase has no issue like that, on one of the article on Google i found that it could be due to SERILIAZATION of the data while loading in Hbase. Data looks ok in Hbase but not when query in Phoenix
>
> https://groups.google.com/forum/#!topic/phoenix-hbase-user/wvgzItxliZs
>
> I used importtsv and completebulkload JARS for Hbase table data load, where do i can control the correct serialization of INTEGER data types while loading in Hbase with these loaders?, what could be the cause of that and how to fix it best?
>
> Best Regards,
> Ahmed.
>
>
>
>
>