You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Marek Wiewiorka <ma...@gmail.com> on 2015/04/06 17:45:39 UTC

hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot
make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while
trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other
way of using Phoenix in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek

Re: hive external table

Posted by Marek Wiewiorka <ma...@gmail.com>.
Hi Nicolas - thanks a  lot I will give a try and let you!

Thanks!
Marek

2015-04-10 11:12 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:

>  Hello Marek
>
>
>  I have corrected your type issues in the public Phoenix-Hive project you
> can go ahead and test the latest version.
>
> I have tested against Phoenix 4.3 to make sure all is well.
>
> This beeing said this build is still dependant on phoenix 4.2, to move to
> 4.3 has a little rewrite that has been done in the private repo, I'd rather
> update everyone at the same time when all gets merged back to standard
> phoenix. So this is a patch for you and the public repo.
>
>
>  let me know if all is better and well, I have many things to make better
> and the next versions of this connector but still always looking for
> initial testers to make sure all is stable.
>
>
>  regards
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Nicolas Maillard
> *Sent:* Thursday, April 09, 2015 6:52 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>
> Hello Marek
>
>
>  Thanks for the test i'm doing a first fix as we speak and will extend my
> unit test to make sure it stays fixed.
>
> The timestamp is good to go I'll run a couple more tests and a new commit
> this evening.
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <ma...@gmail.com>
> *Sent:* Tuesday, April 07, 2015 10:48 AM
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>  Hi Nicolas,
> I tranformed hu_date (timestamp column) into bigint uisng to_number
> function in Phoenix.
> I created a new external table in hive as follows:
>  hive> CREATE EXTERNAL TABLE phx_usage_day(
>     >      hu_date bigint,
>     >     hu_ho_id int,
>     >     hu_stream_id int,
>     >     hu_usage double)
>     > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>     > TBLPROPERTIES(
>     >     'phoenix.hbase.table.name'='se_dwh.homes_usage_day_export ',
>     >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>     >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
>     > );
>  however the problem is now with double type(no longer with timestamp):
>  hive> select * from phx_usage_day limit 1;
> OK
> Failed with exception
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be
> cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
> Ti
>
>  so I changed hu_usage data type double =>float and then I got exception
> like this:
>
>  hive> select * from phx_usage_day limit 1;
> OK
> java.lang.ClassCastException: java.lang.Double cannot be cast to
> java.lang.Float
>         at
> org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
>         at
> org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
>         at
> org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>         at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Failed with exception
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
> Time taken: 0.069 seconds
>
>
>  finally I changed it to string and it worked:
>  hive> select * from phx_usage_day limit 5;
> OK
> 1388534400000   1000    0       8.42
> 1388534400000   1001    0       7.071000000000001
> 1388534400000   1002    0       9.101999999999997
> 1388534400000   1003    0       26.349999999999994
> 1388534400000   1004    0       48.421
> Time taken: 0.064 seconds, Fetched: 5 row(s)
>
>
>  So I think there is a problem with both Timestamp and Double types.
> I didn't try Float.
>
>  Regards,
> Marek
>
>
>
>
>
> 2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>
>>  Hello Marek
>>
>>
>>  I am currently using a private branch prior to pushing the
>> contribution. I remember having you problem when moving to 4.3.
>>
>> I'll correct the public github and the dependency during the contribution
>> submission
>>
>>
>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>   +33 (0) 668176605|
>> Skype: nicolas.maillard.hortonworks
>>       ------------------------------
>> *From:* Marek Wiewiorka <ma...@gmail.com>
>> *Sent:* Monday, April 06, 2015 10:31 PM
>>
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: hive external table
>>
>>   Thanks Nicolas - could you just let me know what branch should I use
>> to build your project against Phoenix 4.3.0?
>> Master of your repo seems to use 4.2.2...
>> Sure I can live using epoch secs/millis instead of timestamp - however it
>> would be nice to have timestamp mapping implemented.
>>
>>
>>  Regards,
>> Marek
>>
>> 2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>>
>>>  Hello Marek
>>>
>>>
>>>  your error seems to point to a conversion mismatch when bringing back
>>> the timestamp type from Phoenix, essentially the timestamp is being brought
>>> back as a Text value in the the Hive world, I'll fix that asap.
>>>
>>> Not nice workaorund try using a long it should work but I will fix the
>>> issue at hand
>>>
>>>
>>>  Concerning your table no problem in your declaration, make sure your
>>> znode is hbase-unsecure though, it usually is hbase (the default ) but in
>>> my hortonworks sandbox it is hbase-unsecure this is why I specify it.
>>>
>>>
>>>
>>>
>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>   +33 (0) 668176605|
>>> Skype: nicolas.maillard.hortonworks
>>>       ------------------------------
>>> *From:* Marek Wiewiorka <ma...@gmail.com>
>>> *Sent:* Monday, April 06, 2015 10:05 PM
>>> *To:* user@phoenix.apache.org
>>> *Subject:* Re: hive external table
>>>
>>>   Hi Nicolas - thanks for your response!
>>>
>>> Where can I find the latest branch - I used master from this repo:
>>> https://github.com/nmaillard/Phoenix-Hive
>>>  and it is built againts 4.2.2 - I changed dependency in pom and tried
>>> to build it using 4.3 I got series of errors like these:
>>>  [ERROR] COMPILATION ERROR :
>>> [INFO] -------------------------------------------------------------
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: package org.apache.phoenix.schema
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: package org.apache.phoenix.schema
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
>>> cannot find symbol
>>>
>>>
>>>  Could you also please advise how should I define an external table in
>>> Hive using Phoenix-hive for a table created in Phoenix as follows:
>>>
>>>  create table se_dwh.homes_usage_hour
>>> (hu_ts time not null ,
>>> hu_ho_id integer not null ,
>>> hu_stream_id integer not null,
>>> hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );
>>>
>>>  I'm not sure how to set mapping columns correctly - I tried something
>>> like this:
>>> CREATE EXTERNAL TABLE phx_usage_hour(
>>>      hu_ts timestamp,
>>>     hu_ho_id int,
>>>     hu_stream_id int,
>>>     hu_usage double)
>>> STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>>> TBLPROPERTIES(
>>>     'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
>>>     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>>>     'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
>>> );
>>>
>>>  It was created without a problem - but when I run a simple query:
>>>
>>>  hive> select * from phx_usage_hour limit 1;
>>> OK
>>> Failed with exception
>>> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>>> org.apache.hadoop.hive.serde2.io.TimestampWritable
>>> Time taken: 0.084 seconds
>>>
>>>  I compiled your project using Phoenix 4.2.2 but I got 4.3 version
>>> installed in HBase - could it be the root cause of the exception ?
>>>
>>>  Thanks!
>>> Marek
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>>>
>>>>  Hello Marek
>>>>
>>>>
>>>>  There are a couple ways of reaching to phoenix through Hive
>>>>
>>>> - One is calling out directly the hbase layer with the Hive Hbase
>>>> connector but this has some caveats.
>>>>
>>>> - Second is this project I am working on, the latest branch is built
>>>> against phoenix 4.3, but building it against 4.2.2 should not create any
>>>> problems from a usage part , if it does I need to fix it.
>>>>
>>>> Could you explain the issue or file an issue on the project.
>>>>
>>>> - Third if it is urgent you could dump your Hive data in a csv format
>>>> and use the phoenix Bulkloader.
>>>>
>>>>
>>>>  Whatever the way you decide to go please do forward some details
>>>> about your error so I can fix if needed
>>>>
>>>>
>>>>
>>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>>   +33 (0) 668176605|
>>>> Skype: nicolas.maillard.hortonworks
>>>>       ------------------------------
>>>> *From:* Marek Wiewiorka <ma...@gmail.com>
>>>> *Sent:* Monday, April 06, 2015 5:45 PM
>>>> *To:* user@phoenix.apache.org
>>>> *Subject:* hive external table
>>>>
>>>>   Hi All,
>>>> I trying to create an external table mapped to a Phoenix table and I
>>>> cannot make it working.
>>>> I tried this project:
>>>> https://github.com/nmaillard/Phoenix-Hive
>>>>  but it fails to build against Phoenix 4.3.0.
>>>> I compiled it using 4.2.2 but still it's throwing some exception while
>>>> trying to access
>>>> an external table.
>>>>
>>>>  Before I try to dig deeper - I would like to ask if  there is  any
>>>> other way of using Phoenix in Hive environment?
>>>>  Maybe you recommend a different approach of transferring data from
>>>> Phoenix
>>>> to Hive?
>>>>
>>>>  Many thanks in advance!
>>>>
>>>>  Marek
>>>>
>>>
>>>
>>
>

Re: hive external table

Posted by Nicolas Maillard <nm...@hortonworks.com>.
Hello Marek


I have corrected your type issues in the public Phoenix-Hive project you can go ahead and test the latest version.

I have tested against Phoenix 4.3 to make sure all is well.

This beeing said this build is still dependant on phoenix 4.2, to move to 4.3 has a little rewrite that has been done in the private repo, I'd rather update everyone at the same time when all gets merged back to standard phoenix. So this is a patch for you and the public repo.


let me know if all is better and well, I have many things to make better and the next versions of this connector but still always looking for initial testers to make sure all is stable.


regards


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Nicolas Maillard
Sent: Thursday, April 09, 2015 6:52 PM
To: user@phoenix.apache.org
Subject: Re: hive external table


Hello Marek


Thanks for the test i'm doing a first fix as we speak and will extend my unit test to make sure it stays fixed.

The timestamp is good to go I'll run a couple more tests and a new commit this evening.


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>
Sent: Tuesday, April 07, 2015 10:48 AM
To: user@phoenix.apache.org
Subject: Re: hive external table

Hi Nicolas,
I tranformed hu_date (timestamp column) into bigint uisng to_number function in Phoenix.
I created a new external table in hive as follows:
hive> CREATE EXTERNAL TABLE phx_usage_day(
    >      hu_date bigint,
    >     hu_ho_id int,
    >     hu_stream_id int,
    >     hu_usage double)
    > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
    > TBLPROPERTIES(
    >     'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_day_export ',
    >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
    > );
however the problem is now with double type(no longer with timestamp):
hive> select * from phx_usage_day limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
Ti

so I changed hu_usage data type double =>float and then I got exception like this:

hive> select * from phx_usage_day limit 1;
OK
java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Float
        at org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
        at org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
Time taken: 0.069 seconds


finally I changed it to string and it worked:
hive> select * from phx_usage_day limit 5;
OK
1388534400000   1000    0       8.42
1388534400000   1001    0       7.071000000000001
1388534400000   1002    0       9.101999999999997
1388534400000   1003    0       26.349999999999994
1388534400000   1004    0       48.421
Time taken: 0.064 seconds, Fetched: 5 row(s)


So I think there is a problem with both Timestamp and Double types.
I didn't try Float.

Regards,
Marek





2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


I am currently using a private branch prior to pushing the contribution. I remember having you problem when moving to 4.3.

I'll correct the public github and the dependency during the contribution submission


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 10:31 PM

To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: hive external table

Thanks Nicolas - could you just let me know what branch should I use to build your project against Phoenix 4.3.0?
Master of your repo seems to use 4.2.2...
Sure I can live using epoch secs/millis instead of timestamp - however it would be nice to have timestamp mapping implemented.


Regards,
Marek

2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


your error seems to point to a conversion mismatch when bringing back the timestamp type from Phoenix, essentially the timestamp is being brought back as a Text value in the the Hive world, I'll fix that asap.

Not nice workaorund try using a long it should work but I will fix the issue at hand


Concerning your table no problem in your declaration, make sure your znode is hbase-unsecure though, it usually is hbase (the default ) but in my hortonworks sandbox it is hbase-unsecure this is why I specify it.




[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 10:05 PM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: hive external table

Hi Nicolas - thanks for your response!

Where can I find the latest branch - I used master from this repo:
https://github.com/nmaillard/Phoenix-Hive
and it is built againts 4.2.2 - I changed dependency in pom and tried to build it using 4.3 I got series of errors like these:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33] cannot find symbol


Could you also please advise how should I define an external table in Hive using Phoenix-hive for a table created in Phoenix as follows:

create table se_dwh.homes_usage_hour
(hu_ts time not null ,
hu_ho_id integer not null ,
hu_stream_id integer not null,
hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );

I'm not sure how to set mapping columns correctly - I tried something like this:
CREATE EXTERNAL TABLE phx_usage_hour(
     hu_ts timestamp,
    hu_ho_id int,
    hu_stream_id int,
    hu_usage double)
STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
TBLPROPERTIES(
    'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_hour',
    'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
);

It was created without a problem - but when I run a simple query:

hive> select * from phx_usage_hour limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.084 seconds

I compiled your project using Phoenix 4.2.2 but I got 4.3 version installed in HBase - could it be the root cause of the exception ?

Thanks!
Marek









2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


There are a couple ways of reaching to phoenix through Hive

- One is calling out directly the hbase layer with the Hive Hbase connector but this has some caveats.

- Second is this project I am working on, the latest branch is built against phoenix 4.3, but building it against 4.2.2 should not create any problems from a usage part , if it does I need to fix it.

Could you explain the issue or file an issue on the project.

- Third if it is urgent you could dump your Hive data in a csv format and use the phoenix Bulkloader.


Whatever the way you decide to go please do forward some details about your error so I can fix if needed



[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 5:45 PM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other way of using Phoenix in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek




Re: hive external table

Posted by Marek Wiewiorka <ma...@gmail.com>.
Hi Nicolas,
I tranformed hu_date (timestamp column) into bigint uisng to_number
function in Phoenix.
I created a new external table in hive as follows:
hive> CREATE EXTERNAL TABLE phx_usage_day(
    >      hu_date bigint,
    >     hu_ho_id int,
    >     hu_stream_id int,
    >     hu_usage double)
    > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
    > TBLPROPERTIES(
    >     'phoenix.hbase.table.name'='se_dwh.homes_usage_day_export ',
    >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
    > );
however the problem is now with double type(no longer with timestamp):
hive> select * from phx_usage_day limit 1;
OK
Failed with exception
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be
cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
Ti

so I changed hu_usage data type double =>float and then I got exception
like this:

hive> select * from phx_usage_day limit 1;
OK
java.lang.ClassCastException: java.lang.Double cannot be cast to
java.lang.Float
        at
org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
        at
org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
        at
org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
        at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
        at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Failed with exception
java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
Time taken: 0.069 seconds


finally I changed it to string and it worked:
hive> select * from phx_usage_day limit 5;
OK
1388534400000   1000    0       8.42
1388534400000   1001    0       7.071000000000001
1388534400000   1002    0       9.101999999999997
1388534400000   1003    0       26.349999999999994
1388534400000   1004    0       48.421
Time taken: 0.064 seconds, Fetched: 5 row(s)


So I think there is a problem with both Timestamp and Double types.
I didn't try Float.

Regards,
Marek





2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:

>  Hello Marek
>
>
>  I am currently using a private branch prior to pushing the contribution.
> I remember having you problem when moving to 4.3.
>
> I'll correct the public github and the dependency during the contribution
> submission
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <ma...@gmail.com>
> *Sent:* Monday, April 06, 2015 10:31 PM
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>  Thanks Nicolas - could you just let me know what branch should I use to
> build your project against Phoenix 4.3.0?
> Master of your repo seems to use 4.2.2...
> Sure I can live using epoch secs/millis instead of timestamp - however it
> would be nice to have timestamp mapping implemented.
>
>
>  Regards,
> Marek
>
> 2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>
>>  Hello Marek
>>
>>
>>  your error seems to point to a conversion mismatch when bringing back
>> the timestamp type from Phoenix, essentially the timestamp is being brought
>> back as a Text value in the the Hive world, I'll fix that asap.
>>
>> Not nice workaorund try using a long it should work but I will fix the
>> issue at hand
>>
>>
>>  Concerning your table no problem in your declaration, make sure your
>> znode is hbase-unsecure though, it usually is hbase (the default ) but in
>> my hortonworks sandbox it is hbase-unsecure this is why I specify it.
>>
>>
>>
>>
>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>   +33 (0) 668176605|
>> Skype: nicolas.maillard.hortonworks
>>       ------------------------------
>> *From:* Marek Wiewiorka <ma...@gmail.com>
>> *Sent:* Monday, April 06, 2015 10:05 PM
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: hive external table
>>
>>   Hi Nicolas - thanks for your response!
>>
>> Where can I find the latest branch - I used master from this repo:
>> https://github.com/nmaillard/Phoenix-Hive
>>  and it is built againts 4.2.2 - I changed dependency in pom and tried
>> to build it using 4.3 I got series of errors like these:
>>  [ERROR] COMPILATION ERROR :
>> [INFO] -------------------------------------------------------------
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: package org.apache.phoenix.schema
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: package org.apache.phoenix.schema
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
>> cannot find symbol
>>
>>
>>  Could you also please advise how should I define an external table in
>> Hive using Phoenix-hive for a table created in Phoenix as follows:
>>
>>  create table se_dwh.homes_usage_hour
>> (hu_ts time not null ,
>> hu_ho_id integer not null ,
>> hu_stream_id integer not null,
>> hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );
>>
>>  I'm not sure how to set mapping columns correctly - I tried something
>> like this:
>> CREATE EXTERNAL TABLE phx_usage_hour(
>>      hu_ts timestamp,
>>     hu_ho_id int,
>>     hu_stream_id int,
>>     hu_usage double)
>> STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>> TBLPROPERTIES(
>>     'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
>>     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>>     'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
>> );
>>
>>  It was created without a problem - but when I run a simple query:
>>
>>  hive> select * from phx_usage_hour limit 1;
>> OK
>> Failed with exception
>> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>> org.apache.hadoop.hive.serde2.io.TimestampWritable
>> Time taken: 0.084 seconds
>>
>>  I compiled your project using Phoenix 4.2.2 but I got 4.3 version
>> installed in HBase - could it be the root cause of the exception ?
>>
>>  Thanks!
>> Marek
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>>
>>>  Hello Marek
>>>
>>>
>>>  There are a couple ways of reaching to phoenix through Hive
>>>
>>> - One is calling out directly the hbase layer with the Hive Hbase
>>> connector but this has some caveats.
>>>
>>> - Second is this project I am working on, the latest branch is built
>>> against phoenix 4.3, but building it against 4.2.2 should not create any
>>> problems from a usage part , if it does I need to fix it.
>>>
>>> Could you explain the issue or file an issue on the project.
>>>
>>> - Third if it is urgent you could dump your Hive data in a csv format
>>> and use the phoenix Bulkloader.
>>>
>>>
>>>  Whatever the way you decide to go please do forward some details about
>>> your error so I can fix if needed
>>>
>>>
>>>
>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>   +33 (0) 668176605|
>>> Skype: nicolas.maillard.hortonworks
>>>       ------------------------------
>>> *From:* Marek Wiewiorka <ma...@gmail.com>
>>> *Sent:* Monday, April 06, 2015 5:45 PM
>>> *To:* user@phoenix.apache.org
>>> *Subject:* hive external table
>>>
>>>   Hi All,
>>> I trying to create an external table mapped to a Phoenix table and I
>>> cannot make it working.
>>> I tried this project:
>>> https://github.com/nmaillard/Phoenix-Hive
>>>  but it fails to build against Phoenix 4.3.0.
>>> I compiled it using 4.2.2 but still it's throwing some exception while
>>> trying to access
>>> an external table.
>>>
>>>  Before I try to dig deeper - I would like to ask if  there is  any
>>> other way of using Phoenix in Hive environment?
>>>  Maybe you recommend a different approach of transferring data from
>>> Phoenix
>>> to Hive?
>>>
>>>  Many thanks in advance!
>>>
>>>  Marek
>>>
>>
>>
>

Re: hive external table

Posted by Nicolas Maillard <nm...@hortonworks.com>.
Hello Marek


I am currently using a private branch prior to pushing the contribution. I remember having you problem when moving to 4.3.

I'll correct the public github and the dependency during the contribution submission


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>
Sent: Monday, April 06, 2015 10:31 PM
To: user@phoenix.apache.org
Subject: Re: hive external table

Thanks Nicolas - could you just let me know what branch should I use to build your project against Phoenix 4.3.0?
Master of your repo seems to use 4.2.2...
Sure I can live using epoch secs/millis instead of timestamp - however it would be nice to have timestamp mapping implemented.


Regards,
Marek

2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


your error seems to point to a conversion mismatch when bringing back the timestamp type from Phoenix, essentially the timestamp is being brought back as a Text value in the the Hive world, I'll fix that asap.

Not nice workaorund try using a long it should work but I will fix the issue at hand


Concerning your table no problem in your declaration, make sure your znode is hbase-unsecure though, it usually is hbase (the default ) but in my hortonworks sandbox it is hbase-unsecure this is why I specify it.




[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 10:05 PM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: hive external table

Hi Nicolas - thanks for your response!

Where can I find the latest branch - I used master from this repo:
https://github.com/nmaillard/Phoenix-Hive
and it is built againts 4.2.2 - I changed dependency in pom and tried to build it using 4.3 I got series of errors like these:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33] cannot find symbol


Could you also please advise how should I define an external table in Hive using Phoenix-hive for a table created in Phoenix as follows:

create table se_dwh.homes_usage_hour
(hu_ts time not null ,
hu_ho_id integer not null ,
hu_stream_id integer not null,
hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );

I'm not sure how to set mapping columns correctly - I tried something like this:
CREATE EXTERNAL TABLE phx_usage_hour(
     hu_ts timestamp,
    hu_ho_id int,
    hu_stream_id int,
    hu_usage double)
STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
TBLPROPERTIES(
    'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_hour',
    'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
);

It was created without a problem - but when I run a simple query:

hive> select * from phx_usage_hour limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.084 seconds

I compiled your project using Phoenix 4.2.2 but I got 4.3 version installed in HBase - could it be the root cause of the exception ?

Thanks!
Marek









2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


There are a couple ways of reaching to phoenix through Hive

- One is calling out directly the hbase layer with the Hive Hbase connector but this has some caveats.

- Second is this project I am working on, the latest branch is built against phoenix 4.3, but building it against 4.2.2 should not create any problems from a usage part , if it does I need to fix it.

Could you explain the issue or file an issue on the project.

- Third if it is urgent you could dump your Hive data in a csv format and use the phoenix Bulkloader.


Whatever the way you decide to go please do forward some details about your error so I can fix if needed



[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 5:45 PM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other way of using Phoenix in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek



Re: hive external table

Posted by Marek Wiewiorka <ma...@gmail.com>.
Thanks Nicolas - could you just let me know what branch should I use to
build your project against Phoenix 4.3.0?
Master of your repo seems to use 4.2.2...
Sure I can live using epoch secs/millis instead of timestamp - however it
would be nice to have timestamp mapping implemented.


Regards,
Marek

2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:

>  Hello Marek
>
>
>  your error seems to point to a conversion mismatch when bringing back
> the timestamp type from Phoenix, essentially the timestamp is being brought
> back as a Text value in the the Hive world, I'll fix that asap.
>
> Not nice workaorund try using a long it should work but I will fix the
> issue at hand
>
>
>  Concerning your table no problem in your declaration, make sure your
> znode is hbase-unsecure though, it usually is hbase (the default ) but in
> my hortonworks sandbox it is hbase-unsecure this is why I specify it.
>
>
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <ma...@gmail.com>
> *Sent:* Monday, April 06, 2015 10:05 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>  Hi Nicolas - thanks for your response!
>
> Where can I find the latest branch - I used master from this repo:
> https://github.com/nmaillard/Phoenix-Hive
>  and it is built againts 4.2.2 - I changed dependency in pom and tried to
> build it using 4.3 I got series of errors like these:
>  [ERROR] COMPILATION ERROR :
> [INFO] -------------------------------------------------------------
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
> cannot find symbol
>   symbol:   class PDataType
>   location: package org.apache.phoenix.schema
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
> cannot find symbol
>   symbol:   class PDataType
>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
> cannot find symbol
>   symbol:   class PDataType
>   location: package org.apache.phoenix.schema
> [ERROR]
> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
> cannot find symbol
>
>
>  Could you also please advise how should I define an external table in
> Hive using Phoenix-hive for a table created in Phoenix as follows:
>
>  create table se_dwh.homes_usage_hour
> (hu_ts time not null ,
> hu_ho_id integer not null ,
> hu_stream_id integer not null,
> hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );
>
>  I'm not sure how to set mapping columns correctly - I tried something
> like this:
> CREATE EXTERNAL TABLE phx_usage_hour(
>      hu_ts timestamp,
>     hu_ho_id int,
>     hu_stream_id int,
>     hu_usage double)
> STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
> TBLPROPERTIES(
>     'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
>     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>     'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
> );
>
>  It was created without a problem - but when I run a simple query:
>
>  hive> select * from phx_usage_hour limit 1;
> OK
> Failed with exception
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
> org.apache.hadoop.hive.serde2.io.TimestampWritable
> Time taken: 0.084 seconds
>
>  I compiled your project using Phoenix 4.2.2 but I got 4.3 version
> installed in HBase - could it be the root cause of the exception ?
>
>  Thanks!
> Marek
>
>
>
>
>
>
>
>
>
> 2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:
>
>>  Hello Marek
>>
>>
>>  There are a couple ways of reaching to phoenix through Hive
>>
>> - One is calling out directly the hbase layer with the Hive Hbase
>> connector but this has some caveats.
>>
>> - Second is this project I am working on, the latest branch is built
>> against phoenix 4.3, but building it against 4.2.2 should not create any
>> problems from a usage part , if it does I need to fix it.
>>
>> Could you explain the issue or file an issue on the project.
>>
>> - Third if it is urgent you could dump your Hive data in a csv format and
>> use the phoenix Bulkloader.
>>
>>
>>  Whatever the way you decide to go please do forward some details about
>> your error so I can fix if needed
>>
>>
>>
>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>   +33 (0) 668176605|
>> Skype: nicolas.maillard.hortonworks
>>       ------------------------------
>> *From:* Marek Wiewiorka <ma...@gmail.com>
>> *Sent:* Monday, April 06, 2015 5:45 PM
>> *To:* user@phoenix.apache.org
>> *Subject:* hive external table
>>
>>   Hi All,
>> I trying to create an external table mapped to a Phoenix table and I
>> cannot make it working.
>> I tried this project:
>> https://github.com/nmaillard/Phoenix-Hive
>>  but it fails to build against Phoenix 4.3.0.
>> I compiled it using 4.2.2 but still it's throwing some exception while
>> trying to access
>> an external table.
>>
>>  Before I try to dig deeper - I would like to ask if  there is  any
>> other way of using Phoenix in Hive environment?
>>  Maybe you recommend a different approach of transferring data from
>> Phoenix
>> to Hive?
>>
>>  Many thanks in advance!
>>
>>  Marek
>>
>
>

Re: hive external table

Posted by Nicolas Maillard <nm...@hortonworks.com>.
Hello Marek


your error seems to point to a conversion mismatch when bringing back the timestamp type from Phoenix, essentially the timestamp is being brought back as a Text value in the the Hive world, I'll fix that asap.

Not nice workaorund try using a long it should work but I will fix the issue at hand


Concerning your table no problem in your declaration, make sure your znode is hbase-unsecure though, it usually is hbase (the default ) but in my hortonworks sandbox it is hbase-unsecure this is why I specify it.




[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>
Sent: Monday, April 06, 2015 10:05 PM
To: user@phoenix.apache.org
Subject: Re: hive external table

Hi Nicolas - thanks for your response!

Where can I find the latest branch - I used master from this repo:
https://github.com/nmaillard/Phoenix-Hive
and it is built againts 4.2.2 - I changed dependency in pom and tried to build it using 4.3 I got series of errors like these:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13] cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33] cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33] cannot find symbol


Could you also please advise how should I define an external table in Hive using Phoenix-hive for a table created in Phoenix as follows:

create table se_dwh.homes_usage_hour
(hu_ts time not null ,
hu_ho_id integer not null ,
hu_stream_id integer not null,
hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );

I'm not sure how to set mapping columns correctly - I tried something like this:
CREATE EXTERNAL TABLE phx_usage_hour(
     hu_ts timestamp,
    hu_ho_id int,
    hu_stream_id int,
    hu_usage double)
STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
TBLPROPERTIES(
    'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_hour',
    'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
);

It was created without a problem - but when I run a simple query:

hive> select * from phx_usage_hour limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.084 seconds

I compiled your project using Phoenix 4.2.2 but I got 4.3 version installed in HBase - could it be the root cause of the exception ?

Thanks!
Marek









2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>>:

Hello Marek


There are a couple ways of reaching to phoenix through Hive

- One is calling out directly the hbase layer with the Hive Hbase connector but this has some caveats.

- Second is this project I am working on, the latest branch is built against phoenix 4.3, but building it against 4.2.2 should not create any problems from a usage part , if it does I need to fix it.

Could you explain the issue or file an issue on the project.

- Third if it is urgent you could dump your Hive data in a csv format and use the phoenix Bulkloader.


Whatever the way you decide to go please do forward some details about your error so I can fix if needed



[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>>
Sent: Monday, April 06, 2015 5:45 PM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other way of using Phoenix in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek


Re: hive external table

Posted by Marek Wiewiorka <ma...@gmail.com>.
Hi Nicolas - thanks for your response!

Where can I find the latest branch - I used master from this repo:
https://github.com/nmaillard/Phoenix-Hive
and it is built againts 4.2.2 - I changed dependency in pom and tried to
build it using 4.3 I got series of errors like these:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR]
/home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
cannot find symbol


Could you also please advise how should I define an external table in Hive
using Phoenix-hive for a table created in Phoenix as follows:

create table se_dwh.homes_usage_hour
(hu_ts time not null ,
hu_ho_id integer not null ,
hu_stream_id integer not null,
hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );

I'm not sure how to set mapping columns correctly - I tried something like
this:
CREATE EXTERNAL TABLE phx_usage_hour(
     hu_ts timestamp,
    hu_ho_id int,
    hu_stream_id int,
    hu_usage double)
STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
TBLPROPERTIES(
    'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
    'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
);

It was created without a problem - but when I run a simple query:

hive> select * from phx_usage_hour limit 1;
OK
Failed with exception
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.084 seconds

I compiled your project using Phoenix 4.2.2 but I got 4.3 version installed
in HBase - could it be the root cause of the exception ?

Thanks!
Marek









2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nm...@hortonworks.com>:

>  Hello Marek
>
>
>  There are a couple ways of reaching to phoenix through Hive
>
> - One is calling out directly the hbase layer with the Hive Hbase
> connector but this has some caveats.
>
> - Second is this project I am working on, the latest branch is built
> against phoenix 4.3, but building it against 4.2.2 should not create any
> problems from a usage part , if it does I need to fix it.
>
> Could you explain the issue or file an issue on the project.
>
> - Third if it is urgent you could dump your Hive data in a csv format and
> use the phoenix Bulkloader.
>
>
>  Whatever the way you decide to go please do forward some details about
> your error so I can fix if needed
>
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mj...@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <ma...@gmail.com>
> *Sent:* Monday, April 06, 2015 5:45 PM
> *To:* user@phoenix.apache.org
> *Subject:* hive external table
>
>  Hi All,
> I trying to create an external table mapped to a Phoenix table and I
> cannot make it working.
> I tried this project:
> https://github.com/nmaillard/Phoenix-Hive
>  but it fails to build against Phoenix 4.3.0.
> I compiled it using 4.2.2 but still it's throwing some exception while
> trying to access
> an external table.
>
>  Before I try to dig deeper - I would like to ask if  there is  any other
> way of using Phoenix in Hive environment?
>  Maybe you recommend a different approach of transferring data from
> Phoenix
> to Hive?
>
>  Many thanks in advance!
>
>  Marek
>

Re: hive external table

Posted by Nicolas Maillard <nm...@hortonworks.com>.
Hello Marek


There are a couple ways of reaching to phoenix through Hive

- One is calling out directly the hbase layer with the Hive Hbase connector but this has some caveats.

- Second is this project I am working on, the latest branch is built against phoenix 4.3, but building it against 4.2.2 should not create any problems from a usage part , if it does I need to fix it.

Could you explain the issue or file an issue on the project.

- Third if it is urgent you could dump your Hive data in a csv format and use the phoenix Bulkloader.


Whatever the way you decide to go please do forward some details about your error so I can fix if needed



[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <ma...@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <ma...@gmail.com>
Sent: Monday, April 06, 2015 5:45 PM
To: user@phoenix.apache.org
Subject: hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other way of using Phoenix in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek