You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Juraj jiv <fa...@gmail.com> on 2014/03/03 12:52:38 UTC

Hive hbase handler composite key - hbase full scan on key

Hello,
im currently testing Hbase integration into Hive. I want to use fast hbase
key lookup in Hive but my hbase key is composite.
I found a solution how to crete table with hbase key as struct which work
fine:

CREATE EXTERNAL TABLE table_tst(
key struct<a:string,b:string,c:string,d:string>, ....
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '_'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' ...

But if i use this select in hive:
select * from table_tst where key.a = '1407273705';
It takes about 860 seconds to print 2 records. So it makes full scan :/

If i use similar select from Java Hbase API as:
            Scan scan = new Scan();
            scan.setStartRow("1407273705".getBytes());
            scan.setStopRow("1407273705~".getBytes());

Note: "~" is end char for me - it has high byte value, my composite key
delimiter is "_"
This select 2 records in 2 seconds.

How can i tell Hive go with start/stop scanner over this key.a value...

JV

Re: Hive hbase handler composite key - hbase full scan on key

Posted by Navis류승우 <na...@nexr.com>.
https://issues.apache.org/jira/browse/HIVE-6411 is exactly for the cases.

The bad new is that it seemed not included even in 0.13.0 and you should
implement own predicate analyzer.

Thanks,
Navis


2014-03-03 20:52 GMT+09:00 Juraj jiv <fa...@gmail.com>:

> Hello,
> im currently testing Hbase integration into Hive. I want to use fast hbase
> key lookup in Hive but my hbase key is composite.
> I found a solution how to crete table with hbase key as struct which work
> fine:
>
> CREATE EXTERNAL TABLE table_tst(
> key struct<a:string,b:string,c:string,d:string>, ....
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '_'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' ...
>
> But if i use this select in hive:
> select * from table_tst where key.a = '1407273705';
> It takes about 860 seconds to print 2 records. So it makes full scan :/
>
> If i use similar select from Java Hbase API as:
>             Scan scan = new Scan();
>             scan.setStartRow("1407273705".getBytes());
>             scan.setStopRow("1407273705~".getBytes());
>
> Note: "~" is end char for me - it has high byte value, my composite key
> delimiter is "_"
> This select 2 records in 2 seconds.
>
> How can i tell Hive go with start/stop scanner over this key.a value...
>
> JV
>