You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by shashwat shriparv <dw...@gmail.com> on 2012/04/05 15:45:11 UTC

How to create external table for hbase

I am able to create external tables in hive of HBase, now i have a
requirement to create an external table which is having variable columns,
which means the columns in HBase are not fixed for the particular table,
the no of columns and can be created dynamically at the time of data
insertion, what should be the approach for handling such kind of situation.

Summery : How to create external tables in hive when the no of columns are
not fixed in HBase table.

Thanks in advance.

-- 
          ∞
Shashwat Shriparv

Re: How to create external table for hbase

Posted by shashwat shriparv <dw...@gmail.com>.
Thanks alot peter, you saved my life. what i understood that i need to
explore hive and hbase and hadoop in more detail :) thanx a ton...

On Fri, Apr 6, 2012 at 3:43 AM, Peter Vandenabeele
<pe...@vandenabeele.com>wrote:

> On Thu, Apr 5, 2012 at 2:45 PM, shashwat shriparv
> <dw...@gmail.com> wrote:
> > I am able to create external tables in hive of HBase, now i have a
> > requirement to create an external table which is having variable columns,
> > which means the columns in HBase are not fixed for the particular table,
> > the no of columns and can be created dynamically at the time of data
> > insertion, what should be the approach for handling such kind of
> situation.
> >
> > Summary : How to create external tables in hive when the no of columns
> are
> > not fixed in HBase table.
>
> Maybe this is more a question for the Hive user mailing list:
>
>  http://hive.apache.org/mailing_lists.html#Users
>
> If your "variable" columns (qualifiers in that case) are all in a column
> family that is known upfront, you could use the
>
>  map<string, string>
>
> structure in the definition of the EXTERNAL table.
>
> E.g. in a table with a column family 'demo' you could do in Hive:
>
> hive> CREATE EXTERNAL TABLE lrug(key int, value map<string, string>)
>    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>    > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,demo:")
>    > TBLPROPERTIES("hbase.table.name" = "LRUG_Mobile");
>
> (Note the specific SERDESPROPERTIES
> :key   for the default key
> demo:    to take all qualifiers in the 'demo' column family
>
> With this test data:
>
> hive> SELECT * from lrug;
> OK
> 1234    {"country_code":"UK","name":"Peter"}
> 2345    {"country_code":"US","name":"Shawn","state":"CA"}
>
> If initially you would only have thought of a qualifier
> "country_code" you could write:
>
> hive> SELECT value['name'] FROM lrug
>    > WHERE value['country_code'] = 'UK';
> Total MapReduce jobs = 1
> ...
> Ended Job = job_201204051132_0017
> OK
> 1234    Peter
> Time taken: 7.644 seconds
>
> If then later, you also have a column 'state' when,
> you could change the query to
>
> hive> SELECT key, value['name'] FROM lrug
>    > WHERE value['country_code'] = 'US' AND value['state'] = 'CA';
> Total MapReduce jobs = 1
>
> ...
>
> Ended Job = job_201204051132_0018
> OK
> 2345    Shawn
> Time taken: 6.864 seconds
>
> If the qualifiers of the columns are really the data itself, you
> can use a lateral view and explode(map_keys()) to convert the
> qualifiers into new data rows.
>
> HTH,
>
> Peter
>



-- 


∞
Shashwat Shriparv

Re: How to create external table for hbase

Posted by Peter Vandenabeele <pe...@vandenabeele.com>.
On Thu, Apr 5, 2012 at 2:45 PM, shashwat shriparv
<dw...@gmail.com> wrote:
> I am able to create external tables in hive of HBase, now i have a
> requirement to create an external table which is having variable columns,
> which means the columns in HBase are not fixed for the particular table,
> the no of columns and can be created dynamically at the time of data
> insertion, what should be the approach for handling such kind of situation.
>
> Summary : How to create external tables in hive when the no of columns are
> not fixed in HBase table.

Maybe this is more a question for the Hive user mailing list:

  http://hive.apache.org/mailing_lists.html#Users

If your "variable" columns (qualifiers in that case) are all in a column
family that is known upfront, you could use the

  map<string, string>

structure in the definition of the EXTERNAL table.

E.g. in a table with a column family 'demo' you could do in Hive:

hive> CREATE EXTERNAL TABLE lrug(key int, value map<string, string>)
    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,demo:")
    > TBLPROPERTIES("hbase.table.name" = "LRUG_Mobile");

(Note the specific SERDESPROPERTIES
:key   for the default key
demo:    to take all qualifiers in the 'demo' column family

With this test data:

hive> SELECT * from lrug;
OK
1234    {"country_code":"UK","name":"Peter"}
2345    {"country_code":"US","name":"Shawn","state":"CA"}

If initially you would only have thought of a qualifier
"country_code" you could write:

hive> SELECT value['name'] FROM lrug
    > WHERE value['country_code'] = 'UK';
Total MapReduce jobs = 1
...
Ended Job = job_201204051132_0017
OK
1234    Peter
Time taken: 7.644 seconds

If then later, you also have a column 'state' when,
you could change the query to

hive> SELECT key, value['name'] FROM lrug
    > WHERE value['country_code'] = 'US' AND value['state'] = 'CA';
Total MapReduce jobs = 1

...

Ended Job = job_201204051132_0018
OK
2345    Shawn
Time taken: 6.864 seconds

If the qualifiers of the columns are really the data itself, you
can use a lateral view and explode(map_keys()) to convert the
qualifiers into new data rows.

HTH,

Peter