You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ibrahim Yakti <iy...@souq.com> on 2013/01/08 13:45:32 UTC
Mapping HBase table in Hive
Hello,
suppose I have the following table (orders) in MySQL:
*************************** 1. row ***************************
Field: id
Type: int(10) unsigned
Null: NO
Key: PRI
Default: NULL
Extra: auto_increment
*************************** 2. row ***************************
Field: value
Type: int(10) unsigned
Null: NO
Key:
Default: NULL
Extra:
*************************** 3. row ***************************
Field: date_lastchange
Type: timestamp
Null: NO
Key:
Default: CURRENT_TIMESTAMP
Extra: on update CURRENT_TIMESTAMP
*************************** 4. row ***************************
Field: date_inserted
Type: timestamp
Null: NO
Key:
Default: 0000-00-00 00:00:00
I imported it into HBase with column family "id"
I want to create an external table in Hive to query the HBase table, I am
not able to get the mapping parameters (*hbase.columns.mapping*), it is
confusing, if anybody can explain it to me please. I used the following
query:
CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
string, date_inserted string) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = " ? ? ? ? ? ?") TBLPROPERTIES ("hbase.table.name"
= "orders");
Is there any way to build the Hive tables automatically or I should go with
the same process with each table?
Thanks in advanced.
--
Ibrahim
Re: Mapping HBase table in Hive
Posted by Ibrahim Yakti <iy...@souq.com>.
Thanks Bejoy,
Seems it worked, in the mapping of the column family I used ":key" and
that's it, in addition as per some articles there should be no spaces in
mapping, below is the create table sample:
CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
> string, date_inserted string) STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
> ("hbase.columns.mapping" =
> ":key,date_inserted:bigint,date_lastchange:string,value:string")
> TBLPROPERTIES ("hbase.table.name" = "orders");
>
Thanks.
--
Ibrahim
On Sun, Jan 13, 2013 at 12:23 PM, <be...@yahoo.com> wrote:
> **
> Hi Ibrahim.
>
> SQOOP is used to import data from rdbms to hbase in your case.
>
> Please get the schema from hbase for your corresponding table and post it
> here.
>
> We can point out how your mapping could be.
>
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: * Ibrahim Yakti <iy...@souq.com>
> *Date: *Sun, 13 Jan 2013 11:22:51 +0300
> *To: *user<us...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Re: Mapping HBase table in Hive
>
> Thanks Bejoy,
>
> what do you mean by:
>
>> If you need to map a full CF to a hive column, the data type of the hive
>> column should be a Map.
>>
>
> suppose I used sqoop to move data from mysql to hbase and used id as a
> column family, all the other columns will be QF then, right?
>
> The integration document is not clear, I think it needs more clarification
> or maybe I am still missing something.
>
> --
> Ibrahim
>
>
> On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:
>
>> data type of
>
>
>
Re: Mapping HBase table in Hive
Posted by be...@yahoo.com.
Hi Ibrahim.
SQOOP is used to import data from rdbms to hbase in your case.
Please get the schema from hbase for your corresponding table and post it here.
We can point out how your mapping could be.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-----Original Message-----
From: Ibrahim Yakti <iy...@souq.com>
Date: Sun, 13 Jan 2013 11:22:51
To: user<us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: Mapping HBase table in Hive
Thanks Bejoy,
what do you mean by:
> If you need to map a full CF to a hive column, the data type of the hive
> column should be a Map.
>
suppose I used sqoop to move data from mysql to hbase and used id as a
column family, all the other columns will be QF then, right?
The integration document is not clear, I think it needs more clarification
or maybe I am still missing something.
--
Ibrahim
On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:
> data type of
Re: Mapping HBase table in Hive
Posted by Ibrahim Yakti <iy...@souq.com>.
Thanks Bejoy,
what do you mean by:
> If you need to map a full CF to a hive column, the data type of the hive
> column should be a Map.
>
suppose I used sqoop to move data from mysql to hbase and used id as a
column family, all the other columns will be QF then, right?
The integration document is not clear, I think it needs more clarification
or maybe I am still missing something.
--
Ibrahim
On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:
> data type of
Re: Mapping HBase table in Hive
Posted by be...@yahoo.com.
Hi Ibrahim
The hive hbase integration totally depends on the hbase table schema and not the schema of the source table in mysql.
You need to provide the column family qualifier mapping in there.
Get the hbase table's schema from hbase shell.
suppose you have the schema as
Id
CF1.qualifier1
CF1.qualifier2
CF1.qualifier3
You need to match each of these ColumnFamily:Qualifier to corresponding columns in hive.
So in hbase.columns.mapping you need to provide these CF:QL in order.
If you need to map a full CF to a hive column, the data type of the hive column should be a Map.
You can get detailed hbase to hive integration document from hive wiki .
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-----Original Message-----
From: Ibrahim Yakti <iy...@souq.com>
Date: Tue, 8 Jan 2013 15:45:32
To: user<us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Mapping HBase table in Hive
Hello,
suppose I have the following table (orders) in MySQL:
*************************** 1. row ***************************
Field: id
Type: int(10) unsigned
Null: NO
Key: PRI
Default: NULL
Extra: auto_increment
*************************** 2. row ***************************
Field: value
Type: int(10) unsigned
Null: NO
Key:
Default: NULL
Extra:
*************************** 3. row ***************************
Field: date_lastchange
Type: timestamp
Null: NO
Key:
Default: CURRENT_TIMESTAMP
Extra: on update CURRENT_TIMESTAMP
*************************** 4. row ***************************
Field: date_inserted
Type: timestamp
Null: NO
Key:
Default: 0000-00-00 00:00:00
I imported it into HBase with column family "id"
I want to create an external table in Hive to query the HBase table, I am
not able to get the mapping parameters (*hbase.columns.mapping*), it is
confusing, if anybody can explain it to me please. I used the following
query:
CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
string, date_inserted string) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = " ? ? ? ? ? ?") TBLPROPERTIES ("hbase.table.name"
= "orders");
Is there any way to build the Hive tables automatically or I should go with
the same process with each table?
Thanks in advanced.
--
Ibrahim