You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ibrahim Yakti <iy...@souq.com> on 2013/01/08 13:45:32 UTC

Mapping HBase table in Hive

Hello,

suppose I have the following table (orders) in MySQL:

*************************** 1. row ***************************
  Field: id
   Type: int(10) unsigned
   Null: NO
    Key: PRI
Default: NULL
  Extra: auto_increment
*************************** 2. row ***************************
  Field: value
   Type: int(10) unsigned
   Null: NO
    Key:
Default: NULL
  Extra:
*************************** 3. row ***************************
  Field: date_lastchange
   Type: timestamp
   Null: NO
    Key:
Default: CURRENT_TIMESTAMP
  Extra: on update CURRENT_TIMESTAMP
*************************** 4. row ***************************
  Field: date_inserted
   Type: timestamp
   Null: NO
    Key:
Default: 0000-00-00 00:00:00

I imported it into HBase with column family "id"

I want to create an external table in Hive to query the HBase table, I am
not able to get the mapping parameters (*hbase.columns.mapping*), it is
confusing, if anybody can explain it to me please. I used the following
query:

CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
string, date_inserted string) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = " ? ? ? ? ? ?") TBLPROPERTIES ("hbase.table.name"
= "orders");

Is there any way to build the Hive tables automatically or I should go with
the same process with each table?


Thanks in advanced.

--
Ibrahim

Re: Mapping HBase table in Hive

Posted by Ibrahim Yakti <iy...@souq.com>.
Thanks Bejoy,

Seems it worked, in the mapping of the column family I used ":key" and
that's it, in addition as per some articles there should be no spaces in
mapping, below is the create table sample:

CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
> string, date_inserted string) STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
> ("hbase.columns.mapping" =
> ":key,date_inserted:bigint,date_lastchange:string,value:string")
> TBLPROPERTIES ("hbase.table.name" = "orders");
>

Thanks.


--
Ibrahim


On Sun, Jan 13, 2013 at 12:23 PM, <be...@yahoo.com> wrote:

> **
> Hi Ibrahim.
>
> SQOOP is used to import data from rdbms to hbase in your case.
>
> Please get the schema from hbase for your corresponding table and post it
> here.
>
> We can point out how your mapping could be.
>
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: * Ibrahim Yakti <iy...@souq.com>
> *Date: *Sun, 13 Jan 2013 11:22:51 +0300
> *To: *user<us...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Re: Mapping HBase table in Hive
>
> Thanks Bejoy,
>
> what do you mean by:
>
>> If you need to map a full CF to a hive column, the data type of the hive
>> column should be a Map.
>>
>
> suppose I used sqoop to move data from mysql to hbase and used id as a
> column family, all the other columns will be QF then, right?
>
> The integration document is not clear, I think it needs more clarification
> or maybe I am still missing something.
>
> --
> Ibrahim
>
>
> On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:
>
>> data type of
>
>
>

Re: Mapping HBase table in Hive

Posted by be...@yahoo.com.
Hi Ibrahim.

 SQOOP is used to import data from rdbms to hbase in your case. 

Please get the schema from hbase for your corresponding table and post it here.

We can point out how your mapping could be.

Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Ibrahim Yakti <iy...@souq.com>
Date: Sun, 13 Jan 2013 11:22:51 
To: user<us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: Mapping HBase table in Hive

Thanks Bejoy,

what do you mean by:

> If you need to map a full CF to a hive column, the data type of the hive
> column should be a Map.
>

suppose I used sqoop to move data from mysql to hbase and used id as a
column family, all the other columns will be QF then, right?

The integration document is not clear, I think it needs more clarification
or maybe I am still missing something.

--
Ibrahim


On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:

> data type of


Re: Mapping HBase table in Hive

Posted by Ibrahim Yakti <iy...@souq.com>.
Thanks Bejoy,

what do you mean by:

> If you need to map a full CF to a hive column, the data type of the hive
> column should be a Map.
>

suppose I used sqoop to move data from mysql to hbase and used id as a
column family, all the other columns will be QF then, right?

The integration document is not clear, I think it needs more clarification
or maybe I am still missing something.

--
Ibrahim


On Tue, Jan 8, 2013 at 9:35 PM, <be...@yahoo.com> wrote:

> data type of

Re: Mapping HBase table in Hive

Posted by be...@yahoo.com.
Hi Ibrahim

The hive hbase integration totally depends on the hbase table schema and not the schema of the source table in mysql.

You need to provide the column family qualifier mapping in there.

Get the hbase table's schema from hbase shell.

suppose you have the schema as
Id
CF1.qualifier1
CF1.qualifier2
CF1.qualifier3

You need to match each of these ColumnFamily:Qualifier to corresponding columns in hive. 

So in hbase.columns.mapping you need to provide these CF:QL in order.

If you need to map a full CF to a hive column, the data type of the hive column should be a Map.

You can get detailed hbase to hive integration document from hive wiki .


Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Ibrahim Yakti <iy...@souq.com>
Date: Tue, 8 Jan 2013 15:45:32 
To: user<us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Mapping HBase table in Hive

Hello,

suppose I have the following table (orders) in MySQL:

*************************** 1. row ***************************
  Field: id
   Type: int(10) unsigned
   Null: NO
    Key: PRI
Default: NULL
  Extra: auto_increment
*************************** 2. row ***************************
  Field: value
   Type: int(10) unsigned
   Null: NO
    Key:
Default: NULL
  Extra:
*************************** 3. row ***************************
  Field: date_lastchange
   Type: timestamp
   Null: NO
    Key:
Default: CURRENT_TIMESTAMP
  Extra: on update CURRENT_TIMESTAMP
*************************** 4. row ***************************
  Field: date_inserted
   Type: timestamp
   Null: NO
    Key:
Default: 0000-00-00 00:00:00

I imported it into HBase with column family "id"

I want to create an external table in Hive to query the HBase table, I am
not able to get the mapping parameters (*hbase.columns.mapping*), it is
confusing, if anybody can explain it to me please. I used the following
query:

CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
string, date_inserted string) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = " ? ? ? ? ? ?") TBLPROPERTIES ("hbase.table.name"
= "orders");

Is there any way to build the Hive tables automatically or I should go with
the same process with each table?


Thanks in advanced.

--
Ibrahim