You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Madhusudhana Rao Podila <Ma...@infosys.com> on 2012/01/27 06:37:36 UTC

Problem with Hive/HBase integration

Hi

I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)

Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family

HBase
Created the table in HBase using the following command


hbase(main):001:0> create 'hbasetohive', 'colfamily'

0 row(s) in 1.9700 seconds



hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'

0 row(s) in 0.2240 seconds



hbase(main):003:0> scan 'hbasetohive'

ROW                    COLUMN+CELL

 1s                    column=colfamily:val, timestamp=1327676987075, value=1strowva

                       l

1 row(s) in 0.0840 seconds

Hive


hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")

    > TBLPROPERTIES("hbase.table.name" = "hbasetohive");

OK

Time taken: 10.808 seconds

hive> select * from hbase_hivetable_k;

OK

1s      1strowval

Time taken: 1.314 seconds



Case 2


I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family,  It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly



HBase

hbase(main):004:0> create 'hb_cdr', 'cf_cdr'

0 row(s) in 1.4870 seconds



hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'

0 row(s) in 0.0490 seconds



hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'

0 row(s) in 0.0300 seconds



hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'

0 row(s) in 0.0170 seconds



hbase(main):008:0> scan 'hb_cdr'

ROW                    COLUMN+CELL

 cdr_r1                column=cf_cdr:caller_name, timestamp=1327677898993, value=mad

                       hu

 cdr_r1                column=cf_cdr:caller_number, timestamp=1327677912648, value=0

                       8877232010

 cdr_r2                column=cf_cdr:caller_name, timestamp=1327677919720, value=bha

                       rat

2 row(s) in 0.1020 seconds



Hive



hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")

    > TBLPROPERTIES("hbase.table.name" = "hb_cdr");

FAILED: Error in metadata: MetaException(message:Column Family  cf_cdr is not defined in hbase table hb_cdr)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask



Is there anything issue in the above script?

Please suggest

Regards
Madhusudhana Rao. Podila


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

RE: Problem with Hive/HBase integration

Posted by Chinna Rao Lalam <ch...@huawei.com>.
Hi,



 In the below table  space is the problem in columns.mapping



WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")



Remove the space between 2 columns like

WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name,cf_cdr:caller_number")





hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)
    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
    > TBLPROPERTIES("hbase.table.name" = "hb_cdr");


Hope It Helps,
Chinna Rao Lalam
________________________________
From: Madhusudhana Rao Podila [Madhusudhana_Podila@infosys.com]
Sent: Friday, January 27, 2012 3:01 PM
To: user@hive.apache.org
Subject: RE: Problem with Hive/HBase integration

Thanks for the Quick reply

Will download and try the same
I am curious to check what was the reason for the issue…is version of HBase/Hive is the problem for the below issue I was mentioning or is there is issue with the way I was creating Hive table?

PS: I am using cdh3u2 (Cloudera-vm)


Regards
Madhusudhana Rao. Podila

From: shashwat shriparv [mailto:dwivedishashwat@gmail.com]
Sent: Friday, January 27, 2012 11:17 AM
To: user@hive.apache.org
Subject: Re: Problem with Hive/HBase integration

http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some settings accordingly specific to your linux settings...


On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <Ma...@infosys.com>> wrote:
Hi

I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)

Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family

HBase
Created the table in HBase using the following command


hbase(main):001:0> create 'hbasetohive', 'colfamily'

0 row(s) in 1.9700 seconds



hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'

0 row(s) in 0.2240 seconds



hbase(main):003:0> scan 'hbasetohive'

ROW                    COLUMN+CELL

 1s                    column=colfamily:val, timestamp=1327676987075, value=1strowva

                       l

1 row(s) in 0.0840 seconds

Hive


hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")

    > TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hbasetohive");

OK

Time taken: 10.808 seconds

hive> select * from hbase_hivetable_k;

OK

1s      1strowval

Time taken: 1.314 seconds



Case 2


I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family,  It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly



HBase

hbase(main):004:0> create 'hb_cdr', 'cf_cdr'

0 row(s) in 1.4870 seconds



hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'

0 row(s) in 0.0490 seconds



hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'

0 row(s) in 0.0300 seconds



hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'

0 row(s) in 0.0170 seconds



hbase(main):008:0> scan 'hb_cdr'

ROW                    COLUMN+CELL

 cdr_r1                column=cf_cdr:caller_name, timestamp=1327677898993, value=mad

                       hu

 cdr_r1                column=cf_cdr:caller_number, timestamp=1327677912648, value=0

                       8877232010

 cdr_r2                column=cf_cdr:caller_name, timestamp=1327677919720, value=bha

                       rat

2 row(s) in 0.1020 seconds



Hive



hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")

    > TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hb_cdr");

FAILED: Error in metadata: MetaException(message:Column Family  cf_cdr is not defined in hbase table hb_cdr)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask



Is there anything issue in the above script?

Please suggest

Regards
Madhusudhana Rao. Podila


**************** CAUTION - Disclaimer *****************

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely

for the use of the addressee(s). If you are not the intended recipient, please

notify the sender by e-mail and delete the original message. Further, you are not

to copy, disclose, or distribute this e-mail or its contents to any other person and

any such actions are unlawful. This e-mail may contain viruses. Infosys has taken

every reasonable precaution to minimize this risk, but is not liable for any damage

you may sustain as a result of any virus in this e-mail. You should carry out your

own virus checks before opening the e-mail or attachment. Infosys reserves the

right to monitor and review the content of all messages sent to or from this e-mail

address. Messages sent to or from this e-mail address may be stored on the

Infosys e-mail system.

***INFOSYS******** End of Disclaimer ********INFOSYS***




--
Shashwat Shriparv


RE: Problem with Hive/HBase integration

Posted by Madhusudhana Rao Podila <Ma...@infosys.com>.
Thanks for the Quick reply

Will download and try the same
I am curious to check what was the reason for the issue...is version of HBase/Hive is the problem for the below issue I was mentioning or is there is issue with the way I was creating Hive table?

PS: I am using cdh3u2 (Cloudera-vm)


Regards
Madhusudhana Rao. Podila

From: shashwat shriparv [mailto:dwivedishashwat@gmail.com]
Sent: Friday, January 27, 2012 11:17 AM
To: user@hive.apache.org
Subject: Re: Problem with Hive/HBase integration

http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some settings accordingly specific to your linux settings...


On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <Ma...@infosys.com>> wrote:
Hi

I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)

Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family

HBase
Created the table in HBase using the following command


hbase(main):001:0> create 'hbasetohive', 'colfamily'

0 row(s) in 1.9700 seconds



hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'

0 row(s) in 0.2240 seconds



hbase(main):003:0> scan 'hbasetohive'

ROW                    COLUMN+CELL

 1s                    column=colfamily:val, timestamp=1327676987075, value=1strowva

                       l

1 row(s) in 0.0840 seconds

Hive


hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")

    > TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hbasetohive");

OK

Time taken: 10.808 seconds

hive> select * from hbase_hivetable_k;

OK

1s      1strowval

Time taken: 1.314 seconds



Case 2


I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family,  It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly



HBase

hbase(main):004:0> create 'hb_cdr', 'cf_cdr'

0 row(s) in 1.4870 seconds



hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'

0 row(s) in 0.0490 seconds



hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'

0 row(s) in 0.0300 seconds



hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'

0 row(s) in 0.0170 seconds



hbase(main):008:0> scan 'hb_cdr'

ROW                    COLUMN+CELL

 cdr_r1                column=cf_cdr:caller_name, timestamp=1327677898993, value=mad

                       hu

 cdr_r1                column=cf_cdr:caller_number, timestamp=1327677912648, value=0

                       8877232010

 cdr_r2                column=cf_cdr:caller_name, timestamp=1327677919720, value=bha

                       rat

2 row(s) in 0.1020 seconds



Hive



hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

    > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")

    > TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hb_cdr");

FAILED: Error in metadata: MetaException(message:Column Family  cf_cdr is not defined in hbase table hb_cdr)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask



Is there anything issue in the above script?

Please suggest

Regards
Madhusudhana Rao. Podila


**************** CAUTION - Disclaimer *****************

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely

for the use of the addressee(s). If you are not the intended recipient, please

notify the sender by e-mail and delete the original message. Further, you are not

to copy, disclose, or distribute this e-mail or its contents to any other person and

any such actions are unlawful. This e-mail may contain viruses. Infosys has taken

every reasonable precaution to minimize this risk, but is not liable for any damage

you may sustain as a result of any virus in this e-mail. You should carry out your

own virus checks before opening the e-mail or attachment. Infosys reserves the

right to monitor and review the content of all messages sent to or from this e-mail

address. Messages sent to or from this e-mail address may be stored on the

Infosys e-mail system.

***INFOSYS******** End of Disclaimer ********INFOSYS***




--
Shashwat Shriparv


Re: Problem with Hive/HBase integration

Posted by shashwat shriparv <dw...@gmail.com>.
http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some
settings accordingly specific to your linux settings...



On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <
Madhusudhana_Podila@infosys.com> wrote:

> Hi****
>
> ** **
>
> I have a problem in create a Hive table using existing HBase table (using
> External Table concept) with multiple columns from column family (not using
> as Map)****
>
> ** **
>
> *Case-1 :*
>
> I have created a table in HBase and able to map to Hive as an external
> table just using only one column from the column family ****
>
> ** **
>
> HBase ****
>
> Created the table in HBase using the following command****
>
>                 ****
>
> hbase(main):001:0> create 'hbasetohive', 'colfamily'****
>
> 0 row(s) in 1.9700 seconds****
>
> ** **
>
> hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'***
> *
>
> 0 row(s) in 0.2240 seconds****
>
> ** **
>
> hbase(main):003:0> scan 'hbasetohive'****
>
> ROW
> COLUMN+CELL                                                  ****
>
>  1s                    column=colfamily:val, timestamp=1327676987075,
> value=1strowva****
>
>
> l                                                            ****
>
> 1 row(s) in 0.0840 seconds****
>
> ** **
>
> *Hive*
>
> ** **
>
> hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)***
> *
>
>     > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
>     > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")****
>
>     > TBLPROPERTIES("hbase.table.name" = "hbasetohive");****
>
> OK****
>
> Time taken: 10.808 seconds****
>
> hive> select * from hbase_hivetable_k;****
>
> OK****
>
> 1s      1strowval****
>
> Time taken: 1.314 seconds****
>
> ** **
>
> *Case 2*
>
> ** **
>
> I have created a table in HBase with column family as cf_cdr with two
> columns caller_name, caller_number; Then I tried creating the Hive table
> using the HBase table that got created by specifying both columns from the
> column family,  It is throwing Metaexteception: If I restrict to only one
> column am able to create the table in Hive properly****
>
> ** **
>
> *HBase*
>
> hbase(main):004:0> create 'hb_cdr', 'cf_cdr'****
>
> 0 row(s) in 1.4870 seconds****
>
> ** **
>
> hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'**
> **
>
> 0 row(s) in 0.0490 seconds****
>
> ** **
>
> hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number',
> '08877232010'****
>
> 0 row(s) in 0.0300 seconds****
>
> ** **
>
> hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'*
> ***
>
> 0 row(s) in 0.0170 seconds****
>
> ** **
>
> hbase(main):008:0> scan 'hb_cdr'****
>
> ROW
> COLUMN+CELL                                                  ****
>
>  cdr_r1                column=cf_cdr:caller_name, timestamp=1327677898993,
> value=mad****
>
>
> hu                                                           ****
>
>  cdr_r1                column=cf_cdr:caller_number,
> timestamp=1327677912648, value=0****
>
>
>               8877232010
> ****
>
>  cdr_r2                column=cf_cdr:caller_name, timestamp=1327677919720,
> value=bha****
>
>
> rat                                                          ****
>
> 2 row(s) in 0.1020 seconds****
>
> ** **
>
> *Hive*
>
> * *
>
> hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number
> string)****
>
>     > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
>     > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name,
> cf_cdr:caller_number")****
>
>     > TBLPROPERTIES("hbase.table.name" = "hb_cdr");****
>
> FAILED: Error in metadata: MetaException(message:Column Family  cf_cdr is
> not defined in hbase table hb_cdr)****
>
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask****
>
> ** **
>
> Is there anything issue in the above script?****
>
> ** **
>
> Please suggest****
>
> ** **
>
> Regards****
>
> Madhusudhana Rao. Podila****
>
> ** **
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are not
> to copy, disclose, or distribute this e-mail or its contents to any other person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
> every reasonable precaution to minimize this risk, but is not liable for any damage
> you may sustain as a result of any virus in this e-mail. You should carry out your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>


-- 
Shashwat Shriparv