You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Madhusudhana Rao Podila <Ma...@infosys.com> on 2012/01/27 06:37:36 UTC
Problem with Hive/HBase integration
Hi
I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)
Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family
HBase
Created the table in HBase using the following command
hbase(main):001:0> create 'hbasetohive', 'colfamily'
0 row(s) in 1.9700 seconds
hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'
0 row(s) in 0.2240 seconds
hbase(main):003:0> scan 'hbasetohive'
ROW COLUMN+CELL
1s column=colfamily:val, timestamp=1327676987075, value=1strowva
l
1 row(s) in 0.0840 seconds
Hive
hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")
> TBLPROPERTIES("hbase.table.name" = "hbasetohive");
OK
Time taken: 10.808 seconds
hive> select * from hbase_hivetable_k;
OK
1s 1strowval
Time taken: 1.314 seconds
Case 2
I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family, It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly
HBase
hbase(main):004:0> create 'hb_cdr', 'cf_cdr'
0 row(s) in 1.4870 seconds
hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'
0 row(s) in 0.0490 seconds
hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'
0 row(s) in 0.0300 seconds
hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'
0 row(s) in 0.0170 seconds
hbase(main):008:0> scan 'hb_cdr'
ROW COLUMN+CELL
cdr_r1 column=cf_cdr:caller_name, timestamp=1327677898993, value=mad
hu
cdr_r1 column=cf_cdr:caller_number, timestamp=1327677912648, value=0
8877232010
cdr_r2 column=cf_cdr:caller_name, timestamp=1327677919720, value=bha
rat
2 row(s) in 0.1020 seconds
Hive
hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
> TBLPROPERTIES("hbase.table.name" = "hb_cdr");
FAILED: Error in metadata: MetaException(message:Column Family cf_cdr is not defined in hbase table hb_cdr)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Is there anything issue in the above script?
Please suggest
Regards
Madhusudhana Rao. Podila
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
RE: Problem with Hive/HBase integration
Posted by Chinna Rao Lalam <ch...@huawei.com>.
Hi,
In the below table space is the problem in columns.mapping
WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
Remove the space between 2 columns like
WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name,cf_cdr:caller_number")
hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
> TBLPROPERTIES("hbase.table.name" = "hb_cdr");
Hope It Helps,
Chinna Rao Lalam
________________________________
From: Madhusudhana Rao Podila [Madhusudhana_Podila@infosys.com]
Sent: Friday, January 27, 2012 3:01 PM
To: user@hive.apache.org
Subject: RE: Problem with Hive/HBase integration
Thanks for the Quick reply
Will download and try the same
I am curious to check what was the reason for the issue…is version of HBase/Hive is the problem for the below issue I was mentioning or is there is issue with the way I was creating Hive table?
PS: I am using cdh3u2 (Cloudera-vm)
Regards
Madhusudhana Rao. Podila
From: shashwat shriparv [mailto:dwivedishashwat@gmail.com]
Sent: Friday, January 27, 2012 11:17 AM
To: user@hive.apache.org
Subject: Re: Problem with Hive/HBase integration
http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some settings accordingly specific to your linux settings...
On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <Ma...@infosys.com>> wrote:
Hi
I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)
Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family
HBase
Created the table in HBase using the following command
hbase(main):001:0> create 'hbasetohive', 'colfamily'
0 row(s) in 1.9700 seconds
hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'
0 row(s) in 0.2240 seconds
hbase(main):003:0> scan 'hbasetohive'
ROW COLUMN+CELL
1s column=colfamily:val, timestamp=1327676987075, value=1strowva
l
1 row(s) in 0.0840 seconds
Hive
hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")
> TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hbasetohive");
OK
Time taken: 10.808 seconds
hive> select * from hbase_hivetable_k;
OK
1s 1strowval
Time taken: 1.314 seconds
Case 2
I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family, It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly
HBase
hbase(main):004:0> create 'hb_cdr', 'cf_cdr'
0 row(s) in 1.4870 seconds
hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'
0 row(s) in 0.0490 seconds
hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'
0 row(s) in 0.0300 seconds
hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'
0 row(s) in 0.0170 seconds
hbase(main):008:0> scan 'hb_cdr'
ROW COLUMN+CELL
cdr_r1 column=cf_cdr:caller_name, timestamp=1327677898993, value=mad
hu
cdr_r1 column=cf_cdr:caller_number, timestamp=1327677912648, value=0
8877232010
cdr_r2 column=cf_cdr:caller_name, timestamp=1327677919720, value=bha
rat
2 row(s) in 0.1020 seconds
Hive
hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
> TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hb_cdr");
FAILED: Error in metadata: MetaException(message:Column Family cf_cdr is not defined in hbase table hb_cdr)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Is there anything issue in the above script?
Please suggest
Regards
Madhusudhana Rao. Podila
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
--
Shashwat Shriparv
RE: Problem with Hive/HBase integration
Posted by Madhusudhana Rao Podila <Ma...@infosys.com>.
Thanks for the Quick reply
Will download and try the same
I am curious to check what was the reason for the issue...is version of HBase/Hive is the problem for the below issue I was mentioning or is there is issue with the way I was creating Hive table?
PS: I am using cdh3u2 (Cloudera-vm)
Regards
Madhusudhana Rao. Podila
From: shashwat shriparv [mailto:dwivedishashwat@gmail.com]
Sent: Friday, January 27, 2012 11:17 AM
To: user@hive.apache.org
Subject: Re: Problem with Hive/HBase integration
http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some settings accordingly specific to your linux settings...
On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <Ma...@infosys.com>> wrote:
Hi
I have a problem in create a Hive table using existing HBase table (using External Table concept) with multiple columns from column family (not using as Map)
Case-1 :
I have created a table in HBase and able to map to Hive as an external table just using only one column from the column family
HBase
Created the table in HBase using the following command
hbase(main):001:0> create 'hbasetohive', 'colfamily'
0 row(s) in 1.9700 seconds
hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'
0 row(s) in 0.2240 seconds
hbase(main):003:0> scan 'hbasetohive'
ROW COLUMN+CELL
1s column=colfamily:val, timestamp=1327676987075, value=1strowva
l
1 row(s) in 0.0840 seconds
Hive
hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")
> TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hbasetohive");
OK
Time taken: 10.808 seconds
hive> select * from hbase_hivetable_k;
OK
1s 1strowval
Time taken: 1.314 seconds
Case 2
I have created a table in HBase with column family as cf_cdr with two columns caller_name, caller_number; Then I tried creating the Hive table using the HBase table that got created by specifying both columns from the column family, It is throwing Metaexteception: If I restrict to only one column am able to create the table in Hive properly
HBase
hbase(main):004:0> create 'hb_cdr', 'cf_cdr'
0 row(s) in 1.4870 seconds
hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'
0 row(s) in 0.0490 seconds
hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number', '08877232010'
0 row(s) in 0.0300 seconds
hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'
0 row(s) in 0.0170 seconds
hbase(main):008:0> scan 'hb_cdr'
ROW COLUMN+CELL
cdr_r1 column=cf_cdr:caller_name, timestamp=1327677898993, value=mad
hu
cdr_r1 column=cf_cdr:caller_number, timestamp=1327677912648, value=0
8877232010
cdr_r2 column=cf_cdr:caller_name, timestamp=1327677919720, value=bha
rat
2 row(s) in 0.1020 seconds
Hive
hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name, cf_cdr:caller_number")
> TBLPROPERTIES("hbase.table.name<http://hbase.table.name>" = "hb_cdr");
FAILED: Error in metadata: MetaException(message:Column Family cf_cdr is not defined in hbase table hb_cdr)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Is there anything issue in the above script?
Please suggest
Regards
Madhusudhana Rao. Podila
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***
--
Shashwat Shriparv
Re: Problem with Hive/HBase integration
Posted by shashwat shriparv <dw...@gmail.com>.
http://dl.dropbox.com/u/19454506/HadoopHIveHbaseReady.tar.gz
Download this its pre connfigured hive and hbase. you need to change some
settings accordingly specific to your linux settings...
On Fri, Jan 27, 2012 at 11:07 AM, Madhusudhana Rao Podila <
Madhusudhana_Podila@infosys.com> wrote:
> Hi****
>
> ** **
>
> I have a problem in create a Hive table using existing HBase table (using
> External Table concept) with multiple columns from column family (not using
> as Map)****
>
> ** **
>
> *Case-1 :*
>
> I have created a table in HBase and able to map to Hive as an external
> table just using only one column from the column family ****
>
> ** **
>
> HBase ****
>
> Created the table in HBase using the following command****
>
> ****
>
> hbase(main):001:0> create 'hbasetohive', 'colfamily'****
>
> 0 row(s) in 1.9700 seconds****
>
> ** **
>
> hbase(main):002:0> put 'hbasetohive', '1s', 'colfamily:val','1strowval'***
> *
>
> 0 row(s) in 0.2240 seconds****
>
> ** **
>
> hbase(main):003:0> scan 'hbasetohive'****
>
> ROW
> COLUMN+CELL ****
>
> 1s column=colfamily:val, timestamp=1327676987075,
> value=1strowva****
>
>
> l ****
>
> 1 row(s) in 0.0840 seconds****
>
> ** **
>
> *Hive*
>
> ** **
>
> hive> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)***
> *
>
> > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
> > WITH SERDEPROPERTIES("hbase.columns.mapping" = "colfamily:val")****
>
> > TBLPROPERTIES("hbase.table.name" = "hbasetohive");****
>
> OK****
>
> Time taken: 10.808 seconds****
>
> hive> select * from hbase_hivetable_k;****
>
> OK****
>
> 1s 1strowval****
>
> Time taken: 1.314 seconds****
>
> ** **
>
> *Case 2*
>
> ** **
>
> I have created a table in HBase with column family as cf_cdr with two
> columns caller_name, caller_number; Then I tried creating the Hive table
> using the HBase table that got created by specifying both columns from the
> column family, It is throwing Metaexteception: If I restrict to only one
> column am able to create the table in Hive properly****
>
> ** **
>
> *HBase*
>
> hbase(main):004:0> create 'hb_cdr', 'cf_cdr'****
>
> 0 row(s) in 1.4870 seconds****
>
> ** **
>
> hbase(main):005:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_name', 'madhu'**
> **
>
> 0 row(s) in 0.0490 seconds****
>
> ** **
>
> hbase(main):006:0> put 'hb_cdr', 'cdr_r1', 'cf_cdr:caller_number',
> '08877232010'****
>
> 0 row(s) in 0.0300 seconds****
>
> ** **
>
> hbase(main):007:0> put 'hb_cdr', 'cdr_r2', 'cf_cdr:caller_name', 'bharat'*
> ***
>
> 0 row(s) in 0.0170 seconds****
>
> ** **
>
> hbase(main):008:0> scan 'hb_cdr'****
>
> ROW
> COLUMN+CELL ****
>
> cdr_r1 column=cf_cdr:caller_name, timestamp=1327677898993,
> value=mad****
>
>
> hu ****
>
> cdr_r1 column=cf_cdr:caller_number,
> timestamp=1327677912648, value=0****
>
>
> 8877232010
> ****
>
> cdr_r2 column=cf_cdr:caller_name, timestamp=1327677919720,
> value=bha****
>
>
> rat ****
>
> 2 row(s) in 0.1020 seconds****
>
> ** **
>
> *Hive*
>
> * *
>
> hive> CREATE EXTERNAL TABLE hv_hb_cdr(key string, c_name string, c_number
> string)****
>
> > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
> > WITH SERDEPROPERTIES("hbase.columns.mapping" = "cf_cdr:caller_name,
> cf_cdr:caller_number")****
>
> > TBLPROPERTIES("hbase.table.name" = "hb_cdr");****
>
> FAILED: Error in metadata: MetaException(message:Column Family cf_cdr is
> not defined in hbase table hb_cdr)****
>
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask****
>
> ** **
>
> Is there anything issue in the above script?****
>
> ** **
>
> Please suggest****
>
> ** **
>
> Regards****
>
> Madhusudhana Rao. Podila****
>
> ** **
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are not
> to copy, disclose, or distribute this e-mail or its contents to any other person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
> every reasonable precaution to minimize this risk, but is not liable for any damage
> you may sustain as a result of any virus in this e-mail. You should carry out your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>
--
Shashwat Shriparv