You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Daniel Pruckler <Da...@t8webware.com> on 2010/03/15 21:44:24 UTC

Problem with HBaseIntegration. Integers show up as NULL in hive

I create a table in hbase

create 'hbase_table', 'colfam'

 

Then run the following code to insert a row into the table:

public class Test 

{

                public static void main(String[] args) throws Exception 

                {

                                String key = "key";

                                String stringValue = "string";

                                int intValue = 1;

                                double doubleValue = 1.5;

                                boolean booleanValue = false;

 

                                Configuration conf = new Configuration();

                                HTable table = new HTable(new HBaseConfiguration(conf), "hbase_table");

                                

                                

                                Put put = new Put(Bytes.toBytes(key));

                                

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("string_value"), Bytes.toBytes(stringValue));

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("int_value"), Bytes.toBytes(intValue));

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("int_value_as_string"), Bytes.toBytes(Integer.toString(intValue)));

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("double_value"), Bytes.toBytes(doubleValue));

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("boolean_value"), Bytes.toBytes(booleanValue));

                                

                                table.put(put);

                }

}

 

 

scan 'hbase_table'

ROW                          COLUMN+CELL                                                                      

 key                         column=colfam:boolean_value, timestamp=1268684020790, value=\x00                 

 key                         column=colfam:double_value, timestamp=1268684020790, value=\x3F\xF8\x00\x00\x00\x

                             00\x00\x00                                                                       

 key                         column=colfam:int_value, timestamp=1268684020790, value=\x00\x00\x00\x01         

 key                         column=colfam:string_value, timestamp=1268684020790, value=string  

 

 

 

Then in hive I run

 

CREATE EXTERNAL TABLE hive_table(key string, string_value string, int_value int, int_value_as_string int, double_value double, boolean_value boolean) 

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES (

"hbase.columns.mapping" = "colfam:string_value,colfam:int_value,colfam:int_value_as_string,colfam:double_value,colfam:boolean_value",

"hbase.table.name" = "hbase_table"

);

 

 

Then when selecting back I get null values

select * from hive_table;

OK

key         string    NULL      1              NULL      NULL

 

 

The tables are all set to the type I expect:

describe hive_table;

OK

key         string    from deserializer

string_value       string    from deserializer

int_value             int          from deserializer

int_value_as_string        int          from deserializer

double_value    double from deserializer

boolean_value boolean               from deserializer

 

 

 

Only the strings and integer that was written to hbase as a string seem to be showing up in hive.  I'm not seeing any errors in the hive logs.  Do all values have to be stored as a string in hbase to work with HBaseIntegration?

 

 

 

Thanks


Re: Problem with HBaseIntegration. Integers show up as NULL in hive

Posted by John Sichi <js...@facebook.com>.
Thanks for the bug report.

So far, most of the testing has been round-tripping data from Hive into HBase and back (rather than accessing existing HBase tables).  I'll log a followup task for mapping in data created in other formats, with suggestions welcome on how the type-mapping specification should look.

JVS

On Mar 15, 2010, at 1:44 PM, Daniel Pruckler wrote:

I create a table in hbase
create 'hbase_table', 'colfam'

Then run the following code to insert a row into the table:
public class Test
{
                public static void main(String[] args) throws Exception
                {
                                String key = "key";
                                String stringValue = "string";
                                int intValue = 1;
                                double doubleValue = 1.5;
                                boolean booleanValue = false;

                                Configuration conf = new Configuration();
                                HTable table = new HTable(new HBaseConfiguration(conf), "hbase_table");


                                Put put = new Put(Bytes.toBytes(key));

                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("string_value"), Bytes.toBytes(stringValue));
                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("int_value"), Bytes.toBytes(intValue));
                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("int_value_as_string"), Bytes.toBytes(Integer.toString(intValue)));
                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("double_value"), Bytes.toBytes(doubleValue));
                                put.add(Bytes.toBytes("colfam"), Bytes.toBytes("boolean_value"), Bytes.toBytes(booleanValue));

                                table.put(put);
                }
}


scan 'hbase_table'
ROW                          COLUMN+CELL
 key                         column=colfam:boolean_value, timestamp=1268684020790, value=\x00
 key                         column=colfam:double_value, timestamp=1268684020790, value=\x3F\xF8\x00\x00\x00\x
                             00\x00\x00
 key                         column=colfam:int_value, timestamp=1268684020790, value=\x00\x00\x00\x01
 key                         column=colfam:string_value, timestamp=1268684020790, value=string



Then in hive I run

CREATE EXTERNAL TABLE hive_table(key string, string_value string, int_value int, int_value_as_string int, double_value double, boolean_value boolean)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "colfam:string_value,colfam:int_value,colfam:int_value_as_string,colfam:double_value,colfam:boolean_value",
"hbase.table.name<http://hbase.table.name>" = "hbase_table"
);


Then when selecting back I get null values
select * from hive_table;
OK
key         string    NULL      1              NULL      NULL


The tables are all set to the type I expect:
describe hive_table;
OK
key         string    from deserializer
string_value       string    from deserializer
int_value             int          from deserializer
int_value_as_string        int          from deserializer
double_value    double from deserializer
boolean_value boolean               from deserializer



Only the strings and integer that was written to hbase as a string seem to be showing up in hive.  I'm not seeing any errors in the hive logs.  Do all values have to be stored as a string in hbase to work with HBaseIntegration?



Thanks