You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "david (JIRA)" <ji...@apache.org> on 2016/01/12 07:11:40 UTC
[jira] [Updated] (HIVE-12844) hive-1.2.1 doesn't return correct
value when run select count query
[ https://issues.apache.org/jira/browse/HIVE-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
david updated HIVE-12844:
-------------------------
Description:
in hbase 1.0.2,I created a table 'test1',it has below rows and values:
hbase(main):027:0> scan 'test1'
ROW COLUMN+CELL
a1 column=df1:a2, timestamp=1452505991743, value=ddd
a1 column=df1:a3, timestamp=1452506082723, value=eee
a1 column=df1:c2, timestamp=1452505705391, value=bbb
b1 column=df1:a2, timestamp=1452505838737, value=ccc
b1 column=df1:a3, timestamp=1452506149461, value=fff
r1 column=df1:a, timestamp=1452507261849, value=hhh
r1 column=df1:a1, timestamp=1452507100774, value=ggg
r1 column=df1:c1, timestamp=1451221711588, value=aaa
then I created hive-1.2.1 table:
create external table test3(
key string,
coll string,
col2 string,
col3 string,
col4 string,
col5 string,
col6 string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES
("hbase.columns.mapping" =
":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
TBLPROPERTIES("hbase.table.name" = "test1");
when I run query in hive:
hive> select * from test3;
OK
a1 NULL NULL ddd eee NULL bbb
b1 NULL NULL ccc fff NULL NULL
r1 hhh NULL NULL NULL aaa NULL
the result is correct,but when I run:
select count(1) from test3;
Total MapReduce CPU Time Spent: 6 seconds 770 msec
OK
1
it returns "1",I find that it doesn't count the rows where the first column is null,
Could you help to analyze this?
by the way the hadoop version is 2.6.0
was:
in hbase 1.0.2,I created a table 'test1',it has below rows and values:
hbase(main):027:0> scan 'test1'
ROW COLUMN+CELL
a1 column=df1:a2, timestamp=1452505991743, value=ddd
a1 column=df1:a3, timestamp=1452506082723, value=eee
a1 column=df1:c2, timestamp=1452505705391, value=bbb
b1 column=df1:a2, timestamp=1452505838737, value=ccc
b1 column=df1:a3, timestamp=1452506149461, value=fff
r1 column=df1:a, timestamp=1452507261849, value=hhh
r1 column=df1:a1, timestamp=1452507100774, value=ggg
r1 column=df1:c1, timestamp=1451221711588, value=aaa
then I created hive-1.2.1 table:
create external table test3(
key string,
coll string,
col2 string,
col3 string,
col4 string,
col5 string,
col6 string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES
("hbase.columns.mapping" =
":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
TBLPROPERTIES("hbase.table.name" = "test1");
when I run query in hive:
hive> select * from test3;
OK
a1 NULL NULL ddd eee NULL bbb
b1 NULL NULL ccc fff NULL NULL
r1 hhh NULL NULL NULL aaa NULL
the result is correct,but when I run:
select count(1) from test3;
Total MapReduce CPU Time Spent: 6 seconds 770 msec
OK
1
it returns "1",I find that it doesn't count the rows where the first column is null,
Could you help to analyze this?
by the way the hadoop version is 2.6.0
> hive-1.2.1 doesn't return correct value when run select count query
> -------------------------------------------------------------------
>
> Key: HIVE-12844
> URL: https://issues.apache.org/jira/browse/HIVE-12844
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.1
> Reporter: david
> Priority: Critical
>
> in hbase 1.0.2,I created a table 'test1',it has below rows and values:
> hbase(main):027:0> scan 'test1'
> ROW COLUMN+CELL
> a1 column=df1:a2, timestamp=1452505991743, value=ddd
> a1 column=df1:a3, timestamp=1452506082723, value=eee
> a1 column=df1:c2, timestamp=1452505705391, value=bbb
> b1 column=df1:a2, timestamp=1452505838737, value=ccc
> b1 column=df1:a3, timestamp=1452506149461, value=fff
> r1 column=df1:a, timestamp=1452507261849, value=hhh
> r1 column=df1:a1, timestamp=1452507100774, value=ggg
> r1 column=df1:c1, timestamp=1451221711588, value=aaa
> then I created hive-1.2.1 table:
> create external table test3(
> key string,
> coll string,
> col2 string,
> col3 string,
> col4 string,
> col5 string,
> col6 string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES
> ("hbase.columns.mapping" =
> ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
> TBLPROPERTIES("hbase.table.name" = "test1");
> when I run query in hive:
> hive> select * from test3;
> OK
> a1 NULL NULL ddd eee NULL bbb
> b1 NULL NULL ccc fff NULL NULL
> r1 hhh NULL NULL NULL aaa NULL
> the result is correct,but when I run:
> select count(1) from test3;
> Total MapReduce CPU Time Spent: 6 seconds 770 msec
> OK
> 1
> it returns "1",I find that it doesn't count the rows where the first column is null,
> Could you help to analyze this?
> by the way the hadoop version is 2.6.0
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)