You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Laljo John Pullokkaran (JIRA)" <ji...@apache.org> on 2014/04/02 23:39:16 UTC
[jira] [Created] (HIVE-6819) Correctness issue with Hive limit
operator & predicate push down
Laljo John Pullokkaran created HIVE-6819:
--------------------------------------------
Summary: Correctness issue with Hive limit operator & predicate push down
Key: HIVE-6819
URL: https://issues.apache.org/jira/browse/HIVE-6819
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
Fix For: 0.13.0
Following query produces 0 rows with Predicate Push Down optimization turned on; the same query produces 130 rows with predicate push down turned off.
select t2.c_int from (select key, value, c_float, c_int from t1 order by key,value,c_float,c_int limit 10)t1 join t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float where t2.c_int>=1;
I could reproduce this on Apache Trunk.
Haven't checked if previous releases have the same issue.
hive> desc t1;
Query ID = jpullokkaran_20140401191515_36e441c6-074b-45ae-aff6-489e13a6f401
OK
key string
value string
c_int int
c_float float
c_boolean boolean
Time taken: 0.077 seconds, Fetched: 5 row(s)
hive> select distinct key, value, c_float, c_int from t1;
OK
1 1 1.0 1
1 1 1.0 1
1 1 1.0 1
1 1 1.0 1
null null NULL NULL
Time taken: 0.062 seconds, Fetched: 5 row(s)
hive> desc t2;
Query ID = jpullokkaran_20140401191616_dfbd14bb-b5b8-4165-8d01-e9a61a7f1c33
OK
key string
value string
c_int int
c_float float
c_boolean boolean
Time taken: 0.062 seconds, Fetched: 5 row(s)
hive> select distinct key, value, c_float, c_int from t2;
OK
1 1 1.0 1
1 1 1.0 1
1 1 1.0 1
1 1 1.0 1
2 2 2.0 2
null null NULL NULL
Time taken: 4.698 seconds, Fetched: 6 row(s)
hive> select t2.c_int from (select key, value, c_float, c_int from t1 order by key,value,c_float,c_int limit 10)t1 join t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float where t2.c_int>=1;
MapredLocal task succeeded
OK
Time taken: 13.029 seconds
hive>
hive> select t2.c_int from (select key, value, c_float, c_int from t1 order by key,value,c_float,c_int limit 10)t1 join t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float where t2.c_int>=1;
MapredLocal task succeeded
OK
...
1
1
1
1
1
1
1
1
1
1
1
Time taken: 9.317 seconds, Fetched: 130 row(s)
hive>
--
This message was sent by Atlassian JIRA
(v6.2#6252)