You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jakob Homan (Created) (JIRA)" <ji...@apache.org> on 2011/12/23 22:16:30 UTC
[jira] [Created] (HIVE-2677) Joins involving where statements with
a partition do not prune correctly in left outer join
Joins involving where statements with a partition do not prune correctly in left outer join
-------------------------------------------------------------------------------------------
Key: HIVE-2677
URL: https://issues.apache.org/jira/browse/HIVE-2677
Project: Hive
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Jakob Homan
The following query:
{noformat}
select t1.id as m_id, count(t2.id) as totals
from table1 t1 left outer join table2 t2 on (t1.id = t2.id)
where (datepartition == '2011-11-08-00')
group by t1.id;{noformat}
should prune to just a single partition (datepartition == '2011-11-08-00') . However, the filter is being applied in the reducer, so a full table scan is being done as part of the map.
One can get the correct behavior by pushing the filter into a select statement within the join itself:
{noformat}select t1.id as m_id, count(p.id) as totals
from table1 t1 left outer join (select * from table2 where datepartition == '2011-11-08-00') t2 on (t1.id = t2.id)
group by t1.id;{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2677) Joins involving where statements
with a partition do not prune correctly in left outer join
Posted by "Navis (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212316#comment-13212316 ]
Navis commented on HIVE-2677:
-----------------------------
For left-outer join, predicates on left condition only could be pushed down.
> Joins involving where statements with a partition do not prune correctly in left outer join
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-2677
> URL: https://issues.apache.org/jira/browse/HIVE-2677
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Jakob Homan
>
> The following query:
> {noformat}
> select t1.id as m_id, count(t2.id) as totals
> from table1 t1 left outer join table2 t2 on (t1.id = t2.id)
> where (datepartition == '2011-11-08-00')
> group by t1.id;{noformat}
> should prune to just a single partition (datepartition == '2011-11-08-00') . However, the filter is being applied in the reducer, so a full table scan is being done as part of the map.
> One can get the correct behavior by pushing the filter into a select statement within the join itself:
> {noformat}select t1.id as m_id, count(p.id) as totals
> from table1 t1 left outer join (select * from table2 where datepartition == '2011-11-08-00') t2 on (t1.id = t2.id)
> group by t1.id;{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2677) Joins involving where statements with
a partition do not prune correctly in left outer join
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-2677:
---------------------------------
Component/s: Query Processor
Labels: partition_pruner (was: )
> Joins involving where statements with a partition do not prune correctly in left outer join
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-2677
> URL: https://issues.apache.org/jira/browse/HIVE-2677
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.7.0
> Reporter: Jakob Homan
> Labels: partition_pruner
>
> The following query:
> {noformat}
> select t1.id as m_id, count(t2.id) as totals
> from table1 t1 left outer join table2 t2 on (t1.id = t2.id)
> where (datepartition == '2011-11-08-00')
> group by t1.id;{noformat}
> should prune to just a single partition (datepartition == '2011-11-08-00') . However, the filter is being applied in the reducer, so a full table scan is being done as part of the map.
> One can get the correct behavior by pushing the filter into a select statement within the join itself:
> {noformat}select t1.id as m_id, count(p.id) as totals
> from table1 t1 left outer join (select * from table2 where datepartition == '2011-11-08-00') t2 on (t1.id = t2.id)
> group by t1.id;{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira