You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/11/04 01:22:33 UTC
[jira] [Resolved] (HIVE-8702) Extra MapTask created but not
connected [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Szehon Ho resolved HIVE-8702.
-----------------------------
Resolution: Invalid
Took another look. So Suhas had wired up two resolvers that need to be enabled. I had enabled only the first one (SparkMapJoinOptimizer). There is a second one called SparkReduceSinkMapJoinProc that also needs to be wired. Once its wired, the plan looks more appropriate.
> Extra MapTask created but not connected [Spark Branch]
> ------------------------------------------------------
>
> Key: HIVE-8702
> URL: https://issues.apache.org/jira/browse/HIVE-8702
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Xuefu Zhang
> Assignee: Szehon Ho
>
> Based on Szehon's observation, there is a strange extra maptask generated but not connected. Here is the query to demonstrate:
> {code}
> select * FROM
> (SELECT avg(key) as x1, value as x2 FROM src group by value) x
> JOIN
> (SELECT avg(key) as y1, value as y2 FROM src group by value) y ON (x1 = y1)
> JOIN
> (SELECT avg(key) as z1, value as z2 FROM src group by value) z ON (x1 = z1);
> {code}
> We shouldn't generate it in the first place.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)