You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Rui Sun <ru...@intel.com> on 2013/12/20 14:01:06 UTC
Review Request 16422: HIVE-5891: Alias conflict when merging multiple
mapjoin tasks into their common child mapred task
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/
-----------------------------------------------------------
Review request for hive.
Bugs: HIVE-5891
https://issues.apache.org/jira/browse/HIVE-5891
Repository: hive
Description
-------
Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict
Diffs
-----
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475
/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475
/trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475
/trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475
Diff: https://reviews.apache.org/r/16422/diff/
Testing
-------
Thanks,
Rui Sun
Re: Review Request 16422: HIVE-5891: Alias conflict when merging multiple
mapjoin tasks into their common child mapred task
Posted by Yin Huai <hu...@cse.ohio-state.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/#review30745
-----------------------------------------------------------
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
<https://reviews.apache.org/r/16422/#comment58851>
Let's just use Entry<String, List<String>>.
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
<https://reviews.apache.org/r/16422/#comment58850>
With the changes made in GenMapRedUtils, we do not expect this check (aliases.contains(mapJoinAlias)) will be true, right? If so, let's add a comment at here tell other code readers that they should not expect to find any conflict on aliases.
- Yin Huai
On Dec. 20, 2013, 1:01 p.m., Rui Sun wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16422/
> -----------------------------------------------------------
>
> (Updated Dec. 20, 2013, 1:01 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-5891
> https://issues.apache.org/jira/browse/HIVE-5891
>
>
> Repository: hive
>
>
> Description
> -------
>
> Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict
>
>
> Diffs
> -----
>
> /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475
> /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475
> /trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475
> /trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475
> /trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475
>
> Diff: https://reviews.apache.org/r/16422/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Rui Sun
>
>
Re: Review Request 16422: HIVE-5891: Alias conflict when merging multiple
mapjoin tasks into their common child mapred task
Posted by Rui Sun <ru...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/
-----------------------------------------------------------
(Updated Dec. 23, 2013, 3:05 a.m.)
Review request for hive.
Changes
-------
Update patch per reviewer's comment.
Enhance the logic a little bit: don't use a QB id as prefix for the intermediate name if it is null.
Bugs: HIVE-5891
https://issues.apache.org/jira/browse/HIVE-5891
Repository: hive
Description
-------
Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict
Diffs (updated)
-----
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475
/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475
/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475
/trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475
/trunk/ql/src/test/results/clientpositive/correlationoptimizer2.q.out 1552475
/trunk/ql/src/test/results/clientpositive/correlationoptimizer3.q.out 1552475
/trunk/ql/src/test/results/clientpositive/correlationoptimizer6.q.out 1552475
/trunk/ql/src/test/results/clientpositive/groupby_position.q.out 1552475
/trunk/ql/src/test/results/clientpositive/join22.q.out 1552475
/trunk/ql/src/test/results/clientpositive/multiMapJoin1.q.out 1552475
/trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475
/trunk/ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out 1552475
Diff: https://reviews.apache.org/r/16422/diff/
Testing
-------
Thanks,
Rui Sun