You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Rui Sun <ru...@intel.com> on 2013/12/20 14:01:06 UTC

Review Request 16422: HIVE-5891: Alias conflict when merging multiple mapjoin tasks into their common child mapred task

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/
-----------------------------------------------------------

Review request for hive.


Bugs: HIVE-5891
    https://issues.apache.org/jira/browse/HIVE-5891


Repository: hive


Description
-------

Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict


Diffs
-----

  /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475 
  /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475 
  /trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475 
  /trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475 
  /trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475 

Diff: https://reviews.apache.org/r/16422/diff/


Testing
-------


Thanks,

Rui Sun


Re: Review Request 16422: HIVE-5891: Alias conflict when merging multiple mapjoin tasks into their common child mapred task

Posted by Yin Huai <hu...@cse.ohio-state.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/#review30745
-----------------------------------------------------------



/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
<https://reviews.apache.org/r/16422/#comment58851>

    Let's just use Entry<String, List<String>>.



/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
<https://reviews.apache.org/r/16422/#comment58850>

    With the changes made in GenMapRedUtils, we do not expect this check (aliases.contains(mapJoinAlias)) will be true, right? If so, let's add a comment at here tell other code readers that they should not expect to find any conflict on aliases.


- Yin Huai


On Dec. 20, 2013, 1:01 p.m., Rui Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16422/
> -----------------------------------------------------------
> 
> (Updated Dec. 20, 2013, 1:01 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5891
>     https://issues.apache.org/jira/browse/HIVE-5891
> 
> 
> Repository: hive
> 
> 
> Description
> -------
> 
> Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict
> 
> 
> Diffs
> -----
> 
>   /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475 
>   /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475 
>   /trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475 
>   /trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475 
>   /trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475 
> 
> Diff: https://reviews.apache.org/r/16422/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Rui Sun
> 
>


Re: Review Request 16422: HIVE-5891: Alias conflict when merging multiple mapjoin tasks into their common child mapred task

Posted by Rui Sun <ru...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16422/
-----------------------------------------------------------

(Updated Dec. 23, 2013, 3:05 a.m.)


Review request for hive.


Changes
-------

Update patch per reviewer's comment.

Enhance the logic a little bit: don't use a QB id as prefix for the intermediate name if it is null.


Bugs: HIVE-5891
    https://issues.apache.org/jira/browse/HIVE-5891


Repository: hive


Description
-------

Use the query ID as prefix for the join stream intermediate name to avoid conflict and add sanity check code in CommonJoinTaskDispatcher so that merge of a MapJoin task into its child MapRed task is skipped if there is any alias conflict


Diffs (updated)
-----

  /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1552475 
  /trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 1552475 
  /trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java 1552475 
  /trunk/ql/src/test/queries/clientpositive/multiMapJoin2.q 1552475 
  /trunk/ql/src/test/results/clientpositive/correlationoptimizer2.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/correlationoptimizer3.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/correlationoptimizer6.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/groupby_position.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/join22.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/multiMapJoin1.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/multiMapJoin2.q.out 1552475 
  /trunk/ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out 1552475 

Diff: https://reviews.apache.org/r/16422/diff/


Testing
-------


Thanks,

Rui Sun