You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Zoltan Haindrich <ki...@rxd.hu> on 2019/07/09 16:12:36 UTC
Review Request 71040: HIVE-21923 Vectorized MapJoin may miss results
when only the join key is selected
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/
-----------------------------------------------------------
Review request for hive and Jesús Camacho Rodríguez.
Bugs: HIVE-21923
https://issues.apache.org/jira/browse/HIVE-21923
Repository: hive-git
Description
-------
HIVE-21923
Vectorized MapJoin may miss results when only the join key is selected
Diffs
-----
common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 70ee4266f45219fd81bf0d0df0a2c4380334e307
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java 35dddddb844f236f24d2f17f4a43d064c9ebaf8c
ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q d989ca7dc883fa071cf5772f358c68bff78f659f
ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 45a646c948ec8b72710a6b8a3949fbe0203dd68e
ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 2305f87e45bd65152a6c77ce04f7b8efad4724d7
ql/src/test/results/clientpositive/spark/auto_join14.q.out 0c80c13889d134abe82bde30c98300620b1fd432
ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 4ee669fa7dd50e0373910030b35c8860383a3a70
ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out e28b15044503ea4bb5bd12b7caed6b105f337efd
Diff: https://reviews.apache.org/r/71040/diff/1/
Testing
-------
Thanks,
Zoltan Haindrich
Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss
results when only the join key is selected
Posted by Zoltan Haindrich <ki...@rxd.hu>.
> On July 11, 2019, 4 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
> > Line 255 (original), 255 (patched)
> > <https://reviews.apache.org/r/71040/diff/1/?file=2154108#file2154108line255>
> >
> > Can we update this comment since it is not only the big table? Feel free to add any more info to understand better what is going on.
I've removed the bigtable keyword...I don't think extending it will help.
I feel that redesigning/reducing the 3-4 mapping things to 1 would make it easier to undetstand this codes; and that would also avoid bugs like this.
Most importantly the part which puzzles together these mappings are hard to follow - and I think the problem arised from that cause.
> On July 11, 2019, 4 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q
> > Line 10 (original), 9 (patched)
> > <https://reviews.apache.org/r/71040/diff/1/?file=2154109#file2154109line10>
> >
> > Why is this disabled now? This is causing map join conversion to not being triggered below.
oh damn...I was making a final check that it's working correctly - looks like I've commited this...fixed; and all the other joins are mapjoins again
- Zoltan
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/#review216516
-----------------------------------------------------------
On July 9, 2019, 6:12 p.m., Zoltan Haindrich wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71040/
> -----------------------------------------------------------
>
> (Updated July 9, 2019, 6:12 p.m.)
>
>
> Review request for hive and Jesús Camacho Rodríguez.
>
>
> Bugs: HIVE-21923
> https://issues.apache.org/jira/browse/HIVE-21923
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-21923
> Vectorized MapJoin may miss results when only the join key is selected
>
>
> Diffs
> -----
>
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 70ee4266f45219fd81bf0d0df0a2c4380334e307
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java 35dddddb844f236f24d2f17f4a43d064c9ebaf8c
> ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q d989ca7dc883fa071cf5772f358c68bff78f659f
> ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 45a646c948ec8b72710a6b8a3949fbe0203dd68e
> ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 2305f87e45bd65152a6c77ce04f7b8efad4724d7
> ql/src/test/results/clientpositive/spark/auto_join14.q.out 0c80c13889d134abe82bde30c98300620b1fd432
> ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 4ee669fa7dd50e0373910030b35c8860383a3a70
> ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out e28b15044503ea4bb5bd12b7caed6b105f337efd
>
>
> Diff: https://reviews.apache.org/r/71040/diff/1/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zoltan Haindrich
>
>
Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss
results when only the join key is selected
Posted by Jesús Camacho Rodríguez <jc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/#review216516
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java
Line 255 (original), 255 (patched)
<https://reviews.apache.org/r/71040/#comment303733>
Can we update this comment since it is not only the big table? Feel free to add any more info to understand better what is going on.
ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q
Line 10 (original), 9 (patched)
<https://reviews.apache.org/r/71040/#comment303736>
Why is this disabled now? This is causing map join conversion to not being triggered below.
ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 83 (original), 84 (patched)
<https://reviews.apache.org/r/71040/#comment303734>
Map Join conversion not being triggered.
ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out
Line 241 (original), 255 (patched)
<https://reviews.apache.org/r/71040/#comment303735>
Map Join conversion not being triggered.
ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out
Line 5149 (original), 5149 (patched)
<https://reviews.apache.org/r/71040/#comment303737>
Cool.
- Jesús Camacho Rodríguez
On July 9, 2019, 4:12 p.m., Zoltan Haindrich wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71040/
> -----------------------------------------------------------
>
> (Updated July 9, 2019, 4:12 p.m.)
>
>
> Review request for hive and Jesús Camacho Rodríguez.
>
>
> Bugs: HIVE-21923
> https://issues.apache.org/jira/browse/HIVE-21923
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-21923
> Vectorized MapJoin may miss results when only the join key is selected
>
>
> Diffs
> -----
>
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 70ee4266f45219fd81bf0d0df0a2c4380334e307
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java 35dddddb844f236f24d2f17f4a43d064c9ebaf8c
> ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q d989ca7dc883fa071cf5772f358c68bff78f659f
> ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 45a646c948ec8b72710a6b8a3949fbe0203dd68e
> ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 2305f87e45bd65152a6c77ce04f7b8efad4724d7
> ql/src/test/results/clientpositive/spark/auto_join14.q.out 0c80c13889d134abe82bde30c98300620b1fd432
> ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 4ee669fa7dd50e0373910030b35c8860383a3a70
> ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out e28b15044503ea4bb5bd12b7caed6b105f337efd
>
>
> Diff: https://reviews.apache.org/r/71040/diff/1/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zoltan Haindrich
>
>
Re: Review Request 71040: HIVE-21923 Vectorized MapJoin may miss
results when only the join key is selected
Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71040/
-----------------------------------------------------------
(Updated July 11, 2019, 9:48 a.m.)
Review request for hive and Jesús Camacho Rodríguez.
Changes
-------
patch#7
Bugs: HIVE-21923
https://issues.apache.org/jira/browse/HIVE-21923
Repository: hive-git
Description
-------
HIVE-21923
Vectorized MapJoin may miss results when only the join key is selected
Diffs (updated)
-----
common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java 70ee4266f4
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyGenerateResultOperator.java 35dddddb84
ql/src/test/queries/clientpositive/hybridgrace_hashjoin_2.q d989ca7dc8
ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out 45a646c948
ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 1ddc1ea1ec
ql/src/test/results/clientpositive/spark/auto_join14.q.out 0c80c13889
ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 4ee669fa7d
ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out 8e9bd0513e
Diff: https://reviews.apache.org/r/71040/diff/2/
Changes: https://reviews.apache.org/r/71040/diff/1-2/
Testing
-------
Thanks,
Zoltan Haindrich