You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Zoltan Haindrich <ki...@rxd.hu> on 2018/07/30 16:13:53 UTC

Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/
-----------------------------------------------------------

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
-------

* open in to or
* close ors into in at 2
* wip patch


Diffs
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 39c77b3fe52cb7d7d255138bb71d77a170347b52 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f586321c2496e9f3bc3b428ae5689bf046a9 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2cba688897bac4cf29eaf9679b6de375 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c0181d697f08ac1aeb62ef57071602050 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2a58d83b0fd853c0008fc99f7063f0c 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out 29ef214ff0e8678fe25bd1793a9522ea0ae61d32 
  ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de767099aeb14beea586bca4cca0784e 
  ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out d12b5f64cc70e0039ce9179c5c6fb90d8beba0d1 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out 588d924e6af9b704577b9acf9a01d139c1b13af7 
  ql/src/test/results/clientpositive/druid_basic3.q.out 54719f751769722c6b099682b9742b1e21c6ebec 
  ql/src/test/results/clientpositive/druid_intervals.q.out a5203c31822ef6cb8c70a0caa9b682b6f73e88f5 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636f5b7bc101505bd9fd69b8f24719003 
  ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b796acc0cc80eef48ddecb97aeb96f7c5 
  ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb603458941dd7f79485c6cd8b3a105995c54 
  ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 6217adb5d4a1c04506d8020a9c9d4b7932c637a3 
  ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05bbbb0d5ec915220865c654e7596a8fe 
  ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6396cf60b02b70e87371a21a0052b57 
  ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
  ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fccefba6ee12b98424b785c0eaef4eed 
  ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ceed021a254491f62069e1b8e18c1541 
  ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e34179d0557ff1de63ca31cd7f1e3fe 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa176170bb615d1032c810a8332fb64c9a23d 
  ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7a25da3b0db2efa397444dd6c5872c4 
  ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c0767b6c23f423ab1ad7e350eba7f85d1a 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3222314374d155fe5f7484a7e4628f2 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59e26a899129f1f816fc79d9a67b5c6b 
  ql/src/test/results/clientpositive/llap/vector_between_in.q.out 12ae1032eabb51a6c9c9f1a7204483128eb2848d 
  ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa6f1cfc569ee8237e83bfc345382e66 
  ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb6b9e1e5e074a31571a202720480aaa 
  ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6989e33f48f82a25742c1c020a9494b 
  ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a10ec96a999895b118aab3fa7ac9c653 
  ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e97168807da0718c75d371dcdfdc741dca1 
  ql/src/test/results/clientpositive/pcr.q.out 919b71234d56a63789c33470022bbf637b054444 
  ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c635f22594bfeb1baa5e8af61bb6b80fe 
  ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac777e68b65ddd1d21f888f8f53ab4b704 
  ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f07b0a0cfb10157ae917c233e1295de9 
  ql/src/test/results/clientpositive/perf/spark/query45.q.out d61f8b80521748965ba8b19992013ef028599c50 
  ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14575b2c3adf998a561c5a459eb2d2dc 
  ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea988efd767f18b8ff92792c0e97fde280 
  ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbfd317a8cc9d64fd211a4e366993aa79 
  ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf98463c8554e4ea2f00a94bb5f4def1 
  ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88dd07785fee487892c1338ce5c85596 
  ql/src/test/results/clientpositive/perf/spark/query8.q.out 6b14eb9bb007d4ebfaaddbd8ac7138fce1ca5e5b 
  ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f7873e7759c56f1830ea583044cf68e9a 
  ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc57766979ffbd1eb916f1a5e8cfff695bce69 
  ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51e6976efe61caecc05ec60ea5e2d328 
  ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3e1faa13cc2a5767ec43ad7347eb24c 
  ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b10bdc23903f8a6b9152e40c9eab2aa2 
  ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b795add63b11e0641550e913e4b6b02 
  ql/src/test/results/clientpositive/perf/tez/query45.q.out edb047d3f56017a52efafda67095504aadd7fe83 
  ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0dab253ce2494e1d94d78a6448183c9c2 
  ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4c8f161b1902605e96eacab996e9b7c 
  ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788db466def6df144de99dea33c40fcfc 
  ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f176abd89c989ecc1a8d2d511c74f31 
  ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e237ce35d4cae1f45a235fca2a3c917 
  ql/src/test/results/clientpositive/perf/tez/query8.q.out ee20e61ff4f0ad3cb226cca4ea36dd8dc976ec64 
  ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d697357e27eb9b2774d97ba3bd07f5331711 
  ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5ce2a3b1a6dc947d946df73782645380 
  ql/src/test/results/clientpositive/pointlookup.q.out 69ae098a418e09d70f0dd562c11a2dc86f95e2d6 
  ql/src/test/results/clientpositive/pointlookup2.q.out 1eba541ff0eaa2c11edd34fcd117f0bbd8046f0c 
  ql/src/test/results/clientpositive/pointlookup3.q.out 8835d4188c9abdf3d7439a9550d15f2107a2ee8f 
  ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a90393adf1939a0174ee20686b13fb3 
  ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371db45a668fad536d206535fe6843799 
  ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33bd51a315c64f038262cd47570b4bbd 
  ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac445ad32559b254ff0777c19b4e861419 
  ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
  ql/src/test/results/clientpositive/spark/pcr.q.out 83437e55936e05f7a5f6f2a116e4bbe980c9768e 
  ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e292169c92560f71302944eef209cc0 
  ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f584ad1308edcac626627421a102406d 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c80a5b1c0e57cf0c0595b7229d8e315 
  ql/src/test/results/clientpositive/spark/vector_between_in.q.out 78bcd26f561eb2449ec7a45f99156f04794aed2f 
  ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa53b732777ee6d074146693df5ee5954 
  ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3a45f7846face3d99817cd1b9cb3664 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d267bb595a6c88c87bd2f8aed26ac92f7 
  ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad0258f7e80a27163a46f7a4e7071efe893 
  ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c616cb74cca2fb20de3dd4b9190c0718 


Diff: https://reviews.apache.org/r/68108/diff/1/


Testing
-------


Thanks,

Zoltan Haindrich


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206737
-----------------------------------------------------------




ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out
Line 22 (original), 22 (patched)
<https://reviews.apache.org/r/68108/#comment289830>

    These cases are not handled...because there are columns on both sides.


- Zoltan Haindrich


On July 30, 2018, 4:13 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated July 30, 2018, 4:13 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or
> * close ors into in at 2
> * wip patch
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 39c77b3fe52cb7d7d255138bb71d77a170347b52 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f586321c2496e9f3bc3b428ae5689bf046a9 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2cba688897bac4cf29eaf9679b6de375 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c0181d697f08ac1aeb62ef57071602050 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2a58d83b0fd853c0008fc99f7063f0c 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 29ef214ff0e8678fe25bd1793a9522ea0ae61d32 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de767099aeb14beea586bca4cca0784e 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out d12b5f64cc70e0039ce9179c5c6fb90d8beba0d1 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 588d924e6af9b704577b9acf9a01d139c1b13af7 
>   ql/src/test/results/clientpositive/druid_basic3.q.out 54719f751769722c6b099682b9742b1e21c6ebec 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c31822ef6cb8c70a0caa9b682b6f73e88f5 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636f5b7bc101505bd9fd69b8f24719003 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b796acc0cc80eef48ddecb97aeb96f7c5 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb603458941dd7f79485c6cd8b3a105995c54 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 6217adb5d4a1c04506d8020a9c9d4b7932c637a3 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05bbbb0d5ec915220865c654e7596a8fe 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6396cf60b02b70e87371a21a0052b57 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fccefba6ee12b98424b785c0eaef4eed 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ceed021a254491f62069e1b8e18c1541 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e34179d0557ff1de63ca31cd7f1e3fe 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa176170bb615d1032c810a8332fb64c9a23d 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7a25da3b0db2efa397444dd6c5872c4 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c0767b6c23f423ab1ad7e350eba7f85d1a 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3222314374d155fe5f7484a7e4628f2 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59e26a899129f1f816fc79d9a67b5c6b 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 12ae1032eabb51a6c9c9f1a7204483128eb2848d 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa6f1cfc569ee8237e83bfc345382e66 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb6b9e1e5e074a31571a202720480aaa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6989e33f48f82a25742c1c020a9494b 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a10ec96a999895b118aab3fa7ac9c653 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e97168807da0718c75d371dcdfdc741dca1 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d56a63789c33470022bbf637b054444 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c635f22594bfeb1baa5e8af61bb6b80fe 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac777e68b65ddd1d21f888f8f53ab4b704 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f07b0a0cfb10157ae917c233e1295de9 
>   ql/src/test/results/clientpositive/perf/spark/query45.q.out d61f8b80521748965ba8b19992013ef028599c50 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14575b2c3adf998a561c5a459eb2d2dc 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea988efd767f18b8ff92792c0e97fde280 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbfd317a8cc9d64fd211a4e366993aa79 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf98463c8554e4ea2f00a94bb5f4def1 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88dd07785fee487892c1338ce5c85596 
>   ql/src/test/results/clientpositive/perf/spark/query8.q.out 6b14eb9bb007d4ebfaaddbd8ac7138fce1ca5e5b 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f7873e7759c56f1830ea583044cf68e9a 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc57766979ffbd1eb916f1a5e8cfff695bce69 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51e6976efe61caecc05ec60ea5e2d328 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3e1faa13cc2a5767ec43ad7347eb24c 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b10bdc23903f8a6b9152e40c9eab2aa2 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b795add63b11e0641550e913e4b6b02 
>   ql/src/test/results/clientpositive/perf/tez/query45.q.out edb047d3f56017a52efafda67095504aadd7fe83 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0dab253ce2494e1d94d78a6448183c9c2 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4c8f161b1902605e96eacab996e9b7c 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788db466def6df144de99dea33c40fcfc 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f176abd89c989ecc1a8d2d511c74f31 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e237ce35d4cae1f45a235fca2a3c917 
>   ql/src/test/results/clientpositive/perf/tez/query8.q.out ee20e61ff4f0ad3cb226cca4ea36dd8dc976ec64 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d697357e27eb9b2774d97ba3bd07f5331711 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5ce2a3b1a6dc947d946df73782645380 
>   ql/src/test/results/clientpositive/pointlookup.q.out 69ae098a418e09d70f0dd562c11a2dc86f95e2d6 
>   ql/src/test/results/clientpositive/pointlookup2.q.out 1eba541ff0eaa2c11edd34fcd117f0bbd8046f0c 
>   ql/src/test/results/clientpositive/pointlookup3.q.out 8835d4188c9abdf3d7439a9550d15f2107a2ee8f 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a90393adf1939a0174ee20686b13fb3 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371db45a668fad536d206535fe6843799 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33bd51a315c64f038262cd47570b4bbd 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac445ad32559b254ff0777c19b4e861419 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e55936e05f7a5f6f2a116e4bbe980c9768e 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e292169c92560f71302944eef209cc0 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f584ad1308edcac626627421a102406d 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c80a5b1c0e57cf0c0595b7229d8e315 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 78bcd26f561eb2449ec7a45f99156f04794aed2f 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa53b732777ee6d074146693df5ee5954 
>   ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3a45f7846face3d99817cd1b9cb3664 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d267bb595a6c88c87bd2f8aed26ac92f7 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad0258f7e80a27163a46f7a4e7071efe893 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c616cb74cca2fb20de3dd4b9190c0718 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206599
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Line 2141 (original), 2141 (patched)
<https://reviews.apache.org/r/68108/#comment289618>

    Is it ok to lower this?
    introduced in https://issues.apache.org/jira/browse/HIVE-11573
    I think this feature was good as is; because it was originally designed to reduce computation for big OR s



ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out
Line 197 (original)
<https://reviews.apache.org/r/68108/#comment289619>

    I suspect that somehow the calcite optree to sql converter have run into problems during interpreting IN...



ql/src/test/results/clientpositive/llap/vectorized_case.q.out
Line 54 (original), 54 (patched)
<https://reviews.apache.org/r/68108/#comment289620>

    this filter is rewritten into an IN close; which resulted in the drop of vectorization - because of casting issues


- Zoltan Haindrich


On July 30, 2018, 4:13 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated July 30, 2018, 4:13 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or
> * close ors into in at 2
> * wip patch
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 39c77b3fe52cb7d7d255138bb71d77a170347b52 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f586321c2496e9f3bc3b428ae5689bf046a9 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2cba688897bac4cf29eaf9679b6de375 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c0181d697f08ac1aeb62ef57071602050 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2a58d83b0fd853c0008fc99f7063f0c 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 29ef214ff0e8678fe25bd1793a9522ea0ae61d32 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de767099aeb14beea586bca4cca0784e 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out d12b5f64cc70e0039ce9179c5c6fb90d8beba0d1 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 588d924e6af9b704577b9acf9a01d139c1b13af7 
>   ql/src/test/results/clientpositive/druid_basic3.q.out 54719f751769722c6b099682b9742b1e21c6ebec 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c31822ef6cb8c70a0caa9b682b6f73e88f5 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636f5b7bc101505bd9fd69b8f24719003 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b796acc0cc80eef48ddecb97aeb96f7c5 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb603458941dd7f79485c6cd8b3a105995c54 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 6217adb5d4a1c04506d8020a9c9d4b7932c637a3 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05bbbb0d5ec915220865c654e7596a8fe 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6396cf60b02b70e87371a21a0052b57 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fccefba6ee12b98424b785c0eaef4eed 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ceed021a254491f62069e1b8e18c1541 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e34179d0557ff1de63ca31cd7f1e3fe 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa176170bb615d1032c810a8332fb64c9a23d 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7a25da3b0db2efa397444dd6c5872c4 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c0767b6c23f423ab1ad7e350eba7f85d1a 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3222314374d155fe5f7484a7e4628f2 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59e26a899129f1f816fc79d9a67b5c6b 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 12ae1032eabb51a6c9c9f1a7204483128eb2848d 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa6f1cfc569ee8237e83bfc345382e66 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb6b9e1e5e074a31571a202720480aaa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6989e33f48f82a25742c1c020a9494b 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a10ec96a999895b118aab3fa7ac9c653 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e97168807da0718c75d371dcdfdc741dca1 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d56a63789c33470022bbf637b054444 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c635f22594bfeb1baa5e8af61bb6b80fe 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac777e68b65ddd1d21f888f8f53ab4b704 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f07b0a0cfb10157ae917c233e1295de9 
>   ql/src/test/results/clientpositive/perf/spark/query45.q.out d61f8b80521748965ba8b19992013ef028599c50 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14575b2c3adf998a561c5a459eb2d2dc 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea988efd767f18b8ff92792c0e97fde280 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbfd317a8cc9d64fd211a4e366993aa79 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf98463c8554e4ea2f00a94bb5f4def1 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88dd07785fee487892c1338ce5c85596 
>   ql/src/test/results/clientpositive/perf/spark/query8.q.out 6b14eb9bb007d4ebfaaddbd8ac7138fce1ca5e5b 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f7873e7759c56f1830ea583044cf68e9a 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc57766979ffbd1eb916f1a5e8cfff695bce69 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51e6976efe61caecc05ec60ea5e2d328 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3e1faa13cc2a5767ec43ad7347eb24c 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b10bdc23903f8a6b9152e40c9eab2aa2 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b795add63b11e0641550e913e4b6b02 
>   ql/src/test/results/clientpositive/perf/tez/query45.q.out edb047d3f56017a52efafda67095504aadd7fe83 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0dab253ce2494e1d94d78a6448183c9c2 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4c8f161b1902605e96eacab996e9b7c 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788db466def6df144de99dea33c40fcfc 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f176abd89c989ecc1a8d2d511c74f31 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e237ce35d4cae1f45a235fca2a3c917 
>   ql/src/test/results/clientpositive/perf/tez/query8.q.out ee20e61ff4f0ad3cb226cca4ea36dd8dc976ec64 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d697357e27eb9b2774d97ba3bd07f5331711 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5ce2a3b1a6dc947d946df73782645380 
>   ql/src/test/results/clientpositive/pointlookup.q.out 69ae098a418e09d70f0dd562c11a2dc86f95e2d6 
>   ql/src/test/results/clientpositive/pointlookup2.q.out 1eba541ff0eaa2c11edd34fcd117f0bbd8046f0c 
>   ql/src/test/results/clientpositive/pointlookup3.q.out 8835d4188c9abdf3d7439a9550d15f2107a2ee8f 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a90393adf1939a0174ee20686b13fb3 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371db45a668fad536d206535fe6843799 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33bd51a315c64f038262cd47570b4bbd 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac445ad32559b254ff0777c19b4e861419 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e55936e05f7a5f6f2a116e4bbe980c9768e 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e292169c92560f71302944eef209cc0 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f584ad1308edcac626627421a102406d 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c80a5b1c0e57cf0c0595b7229d8e315 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 78bcd26f561eb2449ec7a45f99156f04794aed2f 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa53b732777ee6d074146693df5ee5954 
>   ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3a45f7846face3d99817cd1b9cb3664 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d267bb595a6c88c87bd2f8aed26ac92f7 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad0258f7e80a27163a46f7a4e7071efe893 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c616cb74cca2fb20de3dd4b9190c0718 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/
-----------------------------------------------------------

(Updated Aug. 3, 2018, 2:56 p.m.)


Review request for hive, Ashutosh Chauhan and Gopal V.


Changes
-------

patch#12


Bugs: HIVE-19097
    https://issues.apache.org/jira/browse/HIVE-19097


Repository: hive-git


Description
-------

* open in to or - only column can be on left side
* close ors into in at 2
* small fix to or closer; to be able to spot nested cases
* make IN stat estimation better ; it could have (highly) overestimated the selectivity in multi column cases.
* pending qtests for the latest IN stat estimation changes...


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java 01ad41c497 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01179c805f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
  ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
  ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
  ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
  ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
  ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
  ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
  ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
  ql/src/test/results/clientpositive/filter_in_or_dup.q.out b50027d01a 
  ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
  ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 7ec51ffd07 
  ql/src/test/results/clientpositive/implicit_cast_during_insert.q.out 5e974bf7bb 
  ql/src/test/results/clientpositive/join45.q.out 4365d521b2 
  ql/src/test/results/clientpositive/join47.q.out c04b94b911 
  ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
  ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
  ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out 1330a86426 
  ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
  ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
  ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 4db83c149d 
  ql/src/test/results/clientpositive/llap/explainuser_2.q.out 47941fa1ae 
  ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
  ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
  ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
  ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
  ql/src/test/results/clientpositive/llap/vector_struct_in.q.out 3756a2f4ab 
  ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
  ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
  ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
  ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
  ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out 54d5be0b02 
  ql/src/test/results/clientpositive/mapjoin47.q.out cf29fa06b9 
  ql/src/test/results/clientpositive/parquet_vectorization_0.q.out 01f89513c3 
  ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
  ql/src/test/results/clientpositive/pcs.q.out c8819cc0dc 
  ql/src/test/results/clientpositive/perf/spark/query10.q.out 45dfc5391c 
  ql/src/test/results/clientpositive/perf/spark/query12.q.out ad7e91215e 
  ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
  ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
  ql/src/test/results/clientpositive/perf/spark/query16.q.out 2f51a71ef1 
  ql/src/test/results/clientpositive/perf/spark/query17.q.out 23f1e85927 
  ql/src/test/results/clientpositive/perf/spark/query18.q.out f8bec59c53 
  ql/src/test/results/clientpositive/perf/spark/query20.q.out 76fae0be21 
  ql/src/test/results/clientpositive/perf/spark/query23.q.out 08b0f937f1 
  ql/src/test/results/clientpositive/perf/spark/query27.q.out e7ed297f32 
  ql/src/test/results/clientpositive/perf/spark/query29.q.out b070fc038f 
  ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
  ql/src/test/results/clientpositive/perf/spark/query36.q.out d3bea7698c 
  ql/src/test/results/clientpositive/perf/spark/query37.q.out 17c85a6ad4 
  ql/src/test/results/clientpositive/perf/spark/query45.q.out d61f8b8052 
  ql/src/test/results/clientpositive/perf/spark/query46.q.out ccce45c4d2 
  ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
  ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
  ql/src/test/results/clientpositive/perf/spark/query56.q.out 47059878a4 
  ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
  ql/src/test/results/clientpositive/perf/spark/query68.q.out faf5d991bb 
  ql/src/test/results/clientpositive/perf/spark/query69.q.out 83b55df61b 
  ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
  ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
  ql/src/test/results/clientpositive/perf/spark/query74.q.out 3678906bc0 
  ql/src/test/results/clientpositive/perf/spark/query79.q.out 9355239c52 
  ql/src/test/results/clientpositive/perf/spark/query82.q.out bc627f1ce3 
  ql/src/test/results/clientpositive/perf/spark/query83.q.out 6fad2cafff 
  ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
  ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
  ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
  ql/src/test/results/clientpositive/perf/spark/query98.q.out c82607d9a1 
  ql/src/test/results/clientpositive/perf/tez/query10.q.out a8f097fb59 
  ql/src/test/results/clientpositive/perf/tez/query12.q.out d3d8df00cd 
  ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
  ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
  ql/src/test/results/clientpositive/perf/tez/query16.q.out 5652f3b019 
  ql/src/test/results/clientpositive/perf/tez/query17.q.out e185775904 
  ql/src/test/results/clientpositive/perf/tez/query18.q.out 1b9b2fba02 
  ql/src/test/results/clientpositive/perf/tez/query20.q.out 7d126a8de9 
  ql/src/test/results/clientpositive/perf/tez/query23.q.out aab3f9360c 
  ql/src/test/results/clientpositive/perf/tez/query27.q.out 7ea13c8f9c 
  ql/src/test/results/clientpositive/perf/tez/query29.q.out 9bfcdfa9c1 
  ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
  ql/src/test/results/clientpositive/perf/tez/query36.q.out c86c9e42aa 
  ql/src/test/results/clientpositive/perf/tez/query37.q.out 2b3ae52aee 
  ql/src/test/results/clientpositive/perf/tez/query45.q.out edb047d3f5 
  ql/src/test/results/clientpositive/perf/tez/query46.q.out 708a852051 
  ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
  ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
  ql/src/test/results/clientpositive/perf/tez/query56.q.out 0d8ac48fe4 
  ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
  ql/src/test/results/clientpositive/perf/tez/query64.q.out 6d3edd3173 
  ql/src/test/results/clientpositive/perf/tez/query68.q.out 24b250282f 
  ql/src/test/results/clientpositive/perf/tez/query69.q.out 738508a1a9 
  ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
  ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
  ql/src/test/results/clientpositive/perf/tez/query74.q.out 854e6dc3aa 
  ql/src/test/results/clientpositive/perf/tez/query79.q.out 105a7396a4 
  ql/src/test/results/clientpositive/perf/tez/query82.q.out bb5a9e9a0b 
  ql/src/test/results/clientpositive/perf/tez/query83.q.out f766e8dd9b 
  ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
  ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
  ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
  ql/src/test/results/clientpositive/perf/tez/query98.q.out 4915d2b236 
  ql/src/test/results/clientpositive/pointlookup.q.out 69ae098a41 
  ql/src/test/results/clientpositive/pointlookup2.q.out 1eba541ff0 
  ql/src/test/results/clientpositive/pointlookup3.q.out 8835d4188c 
  ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
  ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
  ql/src/test/results/clientpositive/smb_mapjoin_47.q.out 894ab3d3af 
  ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
  ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
  ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/spark/groupby_multi_single_reducer3.q.out 7eff987d20 
  ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out 30b5a2e8c9 
  ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
  ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
  ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
  ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out 4606a0a6f0 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 736321b369 
  ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
  ql/src/test/results/clientpositive/spark/vectorization_0.q.out c6b204f3df 
  ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
  ql/src/test/results/clientpositive/stat_estimate_drill.q.out 8a008c8fbf 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
  ql/src/test/results/clientpositive/vector_date_1.q.out cb952ecf27 
  ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
  ql/src/test/results/clientpositive/vector_struct_in.q.out d073ec6bce 
  ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 
  ql/src/test/results/clientpositive/vectorized_timestamp.q.out be1891999c 


Diff: https://reviews.apache.org/r/68108/diff/5/

Changes: https://reviews.apache.org/r/68108/diff/4-5/


Testing
-------


Thanks,

Zoltan Haindrich


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.

> On Aug. 2, 2018, 6:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out
> > Line 918 (original), 918 (patched)
> > <https://reviews.apache.org/r/68108/diff/3/?file=2066403#file2066403line918>
> >
> >     This could be further simplified. c_int = c_int should become c_int is not null can be explored in follow-up since Calcite already has this simplification logic.

Actually Calcite misses this simple rule..but the logic already there - and works in case unknownAsFalse is true...the changes on 2/3 valued logic handling will probably enable this unknownAsFalse=false as well.


> On Aug. 2, 2018, 6:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out
> > Lines 382-384 (original)
> > <https://reviews.apache.org/r/68108/diff/3/?file=2066409#file2066409line382>
> >
> >     Loss of this will be regression. We need to fix this before committing it.

I think this will need a hive dialect fix in Calcite - which usually arrives as a Calcite upgrade...

The underlying exception is coming from Calcite:
```
Caused by: java.lang.ClassCastException: org.apache.calcite.rex.RexCall cannot be cast to org.apache.calcite.rex.RexSubQuery
        at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:540) ~[calcite-core-1.17.0.jar:1.17.0]
```
and its root cause is that in calcite there are no `IN`s as in hive because calcite's sql translator rewrites them either into ORs or into a subquery; hence the exception tries to rely on this contract...that INs are actually *always* subqueries....to fix this I've opened CALCITE-2444


It seems this was introduced not so long ago...so it's not yet a regression... (HIVE-19360 with fix version 4.0 / 3.2)


> On Aug. 2, 2018, 6:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query15.q.out
> > Line 74 (original), 74 (patched)
> > <https://reviews.apache.org/r/68108/diff/3/?file=2066441#file2066441line74>
> >
> >     In this case LHS is column and all elements in IN list are constants, so this should have been folded to IN again?

no; because the folding logic could not process this or - HIVE-20296 will improve on it.


> On Aug. 2, 2018, 6:08 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query53.q.out
> > Line 133 (original), 133 (patched)
> > <https://reviews.apache.org/r/68108/diff/3/?file=2066444#file2066444line133>
> >
> >     Here LHS of in is column ref and RHS are alll constants, so should have folded back to IN.

yes this is folded back....my patch have not contained an updated version of this q.out


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206814
-----------------------------------------------------------


On Aug. 2, 2018, 2:16 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated Aug. 2, 2018, 2:16 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or - only column can be on left side
> * close ors into in at 2
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java 01ad41c497 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
>   ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
>   ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
>   ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out 1330a86426 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 4db83c149d 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
>   ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 736321b369 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Ashutosh Chauhan <ha...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206814
-----------------------------------------------------------




ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out
Line 918 (original), 918 (patched)
<https://reviews.apache.org/r/68108/#comment289894>

    This could be further simplified. c_int = c_int should become c_int is not null can be explored in follow-up since Calcite already has this simplification logic.



ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out
Lines 382-384 (original)
<https://reviews.apache.org/r/68108/#comment289898>

    Loss of this will be regression. We need to fix this before committing it.



ql/src/test/results/clientpositive/perf/tez/query15.q.out
Line 74 (original), 74 (patched)
<https://reviews.apache.org/r/68108/#comment289896>

    In this case LHS is column and all elements in IN list are constants, so this should have been folded to IN again?



ql/src/test/results/clientpositive/perf/tez/query53.q.out
Line 133 (original), 133 (patched)
<https://reviews.apache.org/r/68108/#comment289897>

    Here LHS of in is column ref and RHS are alll constants, so should have folded back to IN.


- Ashutosh Chauhan


On Aug. 2, 2018, 2:16 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated Aug. 2, 2018, 2:16 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or - only column can be on left side
> * close ors into in at 2
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java 01ad41c497 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
>   ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
>   ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
>   ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out 1330a86426 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 4db83c149d 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
>   ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 736321b369 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/
-----------------------------------------------------------

(Updated Aug. 2, 2018, 2:16 p.m.)


Review request for hive, Ashutosh Chauhan and Gopal V.


Changes
-------

patch#09


Bugs: HIVE-19097
    https://issues.apache.org/jira/browse/HIVE-19097


Repository: hive-git


Description
-------

* open in to or - only column can be on left side
* close ors into in at 2


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java 01ad41c497 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
  ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
  ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
  ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
  ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
  ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
  ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
  ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
  ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
  ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
  ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
  ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out 1330a86426 
  ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
  ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
  ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 4db83c149d 
  ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
  ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
  ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
  ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
  ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
  ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
  ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
  ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
  ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
  ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
  ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
  ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
  ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
  ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
  ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
  ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
  ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
  ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
  ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
  ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
  ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
  ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
  ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
  ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
  ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
  ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
  ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
  ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
  ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
  ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
  ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
  ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
  ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
  ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
  ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
  ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
  ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
  ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 736321b369 
  ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
  ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
  ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
  ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 


Diff: https://reviews.apache.org/r/68108/diff/3/

Changes: https://reviews.apache.org/r/68108/diff/2-3/


Testing
-------


Thanks,

Zoltan Haindrich


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/
-----------------------------------------------------------

(Updated Aug. 2, 2018, 11:13 a.m.)


Review request for hive, Ashutosh Chauhan and Gopal V.


Changes
-------

patch#06


Bugs: HIVE-19097
    https://issues.apache.org/jira/browse/HIVE-19097


Repository: hive-git


Description (updated)
-------

* open in to or - only column can be on left side
* close ors into in at 2


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
  ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
  ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
  ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
  ql/src/test/queries/clientpositive/stat_estimate_related_col.q 52da2f759a 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
  ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
  ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
  ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
  ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
  ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
  ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
  ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
  ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
  ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6 
  ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
  ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
  ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa17617 
  ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
  ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
  ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
  ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
  ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
  ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
  ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
  ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
  ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
  ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
  ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
  ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
  ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
  ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
  ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
  ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
  ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
  ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
  ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
  ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
  ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
  ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
  ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
  ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
  ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
  ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
  ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
  ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
  ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
  ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
  ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
  ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
  ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
  ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
  ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
  ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
  ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
  ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
  ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c 
  ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
  ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
  ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3 
  ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
  ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
  ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 


Diff: https://reviews.apache.org/r/68108/diff/2/

Changes: https://reviews.apache.org/r/68108/diff/1-2/


Testing
-------


Thanks,

Zoltan Haindrich


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Zoltan Haindrich <ki...@rxd.hu>.

> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/cbo_simple_select.q.out
> > Line 866 (original), 866 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065277#file2065277line866>
> >
> >     This didnt get rewritten into IN. Is that expected?

no, this is a different class of comparision; because there are columns on both sides ; I think this way the extraction logic is kinda confused...

note: calcite may handle `x=x` to `true` or `x is not null` simplification.


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/druid_basic3.q.out
> > Line 280 (original), 280 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065278#file2065278line280>
> >
> >     No folding of OR into IN ? for druid also, IN is more performant.

actually this is a case of: `UDF(x) IN (c1,c2)` and it's not getting refolded because of the udf...
probably later could be done; but for opening INs only those are considered which have a column on the left side.


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/vectorized_case.q.out
> > Line 54 (original), 54 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065299#file2065299line54>
> >
> >     yeah.. i think thats because now constants are of type integer. Note in OR clause they had S suffix which made them smallint.
> >     
> >     This used to happen because of https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L1157
> >     
> >     This is during parsing of expresions. We need to enhance this logic now for INs as well.

updated the logic in typecheckprocfactory


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query15.q.out
> > Line 74 (original), 74 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065316#file2065316line74>
> >
> >     No folding back to IN ?

new patch will not `open` INs like this... but that's still not enough to refold: `_col3 IN ('CA','WA','GA')`; some work needs to be HIVE-20296


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query45.q.out
> > Line 81 (original), 81 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065318#file2065318line81>
> >
> >     No folding back to IN?

fixed in new patch


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query63.q.out
> > Line 135 (original), 135 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065321#file2065321line135>
> >
> >     No folding back to IN ?

this needed a little tweak in hivepointlookupoptimizer; now it notices some redundancies in this condition! :)


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/tez/query8.q.out
> > Line 337 (original), 337 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065324#file2065324line337>
> >
> >     No folding back to IN ?

this is not expanded anymore...since it can't be closed back right now;
rule is to only open if left side is a column.


> On July 30, 2018, 6:11 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out
> > Line 22 (original), 22 (patched)
> > <https://reviews.apache.org/r/68108/diff/1/?file=2065343#file2065343line22>
> >
> >     these ORs didnt get folded in IN, expected?

there are columns on both sides - probably later will be taken care of...


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206603
-----------------------------------------------------------


On Aug. 2, 2018, 11:13 a.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated Aug. 2, 2018, 11:13 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or - only column can be on left side
> * close ors into in at 2
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 093b4a73f3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 97e405970f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f58632 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java fa941a1b25 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2c 
>   ql/src/test/queries/clientpositive/pointlookup.q 1b65cec71c 
>   ql/src/test/queries/clientpositive/pointlookup2.q fe19381368 
>   ql/src/test/queries/clientpositive/pointlookup3.q f98feeb164 
>   ql/src/test/queries/clientpositive/stat_estimate_related_col.q 52da2f759a 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c01 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out bafc6de51e 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out 2e7d79660b 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 33f0e71080 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c3182 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b79 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb60345 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 98ad3656e7 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05b 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fc 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ce 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa17617 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c076 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 65eec521a2 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 801dda315a 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a1 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e9716 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c63 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac77 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f0 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea98 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbf 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f78 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc577669 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b1 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0da 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d69735 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5c 
>   ql/src/test/results/clientpositive/perf/tez/query91.q.out a53c7d796d 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac44 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a8 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e5593 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f5 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 8b1a2be89b 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa5 
>   ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d26 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad025 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c6 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Re: Review Request 68108: HIVE-19097 related equals and in operators may cause inaccurate stats estimations

Posted by Ashutosh Chauhan <ha...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68108/#review206603
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Line 2141 (original), 2141 (patched)
<https://reviews.apache.org/r/68108/#comment289622>

    Yes..its ok to lower this since IN is more performant in runtime, we want to execute IN at runtime.



ql/src/test/results/clientpositive/cbo_simple_select.q.out
Line 866 (original), 866 (patched)
<https://reviews.apache.org/r/68108/#comment289623>

    This didnt get rewritten into IN. Is that expected?



ql/src/test/results/clientpositive/druid_basic3.q.out
Line 280 (original), 280 (patched)
<https://reviews.apache.org/r/68108/#comment289634>

    No folding of OR into IN ? for druid also, IN is more performant.



ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out
Line 197 (original)
<https://reviews.apache.org/r/68108/#comment289624>

    yeah.. likely AST gen of IN token isn't present.



ql/src/test/results/clientpositive/llap/vectorized_case.q.out
Line 54 (original), 54 (patched)
<https://reviews.apache.org/r/68108/#comment289628>

    yeah.. i think thats because now constants are of type integer. Note in OR clause they had S suffix which made them smallint.
    
    This used to happen because of https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L1157
    
    This is during parsing of expresions. We need to enhance this logic now for INs as well.



ql/src/test/results/clientpositive/perf/tez/query15.q.out
Line 74 (original), 74 (patched)
<https://reviews.apache.org/r/68108/#comment289629>

    No folding back to IN ?



ql/src/test/results/clientpositive/perf/tez/query45.q.out
Line 81 (original), 81 (patched)
<https://reviews.apache.org/r/68108/#comment289630>

    No folding back to IN?



ql/src/test/results/clientpositive/perf/tez/query63.q.out
Line 135 (original), 135 (patched)
<https://reviews.apache.org/r/68108/#comment289631>

    No folding back to IN ?



ql/src/test/results/clientpositive/perf/tez/query8.q.out
Line 337 (original), 337 (patched)
<https://reviews.apache.org/r/68108/#comment289632>

    No folding back to IN ?



ql/src/test/results/clientpositive/pointlookup.q.out
Line 43 (original), 43 (patched)
<https://reviews.apache.org/r/68108/#comment289625>

    These tests were written with default of 31 in mind. So, lets update the test with that value of config.



ql/src/test/results/clientpositive/pointlookup2.q.out
Line 115 (original), 111 (patched)
<https://reviews.apache.org/r/68108/#comment289626>

    Lets set the config to 31 to retain original intention of tests.



ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out
Line 22 (original), 22 (patched)
<https://reviews.apache.org/r/68108/#comment289627>

    these ORs didnt get folded in IN, expected?


- Ashutosh Chauhan


On July 30, 2018, 4:13 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68108/
> -----------------------------------------------------------
> 
> (Updated July 30, 2018, 4:13 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Bugs: HIVE-19097
>     https://issues.apache.org/jira/browse/HIVE-19097
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * open in to or
> * close ors into in at 2
> * wip patch
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 39c77b3fe52cb7d7d255138bb71d77a170347b52 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java f544f586321c2496e9f3bc3b428ae5689bf046a9 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java b57b5ddc2cba688897bac4cf29eaf9679b6de375 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 5d033a3c0181d697f08ac1aeb62ef57071602050 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 54395886d2a58d83b0fd853c0008fc99f7063f0c 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 29ef214ff0e8678fe25bd1793a9522ea0ae61d32 
>   ql/src/test/results/clientpositive/auto_join19.q.out 3e07ec06de767099aeb14beea586bca4cca0784e 
>   ql/src/test/results/clientpositive/cbo_rp_simple_select.q.out d12b5f64cc70e0039ce9179c5c6fb90d8beba0d1 
>   ql/src/test/results/clientpositive/cbo_simple_select.q.out 588d924e6af9b704577b9acf9a01d139c1b13af7 
>   ql/src/test/results/clientpositive/druid_basic3.q.out 54719f751769722c6b099682b9742b1e21c6ebec 
>   ql/src/test/results/clientpositive/druid_intervals.q.out a5203c31822ef6cb8c70a0caa9b682b6f73e88f5 
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 97922c2636f5b7bc101505bd9fd69b8f24719003 
>   ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b84a2d4b796acc0cc80eef48ddecb97aeb96f7c5 
>   ql/src/test/results/clientpositive/fold_eq_with_case_when.q.out d06fb603458941dd7f79485c6cd8b3a105995c54 
>   ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out 6217adb5d4a1c04506d8020a9c9d4b7932c637a3 
>   ql/src/test/results/clientpositive/llap/bucketpruning1.q.out cc637db05bbbb0d5ec915220865c654e7596a8fe 
>   ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_7.q.out c7f5b887b6396cf60b02b70e87371a21a0052b57 
>   ql/src/test/results/clientpositive/llap/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/llap/check_constraint.q.out 123a3e46fccefba6ee12b98424b785c0eaef4eed 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out 8f06ee58ceed021a254491f62069e1b8e18c1541 
>   ql/src/test/results/clientpositive/llap/enforce_constraint_notnull.q.out e03cd3437e34179d0557ff1de63ca31cd7f1e3fe 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 708fa176170bb615d1032c810a8332fb64c9a23d 
>   ql/src/test/results/clientpositive/llap/kryo.q.out 234bae89c7a25da3b0db2efa397444dd6c5872c4 
>   ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 88ddd9c0767b6c23f423ab1ad7e350eba7f85d1a 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb.q.out 1841f1f4d3222314374d155fe5f7484a7e4628f2 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_ssb_2.q.out d7c92d8c59e26a899129f1f816fc79d9a67b5c6b 
>   ql/src/test/results/clientpositive/llap/vector_between_in.q.out 12ae1032eabb51a6c9c9f1a7204483128eb2848d 
>   ql/src/test/results/clientpositive/llap/vector_string_decimal.q.out 54d9914caa6f1cfc569ee8237e83bfc345382e66 
>   ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out 725ed34acb6b9e1e5e074a31571a202720480aaa 
>   ql/src/test/results/clientpositive/llap/vector_windowing_navfn.q.out 74ac56d1c6989e33f48f82a25742c1c020a9494b 
>   ql/src/test/results/clientpositive/llap/vectorized_case.q.out d444ae86a10ec96a999895b118aab3fa7ac9c653 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out ba004e97168807da0718c75d371dcdfdc741dca1 
>   ql/src/test/results/clientpositive/pcr.q.out 919b71234d56a63789c33470022bbf637b054444 
>   ql/src/test/results/clientpositive/perf/spark/query13.q.out fb2a061c635f22594bfeb1baa5e8af61bb6b80fe 
>   ql/src/test/results/clientpositive/perf/spark/query15.q.out 3d6fbdac777e68b65ddd1d21f888f8f53ab4b704 
>   ql/src/test/results/clientpositive/perf/spark/query34.q.out b40081e4f07b0a0cfb10157ae917c233e1295de9 
>   ql/src/test/results/clientpositive/perf/spark/query45.q.out d61f8b80521748965ba8b19992013ef028599c50 
>   ql/src/test/results/clientpositive/perf/spark/query48.q.out 60a4767a14575b2c3adf998a561c5a459eb2d2dc 
>   ql/src/test/results/clientpositive/perf/spark/query53.q.out 2b1cdfea988efd767f18b8ff92792c0e97fde280 
>   ql/src/test/results/clientpositive/perf/spark/query63.q.out b506455dbfd317a8cc9d64fd211a4e366993aa79 
>   ql/src/test/results/clientpositive/perf/spark/query71.q.out bf9c06debf98463c8554e4ea2f00a94bb5f4def1 
>   ql/src/test/results/clientpositive/perf/spark/query73.q.out 20ec874e88dd07785fee487892c1338ce5c85596 
>   ql/src/test/results/clientpositive/perf/spark/query8.q.out 6b14eb9bb007d4ebfaaddbd8ac7138fce1ca5e5b 
>   ql/src/test/results/clientpositive/perf/spark/query85.q.out 572ba54f7873e7759c56f1830ea583044cf68e9a 
>   ql/src/test/results/clientpositive/perf/spark/query89.q.out 1acc57766979ffbd1eb916f1a5e8cfff695bce69 
>   ql/src/test/results/clientpositive/perf/spark/query91.q.out de8977da51e6976efe61caecc05ec60ea5e2d328 
>   ql/src/test/results/clientpositive/perf/tez/query13.q.out 5cd4e27de3e1faa13cc2a5767ec43ad7347eb24c 
>   ql/src/test/results/clientpositive/perf/tez/query15.q.out 3c7ae664b10bdc23903f8a6b9152e40c9eab2aa2 
>   ql/src/test/results/clientpositive/perf/tez/query34.q.out 9b7b482d3b795add63b11e0641550e913e4b6b02 
>   ql/src/test/results/clientpositive/perf/tez/query45.q.out edb047d3f56017a52efafda67095504aadd7fe83 
>   ql/src/test/results/clientpositive/perf/tez/query48.q.out 1cf8d5c0dab253ce2494e1d94d78a6448183c9c2 
>   ql/src/test/results/clientpositive/perf/tez/query53.q.out 3567534ac4c8f161b1902605e96eacab996e9b7c 
>   ql/src/test/results/clientpositive/perf/tez/query63.q.out a5b7b5a788db466def6df144de99dea33c40fcfc 
>   ql/src/test/results/clientpositive/perf/tez/query71.q.out 4521aabc9f176abd89c989ecc1a8d2d511c74f31 
>   ql/src/test/results/clientpositive/perf/tez/query73.q.out cfa5213b5e237ce35d4cae1f45a235fca2a3c917 
>   ql/src/test/results/clientpositive/perf/tez/query8.q.out ee20e61ff4f0ad3cb226cca4ea36dd8dc976ec64 
>   ql/src/test/results/clientpositive/perf/tez/query85.q.out 4e42d697357e27eb9b2774d97ba3bd07f5331711 
>   ql/src/test/results/clientpositive/perf/tez/query89.q.out ee3374ea5ce2a3b1a6dc947d946df73782645380 
>   ql/src/test/results/clientpositive/pointlookup.q.out 69ae098a418e09d70f0dd562c11a2dc86f95e2d6 
>   ql/src/test/results/clientpositive/pointlookup2.q.out 1eba541ff0eaa2c11edd34fcd117f0bbd8046f0c 
>   ql/src/test/results/clientpositive/pointlookup3.q.out 8835d4188c9abdf3d7439a9550d15f2107a2ee8f 
>   ql/src/test/results/clientpositive/ppd_transform.q.out b38088f16a90393adf1939a0174ee20686b13fb3 
>   ql/src/test/results/clientpositive/remove_exprs_stats.q.out a9c0051371db45a668fad536d206535fe6843799 
>   ql/src/test/results/clientpositive/spark/auto_join19.q.out d7d8caee33bd51a315c64f038262cd47570b4bbd 
>   ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out e07904ac445ad32559b254ff0777c19b4e861419 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out a35edb42a851d5c6ca6ff7fb21a8172788c34d55 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 83437e55936e05f7a5f6f2a116e4bbe980c9768e 
>   ql/src/test/results/clientpositive/spark/ppd_transform.q.out 4dfc0fed6e292169c92560f71302944eef209cc0 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 24202522f584ad1308edcac626627421a102406d 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out c5d0d63f8c80a5b1c0e57cf0c0595b7229d8e315 
>   ql/src/test/results/clientpositive/spark/vector_between_in.q.out 78bcd26f561eb2449ec7a45f99156f04794aed2f 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out 0bf2a4bfa53b732777ee6d074146693df5ee5954 
>   ql/src/test/results/clientpositive/stat_estimate_related_col.q.out 669adafda3a45f7846face3d99817cd1b9cb3664 
>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 5a50431d267bb595a6c88c87bd2f8aed26ac92f7 
>   ql/src/test/results/clientpositive/vector_non_constant_in_expr.q.out 966edad0258f7e80a27163a46f7a4e7071efe893 
>   ql/src/test/results/clientpositive/vectorized_case.q.out 828131f8c616cb74cca2fb20de3dd4b9190c0718 
> 
> 
> Diff: https://reviews.apache.org/r/68108/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>