You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Krisztian Kasa <kk...@hortonworks.com> on 2019/09/09 17:29:37 UTC

Review Request 71458: HIVE-22163

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/
-----------------------------------------------------------

Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.


Bugs: HIVE-22163
    https://issues.apache.org/jira/browse/HIVE-22163


Repository: hive-git


Description
-------

CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
  ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 


Diff: https://reviews.apache.org/r/71458/diff/1/


Testing
-------

New qtest: cbo_stats_estimation.q


Thanks,

Krisztian Kasa


Re: Review Request 71458: HIVE-22163

Posted by Jesús Camacho Rodríguez <jc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/#review217662
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
Lines 272 (patched)
<https://reviews.apache.org/r/71458/#comment304957>

    I think _needColStats_ is only true when we call this method from CBO? What happens with CBO if estimate stats is set to false and stats for any column are not present, i.e., does it fail completely or only the stage that is dependent on the stats (for instance, join reordering)? I think the desired behavior is the latest. We can probably confirm that by checking the logs for join_reordering_no_stats.q test.


- Jesús Camacho Rodríguez


On Sept. 9, 2019, 5:29 p.m., Krisztian Kasa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71458/
> -----------------------------------------------------------
> 
> (Updated Sept. 9, 2019, 5:29 p.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.
> 
> 
> Bugs: HIVE-22163
>     https://issues.apache.org/jira/browse/HIVE-22163
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
>   ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 
> 
> 
> Diff: https://reviews.apache.org/r/71458/diff/1/
> 
> 
> Testing
> -------
> 
> New qtest: cbo_stats_estimation.q
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>


Re: Review Request 71458: HIVE-22163

Posted by Jesús Camacho Rodríguez <jc...@hortonworks.com>.

> On Sept. 11, 2019, 4:29 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out
> > Line 278 (original), 258 (patched)
> > <https://reviews.apache.org/r/71458/diff/2/?file=2164654#file2164654line279>
> >
> >     'Data size: -1' should not happen. Can you explain why this is happening? I think this leads to that Long.MAX data size below too.

OK, it seems the hive configuration property controls whether to estimate for basic and column stats, not only column stats. Dropping this comment.


- Jesús


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/#review217687
-----------------------------------------------------------


On Sept. 11, 2019, 9:18 a.m., Krisztian Kasa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71458/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2019, 9:18 a.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.
> 
> 
> Bugs: HIVE-22163
>     https://issues.apache.org/jira/browse/HIVE-22163
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 8d9718f2c8 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
>   ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 
> 
> 
> Diff: https://reviews.apache.org/r/71458/diff/2/
> 
> 
> Testing
> -------
> 
> New qtest: cbo_stats_estimation.q
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>


Re: Review Request 71458: HIVE-22163

Posted by Jesús Camacho Rodríguez <jc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/#review217687
-----------------------------------------------------------




ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out
Line 278 (original), 258 (patched)
<https://reviews.apache.org/r/71458/#comment304997>

    'Data size: -1' should not happen. Can you explain why this is happening? I think this leads to that Long.MAX data size below too.


- Jesús Camacho Rodríguez


On Sept. 11, 2019, 9:18 a.m., Krisztian Kasa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71458/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2019, 9:18 a.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.
> 
> 
> Bugs: HIVE-22163
>     https://issues.apache.org/jira/browse/HIVE-22163
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 8d9718f2c8 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
>   ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 
> 
> 
> Diff: https://reviews.apache.org/r/71458/diff/2/
> 
> 
> Testing
> -------
> 
> New qtest: cbo_stats_estimation.q
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>


Re: Review Request 71458: HIVE-22163

Posted by Jesús Camacho Rodríguez <jc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/#review217689
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
Line 275 (original), 276 (patched)
<https://reviews.apache.org/r/71458/#comment304999>

    L276 and L277 (in patched version) should be out of _if_ _estimateStats_ block, since we want to obtain better data size from column stats, independently on whether we estimate them or not.


- Jesús Camacho Rodríguez


On Sept. 11, 2019, 9:18 a.m., Krisztian Kasa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71458/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2019, 9:18 a.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.
> 
> 
> Bugs: HIVE-22163
>     https://issues.apache.org/jira/browse/HIVE-22163
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 8d9718f2c8 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
>   ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
>   ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 
> 
> 
> Diff: https://reviews.apache.org/r/71458/diff/2/
> 
> 
> Testing
> -------
> 
> New qtest: cbo_stats_estimation.q
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>


Re: Review Request 71458: HIVE-22163

Posted by Krisztian Kasa <kk...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/
-----------------------------------------------------------

(Updated Sept. 13, 2019, 6:45 a.m.)


Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.


Bugs: HIVE-22163
    https://issues.apache.org/jira/browse/HIVE-22163


Repository: hive-git


Description
-------

CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 43dfceef61 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
  ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 


Diff: https://reviews.apache.org/r/71458/diff/4/

Changes: https://reviews.apache.org/r/71458/diff/3-4/


Testing
-------

New qtest: cbo_stats_estimation.q


Thanks,

Krisztian Kasa


Re: Review Request 71458: HIVE-22163

Posted by Krisztian Kasa <kk...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/
-----------------------------------------------------------

(Updated Sept. 12, 2019, 2:25 p.m.)


Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.


Bugs: HIVE-22163
    https://issues.apache.org/jira/browse/HIVE-22163


Repository: hive-git


Description
-------

CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 8d9718f2c8 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
  ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 


Diff: https://reviews.apache.org/r/71458/diff/3/

Changes: https://reviews.apache.org/r/71458/diff/2-3/


Testing
-------

New qtest: cbo_stats_estimation.q


Thanks,

Krisztian Kasa


Re: Review Request 71458: HIVE-22163

Posted by Krisztian Kasa <kk...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71458/
-----------------------------------------------------------

(Updated Sept. 11, 2019, 9:18 a.m.)


Review request for hive, Gopal V, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet Garg.


Bugs: HIVE-22163
    https://issues.apache.org/jira/browse/HIVE-22163


Repository: hive-git


Description
-------

CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 8d9718f2c8 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 1795ae5626 
  ql/src/test/queries/clientpositive/cbo_stats_estimation.q PRE-CREATION 
  ql/src/test/results/clientpositive/cbo_stats_estimation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/join_reordering_no_stats.q.out fddffbb0a8 


Diff: https://reviews.apache.org/r/71458/diff/2/

Changes: https://reviews.apache.org/r/71458/diff/1-2/


Testing
-------

New qtest: cbo_stats_estimation.q


Thanks,

Krisztian Kasa