You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by j....@gmail.com on 2014/09/05 20:58:49 UTC

Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/
-----------------------------------------------------------

Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.


Bugs: HIVE-7990
    https://issues.apache.org/jira/browse/HIVE-7990


Repository: hive-git


Description
-------

see jira


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
  ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
  ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
  ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
  ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
  ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 

Diff: https://reviews.apache.org/r/25395/diff/


Testing
-------


Thanks,

Prasanth_J


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/#review52492
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91291>

    Remove the comment // map side no grouping set



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91292>

    Ideally we should be taking into account the NDV of the group by columns and not divide by 2.


- Mostafa Mokhtar


On Sept. 5, 2014, 6:58 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25395/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2014, 6:58 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.
> 
> 
> Bugs: HIVE-7990
>     https://issues.apache.org/jira/browse/HIVE-7990
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 
> 
> Diff: https://reviews.apache.org/r/25395/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/#review52494
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91295>

    Sorry. This comment is applicable. This is map side case where grouping set is not available (normal group by). multiplier is initialized to mapSideParallelism.


- Prasanth_J


On Sept. 5, 2014, 6:58 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25395/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2014, 6:58 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.
> 
> 
> Bugs: HIVE-7990
>     https://issues.apache.org/jira/browse/HIVE-7990
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 
> 
> Diff: https://reviews.apache.org/r/25395/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/#review52493
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91294>

    Will fix it.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91293>

    This condition will happen only when column stats not available. This is worst case.


- Prasanth_J


On Sept. 5, 2014, 6:58 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25395/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2014, 6:58 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.
> 
> 
> Bugs: HIVE-7990
>     https://issues.apache.org/jira/browse/HIVE-7990
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 
> 
> Diff: https://reviews.apache.org/r/25395/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/#review52527
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91322>

    Added in the next patch.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91323>

    I will address that in https://issues.apache.org/jira/browse/HIVE-7156


- Prasanth_J


On Sept. 5, 2014, 6:58 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25395/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2014, 6:58 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.
> 
> 
> Bugs: HIVE-7990
>     https://issues.apache.org/jira/browse/HIVE-7990
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 
> 
> Diff: https://reviews.apache.org/r/25395/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/
-----------------------------------------------------------

(Updated Sept. 5, 2014, 10:34 p.m.)


Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.


Changes
-------

Addressed Gunther's comments.


Bugs: HIVE-7990
    https://issues.apache.org/jira/browse/HIVE-7990


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 0cac439 
  ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
  ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
  ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
  ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
  ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
  ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 

Diff: https://reviews.apache.org/r/25395/diff/


Testing
-------


Thanks,

Prasanth_J


Re: Review Request 25395: HIVE-7990: With fetch column stats disabled number of elements in grouping set is not taken into account

Posted by Gunther Hagleitner <gh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25395/#review52521
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91319>

    there's a new case now that has appmastereventoperator as a child of group by.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
<https://reviews.apache.org/r/25395/#comment91320>

    we're doing map side aggr which yields different cardinalities for each set. Shouldn't the code account for that?


- Gunther Hagleitner


On Sept. 5, 2014, 6:58 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25395/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2014, 6:58 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gunther Hagleitner, and Harish Butani.
> 
> 
> Bugs: HIVE-7990
>     https://issues.apache.org/jira/browse/HIVE-7990
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java 01c1d30 
>   ql/src/test/queries/clientpositive/annotate_stats_groupby.q e8e84c6 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 9c37d9b 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 4246744 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 1cd65f6 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out f1ecbb4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 5db5cd5 
> 
> Diff: https://reviews.apache.org/r/25395/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>