You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Harish Butani <rh...@gmail.com> on 2014/02/16 21:39:46 UTC

Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/
-----------------------------------------------------------

Review request for hive.


Bugs: HIVE-6439
    https://issues.apache.org/jira/browse/HIVE-6439


Repository: hive-git


Description
-------

This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
The contract between SemAly and CBO is:
CBO step is controlled by the 'hive.enable.cbo.flag'.
When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.


Diffs
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  conf/hive-default.xml.template 0d08aa2 
  ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 

Diff: https://reviews.apache.org/r/18172/diff/


Testing
-------


Thanks,

Harish Butani


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Gunther Hagleitner <gh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/#review34602
-----------------------------------------------------------



conf/hive-default.xml.template
<https://reviews.apache.org/r/18172/#comment64764>

    max joins is missing.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java
<https://reviews.apache.org/r/18172/#comment64765>

    if this is meant to be a generic contract this shouldn't be in the optiq package.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java
<https://reviews.apache.org/r/18172/#comment64763>

    Why can't we use a generic type here?


- Gunther Hagleitner


On Feb. 16, 2014, 8:39 p.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18172/
> -----------------------------------------------------------
> 
> (Updated Feb. 16, 2014, 8:39 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-6439
>     https://issues.apache.org/jira/browse/HIVE-6439
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
> The contract between SemAly and CBO is:
> CBO step is controlled by the 'hive.enable.cbo.flag'.
> When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
>   conf/hive-default.xml.template 0d08aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
> 
> Diff: https://reviews.apache.org/r/18172/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/#review34599
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java
<https://reviews.apache.org/r/18172/#comment64759>

    indentation needs fixing (2 spaces)



ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java
<https://reviews.apache.org/r/18172/#comment64758>

    The coding conventions followed by hive require braces with if statements.
    
    As hive follows Sun/Java code conventions (except for indentation of 2 chars, and line limit of 100 chars), you can select the java code convention under eclipse formatter and select the java profile, edit it for these two settings and save it as hive profile.
    
    Then highlight your section of new/edited code, right-click source->format .
    I will add these instructions to HowToContribute once the wiki is working again.
    



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/18172/#comment64760>

    indentation issues



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/18172/#comment64761>

    braces needed for if
    


- Thejas Nair


On Feb. 16, 2014, 8:39 p.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18172/
> -----------------------------------------------------------
> 
> (Updated Feb. 16, 2014, 8:39 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-6439
>     https://issues.apache.org/jira/browse/HIVE-6439
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
> The contract between SemAly and CBO is:
> CBO step is controlled by the 'hive.enable.cbo.flag'.
> When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
>   conf/hive-default.xml.template 0d08aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
> 
> Diff: https://reviews.apache.org/r/18172/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/#review34601
-----------------------------------------------------------



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/18172/#comment64762>

    I think it would be better to use the expansion of CBO in this comment


- Thejas Nair


On Feb. 16, 2014, 8:39 p.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18172/
> -----------------------------------------------------------
> 
> (Updated Feb. 16, 2014, 8:39 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-6439
>     https://issues.apache.org/jira/browse/HIVE-6439
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
> The contract between SemAly and CBO is:
> CBO step is controlled by the 'hive.enable.cbo.flag'.
> When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
>   conf/hive-default.xml.template 0d08aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
> 
> Diff: https://reviews.apache.org/r/18172/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Gunther Hagleitner <gh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/#review34603
-----------------------------------------------------------



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/18172/#comment64766>

    this shouldn't be part of the contract should it?


- Gunther Hagleitner


On Feb. 16, 2014, 8:39 p.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18172/
> -----------------------------------------------------------
> 
> (Updated Feb. 16, 2014, 8:39 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-6439
>     https://issues.apache.org/jira/browse/HIVE-6439
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
> The contract between SemAly and CBO is:
> CBO step is controlled by the 'hive.enable.cbo.flag'.
> When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
>   conf/hive-default.xml.template 0d08aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
> 
> Diff: https://reviews.apache.org/r/18172/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Harish Butani <rh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/
-----------------------------------------------------------

(Updated Feb. 17, 2014, 7:35 a.m.)


Review request for hive.


Changes
-------

ok added max.joins to conf; moved CBO to ql.optimizer


Bugs: HIVE-6439
    https://issues.apache.org/jira/browse/HIVE-6439


Repository: hive-git


Description
-------

This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
The contract between SemAly and CBO is:
CBO step is controlled by the 'hive.enable.cbo.flag'.
When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  conf/hive-default.xml.template 0d08aa2 
  ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CostBasedOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 

Diff: https://reviews.apache.org/r/18172/diff/


Testing
-------


Thanks,

Harish Butani


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/#review34624
-----------------------------------------------------------


1. In most query engines 20 way joins and beyond is hard to reorder. So i would imagine that we would need a similar flag(hive.cbo.max.joins.supported) to control the length of join graph that is being considered for reordering.
2. Its reasonable to introduce ql.optimizer.CostBasedOptimizer which then calls in to Optiq based Optimizer. 
   One thing to keep in mind, Optiq based optimizer would have both rule based and cost based portions.

- John Pullokkaran


On Feb. 17, 2014, 6:49 a.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18172/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 6:49 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-6439
>     https://issues.apache.org/jira/browse/HIVE-6439
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
> The contract between SemAly and CBO is:
> CBO step is controlled by the 'hive.enable.cbo.flag'.
> When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
>   conf/hive-default.xml.template 0d08aa2 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
> 
> Diff: https://reviews.apache.org/r/18172/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 18172: HIVE-6439 Introduce CBO step in Semantic Analyzer

Posted by Harish Butani <rh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18172/
-----------------------------------------------------------

(Updated Feb. 17, 2014, 6:49 a.m.)


Review request for hive.


Changes
-------

John,
I agree with Gunther's points, I propose:
- we take out the max joins parameter from HiveConf for now; at least until it becomes clearer that this is one of the ways we want to control CBO use.
- we move CostBasedOptimizer to the ql.optimizer package.

Do you agree?


Bugs: HIVE-6439
    https://issues.apache.org/jira/browse/HIVE-6439


Repository: hive-git


Description
-------

This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. 
The contract between SemAly and CBO is:
CBO step is controlled by the 'hive.enable.cbo.flag'.
When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form.


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  conf/hive-default.xml.template 0d08aa2 
  ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 1ba5654 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/PreCBOOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/CostBasedOptimizer.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 52c39c0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 

Diff: https://reviews.apache.org/r/18172/diff/


Testing
-------


Thanks,

Harish Butani