You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by hakanmemisoglu <gi...@git.apache.org> on 2016/08/01 22:29:13 UTC

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

GitHub user hakanmemisoglu opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/73

    Move hash join's probe and build node decision from ExecutionGenerator to PhysicalOptimizer

    Improvement for QUICKSTEP-37. ([https://issues.apache.org/jira/browse/QUICKSTEP-37]())


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-quickstep refactor-hashjoin-probe-build

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/73.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #73
    
----
commit 372902dd706f9e328dde3709532b5fbf111fdf24
Author: Hakan Memisoglu <ha...@gmail.com>
Date:   2016-08-01T21:39:07Z

    Implemented hashjoin optimization class and removed the logic from ExecutionGenerator.

commit a0905647b783873d8c5ff7b5f17c6fa66f357e36
Author: Hakan Memisoglu <ha...@gmail.com>
Date:   2016-08-01T22:24:47Z

    Added a field in Physical HashJoin to save right(build) side estimated cardinality.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73290304
  
    --- Diff: query_optimizer/rules/BottomUpRule.hpp ---
    @@ -80,6 +81,14 @@ class BottomUpRule : public Rule<TreeType> {
        */
       virtual TreeNodePtr applyToNode(const TreeNodePtr &node) = 0;
     
    +  /**
    +   * @brief Initialization code to be used for each node.
    --- End diff --
    
    The initialization is only applied with the tree's root node but not "each node".
    Other places look good. I'll do this minor revision and then merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73078969
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.cpp ---
    @@ -0,0 +1,72 @@
    +#include "query_optimizer/rules/SwapProbeBuild.hpp"
    +
    +#include <cstddef>
    +#include <vector>
    +
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "query_optimizer/expressions/AttributeReference.hpp"
    +#include "query_optimizer/physical/HashJoin.hpp"
    +#include "query_optimizer/physical/PatternMatcher.hpp"
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/physical/TopLevelPlan.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +P::PhysicalPtr SwapProbeBuild::applyToNode(const P::PhysicalPtr &input) {
    +  P::HashJoinPtr hash_join;
    +
    +  if (P::SomeHashJoin::MatchesWithConditionalCast(input, &hash_join)) {
    +    P::PhysicalPtr left = hash_join->left();
    +    P::PhysicalPtr right = hash_join->right();
    +
    +    P::TopLevelPlanPtr top_level;
    +    if (P::SomeTopLevelPlan::MatchesWithConditionalCast(input, &top_level)) {
    --- End diff --
    
    `input` should be already a `HashJoinPtr` here due to the outer `if` statement.
    
    In fact, `BottomUpRule` does not provide a place for initializing `cost_model_`. One solution is to add an `initialization()` virtual method in `class BottomUpRule`, and implement that method in `SwapProbeBuild`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73079256
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.hpp ---
    @@ -0,0 +1,46 @@
    +#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +#define QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +
    +#include <string>
    +
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +#include "query_optimizer/rules/BottomUpRule.hpp"
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "utility/Macros.hpp"
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +/** \addtogroup OptimizerRules
    + *  @{
    + */
    +
    +namespace P = ::quickstep::optimizer::physical;
    +namespace E = ::quickstep::optimizer::expressions;
    +namespace C = ::quickstep::optimizer::cost;
    +
    +/**
    + * @brief Rule that applies to a physical plan to arrange probe and
    + *        build side based on the cardinalities.
    + */
    +class SwapProbeBuild : public BottomUpRule<P::Physical> {
    + public:
    +  SwapProbeBuild() {
    +  }
    +
    +  std::string getName() const override { return "SwapProbeBuild"; }
    +
    + protected:
    +  P::PhysicalPtr applyToNode(const P::PhysicalPtr &input) override;
    +
    + private:
    +  DISALLOW_COPY_AND_ASSIGN(SwapProbeBuild);
    +
    +  std::unique_ptr<C::StarSchemaSimpleCostModel> cost_model_;
    --- End diff --
    
    Currently use `SimpleCostModel` instead of `StarSchemaSimpleCostModel` for deciding the build/probe sides.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #73: Move hash join's probe and build node decisio...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/73
  
    LGTM except for some minor places. Also fix the style check issue then we're good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #73: Move hash join's probe and build node decisio...

Posted by zuyu <gi...@git.apache.org>.
Github user zuyu commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/73
  
    @jianqiao Per workflow, please delete the branch after merged. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73204750
  
    --- Diff: query_optimizer/rules/BottomUpRule.hpp ---
    @@ -80,6 +81,8 @@ class BottomUpRule : public Rule<TreeType> {
        */
       virtual TreeNodePtr applyToNode(const TreeNodePtr &node) = 0;
     
    +  virtual void init(const TreeNodePtr &node) = 0;
    --- End diff --
    
    Let's provide a default no-op implementation for `init()` here (instead of pure virtual) so we don't have to implement it for every subclass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by zuyu <gi...@git.apache.org>.
Github user zuyu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73080240
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.hpp ---
    @@ -0,0 +1,46 @@
    +#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +#define QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +
    +#include <string>
    +
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +#include "query_optimizer/rules/BottomUpRule.hpp"
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "utility/Macros.hpp"
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +/** \addtogroup OptimizerRules
    + *  @{
    + */
    +
    +namespace P = ::quickstep::optimizer::physical;
    +namespace E = ::quickstep::optimizer::expressions;
    +namespace C = ::quickstep::optimizer::cost;
    +
    +/**
    + * @brief Rule that applies to a physical plan to arrange probe and
    + *        build side based on the cardinalities.
    + */
    +class SwapProbeBuild : public BottomUpRule<P::Physical> {
    + public:
    +  SwapProbeBuild() {
    +  }
    +
    +  std::string getName() const override { return "SwapProbeBuild"; }
    +
    + protected:
    +  P::PhysicalPtr applyToNode(const P::PhysicalPtr &input) override;
    +
    + private:
    +  DISALLOW_COPY_AND_ASSIGN(SwapProbeBuild);
    +
    +  std::unique_ptr<C::StarSchemaSimpleCostModel> cost_model_;
    --- End diff --
    
    Actually, we should use the base class of the cost model, and initiate some implementation as @jianqiao suggested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73078148
  
    --- Diff: query_optimizer/physical/HashJoin.hpp ---
    @@ -172,18 +192,21 @@ class HashJoin : public BinaryJoin {
           const std::vector<expressions::AttributeReferencePtr> &right_join_attributes,
           const expressions::PredicatePtr &residual_predicate,
           const std::vector<expressions::NamedExpressionPtr> &project_expressions,
    -      const JoinType join_type)
    +      const JoinType join_type,
    +      const std::size_t estimated_right_cardinality)
           : BinaryJoin(left, right, project_expressions),
             left_join_attributes_(left_join_attributes),
             right_join_attributes_(right_join_attributes),
             residual_predicate_(residual_predicate),
    -        join_type_(join_type) {
    +        join_type_(join_type),
    +        estimated_right_cardinality_(estimated_right_cardinality) {
       }
     
       std::vector<expressions::AttributeReferencePtr> left_join_attributes_;
       std::vector<expressions::AttributeReferencePtr> right_join_attributes_;
       expressions::PredicatePtr residual_predicate_;
       JoinType join_type_;
    +  std::size_t estimated_right_cardinality_;
    --- End diff --
    
    It might not be good to couple the cardinality information here. We can still use `CostModel` to estimate a physical plan's cardinality whenever needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73118755
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.cpp ---
    @@ -0,0 +1,72 @@
    +#include "query_optimizer/rules/SwapProbeBuild.hpp"
    +
    +#include <cstddef>
    +#include <vector>
    +
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "query_optimizer/expressions/AttributeReference.hpp"
    +#include "query_optimizer/physical/HashJoin.hpp"
    +#include "query_optimizer/physical/PatternMatcher.hpp"
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/physical/TopLevelPlan.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +P::PhysicalPtr SwapProbeBuild::applyToNode(const P::PhysicalPtr &input) {
    +  P::HashJoinPtr hash_join;
    +
    +  if (P::SomeHashJoin::MatchesWithConditionalCast(input, &hash_join)) {
    +    P::PhysicalPtr left = hash_join->left();
    +    P::PhysicalPtr right = hash_join->right();
    +
    +    P::TopLevelPlanPtr top_level;
    +    if (P::SomeTopLevelPlan::MatchesWithConditionalCast(input, &top_level)) {
    --- End diff --
    
    @jianqiao , Thanks for the catch. 
    
    > In fact, BottomUpRule does not provide a place for initializing cost_model_.
    
    It is the reason why, I have implemented initialization of cost model in "applyToNode". 
    I will introduce init(const P::PhysicalPtr &top_level_plan) as you said.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73207435
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.cpp ---
    @@ -0,0 +1,64 @@
    +#include "query_optimizer/rules/SwapProbeBuild.hpp"
    +
    +#include <cstddef>
    +#include <vector>
    +
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "query_optimizer/expressions/AttributeReference.hpp"
    +#include "query_optimizer/physical/HashJoin.hpp"
    +#include "query_optimizer/physical/PatternMatcher.hpp"
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/physical/TopLevelPlan.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +P::PhysicalPtr SwapProbeBuild::applyToNode(const P::PhysicalPtr &input) {
    +  P::HashJoinPtr hash_join;
    +
    +  if (P::SomeHashJoin::MatchesWithConditionalCast(input, &hash_join)
    +      && hash_join->join_type() == P::HashJoin::JoinType::kInnerJoin) {
    +    P::PhysicalPtr left = hash_join->left();
    +    P::PhysicalPtr right = hash_join->right();
    +
    +
    +
    --- End diff --
    
    Remove the extra blank lines here (at most 1 line).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-quickstep/pull/73


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by jianqiao <gi...@git.apache.org>.
Github user jianqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73208331
  
    --- Diff: query_optimizer/rules/SwapProbeBuild.hpp ---
    @@ -0,0 +1,47 @@
    +#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +#define QUICKSTEP_QUERY_OPTIMIZER_RULES_SWAP_PROBE_BUILD_HPP_
    +
    +#include <string>
    --- End diff --
    
    Add `#include <memory>` as `std::unique_ptr` is used in this file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #73: Move hash join's probe and build node ...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/73#discussion_r73117361
  
    --- Diff: query_optimizer/physical/HashJoin.hpp ---
    @@ -172,18 +192,21 @@ class HashJoin : public BinaryJoin {
           const std::vector<expressions::AttributeReferencePtr> &right_join_attributes,
           const expressions::PredicatePtr &residual_predicate,
           const std::vector<expressions::NamedExpressionPtr> &project_expressions,
    -      const JoinType join_type)
    +      const JoinType join_type,
    +      const std::size_t estimated_right_cardinality)
           : BinaryJoin(left, right, project_expressions),
             left_join_attributes_(left_join_attributes),
             right_join_attributes_(right_join_attributes),
             residual_predicate_(residual_predicate),
    -        join_type_(join_type) {
    +        join_type_(join_type),
    +        estimated_right_cardinality_(estimated_right_cardinality) {
       }
     
       std::vector<expressions::AttributeReferencePtr> left_join_attributes_;
       std::vector<expressions::AttributeReferencePtr> right_join_attributes_;
       expressions::PredicatePtr residual_predicate_;
       JoinType join_type_;
    +  std::size_t estimated_right_cardinality_;
    --- End diff --
    
    @jianqiao, I will remove it. Then I need to reintroduce the hashtable size estimation logic into ExecutionGenerator.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---