You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@quickstep.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/01/31 17:23:51 UTC

[jira] (QUICKSTEP-69) Query optimization with ExactFilter.

    [ https://issues.apache.org/jira/browse/QUICKSTEP-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847157#comment-15847157 ] 

ASF GitHub Bot commented on QUICKSTEP-69:
-----------------------------------------

Github user hbdeshmukh commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/172#discussion_r98722409
  
    --- Diff: query_optimizer/rules/InjectJoinFilters.hpp ---
    @@ -0,0 +1,115 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + **/
    +
    +#ifndef QUICKSTEP_QUERY_OPTIMIZER_RULES_INJECT_JOIN_FILTERS_HPP_
    +#define QUICKSTEP_QUERY_OPTIMIZER_RULES_INJECT_JOIN_FILTERS_HPP_
    +
    +#include <cstdint>
    +#include <memory>
    +#include <string>
    +
    +#include "query_optimizer/cost_model/StarSchemaSimpleCostModel.hpp"
    +#include "query_optimizer/expressions/AttributeReference.hpp"
    +#include "query_optimizer/physical/LIPFilterConfiguration.hpp"
    +#include "query_optimizer/physical/FilterJoin.hpp"
    +#include "query_optimizer/physical/HashJoin.hpp"
    +#include "query_optimizer/physical/Physical.hpp"
    +#include "query_optimizer/rules/Rule.hpp"
    +#include "utility/Macros.hpp"
    +
    +namespace quickstep {
    +namespace optimizer {
    +
    +/** \addtogroup OptimizerRules
    + *  @{
    + */
    +
    +/**
    + * @brief Rule that applies to a physical plan to transform HashJoin nodes into
    + *        FilterJoin nodes.
    + * 
    + * This is an optimization that strength-reduces HashJoins to FilterJoins
    + * (implemented as LIPFilters attached to some anchoring operators where the
    + * filters get applied). Briefly speaking, the idea is that in the case that
    + * (1) the join attribute has consecutive integer values bounded in a reasonably
    + * small range AND (2) the output attributes are all from the probe-side table,
    + * we can eliminate the HashJoin by building a BitVector on the build-side
    + * attribute and using the BitVector to filter the probe-side table.
    + */
    +class InjectJoinFilters : public Rule<physical::Physical> {
    + public:
    +  /**
    +   * @brief Constructor.
    +   */
    +  InjectJoinFilters() {}
    +
    +  ~InjectJoinFilters() override {}
    +
    +  std::string getName() const override {
    +    return "TransformFilterJoins";
    +  }
    +
    +  physical::PhysicalPtr apply(const physical::PhysicalPtr &input) override;
    +
    + private:
    +  // Check whether a HashJoin can be transformed into a FilterJoin.
    +  bool isTransformable(const physical::HashJoinPtr &hash_join) const;
    +
    +  // Transform applicable HashJoin nodes into FilterJoin nodes.
    +  physical::PhysicalPtr transformHashJoinToFilters(
    +      const physical::PhysicalPtr &input) const;
    +
    +  // Push down FilterJoin nodes to be evaluated early.
    +  physical::PhysicalPtr pushDownFilters(const physical::PhysicalPtr &input) const;
    +
    +  // Add Selection node, if necessary, for anchoring the LIP filters built by
    +  // FilterJoin nodes.
    +  physical::PhysicalPtr addFilterAnchors(const physical::PhysicalPtr &input,
    +                                         const bool ancestor_can_anchor_filter) const;
    +
    +  // Setup lip_filter_configuration_ with the transformed plan tree.
    +  void concretizeAsLIPFilters(const physical::PhysicalPtr &input,
    +                              const physical::PhysicalPtr &anchor_node) const;
    +
    +  physical::PhysicalPtr pushDownFiltersInternal(
    +      const physical::PhysicalPtr &probe_child,
    +      const physical::PhysicalPtr &build_child,
    +      const physical::FilterJoinPtr &filter_join) const;
    +
    +  bool findExactMinMaxValuesForAttributeHelper(
    +      const physical::PhysicalPtr &physical_plan,
    +      const expressions::AttributeReferencePtr &attribute,
    +      std::int64_t *min_cpp_value,
    +      std::int64_t *max_cpp_value) const;
    +
    +  std::unique_ptr<cost::StarSchemaSimpleCostModel> cost_model_;
    +  std::unique_ptr<physical::LIPFilterConfiguration> lip_filter_configuration_;
    +
    +  // 1G bits = 128MB
    --- End diff --
    
    This could be made a GFLAG variable in a later refactoring. You can add a TODO for this. 


> Query optimization with ExactFilter.
> ------------------------------------
>
>                 Key: QUICKSTEP-69
>                 URL: https://issues.apache.org/jira/browse/QUICKSTEP-69
>             Project: Apache Quickstep
>          Issue Type: Improvement
>          Components: Query Optimizer, Relational Operators, Utility
>            Reporter: Jianqiao Zhu
>            Assignee: Jianqiao Zhu
>
> This is a follow-up optimization based on the facility provided by LIPFilters. Note that LIP (lookahead information passing) filter is an optimization that we can inject bloom filters (created somewhere else) into Select/HashJoin/Aggregate operators to pre-filter the input relations.
> This PR strength-reduces HashJoins (including inner/semi/anti joins) into FilterJoins. The semantics of a FilterJoin is simple: if certain conditions are met, we can build a bit vector from the build side and use the bit vector to FILTER the probe side.
> The execution part is slightly more optimized: a FIlterJoin will not always be converted into a SelectOperator plus a LIPFilter as its semantics indicates. Instead, in most situations we can avoid creating the SelectOperator and attach the LIPFilter properly to some downstream operators -- thus avoid unnecessary materialization of intermediate relations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)