You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by mark pasterkamp <ma...@hotmail.com> on 2018/10/15 12:05:49 UTC

Query rewriting materialized view rules

Hello,

I have some confusion regarding the query rewriting rules in Calcite and I was hoping someone could help me with that.
Looking at the documentation of materialized views http://calcite.apache.org/docs/materialized_views#materialized-views-maintained-by-calcite
and in the source code, I found there are 2 systems in place for rewriting queries to use materialized views. There is the unify rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java with an extension to materialized views found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/MaterializedViewSubstitutionVisitor.java
And there is the materialized view rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/AbstractMaterializedViewRule.java

Since I am not that experienced working with Calcite, I was wondering if someone could shed some light about these 2 query rewriting systems. Are they supposed to be used independently or in conjunction? Do they try to solve the same thing or is one rewriting system better for some queries than others? (for instance, if I want a query to be rewritten to use materialized views, would that better fit the MaterializedViewSubstitutionVisitor or the AbstractMaterializedViewRule).

If someone could help me out to better understand how calcite does its rewritings, that would be great.

Re: Query rewriting materialized view rules

Posted by Jesus Camacho Rodriguez <jc...@hortonworks.com>.
Hi Mark,

In principle both rewriting systems have the same objective. They can be enabled together (as in Calcite) or used independently (for instance, Hive uses AbstractMaterializedViewRule). 

Both rewritings mechanisms support SPJA materialized views, though MaterializedViewSubstitutionVisitor supports definitions with Union operator too, while AbstractMaterializedViewRule does not.

As mentioned in the documentation, one important difference is that the MaterializedViewSubstitutionVisitor relies on other transformation rules to create equivalences between the expressions in the original plan and the materialized view definition. Adding these additional rules to the planning phase makes the rewriting powerful but expensive, as it may not scale for some queries, e.g., query and materialized view with a large number of joins combined with rules that find all the possible join permutations in a plan.
In turn, the rules in AbstractMaterializedViewRule rely on structural information extracted from the subplan after a match is found, hence it does not need to enumerate exhaustively all equivalent expressions in a plan to produce a rewriting.

AbstractMaterializedViewRule can also produce partial rewritings if the result of a query is partially contained in MV (including for MVs with Aggregate operator by having an additional rollup operation), which I believe MaterializedViewSubstitutionVisitor cannot do right now. (There are some examples of partial rewritings in the documentation).

Depending on your specific use case, you may also be interested in lattices: http://calcite.apache.org/docs/lattice.html.

-Jesús


On 10/15/18, 5:06 AM, "mark pasterkamp" <ma...@hotmail.com> wrote:

    Hello,
    
    I have some confusion regarding the query rewriting rules in Calcite and I was hoping someone could help me with that.
    Looking at the documentation of materialized views http://calcite.apache.org/docs/materialized_views#materialized-views-maintained-by-calcite
    and in the source code, I found there are 2 systems in place for rewriting queries to use materialized views. There is the unify rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java with an extension to materialized views found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/MaterializedViewSubstitutionVisitor.java
    And there is the materialized view rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/AbstractMaterializedViewRule.java
    
    Since I am not that experienced working with Calcite, I was wondering if someone could shed some light about these 2 query rewriting systems. Are they supposed to be used independently or in conjunction? Do they try to solve the same thing or is one rewriting system better for some queries than others? (for instance, if I want a query to be rewritten to use materialized views, would that better fit the MaterializedViewSubstitutionVisitor or the AbstractMaterializedViewRule).
    
    If someone could help me out to better understand how calcite does its rewritings, that would be great.