You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "jin xing (Jira)" <ji...@apache.org> on 2019/10/08 08:48:01 UTC

[jira] [Comment Edited] (CALCITE-3334) Refinement for Substitution-Based MV Matching

    [ https://issues.apache.org/jira/browse/CALCITE-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946625#comment-16946625 ] 

jin xing edited comment on CALCITE-3334 at 10/8/19 8:47 AM:
------------------------------------------------------------

Hi, Haisheng

There are two strategies for materialized view matching:
 # substitution based (SubstitutionVisitor.java) [1]
 # plan structural information based (AbstractMaterializedViewRule.java) [2]

The two strategies are controlled by a single connection config of "materializationsEnabled". Calcite will apply strategy-1 firstly and then strategy-2. And all the tests are within MaterializationTest.java

In MaterializationTest.java, some tests are checking plan by comparing 'expected' and 'actual' strings. Before this PR, some materialization matching tests can be supported strategy-2 but not strategy-1. After this PR, tests can be supported by both of them. Result plans are equivalent but slightly different.

Take MaterializationTest#testJoinAggregateMaterializationNoAggregateFuncs1 as an example, which is only supported by strategy-2 previously and the optimized plan is
{code:java}
EnumerableCalc(expr#0..1=[{inputs}], expr#2=[20], expr#3=[<($t2, $t1)], empid=[$t0], $condition=[$t3])
  EnumerableTableScan(table=[[hr, m0]])
{code}
But now with this PR, strategy-1 optimize it as
{code:java}
EnumerableCalc(expr#0..1=[{inputs}], expr#2=[20], expr#3=[>($t1, $t2)], empid=[$t0], $condition=[$t3])
  EnumerableTableScan(table=[[hr, m0]])
{code}
We can find that both of the two result plans are correct and equivalent, but not exactly the same.

I'm hesitate to modify the tests directly, since the tests are originally created for strategy-2. Thus I need an option/config to control running with or without SubstituionVisitor. 

I'm not very sure if it's a good idea to add a config directly in SubstitutionVisitor. Should I need to make it a connection property just like CalciteConnectionProperty#materializationsEnabled ?

 

[1] [https://calcite.apache.org/docs/materialized_views.html#substitution-via-rules-transformation]

[2] [https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information]


was (Author: jinxing6042@126.com):
Hi, Haisheng

There are two strategies for materialized view matching:
 # substitution based (SubstitutionVisitor.java) [1]
 # plan structural information based (AbstractMaterializedViewRule.java) [2] 

The two strategies are controlled by a single connection config of "materializationsEnabled". Calcite will apply strategy-1 firstly and then strategy-2. And all the tests are within MaterializationTest.java

In MaterializationTest.java, some tests are checking plan by comparing 'expected' and 'actual' strings. Before this PR, some materialization matching tests can be supported strategy-2 but not strategy-1. After this PR, tests can be supported by both of them. Result plans are equivalent but slightly different.

Take MaterializationTest#testJoinAggregateMaterializationNoAggregateFuncs1 as an example, which is only supported by strategy-2 previously and the optimized plan is
{code:java}
EnumerableCalc(expr#0..1=[{inputs}], expr#2=[20], expr#3=[<($t2, $t1)], empid=[$t0], $condition=[$t3])
  EnumerableTableScan(table=[[hr, m0]])
{code}
But now with this PR, strategy-1 optimize it as
{code:java}
EnumerableCalc(expr#0..1=[{inputs}], expr#2=[20], expr#3=[>($t1, $t2)], empid=[$t0], $condition=[$t3])
  EnumerableTableScan(table=[[hr, m0]])
{code}
We can find that both of the two result plans are correct and equivalent.

I'm hesitate to modify the tests directly, since the tests are originally created for strategy-2. Thus I need an option/config to run matching with or without SubstituionVisitor. 

My concern is that I'm not very sure if it's a good idea to add a config directly in SubstitutionVisitor. Should I need to make it a connection property just like CalciteConnectionProperty#materializationsEnabled ?

 

[1] [https://calcite.apache.org/docs/materialized_views.html#substitution-via-rules-transformation]

[2] [https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information]

> Refinement for Substitution-Based MV Matching
> ---------------------------------------------
>
>                 Key: CALCITE-3334
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3334
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: jin xing
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The approach of substitution-based MV matching is an effective way for its simplicity and extensibility. 
> This JIRA proposes to refine existing implementation by several points:
>  # Canonicalize before MV matching -- by such canonicalization we can significantly simplify the algebra tree and lower the difficulty for materialization matching.
>  # Separate matching rules into two categories and enumerate common matching patterns which need to be covered by rules.
> Please check the design doc: [Design Doc|https://docs.google.com/document/d/1JpwGNFE3hw3yXb7W3-95-jXKClZC5UFPKbuhgYDuEu4/edit#]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)