You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2018/02/15 18:59:00 UTC

[jira] [Commented] (CALCITE-2179) General improvements for materialized view rewriting rule

    [ https://issues.apache.org/jira/browse/CALCITE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366108#comment-16366108 ] 

Julian Hyde commented on CALCITE-2179:
--------------------------------------

If join orderings are a problem, have you considered using lattices? Queries are rewritten to use the virtual "star" table underlying the lattice and you avoid the combinatorial problem of matching join orders.

Your example query uses date constants but I presume you intend this to work for date-time columns or expressions.

Rolling up FLOOR seems to be safe because FLOOR is monotonic (or, more precisely, {{FLOOR(t TO timeUnit)}} is monotonic in {{t}}, for any given timeUnit). Are these optimizations applicable to other monotonic operations, for example division by a positive constant ({{x / 10}} is monotonic in {{x}})?

> General improvements for materialized view rewriting rule
> ---------------------------------------------------------
>
>                 Key: CALCITE-2179
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2179
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>             Fix For: 1.16.0
>
>
> This issue is for extending {{AbstractMaterializedViewRule}} rule:
> - Support for rolling up date nodes. For instance, rewrite in the following case:
> {code}
> Materialization:
> select "empid", floor(cast('1997-01-20' as timestamp) to month), count(*) + 1 as c, sum("empid") as s
> from "emps" group by "empid", floor(cast('1997-01-20' as timestamp) to month);
> Query:
> select floor(cast('1997-01-20' as timestamp) to year), sum("empid") as s
> from "emps" group by floor(cast('1997-01-20' as timestamp) to year);
> {code}
> - Add flag to enable/disable fast bail out for joins. By default it is true, and thus, we were only creating the rewriting in the minimal subtree of plan operators. For instance:
> {code}
> View: (A JOIN B) JOIN C
> Query: (((A JOIN B) JOIN D) JOIN C) JOIN E
> {code}
> We produce it at:
> {code}
> ((A JOIN B) JOIN D) JOIN C
> {code}
> But not at:
> {code}
> (((A JOIN B) JOIN D) JOIN C) JOIN E
> {code}
> This is important when the rule is used with the Volcano planner together with other rules, e.g. join reordering, as it prevents that the search space grows unnecessarily. However, if we use the rewriting rule in isolation, fast bail out can lead to missing rewriting opportunities (e.g. for bushy join trees).
> - Possibility to provide a HepProgram to optimize query branch in union rewritings. Note that when we produce a partial rewriting with a Union, the branch that will execute the (partial) query can be fully rewritten so we can add the compensation predicate. (We cannot do the same for views because the expression might not be computable if the needed subexpressions are not available in the view output). If we use Volcano with a determined set of rules, this might not be needed, hence providing this program is optional.
> - Multiple small fixes discovered while testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)