You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Hans Zeller (JIRA)" <ji...@apache.org> on 2016/10/20 18:20:00 UTC
[jira] [Commented] (TRAFODION-2127) enhance Trafodion implementation of WITH clause

    [ https://issues.apache.org/jira/browse/TRAFODION-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15592568#comment-15592568 ] 

Hans Zeller commented on TRAFODION-2127:
----------------------------------------

This is a first set of changes to allow us to make use of CTEs
(Common Table Expressions) declared in WITH clauses and to create
a temp table for them that is then read multiple times in the query.

h4. Summary of changes

- Adding a unique statement number in CmpContext
- Moving the execHiveSQL method from the OSIM code to CmpContext
- Adding a list of common subexpressions and their references
  to CmpStatement
- Adding the ability to the Hive Truncate operator to drop the
  table when the TCB gets deallocated
- Adding the ability to the HDFS scan to compute scan ranges at
  runtime. Those are usually determined in the compiler. This is
  only supported for simple, non-partitioned, delimited tables.
  We need this because we populate the temp table and read from
  in in the same statement, without the option of compiling
  after we inserted into the temp table.
- Special handling in the MapValueIds node of common subexpressions.
  See the comment in MapValueId::preCodeGen().
- Moved the binder code to create a FastExtract node into a new
  method FastExtract::makeFastExtractTree(), to be able to call
  it from another place.
- MapValueIds no longer looks at the "used by MVQR flag" to determine
  the method for VEGRewrite. Instead it checks whether a list of
  values has been provided to do this.
- Added a new method, RelExpr::prepareMeForCSESharing, that is
  kind of an "unnormalizer", undoing some of the normalizer
  transformations.
- Implemented the steps for common subexpression materializations
  described below.
- Added the ability to suppress the Hive timestamp modification
  check when truncating a Hive table
- Added an optimizer rule to eliminate CommonSubExprRef nodes.
  These nodes should not normally survive past the SQO phase, but
  if the SQO phase gets interrupted by an exception, that could
  happen, since we then fall back to a copy of the tree before
  SQO. In the future, we can consider cost-based decision on
  what to do with common subexpressions.
- Add CommonSubExprRef nodes in the parser whenever we expand
  a CTE reference.
- Got rid of the now meaningless "seamonster" display in EXPLAIN.

Other changes contained in this change set (file OptLogRelExpr.cpp):

- Optimization for empty scans, like select * from t where 1=0
  This now generates a cardinality constraint with 0 rows, which
  can be used later to eliminate parts of a tree.
- [TRAFODION-2280] Need to remove salt columns from uniqueness
  constraints generated on salted tables.

h4. Generated Plans

The resulting query plan for a query Q with n common
subexpressions CSE1 ... CSEn looks like this:

{noformat}
                         Root
                           |
                      MapValueIds
                           |
                      BlockedUnion
                       /        \
                    Union        Q
                    /   \
                  ...    CTn
                  /
               Union
               /   \
             CT1   CT2
{noformat}


Each of the CTi variables looks like the following, an
INSERT OVERWRITE TABLE tempi ...

{noformat}
         BlockedUnion
          /        \
      Truncate  FastExtract TEMPi
        TEMPi        |
                    CSEi
{noformat}

The original query Q has the common subexpressions replaced
with the following:

{noformat}
           MapValueIds
               |
             scan TEMPi
{noformat}

Here is a simple query and its explain:

{noformat}
prepare s from
with cse1 as (select d_date_sk, d_date, d_year, d_dow, d_moy from date_dim)
select x.d_year, y.d_date
from cse1 x join cse1 y on x.d_date_sk = y.d_date_sk
where x.d_moy = 3;
{noformat}

{noformat}
>>explain options 'f' s;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

11   .    12   root                                                  1.46E+005
5    10   11   blocked_union                                         1.46E+005
7    9    10   merge_join                                            7.30E+004
8    .    9    sort                                                  1.00E+002
.    .    8    hive_scan                       CSE_TEMP_CSE1_MXID11  1.00E+002
6    .    7    sort                                                  5.00E+001
.    .    6    hive_scan                       CSE_TEMP_CSE1_MXID11  5.00E+001
1    4    5    blocked_union                                         7.30E+004
2    .    4    hive_insert                     CSE_TEMP_CSE1_MXID11  7.30E+004
.    .    2    hive_scan                       DATE_DIM              7.30E+004
.    .    1    hive_truncate                                         1.00E+000

--- SQL operation complete.
>>
{noformat}

CQDs to control common subexpressions
-------------------------------------

{noformat}
CQD                    Value      Default  Behavior
---------------------  ---------  -------  ---------------------------------------
CSE_DEBUG_WARNINGS       OFF        Y      No change
                         ON                Emit diagnostic warnings that show why
                                           we didn't create temp tables for
                                           common subexpressions

CSE_FOR_WITH             OFF        Y      Keep current behavior
                         ON                Insert a CommonSubExprRef node in the
                                           tree whenever we reference a CTE
                                           (table defined in a WITH clause)

CSE_HIVE_TEMP_TABLE      OFF               Meaningless at this point
                         ON         Y      Create Hive temp tables

CSE_USE_TEMP             OFF        Y      Disable creation of temp tables
                                           for common subexpressions
                         SYSTEM            Same as OFF for now
                         ON                Always create a temp table for
                                           common subexpressions where possible
{noformat}


h4. CommonSubExprRef relational operators

This is a new RelExpr class that is introduced. It marks the common
subexpressions in a RelExpr tree. This operator remembers the name of
a common subexpression (e.g. the name used in the WITH clause).
Multiple such operators can reference to the same name. Each of
these references has a copy of the tree.

Right now, these operators are created in the parser when we expand a
CTE (Common Table Expression), declared in a WITH clause.  If the CTE
is referenced only once, then the CommonSubExprRef operator is removed
in the binder - it also doesn't live up to its name in this case.

The remaining CommonSubExprRef operators keep track of various changes
to their child trees, during the binder and normalizer phases.  In
particular, it tracks which predicates are pushed down into the child
tree and which outputs are eliminated.

The CmpStatement object keeps a global list of all the
CommonSubExprRef operators in a statement, so the individual operators
have a way to communicate with their siblings:

- A statement can have zero or more named common subexpressions.
- Each reference to a common subexpression is marked in the RelExpr
  tree with a CommonSubExprRef node.
- In the binder and normalizer, common subexpressions are expanded,
  meaning that multiple copies of them exist, one copy per
  CommonSubExprRef.
- Common subexpressions can reference other common subexpressions,
  so they, together with the main query, for a DAG (directed
  acyclic graph) of dependencies.
- Note that CTEs declared in a WITH clause but not referenced are
  ignored and are not part of the query tree.

In the semantic query optimization phase (SQO), the current code makes
a heuristic decision what to do with common subexpressions - to
evaluate them multiple times (expand) or to create a temporary table
once and read that table multiple times.

If we decide to expand, the action is simple: Remove the
CommonSubExprRef operator from the tree and replace it with its child.

If we decide to create a temp table, things become much more difficult.
We need to do several steps:

- Pick one of the child trees of the CommonSubExprRefs as the one to
  materialize.
- Undo any normalization steps that aren't compatible with the other
  CommonSubExprRefs. That means pulling out predicates that are not
  common among the references and adding back outputs that are
  required by other references. If that process fails, we fall back
  and expand the expressions.
- Create a temp table tmp.
- Prepare an INSERT OVERWRITE TABLE tmp SELECT * FROM cse tree
  that materializes the common subexpression in a table.
- Replace the CommonSubExprRef nodes with scans of the temp table.
- Hook up the insert query tree with the main query, such that it
  is executed before the main query starts.

h4. Temporary tables

At this time, temporary tables are created as Hive tables, with a
fabricated, unique name, including the session id, a unique statement
number within the session, and a unique identifier of the common
subexpression within the statement. The temporary table is created at
compile time. The query plan contains an operator to truncate the
table before populating it. The "temporary" Hive table is dropped when
the executor TCB is deallocated.

Several issues are remaining with this approach:

- If the process exits before executing and deallocating the statement,
  the Hive table is not cleaned up.
  Solution (TBD): Clean up these tables like we clean up left-over
  volatile tables. Both are identified by the session id.
- If the executor runs into memory pressure and deallocates the TCB,
  then allocates it again at a later time, the temp table is no longer
  there.
  Solution (TBD): Use AQR to recompile the query and create a new table.
- Query cache: If we cache a query, multiple queries may end up with
  the same temporary table. This works ok as long as these queries are
  executed serially, but it fails if both queries are executed at the
  same time (e.g. open two cursors and fetch from both, alternating).
  Solution (TBD): Add a CQD that disables caching queries with temp tables
  for common subexpressions.

In the future we also want to support volatile tables. However, those also
have issues:

- Volatile tables aren't cleaned up until the session ends. If we run
  many statements with common subexpressions, that is undesirable. So,
  we have a similar cleanup problem as with Hive tables.
- Volatile tables take a relatively long time to create.
- Insert and scan rates on volatile Trafodion tables are slower than
  those on Hive tables.

To-do items are marked with "Todo: CSE: " in the code.


> enhance Trafodion implementation of WITH clause
> -----------------------------------------------
>
>                 Key: TRAFODION-2127
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2127
>             Project: Apache Trafodion
>          Issue Type: Improvement
>            Reporter: liu ming
>            Assignee: Hans Zeller
>
> TRAFODION-1673 described some details about how to support WITH clause in Trafodion.
> As initial implementation, we use a simple pure-parser method.
> That way, Trafodion can support WITH clause functionally, but not good from performance point of view,
> also need to enhance the parser to be more strict in syntax.
> This JIRA is a follow up JIRA, to track following effort to support Trafodion WITH clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)