You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-commits@db.apache.org by Apache Wiki <wi...@apache.org> on 2005/11/29 23:02:35 UTC

[Db-derby Wiki] Update of "LanguageOptimize" by DanDebrunner

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The following page has been changed by DanDebrunner:
http://wiki.apache.org/db-derby/LanguageOptimize

New page:
= Derby SQL Optimization =

Two sub-phases, pre-process and then optimization.

== Pre-process ==

=== Transformations ===

Preparation, obvious rewrite/optimization

 * Convert
   {{{c1 = 10 or c1 = 2  ====>>>>  c1 in (2, 10)}}}
 * For
   {{{c1 in (2, 10) ====ADD>>>>  c1 >= 2 AND c1 <= 10}}}
 * Eliminate {{{NOT}}} (push down, logic conversion).
 * Convert predicates to normalized CNF (in a sense):
   * TRUE is CNF, or Top node is AND.
   * Left sub-tree is DNF, right sub-tree is CNF.
   * DNF is defined symmetrically: FALSE or top is OR, left CNF, right DNF.
 * LIKE
   * {{{ c1 LIKE 'asdf%' ====>>>> c1 LIKE 'asdf%' AND c1 >= 'asdf' AND c1 < 'asdg‘ }}}
   * {{{ c1 LIKE ? ====>>>> c1 LIKE ? and c1 >= ? }}}
 * BETWEEN: convert to {{{ >= AND <=}}}
 * Subquery flattening, join node flattening:
   * Eg. For select subquery, conditions:
      * Under top AND node (not under OR)
      * IN, ANY, EXISTS, binary comparison
   * Flatten to normal join, or exists join, depending on uniqueness: pull up and append to outer query level
   * If not flattened, at least try to convert to binary comparison and push it down:
      * {{{ expression QuantifiedOperator (select x from ...)
====>>>>
(select true from .. where expression <BinaryComparisonOperator> x) IS [NOT] NULL }}}
 * For !ResultSetNode (!FromBaseTable etc.):
      * Generate referenced table map.
      * Convert WHERE and HAVING clauses to !PredicateLists, classify them.
      * Generate !ProjectRestrictNode on top of !FromBaseTable and !FromSubquery.
      * Push single table predicates to new !ProjectRestrictNodes
 * Outer join reordering.
 * Union (without ALL) eliminates ORDER BY (prefix columns).
 * And of course, a lot more…


== Optimization ==

 * Cost based optimization.
 * Purpose: select join order, join method, access method, least operation/cost.
 * Iterate through all permutations of join orders.
 * Cost different join methods (nested loop, hash).
 * For base table, cost each access method (heap, indexes).
 * Row count info from store, cardinality from SYSSTATISTICS.
 * Cost is based on selectivity, row count, index uniqueness, etc.
 * According to join order, table reference map, push down predicates, decide if an index can be used, start/stop keys & operators, simple qualifiers further pushed down to store for heap access.
 * Decide if the plan can avoid sorting (merge sort cost).
 * Consider covering/non-covering index costs
 * Cost index access according to start/stop keys or predicate’s estimated selectivity (or statistics for the index).
 * If optimizer timeout (too many plans), choose the best one so far

== References ==

http://db.apache.org/derby/papers/optimizer.html Derby Optimizer Design