You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Satheesh Bandaram (JIRA)" <de...@db.apache.org> on 2006/03/07 15:24:40 UTC

[jira] Commented: (DERBY-1007) Optimizer can return incorrect "best cost" estimates with nested subqueries, which leads to generation of sub-optimal plans.

    [ http://issues.apache.org/jira/browse/DERBY-1007?page=comments#action_12369235 ] 

Satheesh Bandaram commented on DERBY-1007:
------------------------------------------

Patch submitted to trunk.

> Optimizer can return incorrect "best cost" estimates with nested subqueries, which leads to generation of sub-optimal plans.
> ----------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-1007
>          URL: http://issues.apache.org/jira/browse/DERBY-1007
>      Project: Derby
>         Type: Bug
>   Components: Performance
>     Versions: 10.2.0.0
>     Reporter: A B
>     Assignee: A B
>     Priority: Minor
>  Attachments: d1007_v1.patch, d1007_v1.stat
>
> When optimizing a query that has nested subqueries in it, it's possible that the optimizer for the subqueries will return cost estimates that are lower than what they were actually calculated to be.  The result is that the outer query can pick an access plan that is sub-optimal.
> Filing this jira issue based on the thread "[OPTIMIZER] OptimizerImpl "best plans" for subqueries?" from derby-dev.  Description that follows is pasted from that email:
> http://article.gmane.org/gmane.comp.apache.db.derby.devel/14836
> Following example of what I saw when tracing through the code demonstrates the problem.
> select x1.j, x2.b from
>   (select distinct i,j from t1) x1,
>   (select distinct a,b from t3) x2
> where x1.i = x2.a;
> During optimization of this query we will create three instancesof OptimizerImpl:
>    OI_0: For "select x1.j, x2.b from x1, x2 where x1.i = x2.a"
>    OI_1: For "select distinct i,j from t1"
>    OI_2: For "select distinct a,b from t3"
> Query ran against a clean codeline when T1 had 1 row and T3 had 50,000.
>    -- Top-level call is made to the optimize() method of the
>      outermost SelectNode, which creates OI_0.
>    -- OI_0: picks join order {X1, X2} and calls X1.optimizeIt()
>    -- X1: *creates* OI_1 and makes calls to optimize it.
>    -- OI_1: picks join order {T1} and calls T1.optimizeIt()
>    -- T1: returns a cost of 20.
>    -- OI_1: saves 20 as new best cost and tells T1 to save it.
>    -- X1: calls OI_1.getOptimizedCost(), which returns 20.  X1
>      then returns 20 to OI_0.
>    -- OI_0: calls X2.optimizeIt()
>    -- X2: *creates* OI_2 and makes calls to optimize it.
>    -- OI_2: picks join order {T3} and calls T3.optimizeIt()
>    -- T3: returns a cost of 64700.
>    -- OI_2: saves 64700 as new best cost and tells T3 to save it.
>    -- X2: calls OI_2.getOptimizedCost(), which returns 64700. X2
>      then returns 64700 to OI_0.
>    -- OI_0: saves 20 + 64700 = 64720 as new best cost and tells
>      X1 to save 20 and X2 to save 64700.
>    -- OI_0: picks join order {X2, X1} and calls X2.optimizeIt()
>    -- X2: *fetches* OI_2 and makes calls to optimize it.
>    -- OI_2: picks join order {T3} and calls T3.optimizeIt()
>    -- T3: returns a cost of 10783.
>    -- OI_2: saves 10783 as new best cost and tells T3 to save it.
>    -- X2: calls OI_2.getOptimizedCost(), which returns 10783.  X2
>      then returns 10783 to OI_0.
>    -- OI_0: calls X1.optimizeIt()
>    -- X1: *fetches* OI_1 and makes calls to optimize it.
>    -- OI_1: picks join order {T1} and calls T1.optimizeIt()
>    -- T1: returns a cost of *1 MILLION!*.
>    -- OI_1: rejects new cost (1 mil > 20) and does nothing.
>    -- X1: calls OI_1.getOptimizedCost(), which returns *20*.  X1
>      then returns 20 to OI_0...this seems WRONG!
>    -- OI_0: saves 10783 + 20 = 10803 as new best cost and tells
>      X2 to save 10783 and X1 to save 20.
> So in the end, the outer-most OptimizerImpl chooses join order {X2, X1} because it thought the cost of this join order was only 10783, which is better than  64720.  However, the _actual_ cost of the join order was really estimated at 1 million--so the outer OptimizerImpl chose (and will generate) a plan that, according to the estimates, was (hugely) sub-optimal.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira