You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Army <qo...@sbcglobal.net> on 2006/02/17 19:37:05 UTC

[OPTIMIZER] OptimizerImpl "best plans" for subqueries?

As I was tracing through code for some work that I'm doing for DERBY-805, I came 
across a scenario in the Optimizer that looks to me like it could be a bug.  I'm 
wondering if anyone out there can either confirm that this is a bug or else help 
correct my understanding of how this is supposed to work...

An example of a scenario that's causing me to wonder is that of unflattened 
subqueries.  Ex:

select x1.j, x2.b from
	(select distinct i,j from t1) x1,
	(select distinct a,b from t3) x2
where x1.i = x2.a;

The first thing to note is that during optimization of this query we will create 
three instancesof OptimizerImpl:

   OI_0: For "select x1.j, x2.b from x1, x2 where x1.i = x2.a"
   OI_1: For "select distinct i,j from t1"
   OI_2: For "select distinct a,b from t3"

These OptimizerImpls are "created" from calls to the getOptimizer() method on 
the respective SelectNode.  I put "created" in quotes because the 
OptimizerImpl's are only created on the first call to SelectNode.getOptimizer(); 
thereafer, any calls to that method will return the same instance of the 
OptimizerImpl that was created the first time.

Each OptimizerImpl stores information about the best plan that it has found so 
far, namely best join order, best cost (for that join order), and best plan type 
("normal" or "sort avoidance").  I use the term "bestPlan" to refer these three 
values collectively, though that particular field doesn't actually exist in the 
code.

All of that said, I _think_ that bestPlan is intended to hold the best plan seen 
by the OptimizerImpl for this round of optimizing, where "this round" means 
"from the last time a call to getOptimizer() returned this OptimizerImpl until a 
call to this OptimizerImpl's getNextPermutation() method returns FALSE."  For 
every bestPlan that the OptimizerImpl finds, it calls "rememberAsBest" on each 
Optimizable in its list, which means that trulyTheBestAccessPath for each 
Optimizable will be updated to hold that Optimizable's best plan with respect to 
the OptimizerImpl's current join order.

When I was tracing through code for pushing predicates, though, I came to the 
realization that either:

  1) my definition of a "round" is incorrect, or
  2) the definition is correct, but the implementation is incorrect.

The reason I believe this to be the case is that bestPlan is never
reset.  For non-nested queries this isn't problem because the outer
SelectNode only calls "getOptimizer()" one time and then fetches the
bestPlan after that:

	optimizer = getOptimizer(fromList,
			wherePredicates,
			dataDictionary,
			orderByList);
	optimizer.setOuterRows(outerRows);

	/* Optimize this SelectNode */
	while (optimizer.getNextPermutation())
	{
		while (optimizer.getNextDecoratedPermutation())
		{
			optimizer.costPermutation();
		}
	}

	/* Get the cost */
	costEstimate = optimizer.getOptimizedCost();

With subqueries, though, a SelectNode will call getOptimizer() once for _every_ 
complete join order analyzed by the outer OptimizerImpl.  I think the inner 
OptimizerImpl is then supposed to "start fresh" and return the best permutation 
for the inner OptimizerImpl with respect to the _current_ permutation of the 
_outer_ query.  That's what my above definition of "this round" translates to.

However, since we never reset the inner OptimizerImpl's bestPlan, it's possible 
that _all_ permutations of the inner OptimizerImpl will be rejected because they 
are more expensive than a previous bestPlan from a previous permutation of the 
outer query.  That means that the call to optimizer.getOptimizedCost() will 
return the bestPlan found for an outer query join order that is _not_ the 
current join order.  As explained below, this can cause the outer-most optimizer 
to choose a plan that is not optimal.

Let's walk through this using the above example:

select x1.j, x2.b from
	(select distinct i,j from t1) x1,
	(select distinct a,b from t3) x2
where x1.i = x2.a;

Note that I ran this query against a clean codeline and did actually witness the 
behavior described here (T1 had 1 row, T3 had 50,000).

   -- Top-level call is made to the optimize() method of the
	outermost SelectNode, which creates OI_0.

   -- OI_0: picks join order {X1, X2} and calls X1.optimizeIt()
   -- X1: *creates* OI_1 and makes calls to optimize it.
   -- OI_1: picks join order {T1} and calls T1.optimizeIt()
   -- T1: returns a cost of 20.
   -- OI_1: saves 20 as new best cost and tells T1 to save it.
   -- X1: calls OI_1.getOptimizedCost(), which returns 20.  X1
	then returns 20 to OI_0.
   -- OI_0: calls X2.optimizeIt()
   -- X2: *creates* OI_2 and makes calls to optimize it.
   -- OI_2: picks join order {T3} and calls T3.optimizeIt()
   -- T3: returns a cost of 64700.
   -- OI_2: saves 64700 as new best cost and tells T3 to save it.
   -- X2: calls OI_2.getOptimizedCost(), which returns 64700. X2
	then returns 64700 to OI_0.
   -- OI_0: saves 20 + 64700 = 64720 as new best cost and tells
	X1 to save 20 and X2 to save 64700.

   -- OI_0: picks join order {X2, X1} and calls X2.optimizeIt()
   -- X2: *fetches* OI_2 and makes calls to optimize it.
   -- OI_2: picks join order {T3} and calls T3.optimizeIt()
   -- T3: returns a cost of 10783.
   -- OI_2: saves 10783 as new best cost and tells T3 to save it.
   -- X2: calls OI_2.getOptimizedCost(), which returns 10783.  X2
	then returns 10783 to OI_0.
   -- OI_0: calls X1.optimizeIt()
   -- X1: *fetches* OI_1 and makes calls to optimize it.
   -- OI_1: picks join order {T1} and calls T1.optimizeIt()
   -- T1: returns a cost of *1 MILLION!*.
   -- OI_1: rejects new cost (1 mil > 20) and does nothing.
   -- X1: calls OI_1.getOptimizedCost(), which returns *20*.  X1
	then returns 20 to OI_0...this seems WRONG!
   -- OI_0: saves 10783 + 20 = 10803 as new best cost and tells
	X2 to save 10783 and X1 to save 20.

So in the end, the outer-most OptimizerImpl chooses join order {X2, X1} because 
it thought the cost of this join order was only 10783, which is better than 
64720.  However, the _actual_ cost of the join order was really estimated at 1 
million--so the outer OptimizerImpl chose (and will generate) a plan that, 
according to the estimates, was (hugely) sub-optimal.

As it turned out, the actual difference between {X1,X2} and {X2,X1} for this 
particular query was minimal, so the behavior wasn't a big deal (I guess the 
estimates were off--but that's not the point ;)

But in the grand scheme of things, this behavior looks WRONG to me.  I think the 
correct thing is to reset an OptimizerImpl's best plan whenever it is "fetched" 
as part of a getOptimizer() call.  If we did that, then instead of rejecting the 
new cost of 1 million, OI_1 would have accepted it because it was the first plan 
that it found for "this round".  Then when X1 called "OI_1.getOptimizedCost()", 
it would have seen how bad the plan really was and propagated that information 
back to OI_0, who then would have rejected join order {X2,X1} in favor of {X1,X2}.

That said, I made what I thought was a simple change to address this 
issue--namely, resetting the bestPlan for an Optimizer on every call to 
getOptimizer()--and ran derbylang.  I saw four diffs: two seemed okay (rows had 
changed order, which was fine) but two were definite failures: one in 
refActions1.sql and one in views.sql; both failed because with "ERROR 42Y69: No 
valid execution plan was found for this statement."

Keep in mind: this was on a _clean_ codeline, so my predicate pushing logic for 
DERBY-805 isn't an issue here.

Can anyone comment on this behavior?  Namely, can I get feedback as to whether 
my interpretation of what makes a "bestPlan"--meaning the best plan in "this 
round"--is correct?  And if so, is this a bug in the current Derby optimizer? 
And if so, any ideas as to how this could be causing the two new failures in 
view.sql and refActions1.sql?

Any input/reactions/feedback/corrections at all would be greatly appreciated...

Thanks,
Army


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Army <qo...@sbcglobal.net>.
Army wrote:
> Okay, so maybe that'not as clear as I'd like it to be.  However, when I 
> post these changes to DERBY-805 (as Phase 1 "v2"), I will also update 
> the DERBY-805 document to include this description and a walk-through 
> example showing how it's all supposed to work.

I just attached the new patch with a corresponding description (and walk-through 
example) to DERBY-805.  The new changes match what I proposed in my earlier 
email, with one minor exception: the call to addOrLoadBestPlan() when placing an 
Optimizable was removed, as it turns out that it wasn't needed.

Thanks,
Army


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Army <qo...@sbcglobal.net>.
Jeffrey Lichtman wrote:
> I think it may be necessary when remembering "truly the best" access 
> path at any level of subquery to tell each table subquery to remember 
> its best access path as "truly the best" also, and to recurse through 
> all the nested subqueries to do this. I don't think this would be very 
> hard to implement. 

I was actually toying around with something like this yesterday, so it's nice to 
hear you suggest a similar approach :)  I wrote up a solution that tries to 
implement it; I ran derbyall last night with the changes and there were no 
failures.  I also ran my predicate pushing code locally with these changes 
applied (in place of the Phase 1 patch) and everything worked there, too.  So I 
think this is the right way to go--and it feels more robust to me than the first 
Phase 1 patch for DERBY-805.  So thanks for the ideas and reassurance...

> I only worry about what would happen with deeply-nested subqueries - this 
> could create an order 2**n algorithm (that is, one whose time would grow 
> exponentially with the number of levels of nesting).

I guess my answer to that is if we're dealing with deeply nested queries, the 
caller will probably be preparing the query once and then executing it multiple 
times, in which case the extra optimization cost would be an up-front, one time 
thing while the benefits from doing the extra work could lead to huge 
performance savings for every execution of the query thereafter (when combined 
with predicate pushdown, that is).  Depending on just how much overhead we're 
adding to the optimization phase, this seems like an acceptable trade-off to me...

The solution I've coded tries to do what you describe.  I'll outline it here and 
post the patch as a new "Phase 1" patch for DERBY-805 since as I said earlier, I 
think this issue is really the issue that Phase 1 is trying to address.

First, I added a new field called "optimizerToBestPlanMap" in FromTable, which 
is the parent class to all Optimizables.  This field (which is a HashMap) holds 
"truly the best access path" (TTBP) for the Optimizable with respect to every 
OptimizerImpl at or above it, where "at" refers to the OptimizerImpl to which 
the Optimizable directly belongs.

I then added a new method called "addOrLoadBestPlanMapping" to the Optimizable 
interface, implemented in FromTable.  The signature for this method is:

+	public void addOrLoadBestPlanMapping(boolean doAdd,
+		Optimizer optimizer) throws StandardException
+	{

If "doAdd" is true then we will take the Optimizable's trulyTheBestAccessPath 
field and put a copy of it into optimizerToBestPlanMap, with the key being the 
received optimizer.  If "doAdd" is false, then we will search for the hash map 
for a TTBP that corresponds to the received optimizer, and if we find one we 
will load that path information into the Optimizable's trulyTheBestAccessPath field.

This new method is over-written by the two Optimizable classes that can have 
children: namely, SingleChildResultSetNode and TableOperatorNode.  In each case 
the method will first call super.addOrLoadBestPlanMapping(), which adds/loads 
TTBP for the node itself, and then it will recursively call that method on the 
child result in two cases: 1) if the child is another Optimizable, then the call 
is made directly on that child; 2) if the child is a SelectNode, then we call a 
new method on SelectNode that will iterate through all Optimizables in its FROM 
list and recursively call addOrLoadBestPlanMapping().  By doing so we "recurse 
through all nested subqueries", as Jeff said. For example, in 
SingleChildResultSetNode we have the following:

+	/** @see Optimizable#addOrLoadBestPlanMapping */
+	public void addOrLoadBestPlanMapping(boolean doAdd,
+		Optimizer optimizer) throws StandardException
+	{
+		super.addOrLoadBestPlanMapping(doAdd, optimizer);
+		if (childResult instanceof Optimizable)
+		{
+			((Optimizable)childResult).
+				addOrLoadBestPlanMapping(doAdd, optimizer);
+		}
+		else if (childResult instanceof SelectNode)
+		{
+			((SelectNode)childResult).
+				addOrLoadBestPlanMappings(doAdd, optimizer);
+		}
+	}

The third thing I've done is change the "rememberAsBest()" method on the 
Optimizable interface to take an instance of Optimizer as an argument.  Then 
whenever an OptimizerImpl decides it has found a best overall path and calls 
rememberAsBest(), we make an additional call (within rememberAsBest()) to the 
new addOrLoadBestPlanMapping() method with "doAdd" set to true and passing in 
the optimizer.  This means that whenever an OptimizerImpl finds a best plan, 
every Optimizable in its list (including all Optimizables found recursively 
through subqueries) will remember what its TTBP was with respect to that 
OptimizerImpl.

And as a last step, I've added two calls to addOrLoadBestPlanMapping() to the 
OptimizerImpl class itself.  The first is made with "doAdd" set to true and is 
called whenever we place an Optimizable at a join position, just before we call 
"startOptimizing()".  We need this call in order to remember what TTBP for each 
Optimizable (including those within subqueries) is _before_ we begin optimizing, 
so that if the join order for the current OptimizerImpl ends up being worse than 
a previous join order, we can "revert" the Optimizable (and recursively any 
Optimizables from nested subqueries) back to its TTBP for the OptimizerImpl's 
previously-determined best join order.

The second call to addOrLoadBestPlanMapping() comes when we pull an Optimizable 
from its current join position, in which case doAdd is "false", which we means 
we take the Optimizable's TTBP corresponding to the current OptimizerImpl and 
load it into the Optimizable's trulyTheBestAccessPath field.  If the 
OptimizerImpl's most recent join order was the best one so far, TTBP will end up 
holding the plan that was saved for that join order; otherwise, TTBP will end up 
holding the "reverted" plan, which is the plan the Optimizable had for whatever 
join order was previously considered best.

Okay, so maybe that'not as clear as I'd like it to be.  However, when I post 
these changes to DERBY-805 (as Phase 1 "v2"), I will also update the DERBY-805 
document to include this description and a walk-through example showing how it's 
all supposed to work.  In the meantime, if anyone understands this description 
well enough to point out any glaring problems, that'd be great.  Otherwise, I'll 
work on cleaning up the patch and adding an example to DERBY-805.html, and will 
post both later today...

Many thanks to all who continue to follow these optimizer threads,
Army


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Jeffrey Lichtman <sw...@rcn.com>.
>>I believe the optimizer currently keeps track of only two "best 
>>plans" - the best access path for each table in a query (or 
>>subquery) as it's currently being considered in a join order, and 
>>the best overall join order and path for each table for the best 
>>plan it's found so far.
>
>Am I right in thinking that the first of these two "best plans" is 
>stored in "bestAccessPath", and the second is stored in 
>"trulyTheBestAccessPath"?  That's how I read it, but I'd just like 
>to make sure.

Yes, that's right.

. . .
<a bunch of stuff deleted>
. . .

>Does that sound right, or am I being too optimistic?

I think it's right as far as it goes, but still doesn't fix the other 
issues discussed below:

>I've been struggling to wrap my head around this question, as well 
>(at least, I *think* this is the question with which I've been 
>struggling ;)  And ultimately, I think this ties back to what I was 
>trying to do with the Phase 1 patch for DERBY-805--namely, keep 
>track of "the plan to use with the best plan found for the outer 
>query".  But based on this discussion, I wonder if the patch I 
>posted goes far enough--if there are subqueries beneath subqueries, 
>such as can happen if we have unions beneath unions beneath unions, 
>how do we get each of those subqueries to remember its best plan 
>with respect to _every_ level of outer query (there could be 
>several) that sits above it?

I think it may be necessary when remembering "truly the best" access 
path at any level of subquery to tell each table subquery to remember 
its best access path as "truly the best" also, and to recurse through 
all the nested subqueries to do this. I don't think this would be 
very hard to implement. I only worry about what would happen with 
deeply-nested subqueries - this could create an order 2**n algorithm 
(that is, one whose time would grow exponentially with the number of 
levels of nesting).

>That's the question I've been pondering for the past two days; is 
>that also what you are referring to by the above "I'll have to think 
>about" statement?  Or am I taking this in a different direction?

Yes, I believe we're thinking about the same thing.


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Army <qo...@sbcglobal.net>.
> I think you've found a bug, but the solution you're trying may not be 
> correct. I believe the optimizer currently keeps track of only two "best 
> plans" - the best access path for each table in a query (or subquery) as 
> it's currently being considered in a join order, and the best overall 
> join order and path for each table for the best plan it's found so far. 

Am I right in thinking that the first of these two "best plans" is stored in 
"bestAccessPath", and the second is stored in "trulyTheBestAccessPath"?  That's 
how I read it, but I'd just like to make sure.

> As for the problems you're seeing in view.sql and refActions1.sql, I
> suspect it's due to the fact that in some queries there are certain join
> orders that won't work (e.g. if you pass a column from one table to the
> constructor for a VTI, the VTI must come after the other table in the
> join order). Your attempted fix is probably clobbering the original best
> plan for the subquery in a case where no subsequent best plan is
> possible.

Okay, this definitely makes sense.  I went back and looked at my quick-fix and I 
see that I was a bit overzealous--I reset not only the bestCost for the 
OptimizerImpl, but also the "foundABestPlan" flag and the "bestJoinOrder" field. 
  The fact that I cleared the foundABestPlan flag was what caused the error I 
mentioned.

That said, I then changed the code so that it only resets the bestCost field (by 
setting it to the max value possible); I left foundABestPlan and bestJoinOrder 
untouched.  When I did this, views.sql and refActions1.sql both passed.  At 
first I thought this may have just been dumb luck, but the more I think about 
it, the more I think this is the correct thing to do.  Since, as you said, it's 
possible that "there is no subsequent best plan" for the subquery, we need to 
keep hold of the fact that we did, at some point, find a best plan, and we need 
to know what the corresponding join order is.  That said, resetting bestCost to 
the max value does two things:

  1. It ensures that if at least one best plan is found for the current subquery 
with respect to the current join order of the outer query, that best plan will 
be saved as the best plan for the subquery and it's cost will be returned 
accordingly.

  2. It ensures that if no best plan is found, then the cost for the subquery 
will be outrageously high (Double.MAX_VALUE) which means the outer query will 
not choose its current join order--which is good, because the subquery can't be 
executed with that outer join order.  Further, we will still know that there was 
at some point some outer join order for which the subquery had a valid best plan 
(because foundABestPlan will still be set) and we will still know the subquery's 
join order for that best plan (because bestJoinOrder will still be set).

Number 1 would, I believe, fix the issue that started this thread, because the 
"1 million" plan would be saved (1 million is less than Double.MAX_VALUE) and 
then rejected as too expensive.  Number 2 would, I believe, address your comment 
that there may in fact be "no subsequent best plan", which means the subquery 
must know that it did have a valid plan at one point, and it must know what the 
join order for that plan was.  The only information we lose is the cost of that 
earlier plan--but the cost should already have been viewed and handled by the 
outer query at the time it was first returned, so I don't think it's required 
any more...?  The fact that views.sql and refActions1.sql pass with this change 
suggests that this is along the right lines; I would of course like to have 
someone confirm if I'm seeing this correctly...

Does that sound right, or am I being too optimistic?

> For subqueries it may need to keep track of another "best plan" - the 
> plan to use with the best plan found for the outer query.
> 
> I'll have to think about whether it gets more complicated than this if 
> subqueries are nested. Would the optimizer have to keep track of even 
> more levels of "best plan" in cases like this (I hope not).

I've been struggling to wrap my head around this question, as well (at least, I 
*think* this is the question with which I've been struggling ;)  And ultimately, 
I think this ties back to what I was trying to do with the Phase 1 patch for 
DERBY-805--namely, keep track of "the plan to use with the best plan found for 
the outer query".  But based on this discussion, I wonder if the patch I posted 
goes far enough--if there are subqueries beneath subqueries, such as can happen 
if we have unions beneath unions beneath unions, how do we get each of those 
subqueries to remember its best plan with respect to _every_ level of outer 
query (there could be several) that sits above it?

That's the question I've been pondering for the past two days; is that also what 
you are referring to by the above "I'll have to think about" statement?  Or am I 
taking this in a different direction?

In any event, since it appears that this is in fact an issue with the optimizer, 
I'll file a Jira entry for tracking purposes.

Thanks for the reply, Jeff.  This was very helpful...
Army


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Jeffrey Lichtman <sw...@rcn.com>.
>The optimizer will have to remember this best plan even when it 
>considers a different join order for the outer query, because it's 
>possible that the subquery's best plan for that optimization will be 
>the best one overall.

It occurs to me that the code structure already supports this. Every 
time the optimizer figures out the best access path for an 
optimizable in the context of a join order it's considering, it tells 
the optimizable to remember that access path as the "best," and when 
it finds a complete plan that's the best so far, it tells the 
optimizable to remember it as "truly the best." When optimizing an 
outer query containing a table subquery, the "best" and "truly the 
best" access paths for the subquery should be the complete plan for 
the subquery from the current (or latest) optimization of it.


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?

Posted by Jeffrey Lichtman <sw...@rcn.com>.
>Can anyone comment on this behavior?  Namely, can I get feedback as 
>to whether my interpretation of what makes a "bestPlan"--meaning the 
>best plan in "this round"--is correct?  And if so, is this a bug in 
>the current Derby optimizer? And if so, any ideas as to how this 
>could be causing the two new failures in view.sql and refActions1.sql?

When optimizing a table subquery (a subquery in the FROM clause), we 
must re-optimize the subquery every time it is put into a new 
position in the outer query's join order. Once this is done, there 
will (probably) be a best plan for the subquery in that context. The 
optimizer will have to remember this best plan even when it considers 
a different join order for the outer query, because it's possible 
that the subquery's best plan for that optimization will be the best 
one overall. This means that there should be two different best plans 
for the subquery: the best plan found for the current optimization of 
the subquery, and the best plan for all optimizations of the subquery.

I think you've found a bug, but the solution you're trying may not be 
correct. I believe the optimizer currently keeps track of only two 
"best plans" - the best access path for each table in a query (or 
subquery) as it's currently being considered in a join order, and the 
best overall join order and path for each table for the best plan 
it's found so far. For subqueries it may need to keep track of 
another "best plan" - the plan to use with the best plan found for 
the outer query.

I'll have to think about whether it gets more complicated than this 
if subqueries are nested. Would the optimizer have to keep track of 
even more levels of "best plan" in cases like this (I hope not).

BTW, this problem doesn't happen with regular (non-table) subqueries, 
because Derby defers their optimization until the optimization of the 
outer query is finished.

As for the problems you're seeing in view.sql and refActions1.sql, I 
suspect it's due to the fact that in some queries there are certain 
join orders that won't work (e.g. if you pass a column from one table 
to the constructor for a VTI, the VTI must come after the other table 
in the join order). Your attempted fix is probably clobbering the 
original best plan for the subquery in a case where no subsequent 
best plan is possible. It would help if you posted the queries that 
got these errors.


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/