You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Army <qo...@sbcglobal.net> on 2006/02/17 19:37:05 UTC
[OPTIMIZER] OptimizerImpl "best plans" for subqueries?
As I was tracing through code for some work that I'm doing for DERBY-805, I came
across a scenario in the Optimizer that looks to me like it could be a bug. I'm
wondering if anyone out there can either confirm that this is a bug or else help
correct my understanding of how this is supposed to work...
An example of a scenario that's causing me to wonder is that of unflattened
subqueries. Ex:
select x1.j, x2.b from
(select distinct i,j from t1) x1,
(select distinct a,b from t3) x2
where x1.i = x2.a;
The first thing to note is that during optimization of this query we will create
three instancesof OptimizerImpl:
OI_0: For "select x1.j, x2.b from x1, x2 where x1.i = x2.a"
OI_1: For "select distinct i,j from t1"
OI_2: For "select distinct a,b from t3"
These OptimizerImpls are "created" from calls to the getOptimizer() method on
the respective SelectNode. I put "created" in quotes because the
OptimizerImpl's are only created on the first call to SelectNode.getOptimizer();
thereafer, any calls to that method will return the same instance of the
OptimizerImpl that was created the first time.
Each OptimizerImpl stores information about the best plan that it has found so
far, namely best join order, best cost (for that join order), and best plan type
("normal" or "sort avoidance"). I use the term "bestPlan" to refer these three
values collectively, though that particular field doesn't actually exist in the
code.
All of that said, I _think_ that bestPlan is intended to hold the best plan seen
by the OptimizerImpl for this round of optimizing, where "this round" means
"from the last time a call to getOptimizer() returned this OptimizerImpl until a
call to this OptimizerImpl's getNextPermutation() method returns FALSE." For
every bestPlan that the OptimizerImpl finds, it calls "rememberAsBest" on each
Optimizable in its list, which means that trulyTheBestAccessPath for each
Optimizable will be updated to hold that Optimizable's best plan with respect to
the OptimizerImpl's current join order.
When I was tracing through code for pushing predicates, though, I came to the
realization that either:
1) my definition of a "round" is incorrect, or
2) the definition is correct, but the implementation is incorrect.
The reason I believe this to be the case is that bestPlan is never
reset. For non-nested queries this isn't problem because the outer
SelectNode only calls "getOptimizer()" one time and then fetches the
bestPlan after that:
optimizer = getOptimizer(fromList,
wherePredicates,
dataDictionary,
orderByList);
optimizer.setOuterRows(outerRows);
/* Optimize this SelectNode */
while (optimizer.getNextPermutation())
{
while (optimizer.getNextDecoratedPermutation())
{
optimizer.costPermutation();
}
}
/* Get the cost */
costEstimate = optimizer.getOptimizedCost();
With subqueries, though, a SelectNode will call getOptimizer() once for _every_
complete join order analyzed by the outer OptimizerImpl. I think the inner
OptimizerImpl is then supposed to "start fresh" and return the best permutation
for the inner OptimizerImpl with respect to the _current_ permutation of the
_outer_ query. That's what my above definition of "this round" translates to.
However, since we never reset the inner OptimizerImpl's bestPlan, it's possible
that _all_ permutations of the inner OptimizerImpl will be rejected because they
are more expensive than a previous bestPlan from a previous permutation of the
outer query. That means that the call to optimizer.getOptimizedCost() will
return the bestPlan found for an outer query join order that is _not_ the
current join order. As explained below, this can cause the outer-most optimizer
to choose a plan that is not optimal.
Let's walk through this using the above example:
select x1.j, x2.b from
(select distinct i,j from t1) x1,
(select distinct a,b from t3) x2
where x1.i = x2.a;
Note that I ran this query against a clean codeline and did actually witness the
behavior described here (T1 had 1 row, T3 had 50,000).
-- Top-level call is made to the optimize() method of the
outermost SelectNode, which creates OI_0.
-- OI_0: picks join order {X1, X2} and calls X1.optimizeIt()
-- X1: *creates* OI_1 and makes calls to optimize it.
-- OI_1: picks join order {T1} and calls T1.optimizeIt()
-- T1: returns a cost of 20.
-- OI_1: saves 20 as new best cost and tells T1 to save it.
-- X1: calls OI_1.getOptimizedCost(), which returns 20. X1
then returns 20 to OI_0.
-- OI_0: calls X2.optimizeIt()
-- X2: *creates* OI_2 and makes calls to optimize it.
-- OI_2: picks join order {T3} and calls T3.optimizeIt()
-- T3: returns a cost of 64700.
-- OI_2: saves 64700 as new best cost and tells T3 to save it.
-- X2: calls OI_2.getOptimizedCost(), which returns 64700. X2
then returns 64700 to OI_0.
-- OI_0: saves 20 + 64700 = 64720 as new best cost and tells
X1 to save 20 and X2 to save 64700.
-- OI_0: picks join order {X2, X1} and calls X2.optimizeIt()
-- X2: *fetches* OI_2 and makes calls to optimize it.
-- OI_2: picks join order {T3} and calls T3.optimizeIt()
-- T3: returns a cost of 10783.
-- OI_2: saves 10783 as new best cost and tells T3 to save it.
-- X2: calls OI_2.getOptimizedCost(), which returns 10783. X2
then returns 10783 to OI_0.
-- OI_0: calls X1.optimizeIt()
-- X1: *fetches* OI_1 and makes calls to optimize it.
-- OI_1: picks join order {T1} and calls T1.optimizeIt()
-- T1: returns a cost of *1 MILLION!*.
-- OI_1: rejects new cost (1 mil > 20) and does nothing.
-- X1: calls OI_1.getOptimizedCost(), which returns *20*. X1
then returns 20 to OI_0...this seems WRONG!
-- OI_0: saves 10783 + 20 = 10803 as new best cost and tells
X2 to save 10783 and X1 to save 20.
So in the end, the outer-most OptimizerImpl chooses join order {X2, X1} because
it thought the cost of this join order was only 10783, which is better than
64720. However, the _actual_ cost of the join order was really estimated at 1
million--so the outer OptimizerImpl chose (and will generate) a plan that,
according to the estimates, was (hugely) sub-optimal.
As it turned out, the actual difference between {X1,X2} and {X2,X1} for this
particular query was minimal, so the behavior wasn't a big deal (I guess the
estimates were off--but that's not the point ;)
But in the grand scheme of things, this behavior looks WRONG to me. I think the
correct thing is to reset an OptimizerImpl's best plan whenever it is "fetched"
as part of a getOptimizer() call. If we did that, then instead of rejecting the
new cost of 1 million, OI_1 would have accepted it because it was the first plan
that it found for "this round". Then when X1 called "OI_1.getOptimizedCost()",
it would have seen how bad the plan really was and propagated that information
back to OI_0, who then would have rejected join order {X2,X1} in favor of {X1,X2}.
That said, I made what I thought was a simple change to address this
issue--namely, resetting the bestPlan for an Optimizer on every call to
getOptimizer()--and ran derbylang. I saw four diffs: two seemed okay (rows had
changed order, which was fine) but two were definite failures: one in
refActions1.sql and one in views.sql; both failed because with "ERROR 42Y69: No
valid execution plan was found for this statement."
Keep in mind: this was on a _clean_ codeline, so my predicate pushing logic for
DERBY-805 isn't an issue here.
Can anyone comment on this behavior? Namely, can I get feedback as to whether
my interpretation of what makes a "bestPlan"--meaning the best plan in "this
round"--is correct? And if so, is this a bug in the current Derby optimizer?
And if so, any ideas as to how this could be causing the two new failures in
view.sql and refActions1.sql?
Any input/reactions/feedback/corrections at all would be greatly appreciated...
Thanks,
Army
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Army <qo...@sbcglobal.net>.
Army wrote:
> Okay, so maybe that'not as clear as I'd like it to be. However, when I
> post these changes to DERBY-805 (as Phase 1 "v2"), I will also update
> the DERBY-805 document to include this description and a walk-through
> example showing how it's all supposed to work.
I just attached the new patch with a corresponding description (and walk-through
example) to DERBY-805. The new changes match what I proposed in my earlier
email, with one minor exception: the call to addOrLoadBestPlan() when placing an
Optimizable was removed, as it turns out that it wasn't needed.
Thanks,
Army
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Army <qo...@sbcglobal.net>.
Jeffrey Lichtman wrote:
> I think it may be necessary when remembering "truly the best" access
> path at any level of subquery to tell each table subquery to remember
> its best access path as "truly the best" also, and to recurse through
> all the nested subqueries to do this. I don't think this would be very
> hard to implement.
I was actually toying around with something like this yesterday, so it's nice to
hear you suggest a similar approach :) I wrote up a solution that tries to
implement it; I ran derbyall last night with the changes and there were no
failures. I also ran my predicate pushing code locally with these changes
applied (in place of the Phase 1 patch) and everything worked there, too. So I
think this is the right way to go--and it feels more robust to me than the first
Phase 1 patch for DERBY-805. So thanks for the ideas and reassurance...
> I only worry about what would happen with deeply-nested subqueries - this
> could create an order 2**n algorithm (that is, one whose time would grow
> exponentially with the number of levels of nesting).
I guess my answer to that is if we're dealing with deeply nested queries, the
caller will probably be preparing the query once and then executing it multiple
times, in which case the extra optimization cost would be an up-front, one time
thing while the benefits from doing the extra work could lead to huge
performance savings for every execution of the query thereafter (when combined
with predicate pushdown, that is). Depending on just how much overhead we're
adding to the optimization phase, this seems like an acceptable trade-off to me...
The solution I've coded tries to do what you describe. I'll outline it here and
post the patch as a new "Phase 1" patch for DERBY-805 since as I said earlier, I
think this issue is really the issue that Phase 1 is trying to address.
First, I added a new field called "optimizerToBestPlanMap" in FromTable, which
is the parent class to all Optimizables. This field (which is a HashMap) holds
"truly the best access path" (TTBP) for the Optimizable with respect to every
OptimizerImpl at or above it, where "at" refers to the OptimizerImpl to which
the Optimizable directly belongs.
I then added a new method called "addOrLoadBestPlanMapping" to the Optimizable
interface, implemented in FromTable. The signature for this method is:
+ public void addOrLoadBestPlanMapping(boolean doAdd,
+ Optimizer optimizer) throws StandardException
+ {
If "doAdd" is true then we will take the Optimizable's trulyTheBestAccessPath
field and put a copy of it into optimizerToBestPlanMap, with the key being the
received optimizer. If "doAdd" is false, then we will search for the hash map
for a TTBP that corresponds to the received optimizer, and if we find one we
will load that path information into the Optimizable's trulyTheBestAccessPath field.
This new method is over-written by the two Optimizable classes that can have
children: namely, SingleChildResultSetNode and TableOperatorNode. In each case
the method will first call super.addOrLoadBestPlanMapping(), which adds/loads
TTBP for the node itself, and then it will recursively call that method on the
child result in two cases: 1) if the child is another Optimizable, then the call
is made directly on that child; 2) if the child is a SelectNode, then we call a
new method on SelectNode that will iterate through all Optimizables in its FROM
list and recursively call addOrLoadBestPlanMapping(). By doing so we "recurse
through all nested subqueries", as Jeff said. For example, in
SingleChildResultSetNode we have the following:
+ /** @see Optimizable#addOrLoadBestPlanMapping */
+ public void addOrLoadBestPlanMapping(boolean doAdd,
+ Optimizer optimizer) throws StandardException
+ {
+ super.addOrLoadBestPlanMapping(doAdd, optimizer);
+ if (childResult instanceof Optimizable)
+ {
+ ((Optimizable)childResult).
+ addOrLoadBestPlanMapping(doAdd, optimizer);
+ }
+ else if (childResult instanceof SelectNode)
+ {
+ ((SelectNode)childResult).
+ addOrLoadBestPlanMappings(doAdd, optimizer);
+ }
+ }
The third thing I've done is change the "rememberAsBest()" method on the
Optimizable interface to take an instance of Optimizer as an argument. Then
whenever an OptimizerImpl decides it has found a best overall path and calls
rememberAsBest(), we make an additional call (within rememberAsBest()) to the
new addOrLoadBestPlanMapping() method with "doAdd" set to true and passing in
the optimizer. This means that whenever an OptimizerImpl finds a best plan,
every Optimizable in its list (including all Optimizables found recursively
through subqueries) will remember what its TTBP was with respect to that
OptimizerImpl.
And as a last step, I've added two calls to addOrLoadBestPlanMapping() to the
OptimizerImpl class itself. The first is made with "doAdd" set to true and is
called whenever we place an Optimizable at a join position, just before we call
"startOptimizing()". We need this call in order to remember what TTBP for each
Optimizable (including those within subqueries) is _before_ we begin optimizing,
so that if the join order for the current OptimizerImpl ends up being worse than
a previous join order, we can "revert" the Optimizable (and recursively any
Optimizables from nested subqueries) back to its TTBP for the OptimizerImpl's
previously-determined best join order.
The second call to addOrLoadBestPlanMapping() comes when we pull an Optimizable
from its current join position, in which case doAdd is "false", which we means
we take the Optimizable's TTBP corresponding to the current OptimizerImpl and
load it into the Optimizable's trulyTheBestAccessPath field. If the
OptimizerImpl's most recent join order was the best one so far, TTBP will end up
holding the plan that was saved for that join order; otherwise, TTBP will end up
holding the "reverted" plan, which is the plan the Optimizable had for whatever
join order was previously considered best.
Okay, so maybe that'not as clear as I'd like it to be. However, when I post
these changes to DERBY-805 (as Phase 1 "v2"), I will also update the DERBY-805
document to include this description and a walk-through example showing how it's
all supposed to work. In the meantime, if anyone understands this description
well enough to point out any glaring problems, that'd be great. Otherwise, I'll
work on cleaning up the patch and adding an example to DERBY-805.html, and will
post both later today...
Many thanks to all who continue to follow these optimizer threads,
Army
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Jeffrey Lichtman <sw...@rcn.com>.
>>I believe the optimizer currently keeps track of only two "best
>>plans" - the best access path for each table in a query (or
>>subquery) as it's currently being considered in a join order, and
>>the best overall join order and path for each table for the best
>>plan it's found so far.
>
>Am I right in thinking that the first of these two "best plans" is
>stored in "bestAccessPath", and the second is stored in
>"trulyTheBestAccessPath"? That's how I read it, but I'd just like
>to make sure.
Yes, that's right.
. . .
<a bunch of stuff deleted>
. . .
>Does that sound right, or am I being too optimistic?
I think it's right as far as it goes, but still doesn't fix the other
issues discussed below:
>I've been struggling to wrap my head around this question, as well
>(at least, I *think* this is the question with which I've been
>struggling ;) And ultimately, I think this ties back to what I was
>trying to do with the Phase 1 patch for DERBY-805--namely, keep
>track of "the plan to use with the best plan found for the outer
>query". But based on this discussion, I wonder if the patch I
>posted goes far enough--if there are subqueries beneath subqueries,
>such as can happen if we have unions beneath unions beneath unions,
>how do we get each of those subqueries to remember its best plan
>with respect to _every_ level of outer query (there could be
>several) that sits above it?
I think it may be necessary when remembering "truly the best" access
path at any level of subquery to tell each table subquery to remember
its best access path as "truly the best" also, and to recurse through
all the nested subqueries to do this. I don't think this would be
very hard to implement. I only worry about what would happen with
deeply-nested subqueries - this could create an order 2**n algorithm
(that is, one whose time would grow exponentially with the number of
levels of nesting).
>That's the question I've been pondering for the past two days; is
>that also what you are referring to by the above "I'll have to think
>about" statement? Or am I taking this in a different direction?
Yes, I believe we're thinking about the same thing.
- Jeff Lichtman
swazoo@rcn.com
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Army <qo...@sbcglobal.net>.
> I think you've found a bug, but the solution you're trying may not be
> correct. I believe the optimizer currently keeps track of only two "best
> plans" - the best access path for each table in a query (or subquery) as
> it's currently being considered in a join order, and the best overall
> join order and path for each table for the best plan it's found so far.
Am I right in thinking that the first of these two "best plans" is stored in
"bestAccessPath", and the second is stored in "trulyTheBestAccessPath"? That's
how I read it, but I'd just like to make sure.
> As for the problems you're seeing in view.sql and refActions1.sql, I
> suspect it's due to the fact that in some queries there are certain join
> orders that won't work (e.g. if you pass a column from one table to the
> constructor for a VTI, the VTI must come after the other table in the
> join order). Your attempted fix is probably clobbering the original best
> plan for the subquery in a case where no subsequent best plan is
> possible.
Okay, this definitely makes sense. I went back and looked at my quick-fix and I
see that I was a bit overzealous--I reset not only the bestCost for the
OptimizerImpl, but also the "foundABestPlan" flag and the "bestJoinOrder" field.
The fact that I cleared the foundABestPlan flag was what caused the error I
mentioned.
That said, I then changed the code so that it only resets the bestCost field (by
setting it to the max value possible); I left foundABestPlan and bestJoinOrder
untouched. When I did this, views.sql and refActions1.sql both passed. At
first I thought this may have just been dumb luck, but the more I think about
it, the more I think this is the correct thing to do. Since, as you said, it's
possible that "there is no subsequent best plan" for the subquery, we need to
keep hold of the fact that we did, at some point, find a best plan, and we need
to know what the corresponding join order is. That said, resetting bestCost to
the max value does two things:
1. It ensures that if at least one best plan is found for the current subquery
with respect to the current join order of the outer query, that best plan will
be saved as the best plan for the subquery and it's cost will be returned
accordingly.
2. It ensures that if no best plan is found, then the cost for the subquery
will be outrageously high (Double.MAX_VALUE) which means the outer query will
not choose its current join order--which is good, because the subquery can't be
executed with that outer join order. Further, we will still know that there was
at some point some outer join order for which the subquery had a valid best plan
(because foundABestPlan will still be set) and we will still know the subquery's
join order for that best plan (because bestJoinOrder will still be set).
Number 1 would, I believe, fix the issue that started this thread, because the
"1 million" plan would be saved (1 million is less than Double.MAX_VALUE) and
then rejected as too expensive. Number 2 would, I believe, address your comment
that there may in fact be "no subsequent best plan", which means the subquery
must know that it did have a valid plan at one point, and it must know what the
join order for that plan was. The only information we lose is the cost of that
earlier plan--but the cost should already have been viewed and handled by the
outer query at the time it was first returned, so I don't think it's required
any more...? The fact that views.sql and refActions1.sql pass with this change
suggests that this is along the right lines; I would of course like to have
someone confirm if I'm seeing this correctly...
Does that sound right, or am I being too optimistic?
> For subqueries it may need to keep track of another "best plan" - the
> plan to use with the best plan found for the outer query.
>
> I'll have to think about whether it gets more complicated than this if
> subqueries are nested. Would the optimizer have to keep track of even
> more levels of "best plan" in cases like this (I hope not).
I've been struggling to wrap my head around this question, as well (at least, I
*think* this is the question with which I've been struggling ;) And ultimately,
I think this ties back to what I was trying to do with the Phase 1 patch for
DERBY-805--namely, keep track of "the plan to use with the best plan found for
the outer query". But based on this discussion, I wonder if the patch I posted
goes far enough--if there are subqueries beneath subqueries, such as can happen
if we have unions beneath unions beneath unions, how do we get each of those
subqueries to remember its best plan with respect to _every_ level of outer
query (there could be several) that sits above it?
That's the question I've been pondering for the past two days; is that also what
you are referring to by the above "I'll have to think about" statement? Or am I
taking this in a different direction?
In any event, since it appears that this is in fact an issue with the optimizer,
I'll file a Jira entry for tracking purposes.
Thanks for the reply, Jeff. This was very helpful...
Army
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Jeffrey Lichtman <sw...@rcn.com>.
>The optimizer will have to remember this best plan even when it
>considers a different join order for the outer query, because it's
>possible that the subquery's best plan for that optimization will be
>the best one overall.
It occurs to me that the code structure already supports this. Every
time the optimizer figures out the best access path for an
optimizable in the context of a join order it's considering, it tells
the optimizable to remember that access path as the "best," and when
it finds a complete plan that's the best so far, it tells the
optimizable to remember it as "truly the best." When optimizing an
outer query containing a table subquery, the "best" and "truly the
best" access paths for the subquery should be the complete plan for
the subquery from the current (or latest) optimization of it.
- Jeff Lichtman
swazoo@rcn.com
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/
Re: [OPTIMIZER] OptimizerImpl "best plans" for subqueries?
Posted by Jeffrey Lichtman <sw...@rcn.com>.
>Can anyone comment on this behavior? Namely, can I get feedback as
>to whether my interpretation of what makes a "bestPlan"--meaning the
>best plan in "this round"--is correct? And if so, is this a bug in
>the current Derby optimizer? And if so, any ideas as to how this
>could be causing the two new failures in view.sql and refActions1.sql?
When optimizing a table subquery (a subquery in the FROM clause), we
must re-optimize the subquery every time it is put into a new
position in the outer query's join order. Once this is done, there
will (probably) be a best plan for the subquery in that context. The
optimizer will have to remember this best plan even when it considers
a different join order for the outer query, because it's possible
that the subquery's best plan for that optimization will be the best
one overall. This means that there should be two different best plans
for the subquery: the best plan found for the current optimization of
the subquery, and the best plan for all optimizations of the subquery.
I think you've found a bug, but the solution you're trying may not be
correct. I believe the optimizer currently keeps track of only two
"best plans" - the best access path for each table in a query (or
subquery) as it's currently being considered in a join order, and the
best overall join order and path for each table for the best plan
it's found so far. For subqueries it may need to keep track of
another "best plan" - the plan to use with the best plan found for
the outer query.
I'll have to think about whether it gets more complicated than this
if subqueries are nested. Would the optimizer have to keep track of
even more levels of "best plan" in cases like this (I hope not).
BTW, this problem doesn't happen with regular (non-table) subqueries,
because Derby defers their optimization until the optimization of the
outer query is finished.
As for the problems you're seeing in view.sql and refActions1.sql, I
suspect it's due to the fact that in some queries there are certain
join orders that won't work (e.g. if you pass a column from one table
to the constructor for a VTI, the VTI must come after the other table
in the join order). Your attempted fix is probably clobbering the
original best plan for the subquery in a case where no subsequent
best plan is possible. It would help if you posted the queries that
got these errors.
- Jeff Lichtman
swazoo@rcn.com
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/