You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Jeff Lichtman (JIRA)" <de...@db.apache.org> on 2006/02/16 09:12:03 UTC

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366598 ] 

Jeff Lichtman commented on DERBY-805:
-------------------------------------

I have been reading Army's (A B''s) document, and I have some questions.

How could a predicate be pushable to only one side of a union? Can you provide an example of a predicate that can be pushed only to one side?

Why are you only dealing with join predicates? It would also be useful to push simple search arguments (i.e. a column compared to a constant), and this case might  be more common than join predicates:

create view v as select * from t1 union select * from t2

select * from v where c = 1

I would prefer to see any type of predicate pushed into a union - even those containing complex expressions. This might be hard to implement, though, as I don't know whether the cloning methods are implemented for the entire ValueNode hierarchy.


> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Satheesh Bandaram <sa...@Sourcery.Org>.

Army wrote:

> Jeff Lichtman (JIRA) wrote:
>
>> Why are you only dealing with join predicates? It would also be
>> useful to push simple search arguments (i.e. a column compared to a
>> constant), and this case might  be more common than join predicates:
>
>
> <Jeff Lichtman>
> > BTW, the business of pushing and pulling predicates during
> optimization can be
> > hard to understand and debug, so maybe it's best to only handle the
> simple
> > cases and do it during preprocessing.
>
> <Satheesh>
> The pushing is done before optimization... during preprocessing. [ ...
> ] You bring up a *great *point about pushing join predicates. I am not
> implementing this for UnionNode.

Right... I think more appropriate place to push single-table scalar
predicates is in preprocessing stage. My current changes don't yet
address all possible scalar predicate push downs. It should be possible
to push expressions like upper(name)='DERBY' or T1.a+T1.b=5. Even if
they don't help in choosing index, if available, evaluating expressions
closer to data could save large amount of processing.

That said, one advantage of pushing scalar predicates later during
optimization may be that more of join predicates may qualify to be
pushed. It may be possible to push T1.a+T1.b = T2.a+T2.b into T1 or T2
depending on join order. We may still be able to achieve this, but may
duplicate a lot of code from preprocessing time.

>> I would prefer to see any type of predicate pushed into a union -
>> even those containing complex expressions. This might be hard to
>> implement, though, as I don't know whether the cloning methods are
>> implemented for the entire ValueNode hierarchy.
>
>
> Sounds like an "itch" to me :) While it might indeed be nice to push
> predicates containing complex expressions, that's another enhancement
> of its own.  I won't be doing that with my DERBY-805 changes.

Right... I didn't find any existing code in Derby that clones an entire
expression for pushing. I cooked up some predicate copying logic that
only handles very simple cases. That improvement could be taken up
separately.

>
> Thanks again--I can't say that enough--for reading the document.  It's
> a huge one and I'm grateful for your time and feedback.

Thank you for working on this. This is a good improvement to Derby.

Satheesh



Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Daniel John Debrunner <dj...@apache.org>.
Army wrote:

> Jeff Lichtman (JIRA) wrote:

>> I would prefer to see any type of predicate pushed into a union - even
>> those containing complex expressions. This might be hard to implement,
>> though, as I don't know whether the cloning methods are implemented
>> for the entire ValueNode hierarchy.
> 
> 
> Sounds like an "itch" to me :) While it might indeed be nice to push
> predicates containing complex expressions, that's another enhancement of
> its own.  I won't be doing that with my DERBY-805 changes.

Right, sounds like a good approach, don't overload a single contained
change with other changes that add complexity and potentially make it
harder to review, and therefore harder to catch bugs.

Incremental development :-)

Dan.


Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Jeffrey Lichtman <sw...@rcn.com>.
>>How could a predicate be pushable to only one side of a union? Can 
>>you provide an example of a predicate that can be pushed only to one side?
>
>If we take something along the lines of:
>
>select ... from
>   t2,
>   (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
>where X1.a = t2.i;
>
>In this case the predicate X1.a = t2.i could be pushed to the left 
>("select * from t1") and used when reading T1, but couldn't be 
>pushed to the VALUES clause  because there's no underlying table.

OK. One way to deal with this is to put a ProjectRestrictNode between 
the union node and the values clause as a place to park the 
predicate. To make things simple, you might want to always put 
ProjectRestrictNodes under both sides of the union during 
preprocessing (i.e. after binding but before optimization). In some 
cases the extra nodes won't be needed, but ProjectRestrictNodes (and 
the corresponding ProjectRestrictResultSets) are cheap. Also, you 
could eliminate unnecessary ProjectRestrictNodes at the end of 
optimization (possibly in modifyAccessPaths()).

This approach would give better performance in some cases, and could 
simplify the code (since you wouldn't have to figure out when the 
predicates are pushable).


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Army <qo...@sbcglobal.net>.
Jeff Lichtman (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366598 ] 
> 
> Jeff Lichtman commented on DERBY-805:
> -------------------------------------
> 
> I have been reading Army's (A B''s) document, and I have some questions.

Thank you very much for taking the time to read it.  I really appreciate it.

> How could a predicate be pushable to only one side of a union? Can you 
> provide an example of a predicate that can be pushed only to one side?

If we take something along the lines of:

select ... from
   t2,
   (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
where X1.a = t2.i;

In this case the predicate X1.a = t2.i could be pushed to the left ("select * 
from t1") and used when reading T1, but couldn't be pushed to the VALUES clause 
  because there's no underlying table.  If we pushed to the left but not to the 
right, then removed it from the UnionNode's predList--which is the 
restrictionList of the ProjectRestrictNode above the UnionNode--the rows from 
the right would remain unqualified and thus we'd return incorrect results (more 
rows that intended).

> Why are you only dealing with join predicates? It would also be useful to push simple 
> search arguments (i.e. a column compared to a constant), and this case might  be more 
> common than join predicates:

Actually, when I first made the changes described in the document, I pushed any 
predicate that was a binary relational operator with a column reference on at 
least one side and a query-invariant value (i.e. constant or parameter) on 
whichever side was not a column reference (if either).  This covered the case of 
a column compared to a constant.  All of my changes worked, so from a 
logical/coding perspective we could indeed do just that.  However, I then put in 
the join predicate limitation because it seemed to me (based on very brief 
inspection) that the case of a comparison with a constant was covered by 
Satheesh's fix for DERBY-649, so I thought it might be extra unnecessary work to 
continually push/pull those predicates throughout the optimization process.  The 
following comments re: DERBY-649 made me think I didn't need to worry about 
one-sided predicates:

<Jeff Lichtman>
 > BTW, the business of pushing and pulling predicates during optimization can be
 > hard to understand and debug, so maybe it's best to only handle the simple
 > cases and do it during preprocessing.

<Satheesh>
The pushing is done before optimization... during preprocessing. [ ... ] You 
bring up a *great *point about pushing join predicates. I am not implementing 
this for UnionNode.

And of the course, the "summary" of DERBY-805 itself says "Push join predicates 
into union and other set operations. DERBY-649 implemented scalar (single table) 
predicate pushdown. Adding join predicate push down could improve performance 
significantly."

So given that, I figured the goal for DERBY-805 was to focus on pushing join 
predicates--and that's what I've done.  One final comment from OptimizerImpl 
further prompted me lean toward this limitation:

	/*
	** Pull the predicates at from the optimizable and put
	** them back in the predicate list.
	**
	** NOTE: This is a little inefficient because it pulls the
	** single-table predicates, which are guaranteed to always
	** be pushed to the same optimizable.  We could make this
	** leave the single-table predicates where they are.
	*/
	pullMe.pullOptPredicates(predicateList);

So it seemed like pushing more single-sided predicates would be adding to the 
"inefficiency" mentioned here, and since the predicates are (as I understand it) 
already handled in preprocessing for DERBY-649, I didn't think we'd benefit from 
pushing them during optimization.  Perhaps I'm missing something somewhere or 
drawing the wrong conclusion?

> I would prefer to see any type of predicate pushed into a union - even those 
> containing complex expressions. This might be hard to implement, though, as I 
> don't know whether the cloning methods are implemented for the entire ValueNode 
> hierarchy.

Sounds like an "itch" to me :) While it might indeed be nice to push predicates 
containing complex expressions, that's another enhancement of its own.  I won't 
be doing that with my DERBY-805 changes.

Thanks again--I can't say that enough--for reading the document.  It's a huge 
one and I'm grateful for your time and feedback.

Army