You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Satheesh Bandaram (JIRA)" <de...@db.apache.org> on 2006/01/09 21:16:22 UTC

[jira] Created: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Key: DERBY-805
URL: http://issues.apache.org/jira/browse/DERBY-805
Project: Derby
Type: Sub-task
Components: SQL
Versions: 10.1.2.0, 10.2.0.0
Environment: generic
Reporter: Satheesh Bandaram
Fix For: 10.2.0.0

Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.

For example,

create view V1 as select i, j from T1 union all select i,j from T2;
create view V2 as select a,b from T3 union all select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like

select * from V1, V2 where V1.j = V2.b and V1.i =1;

If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.

Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)

If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.

For example:

create view v as select * from t1 union all select * from t2;

select * from v, t3 where v.c1 = t3.c2;

In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.

If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.

There's a comment in UnionNode.optimizeIt() saying:

/* RESOLVE - don't try to push predicated through for now */

This is where I'd expect to see something for pushing predicates into the union during optimization.

BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Comment: was deleted

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366831 ] 

A B commented on DERBY-805:
---------------------------

 <Jeff>
> It's OK with me if you check in this change with the current algorithm. If you do this,
> please add comments to the code saying what could be changed to make it possible
> to apply a pushed predicate to one side only.

Okay, will do. These comments will apply to the Phase 2 patch. Thanks for describing how the changes could work--that'll certain be a big help when it comes time to implement them.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366731 ] 

A B commented on DERBY-805:
---------------------------

Based on the 6 steps that I've outlined in DERBY-805.html, I plan to post four separate patches for this issue, in the following order:

-- Phase 1 -- The first patch will implement the changes as described in "Step 5" of the document, namely:

	5 - Ensure that the best access path for a UnionNode that pushes
		predicates is correctly saved during optimization and correctly
		retrieved when it comes time to finalize the query's overall
		access path.

The phase 1 patch shouldn't have any functional effect on the codeline; it's just an incremental step toward the complete fix.

-- Phase 2 -- The second patch will add code for "Step 1", which is:

	1 - Add the ability to take a predicate and scope it to a target
		result set so that it can be pushed to that result set. 

The phase 2 patch will add code to the codeline that won't actually get executed until phase 3.  Thus, like the phase 1 patch, the phase 2 patch should have no functional effects on the codeline.

-- Phase 3 -- The third patch will be the one that does the core of the predicate pushing/pulling.  This patch will cover Steps 2, 3, 4, and 6 of the steps outlined in the document.  Once this patch is committed, the changes for phase 1 and phase 2 will take effect--so phase 3 is also where I'll add tests to make sure predicates are being pushed correctly.  Note that after phase 3, any join predicates which are targeted for UNIONS will _always_ be pushed down into the union (assuming it's possible to do so).  This means that, as discussed in the document, there will be cases where Derby originally chose to do a Hash join using the predicate but will now (after phase 3) push the predicate and do a nested loop join.  This problem will then be addressed in phase 4.

-- Phase 4 -- The phase 4 patch will address any unresolved issues from the first three phases, which right now means that it will include the necessary changes to allow the optimizer to make a cost-based decision about whether or not it should push the predicates, instead of always pushing them.

The vast majority of the changes will go into the code as described in DERBY-805.html, with potential variations due to review comments and/or improvements/alterations that I find to be necessary as I work.  So anyone who might have the opportunity to review any of the phase 1-4 patches will hopefully find it useful to read the respective parts of the document...

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase1_v3.patch
                d805_phase1_v3.stat

Attaching an updated patch d805_phase1_v3.patch to address the issue in my last comment.  The differences between d805_phase1_v3.patch and d805_phase1_v2.patch are summarized here:

1) There was a slight off-by-one error in the optimizer JUMP logic that was causing it to pull Optimizables prematurely.  This error is what led me to realize that phase1_v2 was incomplete--however, the JUMP error _is_ still an error, so phas1_v3 fixes it (one-line change).

2) phase1_v3 adds logic to only reload best plans when required, instead of doing it every single time we pull (which is what phase1_v2 did).  Since plan "reload" can be relatively expensive--especially for deeply-nested subqueries--it's better to only do it when it's required.

3) phase1_v3 adds logic to remember nested OptimizerImpl "best join orders" in addition to nested Optimizable "best access paths".  The join orders have to saved with respect to outer queries, just like the access paths, but phase1_v2 did not do that.  phase1_v3 does.

I ran derbyall on Red Hat Linux with IBM 1.4 and saw one diff in predicatesIntoViews that matches the diff from DERBY-407.  In that issue the diff was occuring with slower machines but not with faster machines.  With my phase1_v3.patch the diff occurs consistently regardless of the speed of my CPU, and I believe the diff is correct. So I've included a master update for that test as part of d805_phase1_v3.patch.

If anyone has time to review, I'd be grateful...

**NOTE to commiters: There are now two patches pending for this issue: d805_phase1_v3.patch and d805_phase2_v1.patch.  The patches are independent of each other and can be committed in either order.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: phase2_javadocFix.patch

Attaching a one-line patch to fix a javadoc typo.  If any commiters out there can check this in quick, that'd be great.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, phase2_javadocFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]
     
Satheesh Bandaram closed DERBY-805:
-----------------------------------


Fix has been verified.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0, 10.1.3.0, 10.1.2.4
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_1007_1073_portTo10_1.patch, d805_1007_1073_portTo10_1.stat, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch, test_805.htm
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_1007_1073_portTo10_1.patch
                d805_1007_1073_portTo10_1.stat

Attaching patch to port all phases of this issue plus changes for DERBY-1007 and DERBY-1073 to 10.1.  For more details (including why the three different issues are included a single patch), see here:

http://article.gmane.org/gmane.comp.apache.db.derby.devel/19330

Attaching here for ease of download and tracking.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_1007_1073_portTo10_1.patch, d805_1007_1073_portTo10_1.stat, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch, test_805.htm
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Assigned: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B reassigned DERBY-805:
-------------------------

    Assign To: A B

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0

>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Other Info:   (was: [Patch available])

While working with other changes for this issue, I came to realize that d805_phase1_v2 has one small problem.  That patch assumes that an OptimizerImpl will always find a "best join order" before it attempts to "pull" any Optimizables and re-position them for another join order.  But with the "JUMPING" functionality that the Optimizer does for queries with a large number of tables, it turns out that it is in fact possible to "pull" an Optimizable before finding a best join order.  So I need to update the Phase 1 patch to account for this.

I already have the required fixes locally; I want to run derbylang to make sure nothing breaks, and then I will post another version of the patch.   In the meantime, I'm unchecking the "patch available" box...

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366729 ] 

A B commented on DERBY-805:
---------------------------

Pasting my reply to Jeff's question here (Jira was down this morning, so I just sent the reply to derby-dev):

Jeff Lichtman (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366598 ]
> Jeff Lichtman commented on DERBY-805:
> -------------------------------------
>
> I have been reading Army's (A B''s) document, and I have some questions.

Thank you very much for taking the time to read it.  I really appreciate it.

> How could a predicate be pushable to only one side of a union? Can you provide an example of a predicate that can be pushed only to one side?

If we take something along the lines of:

select ... from
  t2,
  (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
where X1.a = t2.i;

In this case the predicate X1.a = t2.i could be pushed to the left ("select * from t1") and used when reading T1, but couldn't be pushed to the VALUES clause  because there's no underlying table.  If we pushed to the left but not to the right, then removed it from the UnionNode's predList--which is the restrictionList of the ProjectRestrictNode above the UnionNode--the rows from the right would remain unqualified and thus we'd return incorrect results (more rows that intended).

> Why are you only dealing with join predicates? It would also be useful to push simple search arguments (i.e. a column compared to a constant), and this case might  be more common than join predicates:

Actually, when I first made the changes described in the document, I pushed any predicate that was a binary relational operator with a column reference on at least one side and a query-invariant value (i.e. constant or parameter) on whichever side was not a column reference (if either).  This covered the case of a column compared to a constant.  All of my changes worked, so from a logical/coding perspective we could indeed do just that.  However, I then put in the join predicate limitation because it seemed to me (based on very brief inspection) that the case of a comparison with a constant was covered by Satheesh's fix for DERBY-649, so I thought it might be extra unnecessary work to continually push/pull those predicates throughout the optimization process.  The following comments re: DERBY-649 made me think I didn't need to worry about one-sided predicates:

<Jeff Lichtman>
> BTW, the business of pushing and pulling predicates during optimization can be
> hard to understand and debug, so maybe it's best to only handle the simple
> cases and do it during preprocessing.

<Satheesh>
The pushing is done before optimization... during preprocessing. [ ... ] You bring up a *great *point about pushing join predicates. I am not implementing this for UnionNode.

And of the course, the "summary" of DERBY-805 itself says "Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly."

So given that, I figured the goal for DERBY-805 was to focus on pushing join predicates--and that's what I've done.  One final comment from OptimizerImpl further prompted me lean toward this limitation:

    /*
    ** Pull the predicates at from the optimizable and put
    ** them back in the predicate list.
    **
    ** NOTE: This is a little inefficient because it pulls the
    ** single-table predicates, which are guaranteed to always
    ** be pushed to the same optimizable.  We could make this
    ** leave the single-table predicates where they are.
    */
    pullMe.pullOptPredicates(predicateList);

So it seemed like pushing more single-sided predicates would be adding to the "inefficiency" mentioned here, and since the predicates are (as I understand it) already handled in preprocessing for DERBY-649, I didn't think we'd benefit from pushing them during optimization.  Perhaps I'm missing something somewhere or drawing the wrong conclusion?

> I would prefer to see any type of predicate pushed into a union - even those containing complex expressions. This might be hard to implement, though, as I don't know whether the cloning methods are implemented for the entire ValueNode hierarchy.

Sounds like an "itch" to me :) While it might indeed be nice to push predicates containing complex expressions, that's another enhancement of its own.  I won't be doing that with my DERBY-805 changes.

Thanks again--I can't say that enough--for reading the document.  It's a huge one and I'm grateful for your time and feedback.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Satheesh Bandaram <sa...@Sourcery.Org>.

Army wrote:

> Jeff Lichtman (JIRA) wrote:
>
>> Why are you only dealing with join predicates? It would also be
>> useful to push simple search arguments (i.e. a column compared to a
>> constant), and this case might  be more common than join predicates:
>
>
> <Jeff Lichtman>
> > BTW, the business of pushing and pulling predicates during
> optimization can be
> > hard to understand and debug, so maybe it's best to only handle the
> simple
> > cases and do it during preprocessing.
>
> <Satheesh>
> The pushing is done before optimization... during preprocessing. [ ...
> ] You bring up a *great *point about pushing join predicates. I am not
> implementing this for UnionNode.

Right... I think more appropriate place to push single-table scalar
predicates is in preprocessing stage. My current changes don't yet
address all possible scalar predicate push downs. It should be possible
to push expressions like upper(name)='DERBY' or T1.a+T1.b=5. Even if
they don't help in choosing index, if available, evaluating expressions
closer to data could save large amount of processing.

That said, one advantage of pushing scalar predicates later during
optimization may be that more of join predicates may qualify to be
pushed. It may be possible to push T1.a+T1.b = T2.a+T2.b into T1 or T2
depending on join order. We may still be able to achieve this, but may
duplicate a lot of code from preprocessing time.

>> I would prefer to see any type of predicate pushed into a union -
>> even those containing complex expressions. This might be hard to
>> implement, though, as I don't know whether the cloning methods are
>> implemented for the entire ValueNode hierarchy.
>
>
> Sounds like an "itch" to me :) While it might indeed be nice to push
> predicates containing complex expressions, that's another enhancement
> of its own.  I won't be doing that with my DERBY-805 changes.

Right... I didn't find any existing code in Derby that clones an entire
expression for pushing. I cooked up some predicate copying logic that
only handles very simple cases. That improvement could be taken up
separately.

>
> Thanks again--I can't say that enough--for reading the document.  It's
> a huge one and I'm grateful for your time and feedback.

Thank you for working on this. This is a good improvement to Derby.

Satheesh

Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Daniel John Debrunner <dj...@apache.org>.

Army wrote:

> Jeff Lichtman (JIRA) wrote:

>> I would prefer to see any type of predicate pushed into a union - even
>> those containing complex expressions. This might be hard to implement,
>> though, as I don't know whether the cloning methods are implemented
>> for the entire ValueNode hierarchy.
> 
> 
> Sounds like an "itch" to me :) While it might indeed be nice to push
> predicates containing complex expressions, that's another enhancement of
> its own.  I won't be doing that with my DERBY-805 changes.

Right, sounds like a good approach, don't overload a single contained
change with other changes that add complexity and potentially make it
harder to review, and therefore harder to catch bugs.

Incremental development :-)

Dan.

Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Jeffrey Lichtman <sw...@rcn.com>.

>>How could a predicate be pushable to only one side of a union? Can 
>>you provide an example of a predicate that can be pushed only to one side?
>
>If we take something along the lines of:
>
>select ... from
>   t2,
>   (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
>where X1.a = t2.i;
>
>In this case the predicate X1.a = t2.i could be pushed to the left 
>("select * from t1") and used when reading T1, but couldn't be 
>pushed to the VALUES clause  because there's no underlying table.

OK. One way to deal with this is to put a ProjectRestrictNode between 
the union node and the values clause as a place to park the 
predicate. To make things simple, you might want to always put 
ProjectRestrictNodes under both sides of the union during 
preprocessing (i.e. after binding but before optimization). In some 
cases the extra nodes won't be needed, but ProjectRestrictNodes (and 
the corresponding ProjectRestrictResultSets) are cheap. Also, you 
could eliminate unnecessary ProjectRestrictNodes at the end of 
optimization (possibly in modifyAccessPaths()).

This approach would give better performance in some cases, and could 
simplify the code (since you wouldn't have to figure out when the 
predicates are pushable).


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/

Re: [jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Army <qo...@sbcglobal.net>.

Jeff Lichtman (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366598 ] 
> 
> Jeff Lichtman commented on DERBY-805:
> -------------------------------------
> 
> I have been reading Army's (A B''s) document, and I have some questions.

Thank you very much for taking the time to read it.  I really appreciate it.

> How could a predicate be pushable to only one side of a union? Can you 
> provide an example of a predicate that can be pushed only to one side?

If we take something along the lines of:

select ... from
   t2,
   (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
where X1.a = t2.i;

In this case the predicate X1.a = t2.i could be pushed to the left ("select * 
from t1") and used when reading T1, but couldn't be pushed to the VALUES clause 
  because there's no underlying table.  If we pushed to the left but not to the 
right, then removed it from the UnionNode's predList--which is the 
restrictionList of the ProjectRestrictNode above the UnionNode--the rows from 
the right would remain unqualified and thus we'd return incorrect results (more 
rows that intended).

> Why are you only dealing with join predicates? It would also be useful to push simple 
> search arguments (i.e. a column compared to a constant), and this case might  be more 
> common than join predicates:

Actually, when I first made the changes described in the document, I pushed any 
predicate that was a binary relational operator with a column reference on at 
least one side and a query-invariant value (i.e. constant or parameter) on 
whichever side was not a column reference (if either).  This covered the case of 
a column compared to a constant.  All of my changes worked, so from a 
logical/coding perspective we could indeed do just that.  However, I then put in 
the join predicate limitation because it seemed to me (based on very brief 
inspection) that the case of a comparison with a constant was covered by 
Satheesh's fix for DERBY-649, so I thought it might be extra unnecessary work to 
continually push/pull those predicates throughout the optimization process.  The 
following comments re: DERBY-649 made me think I didn't need to worry about 
one-sided predicates:

<Jeff Lichtman>
 > BTW, the business of pushing and pulling predicates during optimization can be
 > hard to understand and debug, so maybe it's best to only handle the simple
 > cases and do it during preprocessing.

<Satheesh>
The pushing is done before optimization... during preprocessing. [ ... ] You 
bring up a *great *point about pushing join predicates. I am not implementing 
this for UnionNode.

And of the course, the "summary" of DERBY-805 itself says "Push join predicates 
into union and other set operations. DERBY-649 implemented scalar (single table) 
predicate pushdown. Adding join predicate push down could improve performance 
significantly."

So given that, I figured the goal for DERBY-805 was to focus on pushing join 
predicates--and that's what I've done.  One final comment from OptimizerImpl 
further prompted me lean toward this limitation:

	/*
	** Pull the predicates at from the optimizable and put
	** them back in the predicate list.
	**
	** NOTE: This is a little inefficient because it pulls the
	** single-table predicates, which are guaranteed to always
	** be pushed to the same optimizable.  We could make this
	** leave the single-table predicates where they are.
	*/
	pullMe.pullOptPredicates(predicateList);

So it seemed like pushing more single-sided predicates would be adding to the 
"inefficiency" mentioned here, and since the predicates are (as I understand it) 
already handled in preprocessing for DERBY-649, I didn't think we'd benefit from 
pushing them during optimization.  Perhaps I'm missing something somewhere or 
drawing the wrong conclusion?

> I would prefer to see any type of predicate pushed into a union - even those 
> containing complex expressions. This might be hard to implement, though, as I 
> don't know whether the cloning methods are implemented for the entire ValueNode 
> hierarchy.

Sounds like an "itch" to me :) While it might indeed be nice to push predicates 
containing complex expressions, that's another enhancement of its own.  I won't 
be doing that with my DERBY-805 changes.

Thanks again--I can't say that enough--for reading the document.  It's a huge 
one and I'm grateful for your time and feedback.

Army

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Jeff Lichtman (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366598 ] 

Jeff Lichtman commented on DERBY-805:
-------------------------------------

I have been reading Army's (A B''s) document, and I have some questions.

How could a predicate be pushable to only one side of a union? Can you provide an example of a predicate that can be pushed only to one side?

Why are you only dealing with join predicates? It would also be useful to push simple search arguments (i.e. a column compared to a constant), and this case might  be more common than join predicates:

create view v as select * from t1 union select * from t2

select * from v where c = 1

I would prefer to see any type of predicate pushed into a union - even those containing complex expressions. This might be hard to implement, though, as I don't know whether the cloning methods are implemented for the entire ValueNode hierarchy.


> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase4_v2.patch
                DERBY-805_v5.html

[ Attaching d805_phase4_v2.patch, which only has a 1-line diff from the v1 patch ]

After stepping away from the code for the weekend and then re-reading the document for Phase 4, I noticed a small (one-line) error in the first version of my Phase 4 changes.

In the document I correctly wrote the following regarding the need to revert access paths for subtrees when an Optimizable's current permutation is found to be "not the best":

<begin_quote>

In order to fix this behavior, we have to add logic to save the "best plans" for an Optimizable's subtree before each new permutation, and then if the current permutation isn't the best one, we need to revert the entire subtree's plans back to what they were for the "best" permutation. 

<end_quote>

But the code that I wrote didn't quite match this statement.  In the following diff:

+        // If the final path that we considered for curOpt was _not_ the best
+        // path for this round, then we need to revert back to whatever the
+        // best plan for curOpt was this round.  Note that the cost estimate
+        // for bestAccessPath could be null here if the last path that we
+        // checked was the only one possible for this round.
+        if (!retval &&
+            (curOpt.getBestAccessPath().getCostEstimate() != null) &&
+            (curOpt.getCurrentAccessPath().getCostEstimate() != null))
+        {

the check for "!retval" means that we will only revert the plans if we have exhausted all permutations for the current Optimizable.  But as I wrote in the document, we need to check (and potentially revert) the subtree's plans after _each_  new permutation, not just after the final one.

To show why this matters, assume we have some Optimizable with three possible permutations P1, P2, and P3, of which P1 is the "best".  With the code as I wrote it in d805_phase4_v1.patch, we would end up doing the following:

  -- Pick permutation P1.
  -- Since P1 is the first permutation, there is no previous "best" path
     so there's nothing to save.
  -- Optimize P1 and get the cost; say it's 25.
  -- Since P1 isn't our final permutation, we don't revert.  So the plans
     corresponding to P1 are still saved at the Optimizable and its subtree.
  -- Pick permutation P2.
  -- Before optimizing P2, save the best paths for the current Optimizable's
     subtree; this means we'll save off the paths corresponding to P1.
  -- Optimize P2 and get the cost; say it's 50.
  -- Since P2 isn't our final permutation, we *don't* revert. So the plans
     corresponding to P2 are still saved at the Optimizable and its subtree.
  -- Pick permutation P3.
  -- Before optimizing P3, save the best paths for the current Optimizable's
     subtree; this means we'll save off the paths corresponding to *P2*.
     (This is wrong.)
  -- Optimize P3 and get the cost; say it's 75.
  -- Since P3 is our final permutation, and since it is NOT the best permutation
     that we found, we'll now revert the paths of this Optimizable's subtree.
     That mean's we'll load the plans that we most recently saved and
     generate the query plan based on those.  But that will give us the
     plans for _P2_, when what we wanted was the plans for _P1_.

So d805_phase4_v2.patch fixes this by removing the check for "!retval" in the above diff.  Then after we optimize P2 and get the cost, we'll see that it's not the best so we'll immediately revert the paths back to what they were for P1.  The P1 paths (instead of the P2 paths) will then get saved off before optimization of P3 starts, and thus when we do the "revert" after P3, we'll load the correct paths (P1).

As it turns out this particular change doesn't change any of the other Phase 4 behavior.  The reason is that an Optimizable that is not a FromBaseTable only (currently) has two permutations--nested loop join and hash join--and thus the above scenario can't currently happen. FromBaseTable's can have more than two permutations, but since they don't have subtrees beneath them this whole save/revert logic is not required.  Thus even though the v1 patch was technically incorrect, everything ran as expected.  For accuracy, though, I think this small error should be fixed, so that's what the second version of the phase 4 patch (d805_phase4_v2.patch) does.

I've also added a _v5.html document that is almost identical to _v4 except for two things: 1) the small change that I just described, and 2) I've included a note with the info about index statistics that Andrew posted to derby-dev (thanks to Bryan for asking the question and to Andrew for providing the answer).

Just to be safe I re-ran derbyall on Red Hat Linux with ibm142 sane jars and saw no failures.


> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase2_v1.patch
                d805_phase2_v1.stat
                DERBY-805_v3.html

I've finalized changes for d805_phase1_v3.patch but still need to run derbyall tonight to make sure there are no new diffs (I meant to do it last night but ended up tracing through code well into the night instead).

In the meantime, I'm posting the first version of the Phase 2 patch here: d805_phase2_v1.patch.  This patch handles the "scoping" logic for predicates as described in "Step 1" of DERBY-805_v3.html, which I'm also posting here as there have been some updates since the first version.

The Phase 2 patch (d805_phase2_v1.patch) is independent of the Phase 1 patch and therefore can be committed before or after Phase 1--the order in which the two are committed doesn't actually matter so long as both are committed before Phase 3.

The Phase 2 patch does the following:

  1. Adds the necessary logic/methods to scope a predicate
  2. Adds some utility methods and one new class (BaseTableNumbersVisitor)
	used for scoping predicates
  3. Adds some other simple utility methods in preparation for predicate
	pushdown with Phase 3.

None of the new methods will ever get called until Phase 3 is committed, so d805_phase2_v1.patch should have no affect on the codeline--it's just part of incremental development.

Thanks to anyone who has a chance to review and/or commit...

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase2_v1.patch, d805_phase2_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase1_v2.patch
                d805_phase1_v2.stat
                DERBY-805_v2.html

Attaching a "version 2" of the Phase 1 patch, which does things in a more general and more robust way, based on discussed here:

http://article.gmane.org/gmane.comp.apache.db.derby.devel/14934

The new patch is "d805_phase1_v2.patch".  I'm also attaching DERBY-805_v2.html, which is a new version of the document that has been updated with details to explain the new approach for Phase 1 patch--see "Step 5" of section VI.  This new description matches what was said on derby-dev, with one minor change: the call to addOrLoadBestPlan() when placing an Optimizable was removed, as it turns out that it wasn't needed.  The new description also has a walk-through of what the new code does for a given example.

Note that even though d805_phase1_v2.patch is a different approach, it is still the case none of the existing Derby functionality should be affected by this change.  The new code is exercised, but until we actually start pushing predicates, there should be no behavioral differences.

I ran derbylang with this patch and IBM 1.4.2 with no failures.  I haven't had time to run derbyall yet, but hope to do so tonight (my poor machine is working around the clock lately, so night-time is the only time I can run the full suite).

Thanks in advance for review comments.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Jeff Lichtman (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366736 ] 

Jeff Lichtman commented on DERBY-805:
-------------------------------------

Here is my reply to Army's reply (also previously posted to derby-dev):

> > How could a predicate be pushable to only one side of a union? Can you provide an example of a predicate that can be pushed only to one side?
> > 
> If we take something along the lines of:
> 
> 
> select ... from
>   t2,
>   (select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
> where X1.a = t2.i;
> 
> 
> In this case the predicate X1.a = t2.i could be pushed to the left ("select * from t1") and used when reading T1, but couldn't be pushed to the VALUES clause  because there's no underlying table.


OK. One way to deal with this is to put a ProjectRestrictNode between the union node and the values clause as a place to park the predicate. To make things simple, you might want to always put ProjectRestrictNodes under both sides of the union during preprocessing (i.e. after binding but before optimization). In some cases the extra nodes won't be needed, but ProjectRestrictNodes (and the corresponding ProjectRestrictResultSets) are cheap. Also, you could eliminate unnecessary ProjectRestrictNodes at the end of optimization (possibly in modifyAccessPaths()).

This approach would give better performance in some cases, and could simplify the code (since you wouldn't have to figure out when the predicates are pushable).

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Army <qo...@gmail.com>.

Andrew McIntyre wrote:
> 
> The only time the statistics for an index are guaranteed to be up to
> date is when the index is created. See:
> 
> https://issues.apache.org/jira/browse/DERBY-737
> https://issues.apache.org/jira/browse/DERBY-269
> 
> Creating the index before loading the table with data means that the
> statistics for the table will correspond to an empty table. The table
> needs to be loaded with data, and then the index created afterward.

Thanks for asking the question about indexes, Bryan.  And thanks to Andrew for 
answering the question and for providing Jira entries.  I've included the above 
text as a footnote in the latest html document for DERBY-805 (_v5.html).

And many thanks to both of you for taking the time to read this rather daunting 
document; I do appreciate the feedback...

Army

Re: [jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Andrew McIntyre <mc...@gmail.com>.

On 4/7/06, Bryan Pendleton <bp...@amberpoint.com> wrote:
>
> I love reading your documents! Even if 85% of it is too
> optimizer-y for me to understand, I learn a lot each time I
> read through your work. So thanks!

I also find Army's work on the optimizer very interesting. The
documents attached to DERBY-805 have helped shed some light on what is
not a very well understood component of Derby.

> One particular sentence from your most recent writeup stuck out
> for me, as I had missed this in previous discussions about DERBY-805:
>
>     The second thing was the fact that I created indexes on T3 and
>     T4 _before_ inserting the data, which (it turns out) leads to
>     outdated (if not altogether incorrect) statistics for the indexes.
>
> Are the statistics always incorrect/outdated after inserting lots
> of data into a table which has one or more indexes?
>
> Or was there something specific about the way in which you inserted
> your data which caused the indexes to have outdated/incorrect statistics?

The only time the statistics for an index are guaranteed to be up to
date is when the index is created. See:

https://issues.apache.org/jira/browse/DERBY-737
https://issues.apache.org/jira/browse/DERBY-269

Creating the index before loading the table with data means that the
statistics for the table will correspond to an empty table. The table
needs to be loaded with data, and then the index created afterward.

andrew

Re: [jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by Bryan Pendleton <bp...@amberpoint.com>.

Hi Army,

I love reading your documents! Even if 85% of it is too
optimizer-y for me to understand, I learn a lot each time I
read through your work. So thanks!

One particular sentence from your most recent writeup stuck out
for me, as I had missed this in previous discussions about DERBY-805:

    The second thing was the fact that I created indexes on T3 and
    T4 _before_ inserting the data, which (it turns out) leads to
    outdated (if not altogether incorrect) statistics for the indexes.

Are the statistics always incorrect/outdated after inserting lots
of data into a table which has one or more indexes?

Or was there something specific about the way in which you inserted
your data which caused the indexes to have outdated/incorrect statistics?

thanks,

bryan

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: DERBY-805_v4.html
                d805_phase4_v1.patch
                d805_phase4_v1.stat

Attaching two things to this issue:

1 - DERBY-805_v4.html: an updated document that includes a description of my proposed changes for Phase 4.  See "Section VII".

2 - d805_phase4_v1.patch: a patch that implements the Phase 4 changes described in Section VII of the _v4.html document.

The phase4_v1 patch also fixes a small error in BaseTableNumberVisitor.java and updates the lang/predicatePushdown.sql test to account for the new Phase 4 behavior.  As part of the changes for predicatePushdown.sql I have added a lot more data to the tables, which means the test queries take longer to run.  So where this test used to run in under a minute on my machine, it now takes over 4 minutes to complete.  That's expected given my changes, but I thought I should state that explicitly in case anyone notices.

I ran derbyall on Linux Redhat using sane jars with IBM 1.4.2 and saw no new failures.  I also ran the updated lang/predicatePushdown.sql test against IBM wsdd5.6 and Sun 1.5 and the test passed without problems.

If anyone has time to review and commit, that would be great. Note that while the patch is around 5500 lines, only about 600 lines of it are actual code changes; the rest is the result of the updated predicatePushdown.sql test and the corresponding output (a lot of query plans are printed in this test, so the output is quite large).

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment:     (was: DERBY-805.html)

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366907 ] 

A B commented on DERBY-805:
---------------------------

Note to reviewers/committers: I plan to post a more robust version of the Phase 1 patch sometime soon, so it might be good to hold off on reviewing/committing d805_phase1_v1.patch until the new version can be posted and reviewed.  If it's agreed that the newer version is better (when I post it), then that version should be the one we commit.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12367563 ] 

Satheesh Bandaram commented on DERBY-805:
-----------------------------------------

I am starting to review these patches, with the aim of submitting them to trunk. Let me know if any one has any comments on the patches so far.

Thanks Army for all the good work!


> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: DERBY-805.html

Attaching a description of the changes I plan to submit for this issue.  This is a rather complicated enhancement so the description of the changes is pretty long.  In short, though, I outline a 6-step approach to pushing join predicates down into Unions:

1 - Add the ability to take a predicate and scope it to a target result set so that it can be pushed to that result set.

2 - Implement the "pushOptPredicate()" and "optimizeIt()" methods for UnionNodes.  The former method should take predicates that are passed into the UnionNode from outer queries, scope them (per step 1) for the left and right children of the UnionNode, and store them locally.  The latter method should then pass the scoped predicates down to both children so that they can use the predicates in their own optimize()/optimizeIt() calls.

3 - Take scoped predicates (created in step 1) that are pushed into the children result sets of a UnionNode (per step 2) and allow the the children to push the scoped predicates even further down the tree, until we eventually get them to a base table.

4 - Make sure predicates that are pushed down into subqueries of a UnionNode are correctly "pulled" back up (if they are unscoped) or discarded (if they are scoped) for every permutation seen during optimization.

5 - Ensure that the best access path for a UnionNode that pushes predicates is correctly saved during optimization and correctly retrieved when it comes time to finalize the query's overall access path.

6 - And finally, when optimization is complete, make sure all relevant predicates are pushed down the tree one last time and left there, in preparation for code generation.

See DERBY-805.html for all the gory details.

I have made the changes described in this document locally and they all seem to work, with a couple of exceptions as noted at the end of the document.  I plan to break the changes down into separate patches where it's possible to do so, and will be posting those patches in the coming days.  In the meantime, if anyone has time to review this document and provide feedback/direction/suggestion, I would be grateful.  As I myself am still trying to learn all the subtleties of Derby optimization, the more feedback I get, the better...

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Other Info: [Patch available]

I ran derbyall on Windows 2000 using IBM JDK 1.4.2 with d805_phase1_v2.patch applied and saw no new failures--only xaSimplePositive and jdbcapi.SURTest.junit failed, neither of which is caused by these changes.  So I'm marking this as "Patch available" for review/commit.  Thanks.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Manjula Kutty (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12376646 ] 

Manjula Kutty commented on DERBY-805:
-------------------------------------

Testing effect of DERBY-805 changes 

various  complex queries were created and run against 10.1 and 10.2  versions and
the results and the test descriptions are being attached to this issue. 

The overall result after running these tests looks promising and 10.2 performs way better than 10.1. Still it will be good to have an improvement to the existing execution model to use multiple indexes for table scan. I have filed seperate jira entry for that.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: DERBY-805.html

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366748 ] 

A B commented on DERBY-805:
---------------------------

> OK. One way to deal with this is to put a ProjectRestrictNode between the union node and 
> the values clause as a place to park the predicate. To make things simple, you might want 
> to always put ProjectRestrictNodes under both sides of the union during preprocessing 
> (i.e. after binding but before optimization). In some cases the extra nodes won't be needed, 
> but ProjectRestrictNodes (and the corresponding ProjectRestrictResultSets) are cheap. 
> Also, you could eliminate unnecessary ProjectRestrictNodes at the end of optimization 
> (possibly in modifyAccessPaths()). 

Yes, that all sounds about right :)  But I wonder if that couldn't be done later as follow-up work?  Would you say that the changes I've proposed--namely, see if we can push the predicates to both sides and skip the push if not--are incorrect, or just that we could build on them to make them better?  If they're incorrect, then I'll have to look at this more.  But my feeling is that this is a(nother) way in which the pushing of join predicates could be made better once we have it working--i.e. something that would work well as a follow-up patch...?

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Andrew McIntyre (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

Andrew McIntyre updated DERBY-805:
----------------------------------

    Other Info:   (was: [Patch available])
    Derby Info: [Patch Available]

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Jeff Lichtman (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366756 ] 

Jeff Lichtman commented on DERBY-805:
-------------------------------------

> But I wonder if that couldn't be done later as follow-up work? Would you say that the changes I've proposed--namely, see if we can push the predicates to both sides and skip the push if not--are incorrect, or just that we could build on them to make them better?

Your changes are not incorrect. I just wonder whether they wouldn't be both simpler and more complete if you didn't have to worry about whether predicates were pushed to both sides.

I have the feeling you're anxious to finish this phase of the project. It's OK with me if you check in this change with the current algorithm. If you do this, please add comments to the code saying what could be changed to make it possible to apply a pushed predicate to one side only.


> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Andrew McIntyre (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12368839 ] 

Andrew McIntyre commented on DERBY-805:
---------------------------------------

Committed predPushdown_testFix.patch with revision 383062.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.2.0.0, 10.1.2.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Other Info: [Patch available]

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase2_v1.patch, d805_phase2_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Resolved: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]
     
Satheesh Bandaram resolved DERBY-805:
-------------------------------------

    Fix Version: 10.1.3.0
                 10.1.2.4
     Resolution: Fixed
     Derby Info:   (was: [Patch Available])

Fix has been merged into 10.2 (trunk), and 10.1 branches.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0, 10.1.3.0, 10.1.2.4
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_1007_1073_portTo10_1.patch, d805_1007_1073_portTo10_1.stat, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch, test_805.htm
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: predPushdown_testFix.patch

Small update to the lang/predicatePushdown.sql test to avoid truncation of query plans.  This should solve the diff reported in the TinderBox results here:

http://www.multinet.no/~solberg/public/Apache/TinderBox_Derby/Limited/testSummary-382964.html 

This is a small change to a test only--no code changes--so if any commiter out there could commit, that'd be great.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.2.0.0, 10.1.2.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_followup_v1.patch

Attaching a follow-up patch, d805_followup_v1.patch, that addresses some issues which remained after Phase 4 was committed. In particular:

1) Added logic to skip predicate pushdown when either of the predicate's column references does not point to a base table.  This can happen if, for example, the column reference points to a literal or an aggregate expression.  Further work is required for such situations in order to correctly "remap" the column reference to its source (or at least, to figure out what exactly it means to remap a ColumnReference that doesn't point to a base table, and then to implement the appropriate changes)--so in the meantime, I've just decided to skip pushing the predicate for now.

2) Added logic to correctly set the column number of a "scoped" reference based on whether or not the reference points to a base table.  Existing comments in the relevant sections of code describe why we need to set the column numbers for references pointing to base tables, but the code itself didn't actually check for the base table condition--it set the column number for all scoped references, which wasn't always correct.

3) In cases where a ColumnReference's source ResultColumn's expression is not another ColumnReference, made it so that the scope operation will return a clone of ColumnReference (instead of the ColumnReference itself) since that ColumnReference will be pushed to two result sets.

4) Added corresponding test cases to the lang/predicatePushdown.sql test and updated the master file accordingly.

I ran derbyall on Red Hat Linux with ibm142 and saw no new failures.

If anyone has time to review, I'd be grateful.  Thanks.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366829 ] 

A B commented on DERBY-805:
---------------------------

I ran derbyall last night (Windows 2000, Sun 1.4.2) with d805_phase1_v1.patch applied to my codeline and saw no new failures (only failure was xaSimplePositive.sql, which is not related to my changes).

That said, I should point out that these phase 1 changes may need to be enhanced/updated at some point in the near future, depending on the results of the thread I started on derby-dev:

http://article.gmane.org/gmane.comp.apache.db.derby.devel/14836

The link between that thread and d805_phase1_v1.patch is that the phase 1 patch expects "trulyTheBestAccessPath" for an Optimizable to be set correctly, but the issue described in the given thread may lead to situations where trulyTheBestAccessPath doesn't hold the value that it should.

Note, though, the d805_phase1_v1.patch changes alone do not not affect the functionality of the codeline, so I think the patch can still be committed.  It's only when we get to Phase 3 that the issue described on derby-dev could start influencing the behavior of the codeline with my changes for predicate pushdown.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12374391 ] 

Satheesh Bandaram commented on DERBY-805:
-----------------------------------------

All four patches for each phase have been committed to trunk. Thanks Army.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Manjula Kutty (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

Manjula Kutty updated DERBY-805:
--------------------------------

    Attachment: test_805.htm

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch, test_805.htm
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase3_v1.patch
                d805_phase3_v1.stat

Attaching a "Phase 3" patch, d805_phase3_v1.patch, to implement the changes described in steps 2, 3, 4 and 6 of DERBY-805_v3.html.  In short, this patch does the following:

1. Implements the changes as described in the HTML document attached to this issue (steps 2, 3, 4 and 6).

2. Adds a check in PredicateList.java to skip scoped predicates when trying to find a predicate to use for a hash join.  This check is needed because if the predicate was scoped and pushed down from an outer query, then one of the operands points to the outer query and the other (scoped) operand points down into a subquery, which means we can't do the hash join.  Without this check the optimizer might choose to do a hash scan using the predicate even though it won't actually be able to do the corresponding hash join.  That means we would lose the opportunity to use the predicate for an index scan, and thus we could miss out on a much better (index-based) plan.

3. Makes a slight change to OptimizerImpl.addOrLoadBestPlanMappings to avoid doing extra work where possible.  More specifically, skips the logic to save/restore bestJoinOrder if there is only one Optimizable in the list, as there's only one possible join order in that case and so we don't need to keep track of which join order is best for each of the outer queries.

4. Adds a new test, lang/predicatePushdown.sql, to the harness as part of the derbylang suite.  This test runs a bunch of queries that relate to the pushing of predicates down into unions.  Most of these test cases failed (either with compilation/execution errors or incorrect results) at one point or another while I worked on this issue, so I've chosen to include them all as part of the new test.  The only way (that I know of) to tell if predicate pushdown is actually occuring is to print out the query plans, so the master file for this test is very large--almost 10,000 lines.  For that reason, combined with the fact that I think additional predicate pushdown tests will be needed as support for pushdown grows, I decided to create a new test instead of adding the test cases to an existing test.

Once the Phase 1, Phase 2, and Phase 3 patches have been applied, along with the patch for DERBY-1007, the basic predicate pushdown functionality will be in place.  That said, there are still two outstanding issues into which I'm looking:

A - As described in DERBY-805_v3.html, Phases 1 thru 3 will make it so that join predicates targeted for UNION nodes will always get pushed if it's possible to do so, which in turn means that the optimizer will no longer consider the cost of doing a hash join with the UNION (the hash join would require that the predicate _not_ be pushed into the UNION).  So I still need to investigate this to see how I can make the optimizer cost pushing predicates and using a nested loop join verses not pushing them and using a hash join.

B - The changes I've made for this issue--esp. for Phase 1--add a good amount of code to the optimization code path.  This means that it's now possible for the optimizer to timeout "sooner" than it would have prior to these changse.  By "sooner" I mean that the optimizer might not have a chance to look at as many join orders as it used to (because it takes longer for it to cost each join order, esp. with deeply nested subqueries).  This means that there are situations where the optimizer pre-DERBY-805 may have found one plan but post-DERBY-805 may never get to try that plan, and so it could end up choosing a worse plan than it used to. So I need to investigate the optimizer "timeout" mechanism to see how I can address this issue.

As originally stated when I broke this enhancement down into phases, the issues that remain after Phase 3 will be addressed in Phase 4 (and whatever follow-up phases are required thereafter). In the meantime, the surest way to get the full effect of predicate pushdown is to disable optimizer timeout, which can be done with the property derby.optimizer.noTimeout=true.  Note that I set this property for the new predicatePushdown test, to ensure that the optimizer will choose the same plans across all machines.

I ran derbyall on Red Hat Linux with IBM 1.4.2 and saw no new failures.  ** NOTE ** though that this patch, d805_phase3_v1.patch, has a dependency on d805_phase2_v1.patch and also on the patch for DERBY-1007, attached to that issue and called d1007_v1.patch.  So both of those patches must be applied before d805_phase3_v1.patch can be applied.

I know the patch looks long, but keep in mind that a good 9,000 lines of the diff is just for the new test.  I thought about posting the new test separately before the Phase 3 patch and then submitting an updated master file for Phase 3, but then the diff there would be even larger as most of the query plans would change.  So I've just included the new test as part of the Phase 3 patch.

So in short, the current patches for review-and-commit are as follow:

1. d805_phase1_v3.patch (already committed)
2. d805_phase2_v1.patch
3. d1007_v1.patch (for DERBY-1007; dependent on #1)
4. d805_phase3_v1.patch (dependent on all of the previous patches).

If anyone has the time to review/comment on any of these patches, I would most certainly appreciate it...

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.2.0.0, 10.1.2.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12366830 ] 

A B commented on DERBY-805:
---------------------------

<Jeft>
> It's OK with me if you check in this change with the current algorithm. If you do this, 
> please add comments to the code saying what could be changed to make it possible 
> to apply a pushed predicate to one side only. 

Okay, will do.  These comments will apply to the Phase 2 patch.  Thanks for describing how the changes could work--that'll certain be a big help when it comes time to implement them.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "A B (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-805?page=all ]

A B updated DERBY-805:
----------------------

    Attachment: d805_phase1_v1.patch
                d805_phase1_v1.stat

Attaching d805_phase1_v1.patch, which is the patch for "Phase 1" as defined in my earlier comment.  The changes are mostly just what I described in the html document, except that I've added logic to recurse down a UNION chain where necessary.

I meant to run derbyall with these changes last night, but I messed up my environment, which means the test results were not meaningful.  I ran "derbylang" this evening and everything passed, so I'm posting the patch.  I will run the full derbyall suite tonight and will post results tomorrow.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task
>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, d805_phase1_v1.patch, d805_phase1_v1.stat
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-805) Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-805?page=comments#action_12376859 ] 

Satheesh Bandaram commented on DERBY-805:
-----------------------------------------

I have committed this patch to 10.1 branch. Thanks for pulling all relavent checkins.

> Push join predicates into union and other set operations. DERBY-649 implemented scalar (single table) predicate pushdown. Adding join predicate push down could improve performance significantly.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-805
>          URL: http://issues.apache.org/jira/browse/DERBY-805
>      Project: Derby
>         Type: Sub-task

>   Components: SQL
>     Versions: 10.1.2.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram
>     Assignee: A B
>      Fix For: 10.2.0.0
>  Attachments: DERBY-805.html, DERBY-805_v2.html, DERBY-805_v3.html, DERBY-805_v4.html, DERBY-805_v5.html, d805_1007_1073_portTo10_1.patch, d805_1007_1073_portTo10_1.stat, d805_followup_v1.patch, d805_phase1_v1.patch, d805_phase1_v1.stat, d805_phase1_v2.patch, d805_phase1_v2.stat, d805_phase1_v3.patch, d805_phase1_v3.stat, d805_phase2_v1.patch, d805_phase2_v1.stat, d805_phase3_v1.patch, d805_phase3_v1.stat, d805_phase4_v1.patch, d805_phase4_v1.stat, d805_phase4_v2.patch, phase2_javadocFix.patch, predPushdown_testFix.patch, test_805.htm
>
> Fix for DERBY-649 implemented scalar (single table) predicate push down into UNIONs. While this improves performance for one set of queries, ability to push join-predicates further improves Derby performance by enabling use of indices where possible.
> For example,
> create view V1 as select i, j from T1 union all select i,j from T2; 
> create view V2 as select a,b from T3 union all select a,b from T4; 
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5); 
> For a query like
> select * from V1, V2 where V1.j = V2.b and V1.i =1;
> If the join order choosen is V1,V2, V1 can use index on V1.i (if present) following fix for DERBY-649. But if there is a index on V2.b also, Derby currently can't use that index. By pushing join predicate, Derby would be able to use the index and improve performance. Some of the queries I have seen (not the one shown here...) could improve from 70-120 seconds to about one second.
> Note there is a good comment by Jeff Lichtman about join-predicate push down. I am copying parts of it here for completeness of this report: (Modified)
> If predicate push down is done during optimization, it would be possible to push joins into the union as long as it's in the right place in the join order.
> For example:
> create view v as select * from t1 union all select * from t2;
> select * from v, t3 where v.c1 = t3.c2;
> In this select, if t3 is the outer table then the qualification could be pushed into the union and optimized there, but if t3 is the inner table the qualification can't be pushed into the union.
> If the pushing is done at preprocess time (i.e. before optimization) it is impossible to know whether a join qualification like this can be safely pushed.
> There's a comment in UnionNode.optimizeIt() saying:
> /* RESOLVE - don't try to push predicated through for now */
> This is where I'd expect to see something for pushing predicates into the union during optimization.
> BTW, the business of pushing and pulling predicates during optimization can be hard to understand and debug, so maybe it's best to only handle the simple cases and do it during preprocessing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira