You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-commits@db.apache.org by mi...@apache.org on 2006/09/26 21:19:41 UTC

svn commit: r450155 - in /db/derby/code/trunk/java: engine/org/apache/derby/impl/sql/compile/ testing/org/apache/derbyTesting/functionTests/master/ testing/org/apache/derbyTesting/functionTests/tests/lang/

Author: mikem
Date: Tue Sep 26 12:19:40 2006
New Revision: 450155

URL: http://svn.apache.org/viewvc?view=rev&rev=450155
Log:
DERBY-1866
contributed by Army Brown
patch: d1866_v1.patch

Attaching a first patch for this issue, d1866_v1.patch. In short, the problem was that, when pushing predicates to subqueries beneath UNIONs, the predicates were always being pushed to the *first* table in the subquery's FROM list, regardless of whether or not that was actually the correct table. Thus it was possible to push a predicate down to a base table to which it didn't apply, thereby leading to an assertion failure in sane mode and an index out of bounds exception in insane mode.

For details on how this occurred and what the fix is, please refer to the code comments in the patch. The d1866_v1 patch does the following:

1. Adds logic to ensure scoped predicates are only pushed
to the appropriate base tables.

2. Adds one line to OptimizerImpl to solve the hang that
was occuring for the second query shown in repro.sql.
The problem there was just that one variable was not
being properly reset when beginning a new round of
optimization.

3. Adds some test cases to verify the changes for #1 and
#2.

Note that the patch is mostly just explanatory comments for existing and new logic, plus the test cases.


Modified:
    db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java
    db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out
    db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/predicatePushdown.sql

Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java?view=diff&rev=450155&r1=450154&r2=450155
==============================================================================
--- db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java (original)
+++ db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java Tue Sep 26 12:19:40 2006
@@ -343,6 +343,17 @@
 			timeOptimizationStarted = System.currentTimeMillis();
 			timeExceeded = false;
 		}
+
+		/* If user specified the optimizer override for a fixed
+		 * join order, then desiredJoinOrderFound could be true
+		 * when we get here.  We have to reset it to false in
+		 * prep for the next round of optimization.  Otherwise
+		 * we'd end up quitting the optimization before ever
+		 * finding a plan for this round, and that could, among
+		 * other things, lead to a never-ending optimization
+		 * phase in certain situations.  DERBY-1866.
+		 */
+		desiredJoinOrderFound = false;
 	}
 
     public int getMaxMemoryPerTable()
@@ -1236,9 +1247,13 @@
 		** RESOLVE - We do not push predicates with subqueries not materializable.
 		*/
 
-		int		numPreds = predicateList.size();
-		JBitSet	predMap = new JBitSet(numTablesInQuery);
-		OptimizablePredicate pred;
+		int		                numPreds        = predicateList.size();
+		JBitSet	                predMap         = new JBitSet(numTablesInQuery);
+		JBitSet                 curTableNums    = null;
+		BaseTableNumbersVisitor btnVis          = null;
+		boolean                 pushPredNow     = false;
+		int                     tNum;
+		Predicate               pred;
 
 		/* Walk the OptimizablePredicateList.  For each OptimizablePredicate,
 		 * see if it can be assigned to the Optimizable at the current join
@@ -1249,7 +1264,7 @@
 		 */
 		for (int predCtr = numPreds - 1; predCtr >= 0; predCtr--)
 		{
-			pred = predicateList.getOptPredicate(predCtr);
+			pred = (Predicate)predicateList.getOptPredicate(predCtr);
 
 			/* Skip over non-pushable predicates */
 			if (! isPushable(pred))
@@ -1281,12 +1296,108 @@
 			*/
 			predMap.and(nonCorrelatedTableMap);
 
+			/* At this point what we've done is figure out what FromTables
+			 * the predicate references (using the predicate's "referenced
+			 * map") and then: 1) unset the table numbers for any FromTables
+			 * that have already been optimized, 2) unset the table number
+			 * for curTable, which we are about to optimize, and 3) cleared
+			 * out any remaining table numbers which do NOT directly
+			 * correspond to UN-optimized FromTables in this OptimizerImpl's
+			 * optimizableList.
+			 *
+			 * Note: the optimizables in this OptImpl's optimizableList are
+			 * called "non-correlated".
+			 *
+			 * So at this point predMap holds a list of tableNumbers which
+			 * correspond to "non-correlated" FromTables that are referenced
+			 * by the predicate but that have NOT yet been optimized.  If any
+			 * such FromTable exists then we canNOT push the predicate yet.  
+			 * We can only push the predicate if every FromTable that it
+			 * references either 1) has already been optimized, or 2) is
+			 * about to be optimized (i.e. the FromTable is curTable itself).
+			 * We can check for this condition by seeing if predMap is empty,
+			 * which is what the following line does.
+			 */
+			pushPredNow = (predMap.getFirstSetBit() == -1);
+
+			/* If the predicate is scoped, there's more work to do. A
+			 * scoped predicate's "referenced map" may not be in sync
+			 * with its actual column references.  Or put another way,
+			 * the predicate's referenced map may not actually represent
+			 * the tables that are referenced by the predicate.  For
+			 * example, assume the query tree is something like:
+			 *
+			 *      SelectNode0
+			 *     (PRN0, PRN1)
+			 *       |     |
+			 *       T1 UnionNode
+			 *           /   |
+			 *         PRN2  PRN3
+			 *          |     |
+			 *  SelectNode1   SelectNode2
+			 *   (PRN4, PRN5)    (PRN6)
+			 *     |     |         |
+			 *     T2    T3        T4
+			 *
+			 * Assume further that we have an equijoin predicate between
+			 * T1 and the Union node, and that the column reference that
+			 * points to the Union ultimately maps to T3.  The predicate
+			 * will then be scoped to PRN2 and PRN3 and the newly-scoped
+			 * predicates will get passed to the optimizers for SelectNode1
+			 * and SelectNode2--which brings us here.  Assume for this
+			 * example that we're here for SelectNode1 and that "curTable"
+			 * is PRN4.  Since the predicate has been scoped to SelectNode1,
+			 * its referenced map will hold the table numbers for T1 and
+			 * PRN2--it will NOT hold the table number for PRN5, even
+			 * though PRN5 (T3) is the actual target for the predicate.
+			 * Given that, the above logic will determine that the predicate
+			 * should be pushed to curTable (PRN4)--but that's not correct.
+			 * We said at the start that the predicate ultimately maps to
+			 * T3--so we should NOT be pushing it to T2.  And hence the
+			 * need for some additional logic.  DERBY-1866.
+			 */
+			if (pushPredNow && pred.isScopedForPush() && (numOptimizables > 1))
+			{
+				if (btnVis == null)
+				{
+					curTableNums = new JBitSet(numTablesInQuery);
+					btnVis       = new BaseTableNumbersVisitor(curTableNums);
+				}
+
+				/* What we want to do is find out if the scoped predicate
+				 * is really supposed to be pushed to curTable.  We do
+				 * that by getting the base table numbers referenced by
+				 * curTable along with curTable's own table number.  Then
+				 * we get the base table numbers referenced by the scoped
+				 * predicate. If the two sets have at least one table
+				 * number in common, then we know that the predicate
+				 * should be pushed to curTable.  In the above example
+				 * predMap will end up holding the base table number
+				 * for T3, and thus this check will fail when curTable
+				 * is PRN4 but will pass when it is PRN5, which is what
+				 * we want.
+				 */
+				tNum = ((FromTable)curTable).getTableNumber();
+				curTableNums.clearAll();
+				btnVis.setTableMap(curTableNums);
+				((FromTable)curTable).accept(btnVis);
+				if (tNum >= 0)
+					curTableNums.set(tNum);
+
+				btnVis.setTableMap(predMap);
+				pred.accept(btnVis);
+
+				predMap.and(curTableNums);
+				if ((predMap.getFirstSetBit() == -1))
+					pushPredNow = false;
+			}
+
 			/*
 			** Finally, push the predicate down to the Optimizable at the
 			** end of the current proposed join order, if it can be evaluated
 			** there.
 			*/
-			if (predMap.getFirstSetBit() == -1)
+			if (pushPredNow)
 			{
 				/* Push the predicate and remove it from the list */
 				if (curTable.pushOptPredicate(pred))

Modified: db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out?view=diff&rev=450155&r1=450154&r2=450155
==============================================================================
--- db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out (original)
+++ db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out Tue Sep 26 12:19:40 2006
@@ -1332,6 +1332,56 @@
       union select 'i','j','j',i from t2
       union select c1, c2, c3, c from tc;
 0 rows inserted/updated/deleted
+ij> -- For DERBY-1866.  The problem for DERBY-1866 was that,
+-- when pushing predicates to subqueries beneath UNIONs,
+-- the predicates were always being pushed to the *first*
+-- table in the FROM list, regardless of whether or not
+-- that was actually the correct table.  For the test
+-- query that uses this view (see below) the predicate
+-- is supposed to be pushed to TC, so in order to repro
+-- the DERBY-1866 failure we want to make sure that TC
+-- is *not* the first table in the FROM list.  Thus we
+-- use the optimizer override to fix the join order so
+-- that TC is the second table.
+create view vz5a (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+0 rows inserted/updated/deleted
+ij> -- Same as above but target FromTable in subquery is
+-- itself another subquery.
+create view vz5b (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select distinct * from tc) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+0 rows inserted/updated/deleted
+ij> -- Same as above but target FromTable in subquery is
+-- another union node between two subqueries.
+create view vz5c (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select * from tc union select * from tc) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+0 rows inserted/updated/deleted
+ij> -- Same as above but target FromTable in subquery is
+-- another full query with unions and subqueries.
+create view vz5d (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select * from tc
+         union select z1 c1, z2 c2, z3 c3, z4 c from vz5b) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+0 rows inserted/updated/deleted
 ij> -- Both sides of predicate reference aggregates.
 select x1.c1 from
   (select count(*) from t1 union select count(*) from t2) x1 (c1),
@@ -1540,6 +1590,97 @@
 6          |6          
 2          |2          
 4          |4          
+ij> -- Push outer where predicate down into a UNION having a
+-- a Select child with more than one table in its FROM
+-- list.  The predicate should be pushed to the correct
+-- table in the Select's FROM list.  Prior to the fix for
+-- DERBY-1866 the predicate was always being pushed to
+-- the *first* table, regardless of whether or not that
+-- was actually the correct table.  Thus the predicate
+-- "t1.i = vz5.z4" was getting pushed to table T2 even
+-- though it doesn't apply there.  The result was an
+-- ASSERT failure in sane mode and an IndexOutOfBounds
+-- exception in insane mode.  NOTE: Use of NESTEDLOOP
+-- join strategy ensures the predicate will be pushed
+-- (otherwise optimizer might choose to do a hash join
+-- and we wouldn't be testing what we want to test).
+select t1.i, vz5a.* from
+  t1
+   left outer join
+     vz5a --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5a.z4;
+I          |Z1  |Z2  |Z3     |Z4         
+-----------------------------------------
+1          |i   |j   |j      |1          
+2          |i   |j   |j      |2          
+3          |i   |j   |j      |3          
+4          |i   |j   |j      |4          
+5          |i   |j   |j      |5          
+ij> -- Same query as above, but without the optimizer override.
+-- In this case there was another error where optimizer
+-- state involving the "joinOrder" override (see the
+-- definition of vz5a) was not properly reset, which could
+-- lead to an infinite loop.  This problem was fixed as
+-- part of DERBY-1866, as well.
+select t1.i, vz5a.* from
+  t1
+   left outer join
+     vz5a
+   on
+    t1.i = vz5a.z4;
+I          |Z1  |Z2  |Z3     |Z4         
+-----------------------------------------
+1          |i   |j   |j      |1          
+2          |i   |j   |j      |2          
+3          |i   |j   |j      |3          
+4          |i   |j   |j      |4          
+5          |i   |j   |j      |5          
+ij> -- More tests for DERBY-1866 using more complicated views.
+select t1.i, vz5b.* from
+  t1
+   left outer join
+     vz5b --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5b.z4;
+I          |Z1  |Z2  |Z3     |Z4         
+-----------------------------------------
+1          |i   |j   |j      |1          
+2          |i   |j   |j      |2          
+3          |i   |j   |j      |3          
+4          |i   |j   |j      |4          
+5          |i   |j   |j      |5          
+ij> select t1.i, vz5c.* from
+  t1
+   left outer join
+     vz5c --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5c.z4;
+I          |Z1  |Z2  |Z3     |Z4         
+-----------------------------------------
+1          |i   |j   |j      |1          
+2          |i   |j   |j      |2          
+3          |i   |j   |j      |3          
+4          |i   |j   |j      |4          
+5          |i   |j   |j      |5          
+ij> select t1.i, vz5d.* from
+  t1
+   left outer join
+     vz5d --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5d.z4;
+I          |Z1  |Z2  |Z3     |Z4         
+-----------------------------------------
+1          |i   |j   |bokibob|1          
+1          |i   |j   |j      |1          
+2          |i   |j   |bokibob|2          
+2          |i   |j   |j      |2          
+3          |i   |j   |bokibob|3          
+3          |i   |j   |j      |3          
+4          |i   |j   |bokibob|4          
+4          |i   |j   |j      |4          
+5          |i   |j   |bokibob|5          
+5          |i   |j   |j      |5          
 ij> -- Queries with Select->Union->Select chains having differently-
 -- ordered result column lists with some non-column reference
 -- expressions.  In all of these queries we specify LEFT join
@@ -1730,6 +1871,14 @@
 ij> drop view vz3;
 0 rows inserted/updated/deleted
 ij> drop view vz4;
+0 rows inserted/updated/deleted
+ij> drop view vz5a;
+0 rows inserted/updated/deleted
+ij> drop view vz5d;
+0 rows inserted/updated/deleted
+ij> drop view vz5b;
+0 rows inserted/updated/deleted
+ij> drop view vz5c;
 0 rows inserted/updated/deleted
 ij> drop table tc;
 0 rows inserted/updated/deleted

Modified: db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/predicatePushdown.sql
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/predicatePushdown.sql?view=diff&rev=450155&r1=450154&r2=450155
==============================================================================
--- db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/predicatePushdown.sql (original)
+++ db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/predicatePushdown.sql Tue Sep 26 12:19:40 2006
@@ -113,6 +113,56 @@
       union select 'i','j','j',i from t2
       union select c1, c2, c3, c from tc;
 
+-- For DERBY-1866.  The problem for DERBY-1866 was that,
+-- when pushing predicates to subqueries beneath UNIONs,
+-- the predicates were always being pushed to the *first*
+-- table in the FROM list, regardless of whether or not
+-- that was actually the correct table.  For the test
+-- query that uses this view (see below) the predicate
+-- is supposed to be pushed to TC, so in order to repro
+-- the DERBY-1866 failure we want to make sure that TC
+-- is *not* the first table in the FROM list.  Thus we
+-- use the optimizer override to fix the join order so
+-- that TC is the second table.
+create view vz5a (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+
+-- Same as above but target FromTable in subquery is
+-- itself another subquery.
+create view vz5b (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select distinct * from tc) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+
+-- Same as above but target FromTable in subquery is
+-- another union node between two subqueries.
+create view vz5c (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select * from tc union select * from tc) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+
+-- Same as above but target FromTable in subquery is
+-- another full query with unions and subqueries.
+create view vz5d (z1, z2, z3, z4) as
+  select distinct xx1.c1, xx1.c2, 'bokibob' bb, xx1.c from
+    (select c1, c2, c3, c
+      from --DERBY-PROPERTIES joinOrder=FIXED
+        t2, (select * from tc
+         union select z1 c1, z2 c2, z3 c3, z4 c from vz5b) tc
+      where tc.c = t2.i) xx1
+  union select 'i','j','j',i from t2;
+
 -- Both sides of predicate reference aggregates.
 select x1.c1 from
   (select count(*) from t1 union select count(*) from t2) x1 (c1),
@@ -282,6 +332,62 @@
   (select distinct j from t2 union select j from t1) x2 (c2)
 where x1.z4 = x2.c2;
 
+-- Push outer where predicate down into a UNION having a
+-- a Select child with more than one table in its FROM
+-- list.  The predicate should be pushed to the correct
+-- table in the Select's FROM list.  Prior to the fix for
+-- DERBY-1866 the predicate was always being pushed to
+-- the *first* table, regardless of whether or not that
+-- was actually the correct table.  Thus the predicate
+-- "t1.i = vz5.z4" was getting pushed to table T2 even
+-- though it doesn't apply there.  The result was an
+-- ASSERT failure in sane mode and an IndexOutOfBounds
+-- exception in insane mode.  NOTE: Use of NESTEDLOOP
+-- join strategy ensures the predicate will be pushed
+-- (otherwise optimizer might choose to do a hash join
+-- and we wouldn't be testing what we want to test).
+select t1.i, vz5a.* from
+  t1
+   left outer join
+     vz5a --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5a.z4;
+
+-- Same query as above, but without the optimizer override.
+-- In this case there was another error where optimizer
+-- state involving the "joinOrder" override (see the
+-- definition of vz5a) was not properly reset, which could
+-- lead to an infinite loop.  This problem was fixed as
+-- part of DERBY-1866, as well.
+select t1.i, vz5a.* from
+  t1
+   left outer join
+     vz5a
+   on
+    t1.i = vz5a.z4;
+
+-- More tests for DERBY-1866 using more complicated views.
+select t1.i, vz5b.* from
+  t1
+   left outer join
+     vz5b --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5b.z4;
+
+select t1.i, vz5c.* from
+  t1
+   left outer join
+     vz5c --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5c.z4;
+
+select t1.i, vz5d.* from
+  t1
+   left outer join
+     vz5d --DERBY-PROPERTIES joinStrategy=NESTEDLOOP
+   on
+    t1.i = vz5d.z4;
+
 -- Queries with Select->Union->Select chains having differently-
 -- ordered result column lists with some non-column reference
 -- expressions.  In all of these queries we specify LEFT join
@@ -396,6 +502,10 @@
 drop view vz2;
 drop view vz3;
 drop view vz4;
+drop view vz5a;
+drop view vz5d;
+drop view vz5b;
+drop view vz5c;
 drop table tc;
 
 -- Now bump up the size of tables T3 and T4 to the point where