You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-commits@db.apache.org by bp...@apache.org on 2007/03/19 19:34:02 UTC

svn commit: r520038 - in /db/derby/code/trunk/java: engine/org/apache/derby/impl/sql/compile/ testing/org/apache/derbyTesting/functionTests/master/ testing/org/apache/derbyTesting/functionTests/tests/lang/

Author: bpendleton
Date: Mon Mar 19 11:34:00 2007
New Revision: 520038

URL: http://svn.apache.org/viewvc?view=rev&rev=520038
Log:
DERBY-1861: ASSERT when combining references and expressions in ORDER BY

An ORDER BY clause wihch combines both column references and expressions
causes the sort engine to throw an ASSERT failure in sane builds.

The data structure problems that are exposed by DERBY-1861 both have to do
with the duplicate elimination processing. When the duplicate pulled-up
columns are eliminated from the result column list, the OrderByColumn and
ResultColumn instances may both end up with incorrect values.

The OrderByColumn class contains a field named addedColumnOffset. This
field records the offset of this particular OrderByColumn within the
portion of the result column list which contains pulled-up columns.
Each time a column is pulled up into the result column list, its
addedColumnOffset is set; thus the first pulled-up column has
addedColumnOffset = 0, the second pulled-up column has
addedColumnOffset = 1, etc.

However, later, when duplicate pulled-up result columns are detected
and removed by bind processing, the addedColumnOffset field is not
re-adjusted, and ends up with an invalid value. 

The ResultColumn class contains a field named virtualColumnId. For columns
which are not directly from the underlying table, but rather are the result
of expressions that are computed at runtime, the columns are assigned a
virtualColumnId. For reasons similar to those of the addedColumnOffset,
this field also ends up wiht an invalid value when the duplicate
pulled-up columns are detected and removed from the result column list.

I decided that the best thing was to arrange to call each of the
OrderByColumn instances and ResultColumn instances at the point that
the duplicate result column is detected and removed, to give each of
those objects a chance to adjust its addedColumnOffset and
virtualColumnId value to reflect the removed column. Although this change
required a number of small changes, none of them was terribly complicated,
and the effect of the fix is that the data structures are as desired.

Modified:
    db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java
    db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByList.java
    db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
    db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java
    db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/orderby.out
    db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/orderby.sql

Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java (original)
+++ db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByColumn.java Mon Mar 19 11:34:00 2007
@@ -46,6 +46,7 @@
 	private ResultColumn	resultCol;
 	private boolean			ascending = true;
 	private ValueNode expression;
+	private OrderByList     list;
     /**
      * If this sort key is added to the result column list then it is at result column position
      * 1 + resultColumnList.size() - resultColumnList.getOrderBySelect() + addedColumnOffset
@@ -140,14 +141,25 @@
 	/**
 	 * Bind this column.
 	 *
+	 * During binding, we may discover that this order by column was pulled
+	 * up into the result column list, but is now a duplicate, because the
+	 * actual result column was expanded into the result column list when "*"
+	 * expressions were replaced with the list of the table's columns. In such
+	 * a situation, we will end up calling back to the OrderByList to
+	 * adjust the addedColumnOffset values of the columns; the "oblist"
+	 * parameter exists to allow that callback to be performed.
+	 *
 	 * @param target	The result set being selected from
+	 * @param oblist    OrderByList which contains this column
 	 *
 	 * @exception StandardException		Thrown on error
 	 * @exception StandardException		Thrown when column not found
 	 */
-	public void bindOrderByColumn(ResultSetNode target)
+	public void bindOrderByColumn(ResultSetNode target, OrderByList oblist)
 				throws StandardException 
 	{
+		this.list = oblist;
+
 		if(expression instanceof ColumnReference){
 		
 			ColumnReference cr = (ColumnReference) expression;
@@ -201,8 +213,8 @@
 
 			ColumnReference cr = (ColumnReference) expression;
 
-			resultCol = targetCols.getOrderByColumn(cr.getColumnName(),
-                                                    cr.getTableNameNode());
+			resultCol = targetCols.findResultColumnForOrderBy(
+                    cr.getColumnName(), cr.getTableNameNode());
 
 			if(resultCol == null){
 				resultCol = (ResultColumn) getNodeFactory().getNode(C_NodeTypes.RESULT_COLUMN,
@@ -333,9 +345,10 @@
 
 		ResultColumnList	targetCols = target.getResultColumns();
 
-		resultCol = targetCols.getOrderByColumn(cr.getColumnName(),
+		resultCol = targetCols.getOrderByColumnToBind(cr.getColumnName(),
 							cr.getTableNameNode(),
-							sourceTableNumber);
+							sourceTableNumber,
+							this);
         /* Search targetCols before using addedColumnOffset because select list wildcards, '*',
          * are expanded after pullUpOrderByColumn is called. A simple column reference in the
          * order by clause may be found in the user specified select list now even though it was
@@ -353,4 +366,36 @@
 
 	}
 
+	/**
+	 * Reset addedColumnOffset to indicate that column is no longer added
+	 *
+	 * An added column is one which was artificially added to the result
+	 * column list due to its presence in the ORDER BY clause, as opposed to
+	 * having been explicitly selected by the user. Since * is not expanded
+	 * until after the ORDER BY columns have been pulled up, we may add a
+	 * column, then later decide it is a duplicate of an explicitly selected
+	 * column. In that case, this method is called, and it does the following:
+	 * - resets addedColumnOffset to -1 to indicate this is not an added col
+	 * - calls back to the OrderByList to adjust any other added cols
+	 */
+	void clearAddedColumnOffset()
+	{
+		list.closeGap(addedColumnOffset);
+		addedColumnOffset = -1;
+	}
+	/**
+	 * Adjust addedColumnOffset to reflect that a column has been removed
+	 *
+	 * This routine is called when a previously-added result column has been
+	 * removed due to being detected as a duplicate. If that added column had
+	 * a lower offset than our column, we decrement our offset to reflect that
+	 * we have just been moved down one slot in the result column list.
+	 *
+	 * @param gap   offset of the column which has just been removed from list
+	 */
+	void collapseAddedColumnGap(int gap)
+	{
+		if (addedColumnOffset > gap)
+			addedColumnOffset--;
+	}
 }

Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByList.java
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByList.java?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByList.java (original)
+++ db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OrderByList.java Mon Mar 19 11:34:00 2007
@@ -150,7 +150,7 @@
 		for (int index = 0; index < size; index++)
 		{
 			OrderByColumn obc = (OrderByColumn) elementAt(index);
-			obc.bindOrderByColumn(target);
+			obc.bindOrderByColumn(target, this);
 
 			/*
 			** Always sort if we are ordering on an expression, and not
@@ -161,6 +161,30 @@
 			{
 				alwaysSort = true;
 			}
+		}
+	}
+	
+	/**
+	 * Adjust addedColumnOffset values due to removal of a duplicate column
+	 *
+	 * This routine is called by bind processing when it identifies and
+	 * removes a column from the result column list which was pulled up due
+	 * to its presence in the ORDER BY clause, but which was later found to
+	 * be a duplicate. The OrderByColumn instance for the removed column
+	 * has been adjusted to point to the true column in the result column
+	 * list and its addedColumnOffset has been reset to -1. This routine
+	 * finds any other OrderByColumn instances which had an offset greater
+	 * than that of the column that has been deleted, and decrements their
+	 * addedColumOffset to account for the deleted column's removal.
+	 *
+	 * @param gap   column which has been removed from the result column list
+	 */
+	void closeGap(int gap)
+	{
+		for (int index = 0; index < size(); index++)
+		{
+			OrderByColumn obc = (OrderByColumn) elementAt(index);
+			obc.collapseAddedColumnGap(gap);
 		}
 	}
 

Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java (original)
+++ db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java Mon Mar 19 11:34:00 2007
@@ -467,6 +467,26 @@
 	}
 
 	/**
+	 * Adjust this virtualColumnId to account for the removal of a column
+	 *
+	 * This routine is called when bind processing finds and removes
+	 * duplicate columns in the result list which were pulled up due to their
+	 * presence in the ORDER BY clause, but were later found to be duplicate.
+	 * 
+	 * If this column is a virtual column, and if this column's virtual
+	 * column id is greater than the column id which is being removed, then
+	 * we must logically shift this column to the left by decrementing its
+	 * virtual column id.
+	 *
+	 * @param removedColumnId   id of the column being removed.
+	 */
+	public void collapseVirtualColumnIdGap(int removedColumnId)
+	{
+		if (columnDescriptor == null && virtualColumnId > removedColumnId)
+			virtualColumnId--;
+	}
+
+	/**
 	 * Generate a unique (across the entire statement) column name for unnamed
 	 * ResultColumns
 	 *

Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java (original)
+++ db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java Mon Mar 19 11:34:00 2007
@@ -89,8 +89,8 @@
      * statement uses the "*" token to select all the columns from a table,
      * then during ORDER BY parsing we redundantly generate the columns
      * mentioned in the ORDER BY clause into the ResultColumnlist, but then
-     * later in getOrderByColumn we determine that these are duplicates and
-     * we take them back out again.
+     * later in getOrderByColumnToBind we determine that these are
+     * duplicates and we take them back out again.
      */
 
 	/*
@@ -385,18 +385,47 @@
 	}
 
 	/**
-	 * For order by, get a ResultColumn that matches the specified 
+	 * For order by column bind, get a ResultColumn that matches the specified 
 	 * columnName.
 	 *
+	 * This method is called during bind processing, in the special
+	 * "bind the order by" call that is made by CursorNode.bindStatement().
+	 * The OrderByList has a special set of bind processing routines
+	 * that analyzes the columns in the ORDER BY list and verifies that
+	 * each column is one of:
+	 * - a direct reference to a column explicitly mentioned in
+	 *   the SELECT list
+	 * - a direct reference to a column implicitly mentioned as "SELECT *"
+	 * - a direct reference to a column "pulled up" into the result
+	 *   column list
+	 * - or a valid and fully-bound expression ("c+2", "YEAR(hire_date)", etc.)
+	 *
+	 * At this point in the processing, it is possible that we'll find
+	 * the column present in the RCL twice: once because it was pulled
+	 * up during statement compilation, and once because it was added
+	 * when "SELECT *" was expanded into the table's actual column list.
+	 * If we find such a duplicated column, we can, and do, remove the
+	 * pulled-up copy of the column and point the OrderByColumn
+	 * to the actual ResultColumn from the *-expansion.
+	 *
+	 * Note that the association of the OrderByColumn with the
+	 * corresponding ResultColumn in the RCL occurs in
+	 * OrderByColumn.resolveAddedColumn.
+	 *
 	 * @param columnName	The ResultColumn to get from the list
 	 * @param tableName	The table name on the OrderByColumn, if any
 	 * @param tableNumber	The tableNumber corresponding to the FromTable with the
 	 *						exposed name of tableName, if tableName != null.
+	 * @param obc           The OrderByColumn we're binding.
 	 *
 	 * @return	the column that matches that name.
 	 * @exception StandardException thrown on ambiguity
 	 */
-	public ResultColumn getOrderByColumn(String columnName, TableName tableName, int tableNumber)
+	public ResultColumn getOrderByColumnToBind(
+            String columnName,
+            TableName tableName,
+            int tableNumber,
+            OrderByColumn obc)
 		throws StandardException
 	{
 		int				size = size();
@@ -455,6 +484,9 @@
 				{// remove the column due to pullup of orderby item
 					removeElement(resultColumn);
 					decOrderBySelect();
+					obc.clearAddedColumnOffset();
+					collapseVirtualColumnIdGap(
+							resultColumn.getColumnPosition());
 					break;
 				}
 			}
@@ -462,18 +494,58 @@
 		return retVal;
 	}
 
+	/**
+	 * Adjust virtualColumnId values due to result column removal
+	 *
+	 * This method is called when a duplicate column has been detected and
+	 * removed from the list. We iterate through each of the other columns
+	 * in the list and notify them of the column removal so they can adjust
+	 * their virtual column id if necessary.
+	 *
+	 * @param gap   id of the column which was just removed.
+	 */
+	private void collapseVirtualColumnIdGap(int gap)
+	{
+		for (int index = 0; index < size(); index++)
+			((ResultColumn) elementAt(index)).collapseVirtualColumnIdGap(gap);
+	}
+
 
 	/**
 	 * For order by, get a ResultColumn that matches the specified 
 	 * columnName.
 	 *
+	 * This method is called during pull-up processing, at the very
+	 * start of bind processing, as part of
+	 * OrderByList.pullUpOrderByColumns. Its job is to figure out
+	 * whether the provided column (from the ORDER BY list) already
+	 * exists in the ResultColumnList or not. If the column does
+	 * not exist in the RCL, we return NULL, which signifies that
+	 * a new ResultColumn should be generated and added ("pulled up")
+	 * to the RCL by our caller.
+	 *
+	 * Note that at this point in the processing, we should never
+	 * find this column present in the RCL multiple times; if the
+	 * column is already present in the RCL, then we don't need to,
+	 * and won't, pull a new ResultColumn up into the RCL.
+	 *
+	 * If the caller specified "SELECT *", then the RCL at this
+	 * point contains a special AllResultColumn object. This object
+	 * will later be expanded and replaced by the actual set of
+	 * columns in the table, but at this point we don't know what
+	 * those columns are, so we may pull up an OrderByColumn
+	 * which duplicates a column in the *-expansion; such
+	 * duplicates will be removed at the end of bind processing
+	 * by OrderByList.bindOrderByColumns.
+	 *
 	 * @param columnName	The ResultColumn to get from the list
 	 * @param tableName	The table name on the OrderByColumn, if any
 	 *
-	 * @return	the column that matches that name.
+	 * @return	the column that matches that name, or NULL if pull-up needed
 	 * @exception StandardException thrown on ambiguity
 	 */
-	public ResultColumn getOrderByColumn(String columnName, TableName tableName)
+	public ResultColumn findResultColumnForOrderBy(
+                            String columnName, TableName tableName)
 		throws StandardException
 	{
 		int				size = size();
@@ -513,10 +585,10 @@
 					throw StandardException.newException(SQLState.LANG_DUPLICATE_COLUMN_FOR_ORDER_BY, columnName);
 				}
 				else if (index >= size - orderBySelect)
-				{// remove the column due to pullup of orderby item
-					removeElement(resultColumn);
-					decOrderBySelect();
-					break;
+				{
+					SanityManager.THROWASSERT(
+							"Unexpectedly found ORDER BY column '" +
+							columnName + "' pulled up at position " +index);
 				}
 			}
 		}
@@ -3973,7 +4045,7 @@
 		orderBySelect++;
 	}
 
-	public void decOrderBySelect()
+	private void decOrderBySelect()
 	{
 		orderBySelect--;
 	}

Modified: db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/orderby.out
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/orderby.out?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/orderby.out (original)
+++ db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/orderby.out Mon Mar 19 11:34:00 2007
@@ -1440,4 +1440,49 @@
 -----------------------
 1          |2          
 40         |40         
+ij> -- Tests which verify the handling of expressions in the ORDER BY list
+-- related to DERBY-1861. The issue in DERBY-1861 has to do with how the
+-- compiler handles combinations of expressions and simple columns in the
+-- ORDER BY clause, so we try a number of such combinations
+
+create table derby1861 (a int, b int, c int, d int);
+0 rows inserted/updated/deleted
+ij> insert into derby1861 values (1, 2, 3, 4);
+1 row inserted/updated/deleted
+ij> select * from derby1861 order by a, b, c+2;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
+ij> select a, c from derby1861 order by a, b, c-4;
+A          |C          
+-----------------------
+1          |3          
+ij> select t.* from derby1861 t order by t.a, t.b, t.c+2;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
+ij> select a, b, a, c, d from derby1861 order by b, c-1, a;
+A          |B          |A          |C          |D          
+-----------------------------------------------------------
+1          |2          |1          |3          |4          
+ij> select * from derby1861 order by a, c+2, a;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
+ij> select * from derby1861 order by c-1, c+1, a, b, c * 6;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
+ij> select t.*, t.c+2 from derby1861 t order by a, b, c+2;
+A          |B          |C          |D          |5          
+-----------------------------------------------------------
+1          |2          |3          |4          |5          
+ij> select * from derby1861 order by 3, 1;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
+ij> select * from derby1861 order by 2, a-2;
+A          |B          |C          |D          
+-----------------------------------------------
+1          |2          |3          |4          
 ij> 

Modified: db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/orderby.sql
URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/orderby.sql?view=diff&rev=520038&r1=520037&r2=520038
==============================================================================
--- db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/orderby.sql (original)
+++ db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/orderby.sql Mon Mar 19 11:34:00 2007
@@ -517,3 +517,20 @@
 -- select * from derby147_b order by b, a+2;
 -- Verify that correlation names match the table names properly:
 select t.a, sum(t.a) from derby147_a t group by t.a order by t.a;
+
+-- Tests which verify the handling of expressions in the ORDER BY list
+-- related to DERBY-1861. The issue in DERBY-1861 has to do with how the
+-- compiler handles combinations of expressions and simple columns in the
+-- ORDER BY clause, so we try a number of such combinations
+
+create table derby1861 (a int, b int, c int, d int);
+insert into derby1861 values (1, 2, 3, 4);
+select * from derby1861 order by a, b, c+2;
+select a, c from derby1861 order by a, b, c-4;
+select t.* from derby1861 t order by t.a, t.b, t.c+2;
+select a, b, a, c, d from derby1861 order by b, c-1, a;
+select * from derby1861 order by a, c+2, a;
+select * from derby1861 order by c-1, c+1, a, b, c * 6;
+select t.*, t.c+2 from derby1861 t order by a, b, c+2;
+select * from derby1861 order by 3, 1;
+select * from derby1861 order by 2, a-2;