You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Army <qo...@gmail.com> on 2007/04/21 00:56:48 UTC

Change in behavior caused by proposed DERBY-2370 changes: is this okay?

While looking into DERBY-2370 I noticed the following behavior in Derby:

create table onecol (j int);
create table twocol (i int, j int);

-- These first two fail as expected since left and right children of
-- the UNION do not have the same number of columns.

ij> select j from onecol union select i, j from twocol;
ERROR 42X58: The number of columns on the left and right sides of the UNION must 
be the same.

ij> select j from onecol union select * from twocol;
ERROR 42X58: The number of columns on the left and right sides of the UNION must 
be the same.

-- This one also fails as expected since left and right children of the
-- nested UNION do not have the same number of columns.

ij> select j from onecol where exists
       (select j from onecol union select i, j from twocol);
ERROR 42X58: The number of columns on the left and right sides of the UNION must 
be the same.

-- But if an asterisk ("*") is used within the nested UNION, Derby does *not*
-- throw an error:

ij> select j from onecol where exists
       (select j from onecol union select * from twocol);
J
-----------

0 rows selected

The following comments in SubqueryNode.java seem to suggest that this behavior 
is intentional:

         /* For an EXISTS subquery:
          *    o  If the SELECT list is a "*", then we convert it to a true.
          *       (We need to do the conversion since we don't want the "*" to
          *       get expanded.)
          *  o  We then must bind the expression under the SELECT list to
          *       verify that it is a valid expression.  (We must do this as a
          *       separate step because we need to validate the expression and
          *       we need to handle EXISTS (select * ... union all select 1 ...)
          *       without getting a type compatability error.)
          */

Or put differently, Derby transforms the result column lists of the union's 
children into TRUE constants, so the "*" becomes "TRUE" and thus is never expanded:

   select j from onecol where exists
       (select TRUE from onecol union select TRUE from twocol);

Because it is never expanded Derby only "sees" one column from each child and 
thus does not throw an error.

With the patch for DERBY-2370, though, this query will fail with error 42X58, 
like all of the others.  The reason is that we no longer transform the result 
column lists for the UNION's children into constants (which was wrong), and thus 
the "*" survives the transformation and is later expanded into the proper 
columns.  Then as a result of the expansion we end up with one result column for 
the left child of the UNION and two result columns for the right.  I ran this 
query against DB2 and it fails there, as well.

My feeling is that the behavior with the patch (i.e. error 42X58) is actually 
the correct behavior.  That will of course affect existing applications, but 
given that a) it seems right (to me) to throw an error, and b) the new behavior 
leads to different (correct) results for the queries reported in DERBY-2370, 
which means existing applications are already going to be affected (in a good 
way), I'm thinking this is the right way to go.

But I'm curious to know what others may think about this particular change in 
behavior...comments/feedback?

Army


Re: Change in behavior caused by proposed DERBY-2370 changes: is this okay?

Posted by Bryan Pendleton <bp...@amberpoint.com>.
Hi Army,

I read through your notes on DERBY-2370, and I reviewed the
behavior change that you noted.

I think the behavior change is fine, and I think your fix for
DERBY-2370 is sound.

Thanks very much for the attention to detail in writing the
notes, and for the clear examples. It is very helpful, and
very informative!

thanks,

bryan