You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "A B (JIRA)" <de...@db.apache.org> on 2006/09/14 21:23:22 UTC
[jira] Updated: (DERBY-1852) Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.

     [ http://issues.apache.org/jira/browse/DERBY-1852?page=all ]

A B updated DERBY-1852:
-----------------------

    Attachment: d1852_quick_doNOTcommit.diff

Attaching a quick patch to fix this issue.  I stumbled across this problem while investigating DERBY-1777 and so I was able to tell what the issue is and how to fix it.  The attached patch fixes the problem and should be correct code-wise, but I haven't added any comments nor explanations, nor have I run derbyall yet.  So this is NOT to be committed.

But since I'm not sure how high this is on my priority list, I figured I'd post the patch and if anyone else wants to finalize it by adding comments/explanations and a test case (could just take the one from the description), feel free!

> Wrong results: duplicate rows returned for nested UNIONs when they should be eliminated.
> ----------------------------------------------------------------------------------------
>
>                 Key: DERBY-1852
>                 URL: http://issues.apache.org/jira/browse/DERBY-1852
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1, 10.1.3.0, 10.1.3.1
>            Reporter: A B
>         Attachments: d1852_quick_doNOTcommit.diff
>
>
> Regarding UNIONs, the Derby documentation says that if the "ALL" keyword is not specified, the default behavior is to remove duplicate rows from the result.   And for simple cases of UNIONs, that is indeed the case.  However, there are certain queries with nested UNIONs where Derby is supposed to remove duplicates from the result set, but does not.
> This is *NOT* a regression.  I have been able to reproduce the behavior in early versions of 10.1.2 and also back to 10.0.2.2, which is the oldest client I have.
> To show the problem:
> create table t1 (i int, j int);
> create table t2 (i int, j int);
> insert into t1 values (1, 2), (2, 4), (3, 6), (4, 8), (5, 10);
> insert into t2 values (1, 2), (2, -4), (3, 6), (4, -8), (5, 10);
> insert into t2 values (3, 6), (4, 8), (3, -6), (4, -8);
> -- Following query correctly returns 13 rows; the first union
> -- does not specify "ALL" and should therefore remove
> -- duplicates.
> select * from t1 union select * from t2 union all select * from t1;
> -- But if we do a join with the same query, we end up with
> -- 19 rows instead of 13.  This is because the duplicate entries
> -- are not correctly removed.
> select * from t1 left outer join
>   (select * from t1 union select * from t2 union all select * from t1) x2
> on t1.i = x2.i;
> Just for sanity, I also ran these statements against DB2 and the result was of the last query was indeed 13 rows, not 19.  So I think this is a case where Derby is returning incorrect results.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira