You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org> on 2005/08/12 11:08:54 UTC

[jira] Created: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

SELECT DISTINCT returns duplicates when selecting from subselects
-----------------------------------------------------------------

         Key: DERBY-504
         URL: http://issues.apache.org/jira/browse/DERBY-504
     Project: Derby
        Type: Bug
  Components: SQL  
    Versions: 10.2.0.0    
 Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
    Reporter: Knut Anders Hatlen
    Priority: Minor


When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:

ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
0 rows inserted/updated/deleted
ij> INSERT INTO names (id, name) VALUES
       (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
       (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
6 rows inserted/updated/deleted
ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
NAME      
----------
Anna      
Ben       
Carl      
Carl      
Ben       
Anna      

Six names are returned, although only three names should have been returned.

When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:

ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
NAME      
----------
Anna      
Ben       
Carl      

3 rows selected
ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
NAME      
----------
Anna      
Ben       
Carl      

3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320856 ] 

Knut Anders Hatlen commented on DERBY-504:
------------------------------------------

After this last patch is applied, the only difference between the main canons and the j9_22 canons is the eol-style. This means that the committer could just remove the entire java/testing/org/apache/derbyTesting/functionTests/master/j9_22 directory if he or she pleases.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Reopened: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
     
Knut Anders Hatlen reopened DERBY-504:
--------------------------------------


Reopen the bug so the fix can be included in 10.1.2. Since the bug was reported on derby-user before 10.1.1 was released, I think it should be fixed in 10.1.2.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.1.2.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>      Fix For: 10.1.2.0
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
     
Knut Anders Hatlen closed DERBY-504:
------------------------------------

    Fix Version: 10.1.1.2
     Resolution: Fixed

Fixed in trunk (revision 267239) and 10.1 (revision 306964).

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.1.2.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>      Fix For: 10.2.0.0, 10.1.2.0, 10.1.1.2
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by Daniel John Debrunner <dj...@debrunners.com>.

Knut Anders Hatlen (JIRA) wrote:

>      [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
> 
> Knut Anders Hatlen updated DERBY-504:
> -------------------------------------
> 
>     Fix Version: 10.1.2.0
>                      (was: 10.2.0.0)
>         Version: 10.1.2.0
>                      (was: 10.2.0.0)
> 
> Changed fix version.

Jira allows multiple fix versions, so if this bug was already fixed in
10.2.0.0 that fix version should remain, and 10.1.2.0 added.

Dan.

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Fix Version: 10.1.2.0
                     (was: 10.2.0.0)
        Version: 10.1.2.0
                     (was: 10.2.0.0)

Changed fix version.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.1.2.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>      Fix For: 10.1.2.0
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320834 ] 

Rick Hillegas commented on DERBY-504:
-------------------------------------

Looks great. Derball passes. Ready for a committer to check in.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Daniel John Debrunner (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322857 ] 

Daniel John Debrunner commented on DERBY-504:
---------------------------------------------

Couple of questions on the 10.1 patch:

1) Does this include the cleanup patch posted 07/Sep/05 11:17 AM], or just the committed change?
2) Is it a clean merge, generated from an svn merge command,  or did you need to make changes?

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment:     (was: DERBY-504.diff)

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor

>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Assigned: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen reassigned DERBY-504:
----------------------------------------

    Assign To: Knut Anders Hatlen

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor

>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504_c.diff
                DERBY-504_c.stat

Changed the patch as suggested by Rick.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Fix Version: 10.2.0.0

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.1.2.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>      Fix For: 10.2.0.0, 10.1.2.0
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504.distinct.diff

Fixes the optimization bug which caused the trouble. Disables some optimization, so maybe we should fix it in another way?

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.distinct.diff
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504_c-CRLF.diff

Uploaded a new patch with windows line terminators.

Perhaps one of the committers could set svn:eol-style=native on the canon files in the j9_22 directory?

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.

"Rick Hillegas (JIRA)" <de...@db.apache.org> writes:

>     [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320805 ] 
>
> Rick Hillegas commented on DERBY-504:
> -------------------------------------
>
> The patch seems to be broken. I can't apply the changes to the j2me canons. Here's the output from applying the patch:
>
> patching file java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/SelectNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
> patching file java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java
> patching file java/testing/org/apache/derbyTesting/functionTests/tests/lang/distinct.sql
> patching file java/testing/org/apache/derbyTesting/functionTests/master/groupBy.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/distinct.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/distinct.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/groupBy.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out
> Hunk #1 FAILED at 2477.
> 1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinc
> t.out.rej
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy.out
> Hunk #1 FAILED at 293.
> 1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy
> .out.rej

I think this is a windows/unix newline issue. All the files that were
successfully patched have the svn:eol-style property set to 'native'
(and since I created the patch on Solaris, they have unix
newlines). The two that failed do not have that property, and they
have windows (cr-lf) newlines. I guess your (windows?) patch tool got
confused when it found lines with different end-of-line style in the
same diff.

I'll submit a new patch with windows line terminators. If that doesn't
work, we'll just have to wait until Dyre has fixed DERBY-330.

-- 
Knut Anders

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320805 ] 

Rick Hillegas commented on DERBY-504:
-------------------------------------

The patch seems to be broken. I can't apply the changes to the j2me canons. Here's the output from applying the patch:

patching file java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/SelectNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
patching file java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java
patching file java/testing/org/apache/derbyTesting/functionTests/tests/lang/distinct.sql
patching file java/testing/org/apache/derbyTesting/functionTests/master/groupBy.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/distinct.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/distinct.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/groupBy.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out
Hunk #1 FAILED at 2477.
1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinc
t.out.rej
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy.out
Hunk #1 FAILED at 293.
1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy
.out.rej


> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504.diff

I have attached a patch which fixes the optimization bug by checking that a subquery has the same columns as the top-level query before pushing the duplicate elimination into the subquery. This patch supersedes the previously submitted patch.

This patch does not remove the optimization of SELECT DISTINCT in other cases than those that were incorrectly optimized in the original Derby code.

I have run derbyall successfully with the exception of two tests:
   - lang/groupBy.sql fails because of DERBY-519
   - store/encryptionKey.sql fails, but I have seen that others have had trouble with this test too, so I believe it is not related to my patch

I will submit tests for this bug later.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504-cleanup.diff
                DERBY-504-cleanup.stat

Attached a clean-up patch against trunk which removes a redundant check pointed out by Satheesh. I have run derbyall successfully.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
     
Knut Anders Hatlen closed DERBY-504:
------------------------------------

    Fix Version: 10.2.0.0
     Resolution: Fixed

Bug fixed in revision 267239.

The clean-up patch removing redundant checking and the backport are still not committed.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>      Fix For: 10.2.0.0
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen closed DERBY-504.
------------------------------------

    Resolution: Fixed

Committed DERBY-504-cleanup.diff to trunk with revision 495171.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>                 Key: DERBY-504
>                 URL: https://issues.apache.org/jira/browse/DERBY-504
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.2.1
>         Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>            Reporter: Knut Anders Hatlen
>         Assigned To: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.2.1.6, 10.1.2.1
>
>         Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322909 ] 

Knut Anders Hatlen commented on DERBY-504:
------------------------------------------

About the 10.1 patch:

1) Yes, it includes the cleanup patch.
2) No, it is not generated from an svn merge command, but it could be done. I incorrectly assumed that it wouldn't work since applying the trunk patch directly failed. Seems like subversion is smarter than me... ;)

Applying this command will merge the committed changes into 10.1:

     svn merge -r 267238:267239 mytrunkdir my10.1dir

This doesn't include the cleanup patch, but it will fix the bug. The cleanup patch could be applied directly, but it is not important in the 10.1 backport.

There is one difference between my 10.1 patch and the svn merge. One of the j9_22 canons (java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out) is different. Since I haven't tested the j9_{13,22} files, that canon must be checked by someone else anyway.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320737 ] 

Rick Hillegas commented on DERBY-504:
-------------------------------------

This is a clean, compact bug fix. It could be made a little more compact: SelectNode and ProjectRestrictNode have almost identical while loops for extracting the base column reference. Please don't miss the opportunity to abstract out a common method here.

Once this change is made, I will apply the patch on my machine and run derbyall against jdk1.4.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504_c-CRLF.diff

Uploaded a new patch with windows line terminators.

Perhaps one of the committers could set svn:eol-style=native on the canon files in the j9_22 directory?

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Reopened: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen reopened DERBY-504:
--------------------------------------


Reopen to commit cleanup patch.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>                 Key: DERBY-504
>                 URL: https://issues.apache.org/jira/browse/DERBY-504
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.2.1
>         Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>            Reporter: Knut Anders Hatlen
>         Assigned To: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.1.2.1, 10.2.1.6
>
>         Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment:     (was: DERBY-504.distinct.diff)

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor

>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320917 ] 

Satheesh Bandaram commented on DERBY-504:
-----------------------------------------

I have submitted this patch to trunk. I wonder if the change in SelectNode.java may be doing redundant work by getting BaseColumnColumnNode and checking for simpleColumns. Since we have already matched (resultColumns.countNumberOfSimpleColumnReferences() == resultColumns.size()), doesn't that guarantee resultColumns to be only simpleColumns?

Sending        java\engine\org\apache\derby\impl\sql\compile\FromBaseTable.java
Sending        java\engine\org\apache\derby\impl\sql\compile\ProjectRestrictNode.java
Sending        java\engine\org\apache\derby\impl\sql\compile\ResultColumn.java
Sending        java\engine\org\apache\derby\impl\sql\compile\ResultSetNode.java
Sending        java\engine\org\apache\derby\impl\sql\compile\SelectNode.java
Sending        java\testing\org\apache\derbyTesting\functionTests\master\distinct.out
Sending        java\testing\org\apache\derbyTesting\functionTests\master\groupBy.out
Sending        java\testing\org\apache\derbyTesting\functionTests\master\j9_13\distinct.out
Sending        java\testing\org\apache\derbyTesting\functionTests\master\j9_13\groupBy.out
Sending        java\testing\org\apache\derbyTesting\functionTests\master\j9_22\distinct.out
Sending        java\testing\org\apache\derbyTesting\functionTests\master\j9_22\groupBy.out
Sending        java\testing\org\apache\derbyTesting\functionTests\tests\lang\distinct.sql
Transmitting file data ............
Committed revision 267239.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504.diff
                DERBY-504.stat

Added tests for the bug. No code changes from the previous patch.

The following tests were modified:

   1) lang/distinct.sql: Tests for this bug was added and the master file was updated. There are master files for this test in the j9_13 and j9_22 subdirectories, but I haven't changed those files.

   2) lang/groupBy.sql: No changes in the test, but the master files were updated to reflect the new behaviour (see DERBY-519). The j9_13 and j9_22 versions of the master file were also updated, but I haven't actually tested them.

Could someone with access to the j9_13 and j9_22 platforms have a look at these tests and update the master files?

This patch is now ready for review.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

     [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]

Knut Anders Hatlen updated DERBY-504:
-------------------------------------

    Attachment: DERBY-504-10.1-unix.diff
                DERBY-504-10.1-windows.diff
                DERBY-504-10.1.stat

I have attached a patch for 10.1. Because of the svn:eol-style issue in some of the tests, I have included one patch which can be applied on windows and one which can be applied on unix-like systems. The j9_13 and j9_22 canons are modified but not tested.

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects

Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322650 ] 

Knut Anders Hatlen commented on DERBY-504:
------------------------------------------

It is true that the change in SelectNode is doing redundant work, but not because (resultColumns.countNumberOfSimpleColumnReferences() == resultColumns.size()) guarantees that all result columns are simple columns (if we by "simple columns" mean a column in a base table, no aggregates etc). E.g. in the query 'SELECT a FROM (SELECT AVG(age) AS a FROM names) AS n', resultColumns.countNumberOfSimpleColumnReferences() equals resultColumns.size(), but the result column is not simple. The redundancy is the other way around: If (but not only if) all colums are simple, then (countNumberOfSimpleColumnReferences() == size()) is true.

I can submit a patch which removes this redundant checking. It doesn't seem like ResultColumnList.countNumberOfSimpleColumnReferences() is used anywhere else in the code. If I remove the call to countNumberOfSimpleColumnReferences() and it is not used anywhere else, should I then also remove the definition of the method to make the code cleaner, or should I leave the method in case it would be needed in the future?

> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
>          Key: DERBY-504
>          URL: http://issues.apache.org/jira/browse/DERBY-504
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.2.0.0
>  Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
>     Reporter: Knut Anders Hatlen
>     Assignee: Knut Anders Hatlen
>     Priority: Minor
>  Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
>        (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
>        (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> Carl      
> Ben       
> Anna      
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME      
> ----------
> Anna      
> Ben       
> Carl      
> 3 rows selected

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira