You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org> on 2005/08/12 11:08:54 UTC
[jira] Created: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
SELECT DISTINCT returns duplicates when selecting from subselects
-----------------------------------------------------------------
Key: DERBY-504
URL: http://issues.apache.org/jira/browse/DERBY-504
Project: Derby
Type: Bug
Components: SQL
Versions: 10.2.0.0
Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
Reporter: Knut Anders Hatlen
Priority: Minor
When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
0 rows inserted/updated/deleted
ij> INSERT INTO names (id, name) VALUES
(1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
(4, 'Carl'), (5, 'Ben'), (6, 'Anna');
6 rows inserted/updated/deleted
ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
NAME
----------
Anna
Ben
Carl
Carl
Ben
Anna
Six names are returned, although only three names should have been returned.
When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
NAME
----------
Anna
Ben
Carl
3 rows selected
ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
NAME
----------
Anna
Ben
Carl
3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320856 ]
Knut Anders Hatlen commented on DERBY-504:
------------------------------------------
After this last patch is applied, the only difference between the main canons and the j9_22 canons is the eol-style. This means that the committer could just remove the entire java/testing/org/apache/derbyTesting/functionTests/master/j9_22 directory if he or she pleases.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Reopened: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen reopened DERBY-504:
--------------------------------------
Reopen the bug so the fix can be included in 10.1.2. Since the bug was reported on derby-user before 10.1.1 was released, I think it should be fixed in 10.1.2.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.1.2.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.1.2.0
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen closed DERBY-504:
------------------------------------
Fix Version: 10.1.1.2
Resolution: Fixed
Fixed in trunk (revision 267239) and 10.1 (revision 306964).
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.1.2.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.2.0.0, 10.1.2.0, 10.1.1.2
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
Re: [jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates
when selecting from subselects
Posted by Daniel John Debrunner <dj...@debrunners.com>.
Knut Anders Hatlen (JIRA) wrote:
> [ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
>
> Knut Anders Hatlen updated DERBY-504:
> -------------------------------------
>
> Fix Version: 10.1.2.0
> (was: 10.2.0.0)
> Version: 10.1.2.0
> (was: 10.2.0.0)
>
> Changed fix version.
Jira allows multiple fix versions, so if this bug was already fixed in
10.2.0.0 that fix version should remain, and 10.1.2.0 added.
Dan.
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Fix Version: 10.1.2.0
(was: 10.2.0.0)
Version: 10.1.2.0
(was: 10.2.0.0)
Changed fix version.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.1.2.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.1.2.0
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320834 ]
Rick Hillegas commented on DERBY-504:
-------------------------------------
Looks great. Derball passes. Ready for a committer to check in.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Daniel John Debrunner (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322857 ]
Daniel John Debrunner commented on DERBY-504:
---------------------------------------------
Couple of questions on the 10.1 patch:
1) Does this include the cleanup patch posted 07/Sep/05 11:17 AM], or just the committed change?
2) Is it a clean merge, generated from an svn merge command, or did you need to make changes?
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: (was: DERBY-504.diff)
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Assigned: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen reassigned DERBY-504:
----------------------------------------
Assign To: Knut Anders Hatlen
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504_c.diff
DERBY-504_c.stat
Changed the patch as suggested by Rick.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Fix Version: 10.2.0.0
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.1.2.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.2.0.0, 10.1.2.0
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504.distinct.diff
Fixes the optimization bug which caused the trouble. Disables some optimization, so maybe we should fix it in another way?
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.distinct.diff
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504_c-CRLF.diff
Uploaded a new patch with windows line terminators.
Perhaps one of the committers could set svn:eol-style=native on the canon files in the j9_22 directory?
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
Re: [jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates
when selecting from subselects
Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
"Rick Hillegas (JIRA)" <de...@db.apache.org> writes:
> [ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320805 ]
>
> Rick Hillegas commented on DERBY-504:
> -------------------------------------
>
> The patch seems to be broken. I can't apply the changes to the j2me canons. Here's the output from applying the patch:
>
> patching file java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/SelectNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
> patching file java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
> patching file java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java
> patching file java/testing/org/apache/derbyTesting/functionTests/tests/lang/distinct.sql
> patching file java/testing/org/apache/derbyTesting/functionTests/master/groupBy.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/distinct.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/distinct.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/groupBy.out
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out
> Hunk #1 FAILED at 2477.
> 1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinc
> t.out.rej
> patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy.out
> Hunk #1 FAILED at 293.
> 1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy
> .out.rej
I think this is a windows/unix newline issue. All the files that were
successfully patched have the svn:eol-style property set to 'native'
(and since I created the patch on Solaris, they have unix
newlines). The two that failed do not have that property, and they
have windows (cr-lf) newlines. I guess your (windows?) patch tool got
confused when it found lines with different end-of-line style in the
same diff.
I'll submit a new patch with windows line terminators. If that doesn't
work, we'll just have to wait until Dyre has fixed DERBY-330.
--
Knut Anders
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320805 ]
Rick Hillegas commented on DERBY-504:
-------------------------------------
The patch seems to be broken. I can't apply the changes to the j2me canons. Here's the output from applying the patch:
patching file java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/SelectNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
patching file java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
patching file java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java
patching file java/testing/org/apache/derbyTesting/functionTests/tests/lang/distinct.sql
patching file java/testing/org/apache/derbyTesting/functionTests/master/groupBy.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/distinct.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/distinct.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_13/groupBy.out
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out
Hunk #1 FAILED at 2477.
1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinc
t.out.rej
patching file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy.out
Hunk #1 FAILED at 293.
1 out of 1 hunk FAILED -- saving rejects to file java/testing/org/apache/derbyTesting/functionTests/master/j9_22/groupBy
.out.rej
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504.diff
I have attached a patch which fixes the optimization bug by checking that a subquery has the same columns as the top-level query before pushing the duplicate elimination into the subquery. This patch supersedes the previously submitted patch.
This patch does not remove the optimization of SELECT DISTINCT in other cases than those that were incorrectly optimized in the original Derby code.
I have run derbyall successfully with the exception of two tests:
- lang/groupBy.sql fails because of DERBY-519
- store/encryptionKey.sql fails, but I have seen that others have had trouble with this test too, so I believe it is not related to my patch
I will submit tests for this bug later.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504-cleanup.diff
DERBY-504-cleanup.stat
Attached a clean-up patch against trunk which removes a redundant check pointed out by Satheesh. I have run derbyall successfully.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen closed DERBY-504:
------------------------------------
Fix Version: 10.2.0.0
Resolution: Fixed
Bug fixed in revision 267239.
The clean-up patch removing redundant checking and the backport are still not committed.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.2.0.0
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Closed: (DERBY-504) SELECT DISTINCT returns duplicates when
selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Knut Anders Hatlen closed DERBY-504.
------------------------------------
Resolution: Fixed
Committed DERBY-504-cleanup.diff to trunk with revision 495171.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: https://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.1.2.1
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assigned To: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.2.1.6, 10.1.2.1
>
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322909 ]
Knut Anders Hatlen commented on DERBY-504:
------------------------------------------
About the 10.1 patch:
1) Yes, it includes the cleanup patch.
2) No, it is not generated from an svn merge command, but it could be done. I incorrectly assumed that it wouldn't work since applying the trunk patch directly failed. Seems like subversion is smarter than me... ;)
Applying this command will merge the committed changes into 10.1:
svn merge -r 267238:267239 mytrunkdir my10.1dir
This doesn't include the cleanup patch, but it will fix the bug. The cleanup patch could be applied directly, but it is not important in the 10.1 backport.
There is one difference between my 10.1 patch and the svn merge. One of the j9_22 canons (java/testing/org/apache/derbyTesting/functionTests/master/j9_22/distinct.out) is different. Since I haven't tested the j9_{13,22} files, that canon must be checked by someone else anyway.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320737 ]
Rick Hillegas commented on DERBY-504:
-------------------------------------
This is a clean, compact bug fix. It could be made a little more compact: SelectNode and ProjectRestrictNode have almost identical while loops for extracting the base column reference. Please don't miss the opportunity to abstract out a common method here.
Once this change is made, I will apply the patch on my machine and run derbyall against jdk1.4.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504_c-CRLF.diff
Uploaded a new patch with windows line terminators.
Perhaps one of the committers could set svn:eol-style=native on the canon files in the j9_22 directory?
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Reopened: (DERBY-504) SELECT DISTINCT returns duplicates
when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Knut Anders Hatlen reopened DERBY-504:
--------------------------------------
Reopen to commit cleanup patch.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: https://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.1.2.1
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assigned To: Knut Anders Hatlen
> Priority: Minor
> Fix For: 10.1.2.1, 10.2.1.6
>
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: (was: DERBY-504.distinct.diff)
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12320917 ]
Satheesh Bandaram commented on DERBY-504:
-----------------------------------------
I have submitted this patch to trunk. I wonder if the change in SelectNode.java may be doing redundant work by getting BaseColumnColumnNode and checking for simpleColumns. Since we have already matched (resultColumns.countNumberOfSimpleColumnReferences() == resultColumns.size()), doesn't that guarantee resultColumns to be only simpleColumns?
Sending java\engine\org\apache\derby\impl\sql\compile\FromBaseTable.java
Sending java\engine\org\apache\derby\impl\sql\compile\ProjectRestrictNode.java
Sending java\engine\org\apache\derby\impl\sql\compile\ResultColumn.java
Sending java\engine\org\apache\derby\impl\sql\compile\ResultSetNode.java
Sending java\engine\org\apache\derby\impl\sql\compile\SelectNode.java
Sending java\testing\org\apache\derbyTesting\functionTests\master\distinct.out
Sending java\testing\org\apache\derbyTesting\functionTests\master\groupBy.out
Sending java\testing\org\apache\derbyTesting\functionTests\master\j9_13\distinct.out
Sending java\testing\org\apache\derbyTesting\functionTests\master\j9_13\groupBy.out
Sending java\testing\org\apache\derbyTesting\functionTests\master\j9_22\distinct.out
Sending java\testing\org\apache\derbyTesting\functionTests\master\j9_22\groupBy.out
Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\distinct.sql
Transmitting file data ............
Committed revision 267239.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504.diff
DERBY-504.stat
Added tests for the bug. No code changes from the previous patch.
The following tests were modified:
1) lang/distinct.sql: Tests for this bug was added and the master file was updated. There are master files for this test in the j9_13 and j9_22 subdirectories, but I haven't changed those files.
2) lang/groupBy.sql: No changes in the test, but the master files were updated to reflect the new behaviour (see DERBY-519). The j9_13 and j9_22 versions of the master file were also updated, but I haven't actually tested them.
Could someone with access to the j9_13 and j9_22 platforms have a look at these tests and update the master files?
This patch is now ready for review.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=all ]
Knut Anders Hatlen updated DERBY-504:
-------------------------------------
Attachment: DERBY-504-10.1-unix.diff
DERBY-504-10.1-windows.diff
DERBY-504-10.1.stat
I have attached a patch for 10.1. Because of the svn:eol-style issue in some of the tests, I have included one patch which can be applied on windows and one which can be applied on unix-like systems. The j9_13 and j9_22 canons are modified but not tested.
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504-10.1-unix.diff, DERBY-504-10.1-windows.diff, DERBY-504-10.1.stat, DERBY-504-cleanup.diff, DERBY-504-cleanup.stat, DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (DERBY-504) SELECT DISTINCT returns duplicates when selecting from subselects
Posted by "Knut Anders Hatlen (JIRA)" <de...@db.apache.org>.
[ http://issues.apache.org/jira/browse/DERBY-504?page=comments#action_12322650 ]
Knut Anders Hatlen commented on DERBY-504:
------------------------------------------
It is true that the change in SelectNode is doing redundant work, but not because (resultColumns.countNumberOfSimpleColumnReferences() == resultColumns.size()) guarantees that all result columns are simple columns (if we by "simple columns" mean a column in a base table, no aggregates etc). E.g. in the query 'SELECT a FROM (SELECT AVG(age) AS a FROM names) AS n', resultColumns.countNumberOfSimpleColumnReferences() equals resultColumns.size(), but the result column is not simple. The redundancy is the other way around: If (but not only if) all colums are simple, then (countNumberOfSimpleColumnReferences() == size()) is true.
I can submit a patch which removes this redundant checking. It doesn't seem like ResultColumnList.countNumberOfSimpleColumnReferences() is used anywhere else in the code. If I remove the call to countNumberOfSimpleColumnReferences() and it is not used anywhere else, should I then also remove the definition of the method to make the code cleaner, or should I leave the method in case it would be needed in the future?
> SELECT DISTINCT returns duplicates when selecting from subselects
> -----------------------------------------------------------------
>
> Key: DERBY-504
> URL: http://issues.apache.org/jira/browse/DERBY-504
> Project: Derby
> Type: Bug
> Components: SQL
> Versions: 10.2.0.0
> Environment: Latest development sources (SVN revision 232227), Sun JDK 1.5, Solaris/x86
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: DERBY-504.diff, DERBY-504.stat, DERBY-504_b.diff, DERBY-504_b.stat, DERBY-504_c-CRLF.diff, DERBY-504_c-CRLF.diff, DERBY-504_c.diff, DERBY-504_c.stat
>
> When one performs a select distinct on a table generated by a subselect, there sometimes are duplicates in the result. The following example shows the problem:
> ij> CREATE TABLE names (id INT PRIMARY KEY, name VARCHAR(10));
> 0 rows inserted/updated/deleted
> ij> INSERT INTO names (id, name) VALUES
> (1, 'Anna'), (2, 'Ben'), (3, 'Carl'),
> (4, 'Carl'), (5, 'Ben'), (6, 'Anna');
> 6 rows inserted/updated/deleted
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> Carl
> Ben
> Anna
> Six names are returned, although only three names should have been returned.
> When the result is explicitly sorted (using ORDER BY) or the id column is removed from the subselect, the query returns three names as expected:
> ij> SELECT DISTINCT(name) FROM (SELECT name, id FROM names) AS n ORDER BY name;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
> ij> SELECT DISTINCT(name) FROM (SELECT name FROM names) AS n;
> NAME
> ----------
> Anna
> Ben
> Carl
> 3 rows selected
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira