You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Andrew Shuttlewood <an...@futureroute.co.uk> on 2005/06/27 13:47:39 UTC

Distinct subselect bug?

While trying out some queries recently, we came across some weird
issues.

We sometimes work off of queries that are provided by the user, so we
heavily use subselects. This is a recent one

select distinct(col1) from (select col1,col2,col3,col4 from table) t

Interestingly enough, this doesn't actually work.

I've attached a sample CSV file to demonstrate some sample data.

With this query

select distinct name from (select name,ID from names) n

And a schema of ID:string, Name: String

I get

"Carl"
"Anna"
"Julian"
"Julian"
"Julian"
"Andrew"
"Andrew"
"Andrew"
"Ben"
"George"
"George"
"George"
etc.

if I omit the other column (ie, 'select distinct name from (select name
from names) n' then it works fine, and it also works okay if I flatten
the query.

It ALSO works okay if I do an ORDER BY on the top entry. ie

select distinct name from (select name,ID from names) n ORDER BY name

I assume this is a bug?
names.csv: attached

Re: Distinct subselect bug?

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
Hi,

I browsed the derby user archives and didn't see any responses to your
(very old) article. This is definitely a bug and should be filed in
Jira. I wasn't able to find a Jira issue for this particular bug, so
I'll look into it tomorrow and file an issue if no one points out
something obvious that I have missed.

For the record: I can verify that the reported behaviour also is
present in the latest development version of Derby.

-- 
Knut Anders

Andrew Shuttlewood <an...@futureroute.co.uk> writes:

> While trying out some queries recently, we came across some weird
> issues.
>
> We sometimes work off of queries that are provided by the user, so we
> heavily use subselects. This is a recent one
>
> select distinct(col1) from (select col1,col2,col3,col4 from table) t
>
> Interestingly enough, this doesn't actually work.
>
> I've attached a sample CSV file to demonstrate some sample data.
>
> With this query
>
> select distinct name from (select name,ID from names) n
>
> And a schema of ID:string, Name: String
>
> I get
>
> "Carl"
> "Anna"
> "Julian"
> "Julian"
> "Julian"
> "Andrew"
> "Andrew"
> "Andrew"
> "Ben"
> "George"
> "George"
> "George"
> etc.
>
> if I omit the other column (ie, 'select distinct name from (select name
> from names) n' then it works fine, and it also works okay if I flatten
> the query.
>
> It ALSO works okay if I do an ORDER BY on the top entry. ie
>
> select distinct name from (select name,ID from names) n ORDER BY name
>
> I assume this is a bug?
> names.csv: attached