You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Daniel Noll <da...@nuix.com> on 2008/02/04 00:08:33 UTC

Nested query in conjunction with ordering - unreasonably slow

Hi all.

I've been trying to determine the cause for a slowness in a certain query.  
I've managed to strip everything away and get to the core of what's making it 
slow, but I'd like to understand why.

The query is of this form:

  SELECT jobs.id AS id,
         jobs.name AS name,
         (SELECT COUNT(1) FROM jobitems
           WHERE jobitems.jobid = jobs.id) AS count
    FROM jobs
   ORDER BY jobs.name

This takes around 3000ms to complete if there are 200,000 rows in jobitems.

If I remove the "count" subquery, then it runs in 1ms.  If I remove the ORDER 
BY, it runs in 1ms.

There is only one row in jobs, thus I would expect the same performance 
whether I order it or not.  The PK on jobitems is (jobid,itemid); prior posts 
on this forum have led me to believe that this creates an implicit UNIQUE 
INDEX on (jobid,itemid) and also an implicit INDEX on (jobid).  Whether this 
is the case or not, creating an additional INDEX on (jobid) itself doesn't 
speed things up.

If I remove the count part and then execute it by itself, I get a total query 
time of 3ms for the two queries.  No nested query should take longer than it 
takes to execute its parts separately.

I discovered this issue on version 10.3.1.4 but it still performs the same way 
under version 10.3.2.1.

The obvious workaround is not to order by name and sort the results on the 
Java side.  But is there a way to rewrite this query that will make Derby 
happier?

Daniel

Re: Nested query in conjunction with ordering - unreasonably slow

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
Daniel Noll <da...@nuix.com> writes:

> Hi all.
>
> I've been trying to determine the cause for a slowness in a certain query.  
> I've managed to strip everything away and get to the core of what's making it 
> slow, but I'd like to understand why.
>
> The query is of this form:
>
>   SELECT jobs.id AS id,
>          jobs.name AS name,
>          (SELECT COUNT(1) FROM jobitems
>            WHERE jobitems.jobid = jobs.id) AS count
>     FROM jobs
>    ORDER BY jobs.name
>
> This takes around 3000ms to complete if there are 200,000 rows in jobitems.
>
> If I remove the "count" subquery, then it runs in 1ms.  If I remove the ORDER 
> BY, it runs in 1ms.
>
> There is only one row in jobs, thus I would expect the same performance 
> whether I order it or not.  The PK on jobitems is (jobid,itemid); prior posts 
> on this forum have led me to believe that this creates an implicit UNIQUE 
> INDEX on (jobid,itemid) and also an implicit INDEX on (jobid).  Whether this 
> is the case or not, creating an additional INDEX on (jobid) itself doesn't 
> speed things up.
>
> If I remove the count part and then execute it by itself, I get a total query 
> time of 3ms for the two queries.  No nested query should take longer than it 
> takes to execute its parts separately.
>
> I discovered this issue on version 10.3.1.4 but it still performs the
> same way under version 10.3.2.1.

It sounds like the optimizer is making a bad decision for some reason,
but I don't know why (perhaps the statistics are out of date? See
http://thread.gmane.org/gmane.comp.apache.db.derby.user/8100/focus=8101).
Since you seem to have a small reproducible test case for this problem,
it would be great if you could file a JIRA issue and upload one sql
script that creates the tables/indices and insert enough sample data to
show the problem, and one sql script that runs the different queries.

> The obvious workaround is not to order by name and sort the results on the 
> Java side.  But is there a way to rewrite this query that will make Derby 
> happier?

I don't know if it makes Derby happier, but I think this query does the
same thing without the nested query:

SELECT jobs.id, jobs.name, count(*) FROM jobs, jobitems
    WHERE jobs.id = jobitems.jobid
    GROUP BY jobs.id, jobs.name
    ORDER BY jobs.name

-- 
Knut Anders