You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Andrew Alsup <bl...@gmail.com> on 2009/11/10 04:50:20 UTC

Derby 10.5.3: subquery performance using "IN"

I am experiencing a dramatic performance hit with a subquery utilizing
a "IN" predicate.  The query involves a total of 4 tables.  If I run
the query (with subquery) it takes ~4.5sec to complete.  If I execute
the subquery by itself, it takes ~0.03sec to complete.  If I then copy
the results of the subquery directly into the "IN" clause of the
parent query, it takes only ~0.02sec to run.  So, I would expect the
combined query (with subquery) to take ~0.5sec (give or take a few
millis).  I'm experiencing an 800% slowdown by performing the
subquery.

Is there anything I can do to restructure my query?

Thanks for any help you can provide.  More information provided below.

FWIW, the performance problem was originally identified with an UPDATE
statement (with subquery), but I've rewritten it as a SELECT here.

-- Andy



The 4 tables are:
end_item_meter: 461,617 rows
meter_reading: 2,776,955 rows
end_item_event: 1,704,171 rows
end_Item_inventory: 233,057 rows

--------------------------------------------
--  This is the real query (with subquery)
--------------------------------------------

SELECT em.end_item_meter_id
FROM   end_item_meter em
WHERE  em.end_Item_meter_id IN
    (
    SELECT
        mr.end_item_meter_id
    FROM
        Meter_Reading mr
            INNER JOIN End_Item_Event eie
                ON eie.end_Item_Event_Id = mr.end_Item_event_Id
            INNER JOIN End_Item_Inventory eii
                ON eii.end_Item_inventory_Id = eie.end_Item_inventory_id
    WHERE
        eii.End_Item_Inventory_Id = 6061799
    );


=================================================================================================
Query 1 of 1 elapsed time (seconds) - Total: 4.458, SQL query: 0.025,
Building output: 4.433
=================================================================================================

END_ITEM_METER_ID
-----------------
6162998
6162999
6163002
6163003

--------------------------------------------
--  Run subquery by itself
--------------------------------------------
SELECT
    mr.end_item_meter_id
FROM
    Meter_Reading mr
        INNER JOIN End_Item_Event eie
            ON eie.end_Item_Event_Id = mr.end_Item_event_Id
        INNER JOIN End_Item_Inventory eii
            ON eii.end_Item_inventory_Id = eie.end_Item_inventory_id
WHERE
    eii.End_Item_Inventory_Id = 6061799;


=================================================================================================
Query 1 of 1 elapsed time (seconds) - Total: 0.031, SQL query: 0.022,
Building output: 0.00
=================================================================================================

END_ITEM_METER_ID
-----------------
6163002
6163003
6162999
6162998
6162999
6162998
6162999
6162998
6162999
6162998


--------------------------------------------
-- Run outer query with explicit values returned by inner query
--------------------------------------------
SELECT em.end_item_meter_id
FROM   end_item_meter em
WHERE  em.end_Item_meter_id IN
    (
        6163002,
        6163003,
        6162999,
        6162998,
        6162999,
        6162998,
        6162999,
        6162998,
        6162999,
        6162998
    );

=================================================================================================
Query 1 of 1 elapsed time (seconds) - Total: 0.021, SQL query: 0.013,
Building output: 0.008
=================================================================================================

END_ITEM_METER_ID
-----------------
6162998
6162999
6163002
6163003

Re: Derby 10.5.3: subquery performance using "IN"

Posted by Bryan Pendleton <bp...@amberpoint.com>.
> --------------------------------------------
> --  This is the real query (with subquery)
> --------------------------------------------
> 
> SELECT em.end_item_meter_id
> FROM   end_item_meter em
> WHERE  em.end_Item_meter_id IN
>     (
>     SELECT
>         mr.end_item_meter_id
>     FROM
>         Meter_Reading mr
>             INNER JOIN End_Item_Event eie
>                 ON eie.end_Item_Event_Id = mr.end_Item_event_Id
>             INNER JOIN End_Item_Inventory eii
>                 ON eii.end_Item_inventory_Id = eie.end_Item_inventory_id
>     WHERE
>         eii.End_Item_Inventory_Id = 6061799
>     );

Is it possible to rewrite this as a top-level join, without using a subquery?

Something along the lines of:

  SELECT em.end_item_meter_id
  FROM   end_item_meter em
      inner join  Meter_Reading mr
               on em.end_Item_meter_id = mr.end_item_meter_id
      INNER JOIN End_Item_Event eie
               ON eie.end_Item_Event_Id = mr.end_Item_event_Id
      INNER JOIN End_Item_Inventory eii
               ON eii.end_Item_inventory_Id = eie.end_Item_inventory_id
      WHERE
          eii.End_Item_Inventory_Id = 6061799

Does that query produce the right results for you?

thanks,

bryan