You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Jeffrey Lichtman <sw...@rcn.com> on 2005/11/11 20:13:02 UTC

Re: Poor query optimizer choices is making Derby unusable for large tables

>I've described the problem in detail below, and I'd appreciate any 
>assistance. Specifically:
>
>i) Does anyone have any plans to fix this problem?

I believe the real problem is that Derby doesn't have any strategy 
for doing multiple scans for OR/IN clauses. This is a useful feature, 
but I don't know if anyone has entered an enhancement request.

>ii) In the meantime, are there any work-arounds? I'd appreciate any 
>suggestions that would decrease the execution time of my second 
>query below (the one with with two search terms). Likewise, any 
>general strategies for avoiding this problem with IN clauses would 
>be appreciated.

You could re-write the query to use UNION:

SELECT ObjectId, SUM(WordLocation) AS Score
FROM tblSearchDictionary
WHERE Word = 'CONTACT' OR Word = 'ADD'
GROUP BY ObjectId;

would become:

SELECT ObjectId, SUM(WordLocation) AS Score
FROM
     (SELECT ObjectId, WordLocation
         FROM tblSearchDictionary
         WHERE Word = 'CONTACT'
      UNION ALL
      SELECT ObjectId, WordLocation
         FROM tblSearchDictionary
         WHERE Word = 'ADD') t
GROUP BY ObjectId;


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: Poor query optimizer choices is making Derby unusable for large tables

Posted by Kevin Hore <kh...@araxis.com>.
Hi Jeffrey,

Thank you for your response. I think you are probably right about Derby
not having a strategy for doing multiple scans. I think DERBY-47
probably covers this, but I'll perhaps add a note to that covering my
circumstances.

I had thought of using UNION but, unfortunately the IN clause must cope
with a varying number of values, and I don't want the system to be
constructing complex statements on-the-fly to avoid this particular
Derby problem. We have quite a number of queries affected by this, not
just the (simple) example under consideration, so using UNION would
require major surgery to our application.

Thanks again.

Kind regards

Kevin Hore

Jeffrey Lichtman wrote:
> 
>> I've described the problem in detail below, and I'd appreciate any 
>> assistance. Specifically:
>>
>> i) Does anyone have any plans to fix this problem?
> 
> 
> I believe the real problem is that Derby doesn't have any strategy for 
> doing multiple scans for OR/IN clauses. This is a useful feature, but I 
> don't know if anyone has entered an enhancement request.
> 
>> ii) In the meantime, are there any work-arounds? I'd appreciate any 
>> suggestions that would decrease the execution time of my second query 
>> below (the one with with two search terms). Likewise, any general 
>> strategies for avoiding this problem with IN clauses would be 
>> appreciated.
> 
> 
> You could re-write the query to use UNION:
> 
> SELECT ObjectId, SUM(WordLocation) AS Score
> FROM tblSearchDictionary
> WHERE Word = 'CONTACT' OR Word = 'ADD'
> GROUP BY ObjectId;
> 
> would become:
> 
> SELECT ObjectId, SUM(WordLocation) AS Score
> FROM
>     (SELECT ObjectId, WordLocation
>         FROM tblSearchDictionary
>         WHERE Word = 'CONTACT'
>      UNION ALL
>      SELECT ObjectId, WordLocation
>         FROM tblSearchDictionary
>         WHERE Word = 'ADD') t
> GROUP BY ObjectId;
> 
> 
>                        -        Jeff Lichtman
>                                 swazoo@rcn.com
>                                 Check out Swazoo Koolak's Web Jukebox at
>                                 http://swazoo.com/