You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Tim Dudgeon <td...@informaticsmatters.com> on 2008/10/05 22:07:09 UTC

Perfroamnce of IN ( ... list ... ) statements

If I run a statement like this is Derby:
SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
it seems unexpectedly slow. Its actually faster to retrieve each row 
individually using a for loop that it is to use the IN ( ..  list ... ) 
clause. This seems strange.

Thanks

Tim


Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
derby@segel.com wrote:
> There was an issue with earlier versions of Derby, that was supposed to 
> have been fixed.
> 
>  
> 
> Even so, the performance of the IN clause where you have a large list is 
> going to be an issue in most databases.

I'm running against 3 different databases. With Oracle and MySQL IN 
clause performance is satisfactory. With Derby it is not and its much 
faster to retrieve each row individually, which seems to indicate that 
something is amiss.

Tim

> 
>  
> 
>  
> 
> ------------------------------------------------------------------------
> 
> *From:* Peter Ondruška 
> [mailto:peter.ondruska@gmail.com]
> *Sent:* Sunday, October 05, 2008 3:12 PM
> *To:* Derby Discussion
> *Subject:* Re: Perfroamnce of IN ( ... list ... ) statements
> 
>  
> 
> What Derby version is it?
> 
> How does it perform if you re-create index on foo_id column?
> 
>  
> 
> p. 
> 
> On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon 
> <tdudgeon@informaticsmatters.com 
> <ma...@informaticsmatters.com>> wrote:
> 
> If I run a statement like this is Derby:
> SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
> it seems unexpectedly slow. Its actually faster to retrieve each row 
> individually using a for loop that it is to use the IN ( ..  list ... ) 
> clause. This seems strange.
> 
> Thanks
> 
> Tim
> 
>  
> 


RE: Perfroamnce of IN ( ... list ... ) statements

Posted by de...@segel.com.
Tim,

The issue is that even though in using other databases, you get "adequate"
performance, the key is that in all the databases, the IN clause can be the
one clause that doesn't perform as well as others. An IN clause using a sub
select could perform much better than an IN clause using a list of values.
Especially if the list gets too long.

If you are using a large list of elements in the IN clause, you may want to
consider trying to put the values in to a temp table and then do an inner
select statement. Or you could do a join. 
Performance here too is going to vary by database. Oracle's temp tables are
a royal pain. Informix's temp tables (Now IBM IDS) are much easier to create
on the fly and you can even index them.

The point I'm trying to make is that even if the IN clause works, you may
want to consider a possible redesign to allow for scalability.

-Mike


> -----Original Message-----
> From: news [mailto:news@ger.gmane.org] On Behalf Of Tim Dudgeon
> Sent: Monday, October 06, 2008 7:02 AM
> To: derby-user@db.apache.org
> Subject: Re: Perfroamnce of IN ( ... list ... ) statements
> 
> derby@segel.com wrote:
> > There was an issue with earlier versions of Derby, that was supposed to
> > have been fixed.
> 
> 
> You are right. I tested with Derby 10.4 and it now looks much better.
> Sorry for the confusion.
> 
> 
> 
> Tim
> 
> >
> >
> >
> > Even so, the performance of the IN clause where you have a large list is
> > going to be an issue in most databases.
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > *From:* Peter Ondruška
> > [mailto:peter.ondruska@gmail.com]
> > *Sent:* Sunday, October 05, 2008 3:12 PM
> > *To:* Derby Discussion
> > *Subject:* Re: Perfroamnce of IN ( ... list ... ) statements
> >
> >
> >
> > What Derby version is it?
> >
> > How does it perform if you re-create index on foo_id column?
> >
> >
> >
> > p.
> >
> > On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon
> > <tdudgeon@informaticsmatters.com
> > <ma...@informaticsmatters.com>> wrote:
> >
> > If I run a statement like this is Derby:
> > SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
> > it seems unexpectedly slow. Its actually faster to retrieve each row
> > individually using a for loop that it is to use the IN ( ..  list ... )
> > clause. This seems strange.
> >
> > Thanks
> >
> > Tim
> >
> >
> >




Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
derby@segel.com wrote:
> There was an issue with earlier versions of Derby, that was supposed to 
> have been fixed.


You are right. I tested with Derby 10.4 and it now looks much better.
Sorry for the confusion.



Tim

> 
>  
> 
> Even so, the performance of the IN clause where you have a large list is 
> going to be an issue in most databases.
> 
>  
> 
>  
> 
> ------------------------------------------------------------------------
> 
> *From:* Peter Ondruška 
> [mailto:peter.ondruska@gmail.com]
> *Sent:* Sunday, October 05, 2008 3:12 PM
> *To:* Derby Discussion
> *Subject:* Re: Perfroamnce of IN ( ... list ... ) statements
> 
>  
> 
> What Derby version is it?
> 
> How does it perform if you re-create index on foo_id column?
> 
>  
> 
> p. 
> 
> On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon 
> <tdudgeon@informaticsmatters.com 
> <ma...@informaticsmatters.com>> wrote:
> 
> If I run a statement like this is Derby:
> SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
> it seems unexpectedly slow. Its actually faster to retrieve each row 
> individually using a for loop that it is to use the IN ( ..  list ... ) 
> clause. This seems strange.
> 
> Thanks
> 
> Tim
> 
>  
> 


Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
Kristian Waagan wrote:
> On 06.10.08 10:53, Tim Dudgeon wrote:
>> Peter Ondruška wrote:
>>> What Derby version is it?
>>> How does it perform if you re-create index on foo_id column?
>>
>> The column in the primary key column, so has an index.
> 
> Hello Tim,
> 
> It would be helpful to know if compressing the table helps (or 
> re-creating the index). The idea is to update the statistics used by the 
> optimizer.
> 
> I saw the question in an earlier post, but no reply. Which version of 
> Derby are you using?

As I mentioned in separate response, this is in fact much improved in 
letest Derby compared with what I was used to.

No need to persue any further.
Sorry for the confusion.

Tim


> 
> 
> regards,


Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Kristian Waagan <Kr...@Sun.COM>.
On 06.10.08 10:53, Tim Dudgeon wrote:
> Peter Ondruška wrote:
>> What Derby version is it?
>> How does it perform if you re-create index on foo_id column?
> 
> The column in the primary key column, so has an index.

Hello Tim,

It would be helpful to know if compressing the table helps (or 
re-creating the index). The idea is to update the statistics used by the 
optimizer.

I saw the question in an earlier post, but no reply. Which version of 
Derby are you using?


regards,
-- 
Kristian

> 
> Tim
> 
>>
>> p.
>> On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon 
>> <tdudgeon@informaticsmatters.com 
>> <ma...@informaticsmatters.com>> wrote:
>>
>>     If I run a statement like this is Derby:
>>     SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
>>     it seems unexpectedly slow. Its actually faster to retrieve each row
>>     individually using a for loop that it is to use the IN ( ..  list
>>     ... ) clause. This seems strange.
>>
>>     Thanks
>>
>>     Tim
>>
>>
> 


Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
Peter Ondruška wrote:
> What Derby version is it?
> How does it perform if you re-create index on foo_id column?

The column in the primary key column, so has an index.

Tim

> 
> p. 
> 
> On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon 
> <tdudgeon@informaticsmatters.com 
> <ma...@informaticsmatters.com>> wrote:
> 
>     If I run a statement like this is Derby:
>     SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
>     it seems unexpectedly slow. Its actually faster to retrieve each row
>     individually using a for loop that it is to use the IN ( ..  list
>     ... ) clause. This seems strange.
> 
>     Thanks
> 
>     Tim
> 
> 


RE: Perfroamnce of IN ( ... list ... ) statements

Posted by de...@segel.com.
There was an issue with earlier versions of Derby, that was supposed to have
been fixed.

 

Even so, the performance of the IN clause where you have a large list is
going to be an issue in most databases.

 

 

  _____  

From: Peter Ondruška [mailto:peter.ondruska@gmail.com] 
Sent: Sunday, October 05, 2008 3:12 PM
To: Derby Discussion
Subject: Re: Perfroamnce of IN ( ... list ... ) statements

 

What Derby version is it?

How does it perform if you re-create index on foo_id column?

 

p. 

On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon
<td...@informaticsmatters.com> wrote:

If I run a statement like this is Derby:
SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
it seems unexpectedly slow. Its actually faster to retrieve each row
individually using a for loop that it is to use the IN ( ..  list ... )
clause. This seems strange.

Thanks

Tim

 


Re: Perfroamnce of IN ( ... list ... ) statements

Posted by Peter Ondruška <pe...@gmail.com>.
What Derby version is it?How does it perform if you re-create index on
foo_id column?

p.

On Sun, Oct 5, 2008 at 10:07 PM, Tim Dudgeon <
tdudgeon@informaticsmatters.com> wrote:

> If I run a statement like this is Derby:
> SELECT COL1, COL2 FROM FOO WHERE FOO_ID IN ( 1, 31, ..... 4567)
> it seems unexpectedly slow. Its actually faster to retrieve each row
> individually using a for loop that it is to use the IN ( ..  list ... )
> clause. This seems strange.
>
> Thanks
>
> Tim
>
>