You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Simon Chatelain <sc...@gmail.com> on 2012/02/01 13:07:32 UTC

Re: Selecting a few rows is really slow when ordering by primary key

Hello Rick,

Thank you very much for your answer.
Some time ago I already tried to regenerate the statistics of the UPDATE
table only (as both the order by and the timestamp restriction was on that
table, I thought that if there was a problem with the statistics, it should
be on that table) but without effect.
BUT this time I also regenerate the statistics of the FIELD table, and that
worked ! Now the slow query is as fast as the fast one.

I also tried to create the index on both columns (TIMESTAMP_LONG and PK)
but without effect.

And finally adding an optimizer override as follow:

select MACHINE_UPDATE.PK <http://machine_update.pk/> MACHINE_UPDATE_PK,
     MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
     MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
     MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
     FIELD.NAME <http://field.name/> FIELD_NAME,
     FIELD.STR_VALUE FIELD_STR_VALUE,
     FIELD.LONG_VALUE FIELD_LONG_VALUE
     from UPDATE_63402512 MACHINE_UPDATE *--DERBY-PROPERTIES
index=**UPDATE_63402512_TIMESTAMP_LONG_IDX
*
             , FIELD_63402512 FIELD
     where FIELD.FK_MACHINE_UPDATE =
MACHINE_UPDATE.PK<http://machine_update.pk/>
     and TIMESTAMP_LONG <= 1313590989001
     and TIMESTAMP_LONG >= 1313590969001
     ORDER BY MACHINE_UPDATE_PK

works as well, the query plan is the same that the plan of the fast query.

So for now my problem is solved. Than you very much.

But one small question remains, do you think that it is enough to run
SYSCS_UPDATE_STATIS
TICS only once, or should I plan to run this procedure at regular interval
(let's say once a week or once a month) in case the statistics are not
correctly updated as new rows are inserted in the database ?

Simon


On Fri, Jan 27, 2012 at 4:05 PM, Rick Hillegas <ri...@oracle.com>wrote:

> Hi Simon,
>
> I am not an expert in how the optimizer chooses query plans. Hopefully,
> other people will be able to provide more insight.
>
> From the second plan, it is clear that the optimizer understands that the
> timestamp restriction will qualify only 97 rows. But the optimizer is
> unable to use this information to predict that the join will produce a
> small number of rows. The optimizer estimates that, despite the timestamp
> restriction, the join will produce 2836394 rows. That is probably why the
> optimizer rejects putting a sort on top of the second plan when considering
> plans for the first query, the one with the ORDER BY clause.
>
> Sometimes these problems are fixed by regenerating the statistics used by
> the optimizer. You can regenerate statistics using the
> SYSCS_UTIL.SYSCS_UPDATE_**STATISTICS procedure (see the section dealing
> with this procedure in the Derby Reference Guide).
>
> If that doesn't work, please let us know. Logging a bug against the
> optimizer will help build the case for improving the optimizer. The
> following other solutions may help:
>
> 1) Try adding the primary key to the end of the timestamp index signature:
>
> CREATE INDEX UPDATE_63402512_TIMESTAMP_**LONG_IDX ON
> CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG, PK);
>
> 2) Force the first query to use the indexes chosen by the second query by
> specifying optimizer overrides. See the section on optimizer overrides in
> the Tuning Guide.
>
> Hope this helps,
> -Rick
>
>
> On 1/27/12 12:59 AM, Simon Chatelain wrote:
>
>> Hello,
>>
>> I am facing an issue where a select query returning a few hundred rows
>> take a really long time (about 8 seconds) when the result is ordered by a
>> primary key value, and take almost no time (about 0.03 seconds) when
>> ordered by another column.
>>
>> I traced the query plan for both queries, but as I am no expert I am not
>> sure to understand them very well and I cannot see any way of optimizing
>> the slow query to make it as fast as the fast one.
>>
>> So my question is: could anyone explain to me why the slow query is slow
>> and more even important: is there a way to make it fast ?
>>
>> I tried with Derby 10.5 and 10.8 and both exhibits the same behavior.
>>
>>
>> Thank in advance for any input..
>>
>> schatela
>>
>>
>> Here are the details:
>>
>> *The database schema looks like this*
>>
>>
>> CREATE TABLE CERBERUS.UPDATE_63402512 (
>> PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY (START WITH 1,
>> INCREMENT BY 1) ,
>>    CATEGORY VARCHAR(30000) NOT NULL,
>>    TIMESTAMP TIMESTAMP NOT NULL,
>>    TIMESTAMP_LONG BIGINT NOT NULL);
>>
>> CREATE INDEX UPDATE_63402512_TIMESTAMP_IDX ON CERBERUS.UPDATE_63402512
>> (TIMESTAMP);
>>
>> CREATE INDEX UPDATE_63402512_TIMESTAMP_**LONG_IDX ON
>> CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG);
>>
>> CREATE TABLE CERBERUS.FIELD_63402512 (
>>            PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY
>> (START WITH 1, INCREMENT BY 1),
>>            NAME VARCHAR(30000) NOT NULL,
>>            STR_VALUE VARCHAR(30000),
>>            LONG_VALUE  INT,
>>            FK_MACHINE_UPDATE INT NOT NULL,
>>            CONSTRAINT fk_FIELD_UPDATE_63402512
>>             FOREIGN KEY (FK_MACHINE_UPDATE)
>>                REFERENCES CERBERUS.UPDATE_63402512 (PK)
>>                ON DELETE NO ACTION
>>                ON UPDATE NO ACTION);
>>
>> *the UPDATE table contains about 1.4 million rows and the FIELD table
>> contains a little less than 3 million rows.*
>>
>> *Then the slow query is as follow:*
>>
>> select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,
>>
>>     MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
>>     MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
>>     MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
>> FIELD.NAME <http://FIELD.NAME> FIELD_NAME,
>>
>>     FIELD.STR_VALUE FIELD_STR_VALUE,
>>     FIELD.LONG_VALUE FIELD_LONG_VALUE
>>     fromUPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
>>     where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <
>> http://MACHINE_UPDATE.PK>
>>
>>     and TIMESTAMP_LONG <= 1313590989001
>>     and TIMESTAMP_LONG >= 1313590969001
>>     ORDER BY MACHINE_UPDATE_PK
>>
>> *And the fast query is:*
>>
>> select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,
>>
>>     MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
>>     MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
>>     MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
>> FIELD.NAME <http://FIELD.NAME> FIELD_NAME,
>>
>>     FIELD.STR_VALUE FIELD_STR_VALUE,
>>     FIELD.LONG_VALUE FIELD_LONG_VALUE
>>  from UPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
>>     where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <
>> http://MACHINE_UPDATE.PK>
>>
>>     and TIMESTAMP_LONG <= 1313590989001
>>     and TIMESTAMP_LONG >= 1313590969001
>>     ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG
>>
>> *The explain plan of the the slow query is this:*
>>
>>
>> Statement Name:
>>    SQL_CURLH000C1
>> Statement Text:
>>    select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>
>> MACHINE_UPDATE_PK,MACHINE_**UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,**MACHINE_UPDATE.TIMESTAMP
>> MACHINE_UPDATE_TIMESTAMP,**MACHINE_UPDATE.TIMESTAMP_LONG
>> MACHINE_UPDATE_TIMESTAMP_LONG,**FIELD.NAME <http://FIELD.NAME> <
>> http://FIELD.NAME> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_
>> **VALUE FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512
>> MACHINE_UPDATE,CERBERUS.FIELD_**63402512 FIELD where
>> FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>
>> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 1313590969001
>>  ORDER BY MACHINE_UPDATE_PK
>>
>> Parse Time: 0
>> Bind Time: 0
>> Optimize Time: 15
>> Generate Time: 0
>> Compile Time: 15
>> Execute Time: 8970
>> Begin Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.*
>> *838
>> End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.**
>> 853
>> Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.**
>> 853
>> End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:58.**839
>>
>> Statement Execution Plan Text:
>> Project-Restrict ResultSet (7):
>> Number of opens = 1
>> Rows seen = 300
>> Rows filtered = 0
>> restriction = false
>> projection = true
>>    constructor time (milliseconds) = 0
>>    open time (milliseconds) = 0
>>    next time (milliseconds) = 8970
>>    close time (milliseconds) = 0
>>    restriction time (milliseconds) = 0
>>    projection time (milliseconds) = 0
>>    optimizer estimated row count:      2836394.00
>>    optimizer estimated cost:      8328596.88
>>
>> Source result set:
>>    Nested Loop Join ResultSet:
>>    Number of opens = 1
>>    Rows seen from the left = 100
>>    Rows seen from the right = 300
>>    Rows filtered = 0
>>    Rows returned = 300
>>        constructor time (milliseconds) = 0
>>        open time (milliseconds) = 0
>>        next time (milliseconds) = 8970
>>        close time (milliseconds) = 0
>>        optimizer estimated row count:      2836394.00
>>        optimizer estimated cost:      8328596.88
>>
>>    Left result set:
>>        Project-Restrict ResultSet (4):
>>        Number of opens = 1
>>        Rows seen = 1408086
>>        Rows filtered = 1407986
>>        restriction = true
>>        projection = true
>>            constructor time (milliseconds) = 0
>>            open time (milliseconds) = 0
>>            next time (milliseconds) = 8954
>>            close time (milliseconds) = 0
>>            restriction time (milliseconds) = 314
>>            projection time (milliseconds) = 0
>>            optimizer estimated row count:       153341.11
>>            optimizer estimated cost:      2826935.53
>>
>>        Source result set:
>>            Index Row to Base Row ResultSet for UPDATE_63402512:
>>            Number of opens = 1
>>            Rows seen = 1408086
>>            Columns accessed from heap = {0, 1, 2, 3}
>>                constructor time (milliseconds) = 0
>>                open time (milliseconds) = 0
>>                next time (milliseconds) = 8500
>>                close time (milliseconds) = 0
>>                optimizer estimated row count:       153341.11
>>                optimizer estimated cost:      2826935.53
>>
>>                Index Scan ResultSet for UPDATE_63402512 using constraint
>> SQL111123145514350 at read committed isolation level using share row
>> locking chosen by the optimizer
>>                Number of opens = 1
>>                Rows seen = 1408086
>>                Rows filtered = 0
>>                Fetch Size = 1
>>                    constructor time (milliseconds) = 0
>>                    open time (milliseconds) = 0
>>                    next time (milliseconds) = 4744
>>                    close time (milliseconds) = 0
>>                    next time in milliseconds/row = 0
>>
>>                scan information:
>>                    Bit set of columns fetched={1}
>>                    Number of columns fetched=1
>>                    Number of deleted rows visited=0
>>                    Number of pages visited=7728
>>                    Number of rows qualified=1408086
>>                    Number of rows visited=1408086
>>                    Scan type=btree
>>                    Tree height=3
>>                    start position:
>>    None
>>                    stop position:
>>    None
>>                    qualifiers:
>> None
>>                    optimizer estimated row count:       153341.11
>>                    optimizer estimated cost:      2826935.53
>>
>>
>>    Right result set:
>>        Index Row to Base Row ResultSet for FIELD_63402512:
>>        Number of opens = 100
>>        Rows seen = 300
>>        Columns accessed from heap = {1, 2, 3, 4}
>>            constructor time (milliseconds) = 0
>>            open time (milliseconds) = 0
>>            next time (milliseconds) = 0
>>            close time (milliseconds) = 16
>>            optimizer estimated row count:      2836394.00
>>            optimizer estimated cost:      5501661.34
>>
>>            Index Scan ResultSet for FIELD_63402512 using constraint
>> fk_FIELD_UPDATE_63402512 at read committed isolation level using share row
>> locking chosen by the optimizer
>>            Number of opens = 100
>>            Rows seen = 300
>>            Rows filtered = 0
>>            Fetch Size = 1
>>                constructor time (milliseconds) = 0
>>                open time (milliseconds) = 0
>>                next time (milliseconds) = 0
>>                close time (milliseconds) = 16
>>                next time in milliseconds/row = 0
>>
>>            scan information:
>>                Bit set of columns fetched=All
>>                Number of columns fetched=2
>>                Number of deleted rows visited=0
>>                Number of pages visited=401
>>                Number of rows qualified=300
>>                Number of rows visited=400
>>                Scan type=btree
>>                Tree height=4
>>                start position:
>> >= on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                stop position:
>> > on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                qualifiers:
>> None
>>                optimizer estimated row count:      2836394.00
>>                optimizer estimated cost:      5501661.34
>>
>> *and the explain plan of the fast query is:*
>>
>>
>> Statement Name:
>>    SQL_CURLH000C1
>> Statement Text:
>>    select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>
>> MACHINE_UPDATE_PK,MACHINE_**UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,**MACHINE_UPDATE.TIMESTAMP
>> MACHINE_UPDATE_TIMESTAMP,**MACHINE_UPDATE.TIMESTAMP_LONG
>> MACHINE_UPDATE_TIMESTAMP_LONG,**FIELD.NAME <http://FIELD.NAME> <
>> http://FIELD.NAME> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_
>> **VALUE FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512
>> MACHINE_UPDATE,CERBERUS.FIELD_**63402512 FIELD where
>> FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>
>> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 1313590969001
>>  ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG
>>
>> Parse Time: 0
>> Bind Time: 15
>> Optimize Time: 78
>> Generate Time: 0
>> Compile Time: 93
>> Execute Time: 32
>> Begin Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.*
>> *508
>> End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.**
>> 601
>> Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.**
>> 601
>> End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.**648
>>
>> Statement Execution Plan Text:
>> Project-Restrict ResultSet (6):
>> Number of opens = 1
>> Rows seen = 300
>> Rows filtered = 0
>> restriction = false
>> projection = true
>>    constructor time (milliseconds) = 0
>>    open time (milliseconds) = 0
>>    next time (milliseconds) = 32
>>    close time (milliseconds) = 0
>>    restriction time (milliseconds) = 0
>>    projection time (milliseconds) = 0
>>    optimizer estimated row count:      2836394.00
>>    optimizer estimated cost:      5501851.37
>>
>> Source result set:
>>    Nested Loop Join ResultSet:
>>    Number of opens = 1
>>    Rows seen from the left = 100
>>    Rows seen from the right = 300
>>    Rows filtered = 0
>>    Rows returned = 300
>>        constructor time (milliseconds) = 0
>>        open time (milliseconds) = 0
>>        next time (milliseconds) = 32
>>        close time (milliseconds) = 0
>>        optimizer estimated row count:      2836394.00
>>        optimizer estimated cost:      5501851.37
>>
>>    Left result set:
>>        Index Row to Base Row ResultSet for UPDATE_63402512:
>>        Number of opens = 1
>>        Rows seen = 100
>>        Columns accessed from heap = {0, 1, 2, 3}
>>            constructor time (milliseconds) = 0
>>            open time (milliseconds) = 0
>>            next time (milliseconds) = 0
>>            close time (milliseconds) = 0
>>            optimizer estimated row count:           97.00
>>            optimizer estimated cost:          190.03
>>
>>            Index Scan ResultSet for UPDATE_63402512 using index
>> UPDATE_63402512_TIMESTAMP_**LONG_IDX at read committed isolation level
>> using share row locking chosen by the optimizer
>>            Number of opens = 1
>>            Rows seen = 100
>>            Rows filtered = 0
>>            Fetch Size = 1
>>                constructor time (milliseconds) = 0
>>                open time (milliseconds) = 0
>>                next time (milliseconds) = 0
>>                close time (milliseconds) = 0
>>                next time in milliseconds/row = 0
>>
>>            scan information:
>>                Bit set of columns fetched=All
>>                Number of columns fetched=2
>>                Number of deleted rows visited=0
>>                Number of pages visited=3
>>                Number of rows qualified=100
>>                Number of rows visited=101
>>                Scan type=btree
>>                Tree height=3
>>                start position:
>> >= on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                stop position:
>> > on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                qualifiers:
>> None
>>                optimizer estimated row count:           97.00
>>                optimizer estimated cost:          190.03
>>
>>
>>    Right result set:
>>        Index Row to Base Row ResultSet for FIELD_63402512:
>>        Number of opens = 100
>>        Rows seen = 300
>>        Columns accessed from heap = {1, 2, 3, 4}
>>            constructor time (milliseconds) = 0
>>            open time (milliseconds) = 0
>>            next time (milliseconds) = 16
>>            close time (milliseconds) = 16
>>            optimizer estimated row count:      2836394.00
>>            optimizer estimated cost:      5501661.34
>>
>>            Index Scan ResultSet for FIELD_63402512 using constraint
>> fk_FIELD_UPDATE_63402512 at read committed isolation level using share row
>> locking chosen by the optimizer
>>            Number of opens = 100
>>            Rows seen = 300
>>            Rows filtered = 0
>>            Fetch Size = 1
>>                constructor time (milliseconds) = 0
>>                open time (milliseconds) = 0
>>                next time (milliseconds) = 16
>>                close time (milliseconds) = 16
>>                next time in milliseconds/row = 0
>>
>>            scan information:
>>                Bit set of columns fetched=All
>>                Number of columns fetched=2
>>                Number of deleted rows visited=0
>>                Number of pages visited=401
>>                Number of rows qualified=300
>>                Number of rows visited=400
>>                Scan type=btree
>>                Tree height=4
>>                start position:
>> >= on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                stop position:
>> > on first 1 column(s).
>>    Ordered null semantics on the following columns:
>> 0
>>                qualifiers:
>> None
>>                optimizer estimated row count:      2836394.00
>>                optimizer estimated cost:      5501661.34
>>
>>
>>
>

Re: Selecting a few rows is really slow when ordering by primary key

Posted by Kristian Waagan <kr...@oracle.com>.
On 02.02.2012 11:14, Kristian Waagan wrote:
> On 01.02.2012 14:51, Rick Hillegas wrote: 
< snip>
>> On 2/1/12 4:07 AM, Simon Chatelain wrote:
< snip>
> You may want to set derby.storage.indexStats.log=true, and more 
> importantly, derby.storage.indexStats.debug.trace=true. Run you app 
> for a while, then grep for "{istat}" in derby.log.
> This will tell you if automatic calculation is being triggered.
>
> Although strongly discouraged for production environments, you can try 
> to set lndiffThreshold to zero, or a very small value, to schedule an 
> update on every occasion, i.e. when you compile a statement like 
> "select ... from tbl where indexed_column = ?".

If you want to test this manually, you will appreciate setting 
derby.storage.indexStats.debug.absdiffThreshold, and possibly 
derby.storage.indexStats.debug.createThreshold, to a lower value (both 
express number of rows, defaults are 1000 and 250 respectively).

Just for reference, testing stuff like this in ij can be done along 
these lines:
select * from sys.sysstatistics (note creationtimestamp / statistics, 
may want a join to get table name)
insert into ... (or delete from ...)
prepare q as 'select * from <table> where <indexed_column> = ?'
select * from sys.sysstatistics


-- 
Kristian

>
> Note that the istat daemon will only be triggered when the table 
> changes in size. If you do updates that changes the index key 
> distribution significantly you'll want to generate new statistics 
> manually/periodically.
>
>
> Regards,

Re: Selecting a few rows is really slow when ordering by primary key

Posted by Kristian Waagan <kr...@oracle.com>.
On 01.02.2012 14:51, Rick Hillegas wrote:
> Hi Simon,
>
> Glad to hear that regenerating the statistics fixed your problem. More 
> inline...
>
> On 2/1/12 4:07 AM, Simon Chatelain wrote:
>> ...
>>
>> But one small question remains, do you think that it is enough to run 
>> SYSCS_UPDATE_STATIS
>> TICS only once, or should I plan to run this procedure at regular 
>> interval (let's say once a week or once a month) in case the 
>> statistics are not correctly updated as new rows are inserted in the 
>> database ?
>>
> If you upgrade to 10.8, the statistics should be regenerated 
> automatically. The statistics are supposed to be recalculated when the 
> number of rows in the table is too far out of sync with the number of 
> rows estimated by the statistics. "Too far out of sync" means that the 
> ratio between the two numbers is more than e. If statistics are not 
> being generated often enough, you can adjust the triggering ratio by 
> setting the following Derby property:
>
> derby.storage.indexStats.debug.lndiffThreshold
>
> That property defaults to be 1.0. By setting it to a value between 0.0 
> and 1.0, you should speed up the rate at which Derby recalculates 
> statistics. By setting the property to a larger value, you should slow 
> down the rate at which Derby recalculates statistics. Let us know if 
> you think that statistics are not being regenerated often enough. This 
> is a new feature which we are eager to tune and improve.

You may want to set derby.storage.indexStats.log=true, and more 
importantly, derby.storage.indexStats.debug.trace=true. Run you app for 
a while, then grep for "{istat}" in derby.log.
This will tell you if automatic calculation is being triggered.

Although strongly discouraged for production environments, you can try 
to set lndiffThreshold to zero, or a very small value, to schedule an 
update on every occasion, i.e. when you compile a statement like "select 
... from tbl where indexed_column = ?".

Note that the istat daemon will only be triggered when the table changes 
in size. If you do updates that changes the index key distribution 
significantly you'll want to generate new statistics manually/periodically.


Regards,
-- 
Kristian

>
> Hope this helps,
> -Rick

Re: Selecting a few rows is really slow when ordering by primary key

Posted by Rick Hillegas <ri...@oracle.com>.
Hi Simon,

Glad to hear that regenerating the statistics fixed your problem. More 
inline...

On 2/1/12 4:07 AM, Simon Chatelain wrote:
> ...
>
> But one small question remains, do you think that it is enough to run 
> SYSCS_UPDATE_STATIS
> TICS only once, or should I plan to run this procedure at regular 
> interval (let's say once a week or once a month) in case the 
> statistics are not correctly updated as new rows are inserted in the 
> database ?
>
If you upgrade to 10.8, the statistics should be regenerated 
automatically. The statistics are supposed to be recalculated when the 
number of rows in the table is too far out of sync with the number of 
rows estimated by the statistics. "Too far out of sync" means that the 
ratio between the two numbers is more than e. If statistics are not 
being generated often enough, you can adjust the triggering ratio by 
setting the following Derby property:

derby.storage.indexStats.debug.lndiffThreshold

That property defaults to be 1.0. By setting it to a value between 0.0 
and 1.0, you should speed up the rate at which Derby recalculates 
statistics. By setting the property to a larger value, you should slow 
down the rate at which Derby recalculates statistics. Let us know if you 
think that statistics are not being regenerated often enough. This is a 
new feature which we are eager to tune and improve.

Hope this helps,
-Rick