You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2010/08/13 15:03:16 UTC

[jira] Created: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Continue investigation of automatic creation/update of index statistics
-----------------------------------------------------------------------

                 Key: DERBY-4771
                 URL: https://issues.apache.org/jira/browse/DERBY-4771
             Project: Derby
          Issue Type: Task
            Reporter: Kristian Waagan


Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908668#action_12908668 ] 

Kristian Waagan commented on DERBY-4771:
----------------------------------------

Hi Lily,

First, did you modify the code before you ran the second time? I still see "CHECKING: ..." in the output, but I disabled this println in the 1b patch.
Also, did you run in a new directory, or delete all existing directories before running again?
It would be interesting to know if testOSReadOnly fails also if you run it individually and in a clean test directory.

With database status, do you mean whether it is read-only or not?

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-4771:
-----------------------------------

    Attachment: derby-4771-1b-prototype_code_dump.diff

Attaching patch revision 1b.

Thanks for having a look at the patch, Lily.
Seems a last minute change caused a lot of trouble. Early on the I think I ignored all exceptions originating from IndexStatisticsDaemon.writeUpdatedStats, but before I uploaded the patch I added checks for specific errors.
I have added another check for the container opened in read-only mode error. The issue I see is that Derby doesn't detect that the database is read-only before it's too late to disable the statistics update feature. I tried the isReadOnly-method on both store, lcc, and tx - but none of those returned true at the time when the index statistics daemon is called for. In this case the read-only was caused by missing file privileges, maybe Derby will handle other causes better (?).

If you run the tests again with patch 1b, hopefully all you'll see is four failures in lang.OrderByAndSortAvoidance (two distinct failures).

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lily Wei updated DERBY-4771:
----------------------------

    Attachment: rjall.out

Hi Kristian:
     Thank you for doing such a great work. I am very interested of this feature and would love to keep learning about this feature. The check for container opened in read-only mode error behaves much better. I was wondering, is it a good idea to check set a flag on PrepareStatement that possible allow check about database status somewhere between generating PreparedStatement and the time we check for exception in  GenericStatement.java for store, lcc, tx or something else? Could it help the situation we are in? 

     I did run the Suites.all test suits. This is for reference point only. I got 6 failures. Four of them are all for lang.OrderByAndSortAvoidance except testOSReadOnly is having permission problem when it try to remove directory: c:\derby2\trunk\testallpackages\system\singleUse\readOnly or copy directory from c:\derby2\trunk\testallpackages\system\singleUse\oneuse4e to c:\derby2\trunk\testallpackages\system\singleUse\readOnly   Hope this help!


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan reassigned DERBY-4771:
--------------------------------------

    Assignee: Kristian Waagan

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lily Wei updated DERBY-4771:
----------------------------

    Attachment: derby.log
                error-stacktrace.out

Thanks Kristian for the prompt reply and explanation.
                
Good eyes. I add the "CHECKING" so it is easier for me to see what is going on for now.  
                if (tableDescriptor.getTableType() ==
                        TableDescriptor.BASE_TABLE_TYPE &&
                        tableDescriptor.getTotalNumberOfIndexes() > 0) {
                    System.out.println("CHECKING: " + tableDescriptor.getQualifiedName());
                    long rows = baseRowCount();
                    if (statisticsForTable) {
                        tableDescriptor.markForIndexStatsUpdate(rows);
                    } else if (rows > 100) {
                        // Only create statistics if there are "enough" rows.
                        tableDescriptor.markForIndexStatsUpdate(-1);
                    }
                }

I delete all existing directories before I run the test suites.  I run the testOSReadOnly and it failed as I run it individually. I am attaching the error-stacktrace.out and derby.log and hope it is helpful to you.

With database status, I mean it allows us to know the state of the database. i.e. read-only status.


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-4771:
-----------------------------------

    Attachment: derby-4771-1a-prototype_code_dump.stat
                derby-4771-1a-prototype_code_dump.diff

Attached is a prototype of another attempt at implementing auto-update
of Derby index statistics. First I'll describe the patch briefly, then
I'll note some potential improvements and ideas.
I've omitted lots of details, feel free to ask questions and to comment
on the suggested improvements etc. They need a lot more work...

The code is nowhere near complete, its primary purpose is to spur
discussion and hopefully guide us in the right direction.


[Prototype description]

The prototype performs some checks for whether the index statistics are
stale during statement compilation, as Mamta did under DERBY-3788. If
the statistics are considered stale, an update job to update all indexes
for the base table is scheduled with a "daemon". The daemon keeps track
of scheduled update jobs, and will execute them in a separate thread.
Only one job will be taken care of at a time, and if there are too many
jobs, new jobs are discarded. When a slot frees up in the work queue,
these jobs will eventually be scheduled. If there are no statistics,
creating them will be scheduled (the daemon doesn't separate between
creating and updating stats). When a job is scheduled for a base table,
this is recorded in the associated index descriptors (transient state)
to avoid having to query the daemon too often.

As mentioned, the work is carried out in a separate thread, created as
required (there is no permanent background thread, it dies if the queue
is emptied). This seems appropriate as statistics update should be
rather infrequent compared to other operations in a database system.

When new statistics are computed for the indexes of a table, they are
stored in the daemon. They require little memory (table identifier, and
per index, the index identifer, two longs and one int).

As a statement is compiled, the optimizer will consider the available
indexes. At this point the index statistics are checked, and if we see
that they have been scheduled we make sure we check if they are
completed a little later in the compilation process. If we find new
statistics for the query being compiled, we also write any other
completed statistics to the data dictionary. Writing to the data
dictionary is currently done with a nested read-write user transaction
in the user transaction (during statement compilation) - mainly to avoid
keeping locks for an extended period of time.

For clarity, statement compilation/execution will not wait for new
statistics to be generated. In the case of large tables, it could take
hours to generate new stats.

Obvious weaknesses:
 o code organization (I don't know the code well) - choices made based on
   what worked and on reducing overhead (i.e., checking indexes when we
   have already obtained handles to them)
 o the async/decoupled data dictionary update - done to avoid having to
   create a LanguageConnectionContext (lcc).
 o logic/thresholds for determining when stats are stale
 o the row estimate logic also has weaknesses (for instance when mixing
   setting absolute values and updating the estimate based on deltas)

Other notes/characteristics of the prototype:
 o stats not generated/updated for system tables (caused locking problems)
 o lower limit on the row estimate (don't generate for tables with few rows)
 o I considered to expose the NO_WAIT option in the call to add new
   descriptors to the data dictionary. Don't know if this is needed if we
   update stats with a separate transaction from the daemon, then we can
   either use TransactionControl.setNoLockWait() or maybe even just wait?
 o current staleness code is dependent on reasonable row estimates
 o the "unit of work" is currently a base table - when scheduled all
   associated index statistics will be regenerated.
 o I suspect that most tests in suites.All run with the DBO as the user,
   and I haven't done anything specific to handle missing privileges.
 

[Prototype state]

Runs suites.All and derby.all with only four failures, all in
OrderByAndSortAvoidance. The tests fail on an assert for whether a table
scan is performed. To me it looks like the new stats makes the
compiler/optimizer choose a different plan (not necessarily better in
terms of pages visited though, but that's a DBA/optimizer issue).

Currently two flags control the prototype behavior:
 o derby.language.disableIndexStatsUpdate=*false*|true
 o derby.language.logIndexStatsUpdate=*false*|true

If you grep for 'istat' in derby.log, you should get all the lines
relevant to automatic index statistics update.


[Potential improvements]
 o update data dictionary from the daemon thread
   (must then be able to create an appropriate lcc)

 o drift in the number of unique values isn't handled.
   Some potential remedies (raw ideas):

   Mechanism                       Distinct value drift     Row count change
   =========================================================================
   (a) compilation check                    N                       Y
   (b) timed check                          N                       Y
   (c) timed unconditional update           Y                       Y
   (d) UPDATE table SET ...                 y                       N

   In short:
   (a) creates statistics when not existing and kicks off the update job
   as soon as stale we believe we should have had better stats. (b) helps
   systems which are in a steady state (all statements compiled and
   reused) - would typically check all user tables with indexes and
   perform the staleness check from (a). (c) would help against
   "anything" - but potentially with a large delay. Only useful for
   applications where the database is up for very long periods of time
   (days, weeks, months). Intervals for (b) and (c) would have to be
   configurable. Mechanism (d) would help for updates changing a large
   percentage of the rows, but would not catch many small updates
   changing the selectivity of an index.
   It may be possible to reuse BasicDaemon for the timed checks
   (scheduling only, work would still be performed in a separate thread).
   
 o do we need to throttle (a) the index scans, or (b) the processing
   rate of the scheduled jobs?
   (I started playing with a crude utilization rate)

 o almost as above, but we should take care to avoid "infinite-loops"

 o at which point may a change in either the number of rows or the field
   values be big enough to warrant a recalculation of the stats?
   What's more costly; a sub-optimal plan or reading all the data?


I'll be away for some weeks, but plan to return to this issue when I'm back.
My next steps depends on the feedback I get, but one way forwards may be
to try to do the data dictionary update from the daemon itself. Once we get
the core framework in place, we can start working on all the various issues
that have to be addressed.

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-4771:
-----------------------------------

    Attachment: derby-4771-2b-prototype_lcc_code_dump.diff

I've taken a look at the test failures:
 o failures in OrderByAndSortAvoidance were taken care of by DERBY-4833 patch 1a.
 o the failure UpdateStatisticsTest.testUpdateStatistics will continue to happen for the time being (see DERBY-4837). The failure surfaced because the statistics are automatically generated faster now (written to the dd as soon as they are computed).
 o the failure UpdateStatisticsTest.testNoExclusiveLockOnTable failed due to a bug in the prototype. It left the dd in write-mode, and that causes at least one path in the dd to take an exclusive lock instead of a shared lock when looking up stuff in the system table.
 o I suspect the failure in AutoIncrementTest.testsyslocks may have been caused by the same bug in the prototype, but I'm not sure. It doesn't reproduce again on my machine, but it could be timing-dependent.

I have also seen an intermittent test failure in XplainStatisticsTest, which I'm unable to explain. Seems like there are two rows in one of the XPLAIN tables where there is supposed to be only one.
Today I also saw the old harness test store/updatelocks.sql fail, but I haven't looked into it yet.

I'm attaching the latest revision (patch 2b).

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby-4771-2a-prototype_lcc_code_dump.diff, derby-4771-2b-prototype_lcc_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out, rjall.rar, rjone.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910588#action_12910588 ] 

Lily Wei commented on DERBY-4771:
---------------------------------

Thanks Kristian for such detail report. It is hard to handle all platform at prototype stage. Thank you so much for doing it and share all the details with us.

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lily Wei updated DERBY-4771:
----------------------------

    Attachment: rjone.out

With more tracking and specific tracing in the JDBC.java and RuntimeStatisticsParser.java, I generate rjone.out for 'java -Dderby.tests.trace=true -Dderby.language.disableIndexStatsUpdate=false -Dd
erby.language.logIndexStatsUpdate=true -Dderby.language.traceIndexStatsUpdate=bo
th junit.textui.TestRunner org.apache.derbyTesting.functionTests.tests.lang.Upda
teStatisticsTest' Form looking by the output, the unexpected result row come from 'SELECT * FROM SYS.STATISTICS'for 'Index Scan ResultSet for T2 using index T2I1'. And, it gets error 'A lock could not be obtained within the time requested' was thrown while evaluating an expression.'  The test did ask to lock table t with share mode.  I am not really familiar with locking for Derby. Could we need more concurrency control for StatementNode and RAMTransactionContext? I am including rjone.out for reference.


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby-4771-2a-prototype_lcc_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out, rjall.rar, rjone.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-4771:
-----------------------------------

    Attachment: derby-4771-2a-prototype_lcc_code_dump.diff

Patch 2a is another code dump, still prototyping.
I am now using an lcc to update the data dictionary directly from the daemon.
The damoen is enabled by default and will write some information to the log. More detailed tracing can be enabled (see the comments in IndexStatisticsDaemon).

If anyone has an application or a db load they can test this with, I'd be happy to know if the daemon works.
To do so, build Derby with the patch, run your app and then grep your derby.log file afterwards for "istat".
It might also crash...
You should see statistics being generated for indexes which don't have them, and potentially also updates of existing stats (depends on many factors, I'll explain more later, but some keywords: row count estimate, table growth, statement compilation).

I'll be away for a week, and will answer any comments when I'm back.
My next step will be to validate/rewrite the logic I added to the table descriptor and the other "catalog classes", potentially followed by some initial tuning of various thresholds, and writing more tests.

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby-4771-2a-prototype_lcc_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910498#action_12910498 ] 

Kristian Waagan commented on DERBY-4771:
----------------------------------------

Investigation showed that the errors Lily are getting on Windows in store.OSReadOnlyTest are caused by a partly read-only database directory. The fact that it isn't fully read-only makes Derby believe the database is read-write.
I'll fix the test issue, see DERBY-4804 for details.

I expect to post a new version of the prototype soon. It will use an lcc to update the data dictionary directly.

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lily Wei updated DERBY-4771:
----------------------------

    Attachment: rjall.out

Hi Kristian:
     Thank you so much for doing this. This will be a great plus for Derby in my opinion.

     Overall, I like the design. Having a daemon gathering statistic and execute in a separate thread is a common statistic gathering implementation strategy.  I agree that the memory consumption should be little for the implementation. At the compilation process, new statistic for the query is written along with completed statistics to the data dictionary with a nested read-write user transaction.  And, we will not wait for new statistics to be generated. Should we have more detail priority strategy in turn of how and when completed statistic gets written to data dictionary associate with query complete time and new statistic? i.e. For the case we write and the cases we don't wait for too long of time consideration.
I am not sure this is cover in the weaknesses already or the  NO_WAIT option in the call to add new 
   descriptors to the data dictionary.  I personally was not clear in turn of the implementation on compilation time. So, any elaboration on compilation process for the current operation and additional with new statistics and completed statistics information written to data directory will be very helpful to me.

     I include my run(rjall.out) for Suites.All on Windows 7 with jdk1.6.0_13 32 bits with my comments. This is just for reference in case we can see other issues were not mention already.


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, rjall.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4771) Continue investigation of automatic creation/update of index statistics

Posted by "Lily Wei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lily Wei updated DERBY-4771:
----------------------------

    Attachment: rjall.rar

Before adding application tests, I thought I tried it with Suites.All with all the tracing and flag set to true for the feature. UpdateStatisticsTest failed by itself on my machine. Maybe the lcc to update the data dictionary directly from the daemon has some concurrency issue to tweet with. I really enjoy seeing all the tracing log for this feature. That is so great. I will try to run it with application next and understand the lcc more.

> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat, derby-4771-1b-prototype_code_dump.diff, derby-4771-2a-prototype_lcc_code_dump.diff, derby.log, error-stacktrace.out, rjall.out, rjall.out, rjall.rar
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.