You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2013/06/28 20:24:24 UTC
[jira] [Commented] (DERBY-3790) Investigate if request for update statistics can be skipped for certain kind of indexes, one instance may be unique indexes based on one column.

    [ https://issues.apache.org/jira/browse/DERBY-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695632#comment-13695632 ] 

ASF subversion and git services commented on DERBY-3790:
--------------------------------------------------------

Commit 1497868 from [~mamtas]
[ https://svn.apache.org/r1497868 ]

DERBY-5680( indexStat daemon processing tables over and over even when there are no changes in the tables )

Backporting the 3 commits that went in for DERBY-5680 to 10.8. The 3 commits were 1340549, 1341622, 1341629. The first two commits were easy to backport using svn merge command but the third commit 1341629 ran into conflicts. For that backport, hand made the changes since there were not too many changes.

The changes for this jira has added a new property derby.storage.indexStats.debug.keepDisposableStats. The intention of the property is that if the property is set to true, we do not delete the orphaned/disposable stats. If the property is set to false, the orphaned/disposable stats will get dropped by the index stats daemon. Currently known reasons for orphaned/disposable stats are
1)DERBY-5681(When a foreign key constraint on a table is dropped, the associated statistics row for the conglomerate is not removed). Fix for this has been backported all the way to 10.3
2)DERBY-3790(Investigate if request for update statistics can be skipped for certain kind of indexes, one instance may be unique indexes based on one column.) Fix for this is in 10.9 and higher

A junit test was added for this new property but it went in as part of DERBY-3790. The name of the junit test is store.KeepDisposableStatsPropertyTest. Had to make changes to this test to backport it to 10.8 but without the fix for DEBRY-3790 and with the absence of drop statistics procedure, the test really does not make much sense for 10.8 codeline. The test uses drop statistics procedure and it is mainly testing DERBY-3790 to make sure that the orphaned stats are being deleted or left behind based on whether the property is set to true or false. But since we do not have drop statistics procedure and we do not have DERBY-3790 fixed in 10.8, we can't really meaningfully run the KeepDisposableStatsPropertyTest in 10.8. In any case, I have changed the test so that atleast it will not fail in 10.8 but it is not able to truly test the property. May be we can test this property through upgrade suite where we will create orphaned stats because of DERBY-5681 on older releases and we will find that when the property is set to true, even after upgrade, we will have orphaned stats but when property is set to false, after upgrade, orphaned stats are deleted.
                
> Investigate if request for update statistics can be skipped for certain kind of indexes, one instance may be unique indexes based on one column.
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3790
>                 URL: https://issues.apache.org/jira/browse/DERBY-3790
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 10.5.1.1
>            Reporter: Mamta A. Satoor
>            Assignee: Kristian Waagan
>             Fix For: 10.9.1.0
>
>         Attachments: derby-3790-1a-skip_stats_scui.diff, derby-3790-1b-skip_stats_scui.diff, derby-3790-1c-skip_stats_scui.diff, derby-3790-2a-minor_test_improvements.diff
>
>
> DERBY-269 provided a manual way to update the statisitcs. There was some discussion in that jira entry for possibly optimizing the cases where there is no need to update the statistics. I will enter the related comments from that jira entry here for reference.
> **************************
> Knut Anders Hatlen - 18/Jul/08 12:39 AM 
> If I have understood correctly, unique indexes always have up to date cardinality statistics because cardinality == row count. If that's the case, one possible optimization is to skip the unique indexes when SYSCS_UPDATE_STATISTICS is called. 
> **************************
> **************************
> Mike Matrigali - 18/Jul/08 09:48 AM 
> is the cardinality of a unique index 1 or is it row count? 
> It is also more complicated than just skipping unique indexes, it depends on the number of columns in the index because 
> in a multi-column index, multiple cardinalities are calculated. So for instance on an index on columns A,B,C there are 
> actually 3 cardinalities calculated: 
> A 
> A,B 
> A,B,C 
> I agree that the calculation of cardinality of A,B,C could/should be short circuited for a unique index. 
> **************************
> **************************
> Knut Anders Hatlen - 18/Jul/08 03:25 PM 
> Mike, 
> It looks to me as if the cardinality is the number of unique values, so I think the cardinality of a unique index is equal to its row count (for the full key, that is). You're right that we can't short circuit it if we have a multi-column index. I don't know if it's worth the extra complexity to short circuit the A,B,C case, since we'd have to scan the entire index anyway. For a single-column unique index it sounds like a good idea, though. 
> **************************

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira