You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2016/09/16 12:15:22 UTC

[jira] [Updated] (OAK-4816) Property index: cost estimate with path restriction is too optimistic

     [ https://issues.apache.org/jira/browse/OAK-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Mueller updated OAK-4816:
--------------------------------
    Component/s: query

> Property index: cost estimate with path restriction is too optimistic
> ---------------------------------------------------------------------
>
>                 Key: OAK-4816
>                 URL: https://issues.apache.org/jira/browse/OAK-4816
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 1.6
>
>
> The property index cost estimation is too optimistic in case there is a property restriction plus a path restriction. The current algorithm, as documented in http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation , assumes that matching entries are evenly distributed over the whole repository. In many cases, this is not the case. In extreme cases, _all_ entries that match the property restriction are in the subtree that matches the path restriction. Example: 
> * 10'000 nodes with property color "red".
> * 1 million nodes in the repository
> * 10'000 nodes in the subtree /content
> * query {{/jcr:root/content//\*[@color = 'red']}}
> Currently, the cost estimate is about 100, there are about 10'000 entries for "red", and "/content" contains 1% of all nodes. But in reality, there might be 10'000 entries with color "red" in that subtree (that is, all of them).
> The cost estimation should take that into account, and assume that at least 80% of the matching nodes are in that subtree (if the subtree contains that many nodes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)