You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2017/10/06 14:45:01 UTC

[jira] [Commented] (OAK-6735) Lucene Index: improved cost estimation by using document count per field

    [ https://issues.apache.org/jira/browse/OAK-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194679#comment-16194679 ] 

Thomas Mueller commented on OAK-6735:
-------------------------------------

I would be in favour of supporting customer defined selectivity per property. See also:

* https://www.postgresql.org/docs/8.1/static/planner-stats-details.html (search for selectivity)
* https://sqlite.org/fileformat2.html#stat4tab
* http://www.h2database.com/html/grammar.html#analyze (see selectivity)

> Lucene Index: improved cost estimation by using document count per field
> ------------------------------------------------------------------------
>
>                 Key: OAK-6735
>                 URL: https://issues.apache.org/jira/browse/OAK-6735
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene, query
>    Affects Versions: 1.7.4
>            Reporter: Thomas Mueller
>             Fix For: 1.8
>
>
> The cost estimation of the Lucene index is somewhat inaccurate because (by default) it just used the number of documents in the index (as of Oak 1.7.4 by default, due to OAK-6333).
> Instead, it should use the number of documents for the given fields (the minimum, if there are multiple fields with restrictions). 
> Plus divided by the number of restrictions (as we do now already).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)