You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/10/04 08:15:00 UTC

[jira] [Commented] (LUCENE-8990) IndexOrDocValuesQuery can take a bad decision for range queries if field has many values per document

    [ https://issues.apache.org/jira/browse/LUCENE-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944308#comment-16944308 ] 

ASF subversion and git services commented on LUCENE-8990:
---------------------------------------------------------

Commit 9942544a7fc9f1abfb70d70e7ebfe275134222f4 in lucene-solr's branch refs/heads/master from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9942544 ]

LUCENE-8990: Add estimateDocCount(visitor) method to PointValues (#905)



> IndexOrDocValuesQuery can take a bad decision for range queries if field has many values per document
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8990
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8990
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Ignacio Vera
>            Priority: Major
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Heuristics of IndexOrDocValuesQuery are somewhat inconsistent for range queries . The leadCost that is provided is based on number of documents, meanwhile the cost() of a range query is based on the number of points that potentially match the query. 
> Therefore it might happen that a BKD tree has millions of points but this points correspond to just a few documents. Therefore we can take the decision of executing the query using docValues and in fact we are almost scanning all the points.
> Maybe the cost() function for range queries need to take into account the average number of points per document in the tree and adjust the value accordingly.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org