You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Lu Xugang (Jira)" <ji...@apache.org> on 2022/03/04 03:14:00 UTC

[jira] [Commented] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values

    [ https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501122#comment-17501122 ] 

Lu Xugang commented on LUCENE-10162:
------------------------------------

Move the conversation from LUCENE-10446 about current issue to one place:

{quote}I think that one way we could make the situation better would be by implementing LUCENE-10162 to create fields that index both points and doc values. Then factory methods on these fields would know exactly how the field is indexed and they could make the best decision without having to hurt the API by merging what PointRangeQuery, IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery do:
 - If the points index tells us that all docs match, then return DocIdSetIterator#range(0,maxDoc).
 - If the field is the primary index sort, then use the index to figure out the min and max values and return the appropriate range.
 - Otherwise do what IndexOrDocValuesQuery is doing today.

One thought I had in mind when opening LUCENE-10162 was that we could return queries that can more easily do the right thing because they know both{quote}

> Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10162
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10162
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one hand, and NumericDocValuesField and SortedNumericDocValuesField on the other hand.
> When we introduced these classes, this distinction made sense: use the XXXPoint classes if you want your numeric fields to be searchable and the XXXDocValuesField classes if you want your numeric fields to be sortable/aggregatable.
> However since then, we introduced logic to take advantage of doc values for filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of the Points index to skip non-competitive documents. So even if you only need searching, or if you only need sorting, it's likely a good idea to index both with points *and* doc values.
> Could we make this easier on users by having XXXField classes that automatically do it as opposed to requiring users to add both an XXXPoint and an XXXDocValuesField for every numeric field to their index? This could also make consuming these fields easier, e.g. factory methods for range queries could automatically use IndexOrDocValuesQuery.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org