You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2015/06/09 23:10:02 UTC
[jira] [Commented] (LUCENE-6539) Add DocValuesNumbersQuery, like
DocValuesTermsQuery but works only with long values
[ https://issues.apache.org/jira/browse/LUCENE-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579566#comment-14579566 ]
Adrien Grand commented on LUCENE-6539:
--------------------------------------
This new query looks good to me. However instead of keeping adding such queries to core, I think we should consider moving all our doc values queries to misc since they have complicated trade-offs and are only useful in expert use-cases?
{code}
+ private static Set<Long> toSet(Long[] array) {
+ Set<Long> numbers = new HashSet<>();
+ for(Long number : array) {
+ numbers.add(number);
+ }
+ return numbers;
+ }
{code}
FYI you don't need this helper and could do just: {{new HashSet<Long>(Arrays.asList(array))}}.
bq. in certain cases (many terms/numbers and fewish matching hits) it should be faster than using TermsQuery
This comment got me confused: I think in general these queries are more efficient when they match _many_ documents, ie. even when an equivalent TermsQuery would not be used as a lead iterator in a conjunction? I think the only case when such a query matching few documents would be useful would be in a prohibited clause since these prohibited clauses can never be used to lead iteration anyway and are only used in a random-access fashion?
> Add DocValuesNumbersQuery, like DocValuesTermsQuery but works only with long values
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-6539
> URL: https://issues.apache.org/jira/browse/LUCENE-6539
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: Trunk, 5.3
>
> Attachments: LUCENE-6539.patch
>
>
> This query accepts any document where any of the provided set of longs
> was indexed into the specified field as a numeric DV field
> (NumericDocValuesField or SortedNumericDocValuesField). You can use
> it instead of DocValuesTermsQuery when you have field values that can
> be represented as longs.
> Like DocValuesTermsQuery, this is slowish in general, since it doesn't
> use an inverted data structure, but in certain cases (many
> terms/numbers and fewish matching hits) it should be faster than using
> TermsQuery because it's done as a "post filter" when other (faster)
> query clauses are MUST'd with it.
> In such cases it should also be faster than DocValuesTermsQuery since
> it skips having to resolve terms -> ords.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org