You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jason Gerlowski (Jira)" <ji...@apache.org> on 2019/10/23 15:53:00 UTC

[jira] [Created] (LUCENE-9025) Add more efficient lookupTerm() overload to SortedSetDocValues

Jason Gerlowski created LUCENE-9025:
---------------------------------------

             Summary: Add more efficient lookupTerm() overload to SortedSetDocValues
                 Key: LUCENE-9025
                 URL: https://issues.apache.org/jira/browse/LUCENE-9025
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/search
    Affects Versions: master (9.0)
            Reporter: Jason Gerlowski


{{SortedSetDocValues.lookupTerm(BytesRef)}} performs a binary search of the entire docValues range to find the ordinal of the requested BytesRef.

For an individual invocation, this is optimal.  Without other context, binary search needs to cover the entire space.

But there are some common uses of {{lookupTerm}} where this shouldn't be necessary.  For example: making multiple {{lookupTerm}} calls to fetch the ordinals for each value in a sorted list of terms.  {{lookupTerm}} will binary-search the whole space on each invocation, even though the caller knows that there's no point searching anything before the ordinal that came back from the previous {{lookupTerm}} call.

I propose we add a {{SortedSetDocValues.lookupTerm}} overload which takes a lower-bound to start the binary search at: {{public long lookupTerm(BytesRef key, long lowerSearchBound) throws IOException}}  This saves each binary-search a few iterations in usage scenarios like the one described above, which can conceivably add up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org