You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael Sokolov (Jira)" <ji...@apache.org> on 2019/11/18 11:44:00 UTC

[jira] [Created] (LUCENE-9051) Implement random access seeks in IndexedDISI (DocValues)

Michael Sokolov created LUCENE-9051:
---------------------------------------

             Summary: Implement random access seeks in IndexedDISI (DocValues)
                 Key: LUCENE-9051
                 URL: https://issues.apache.org/jira/browse/LUCENE-9051
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael Sokolov


In LUCENE-9004 we have a use case for random-access seeking in DocValues, which currently only support forward-only iteration (with efficient skipping). One idea there was to write an entirely new format to cover these cases. While looking into that, I noticed that our current DocValues addressing implementation, {{IndexedDISI}}, already has a pretty good basis for providing random accesses. I worked up a patch that does that; we already have the ability to jump to a block, thanks to the jump-tables added last year by [~toke]; the patch uses that, and/or rewinds the iteration within current block as needed.

I did a very simple performance test, comparing forward-only iteration with random seeks, and in my test I saw no difference, but that can't be right, so I wonder if we have a more thorough performance test of DocValues somwhere that I could repurpose. Probably I'll go back and dig into the issue where we added the jump tables - I seem to recall some testing was done then.

Aside from performance testing the implementation, there is the question should we alter our API guarantees in this way. This might be controversial, I don't know the history or all the reasoning behind the way it is today. We provide {{advanceExact}} and some implementations support docids going backwards, others don't.  {{AssertingNumericDocValues.advanceExact}} does  enforce forward-iteration (in tests); what would the consequence be of relaxing that? We'd then open ourselves up to requiring all DV impls to support random access. Are there other impls to worry about though? I'm not sure. I'd appreciate y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org