You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/05/05 01:16:05 UTC
[jira] Created: (LUCENE-2445) Perf improvements for the DocsEnum
bulk read API
Perf improvements for the DocsEnum bulk read API
------------------------------------------------
Key: LUCENE-2445
URL: https://issues.apache.org/jira/browse/LUCENE-2445
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Fix For: 4.0
I started to work on LUCENE-2443, to create a test showing the
problems, but it turns out none of the core codecs (even sep/intblock)
ever set a non-zero offset.
So I set forth to fix sep to do so, but ran into some issues w/ the
current bulk-read API that we should fix to make it higher
performance:
* Filtering of deleted docs should be the caller's job (saves an
extra pass through the docs)
* Probably docs should arrive as deltas and caller sums these up to
get the actual docID
* Whether to load freqs or not should be separately controllable
* We may want to require that the int[] for docs and freqs are
"aligned", ie the offset into each is the same
* Maybe we should separate out a BulkDocsEnum from DocsEnum. We can
make it optional for codecs (ie, we can emulate BulkDocsEnum from
the DocsEnum)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org