You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Paul Elschot (JIRA)" <ji...@apache.org> on 2013/10/17 22:02:44 UTC

[jira] [Comment Edited] (LUCENE-5293) Also use EliasFanoDocIdSet in CachingWrapperFilter

    [ https://issues.apache.org/jira/browse/LUCENE-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798317#comment-13798317 ] 

Paul Elschot edited comment on LUCENE-5293 at 10/17/13 8:01 PM:
----------------------------------------------------------------

First patch, 17 Oct 2013, quite rough, one nocommit.
The latest benchmark results for doc id sets are here: http://people.apache.org/~jpountz/doc_id_sets.html

The patch uses EliasFanoDocIdSet for caching when EliasFanoDocIdSet.sufficientlySmallerThanBitSet returns true,
which is currently at 1/7, or at about -0.85 log10 scale in the benchmark results.
Otherwise it uses WAH8DocIdSet, the current behaviour.
Does this choice make good use of the benchmark results?

To get the number of doc ids to be put in the cache, the patch checks for the type of the actual DocIdSet that is given, and uses FixedBitSet and OpenBitSet cardinality. (Perhaps a similar method should be added to EliasFanoDocIdSet.)
In other cases, the patch falls back to WAH8DocIdSet.

I added a DocIdSet argument to cacheImpl(), there is a nocommit for that.

The patch also corrects a mistake in EliasFanoDocIdSet.sufficientlySmallerThanBitSet, the arguments should be int instead of long, just like the  EliasFanoDocIdSet constructor.






was (Author: paul.elschot@xs4all.nl):
First patch, 17 Oct 2013, quite rough, one nocommit.
The latest benchmark results doc id sets results are here: http://people.apache.org/~jpountz/doc_id_sets.html

The patch uses EliasFanoDocIdSet for caching when EliasFanoDocIdSet.sufficientlySmallerThanBitSet returns true,
which is currently at 1/7, or at about -0.85 log10 scale in the benchmark results.
Otherwise it uses WAH8DocIdSet, the current behaviour.
Does this choice make good use of the benchmark results?

To get the number of doc ids to be put in the cache, the patch checks for the type of the actual DocIdSet that is given, and uses FixedBitSet and OpenBitSet cardinality. (Perhaps a similar method should be added to EliasFanoDocIdSet.)
In other cases, the patch falls back to WAH8DocIdSet.

I added a DocIdSet argument to cacheImpl(), there is a nocommit for that.

The patch also corrects a mistake in EliasFanoDocIdSet.sufficientlySmallerThanBitSet, the arguments should be int instead of long, just like the  EliasFanoDocIdSet constructor.





> Also use EliasFanoDocIdSet in CachingWrapperFilter
> --------------------------------------------------
>
>                 Key: LUCENE-5293
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5293
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-5293.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org