You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Julie Tibshirani (Jira)" <ji...@apache.org> on 2021/09/30 01:04:00 UTC

[jira] [Comment Edited] (LUCENE-10130) HnswGraph could make use of a SparseFixedBitSet.getAndSet

    [ https://issues.apache.org/jira/browse/LUCENE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422467#comment-17422467 ] 

Julie Tibshirani edited comment on LUCENE-10130 at 9/30/21, 1:03 AM:
---------------------------------------------------------------------

+1 to switch to a better data structure that could help perf. Some context that might be helpful: this set tracks the graph nodes that have already been visited during a kNN search. In https://issues.apache.org/jira/browse/LUCENE-9937 on dataset of 1M points, I saw the number of visited nodes range from around 1000 to 20,000, depending on k. It's supposed to be roughly logarithmic. When there are deleted docs the number could climb much higher, since we might visit a bunch of deleted docs until we find good vectors to return.


was (Author: julietibs):
+1 to switch to a better data structure that could help perf. Some context that might be helpful: this set tracks the graph nodes that have already been visited during a kNN search. In https://issues.apache.org/jira/browse/LUCENE-9937 on dataset of 1M points, I saw the number of visited nodes range from around 1000 to 20,000, depending on k. When there are deleted docs this number could climb much higher, since we might visit a bunch of deleted docs until we find good vectors to return.

> HnswGraph could make use of a SparseFixedBitSet.getAndSet
> ---------------------------------------------------------
>
>                 Key: LUCENE-10130
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10130
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-10130.patch
>
>
> Currently HnswGraph uses SparseFixedBitSet "visited" to track where it has already been. The logic currently looks like this:
> {code}
> if (visited.get(entryPoint) == false) {
>   visited.set(entryPoint);
>   ... logic ...
> }
> {code}
> If SparseFixedBitSet had a {{getAndSet}} (like FixedBitSet), the code could be:
> {code}
> if (visited.getAndSet(entrypoint) == false) {
>   ... logic ...
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org