You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael Sokolov (Jira)" <ji...@apache.org> on 2021/07/10 17:04:00 UTC
[jira] [Comment Edited] (LUCENE-9614) Implement KNN Query

    [ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378497#comment-17378497 ] 

Michael Sokolov edited comment on LUCENE-9614 at 7/10/21, 5:03 PM:
-------------------------------------------------------------------

Doing nn vector search during rewrite has one significant drawback, which is that {{rewrite()}} cannot make use of IndexSearcher's executor to perform concurrent searches across segments (or slices), whereas an implementation that does the search in createWeight will naturally get executed concurrently when IndexSearcher is configured for that.

To fix that would require some substantial change to pass an executor to {{Query.rewrite}}, which seems kind of overkill at this point. Instead, perhaps we can implement the {{createWeight}} version that supports concurrency and define {{equals(Object)}} and {{hashCode()}} to use object identity in order to prevent spurious caching.


was (Author: sokolov):
Doing nn vector search during rewrite has one significant drawback, which is that rewrite() cannot make use of IndexSearcher's executor to perform concurrent searches across segments (or slices), whereas an implementation that does the search in createWeight will naturally get executed concurrently when IndexSearcher is configured for that.

To fix that would require some substantial change to pass an executor to Query.rewrite, which seems kind of overkill at this point. Instead, perhaps we can implement the `createWeight` version that supports concurrency and define `equals` and `hashCode` to use object identity in order to prevent spurious caching.

> Implement KNN Query
> -------------------
>
>                 Key: LUCENE-9614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9614
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael Sokolov
>            Priority: Major
>
> Now we have a vector index format, and one vector indexing/KNN search implementation, but the interface is low-level: you can search across a single segment only. We would like to expose a Query implementation. Initially, we want to support a usage where the KnnVectorQuery selects the k-nearest neighbors without regard to any other constraints, and these can then be filtered as part of an enclosing Boolean or other query.
> Later we will want to explore some kind of filtering *while* performing vector search, or a re-entrant search process that can yield further results. Because of the nature of knn search (all documents having any vector value match), it is more like a ranking than a filtering operation, and it doesn't really make sense to provide an iterator interface that can be merged in the usual way, in docid order, skipping ahead. It's not yet clear how to satisfy a query that is "k nearest neighbors satsifying some arbitrary Query", at least not without realizing a complete bitset for the Query. But this is for a later issue; *this* issue is just about performing the knn search in isolation, computing a set of (some given) K nearest neighbors, and providing an iterator over those.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org