You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2020/11/19 12:58:00 UTC

[jira] [Commented] (LUCENE-9614) Implement KNN Query

    [ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235437#comment-17235437 ] 

Adrien Grand commented on LUCENE-9614:
--------------------------------------

I wonder if we should use the Query API at all for nearest-neighbor search. Today the Query API assumes that you can figure out whether a document matches in isolation, regardless of other matches in the index/segment. Maybe we should have a new top-level API on IndexSearcher, something like `IndexSearcher#nearestNeighbors(String field, float[] target)`, which we could later expand into `IndexSearcher#nearestNeighbors(String field, float[] target, Query filter)` to add support for filtering?

> Implement KNN Query
> -------------------
>
>                 Key: LUCENE-9614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9614
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael Sokolov
>            Priority: Major
>
> Now we have a vector index format, and one vector indexing/KNN search implementation, but the interface is low-level: you can search across a single segment only. We would like to expose a Query implementation. Initially, we want to support a usage where the KnnVectorQuery selects the k-nearest neighbors without regard to any other constraints, and these can then be filtered as part of an enclosing Boolean or other query.
> Later we will want to explore some kind of filtering *while* performing vector search, or a re-entrant search process that can yield further results. Because of the nature of knn search (all documents having any vector value match), it is more like a ranking than a filtering operation, and it doesn't really make sense to provide an iterator interface that can be merged in the usual way, in docid order, skipping ahead. It's not yet clear how to satisfy a query that is "k nearest neighbors satsifying some arbitrary Query", at least not without realizing a complete bitset for the Query. But this is for a later issue; *this* issue is just about performing the knn search in isolation, computing a set of (some given) K nearest neighbors, and providing an iterator over those.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org