You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@solr.apache.org by "Alessandro Benedetti (Jira)" <ji...@apache.org> on 2023/12/18 12:02:00 UTC

[jira] [Commented] (SOLR-16857) Efficiently rerank collapsed queries with vector queries

    [ https://issues.apache.org/jira/browse/SOLR-16857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798161#comment-17798161 ] 

Alessandro Benedetti commented on SOLR-16857:
---------------------------------------------

Currently using the Knn query parser as a rerank query is not ideal:

It doesn't calculate the vector distance for all the top N documents from the original query (with the query vector) but rather impacts the score of only the top N documents from the original query that intersect the approximate K-NN list.
And being approximate, it doesn't make much sense.

This is unlikely what a user wants.
I am questioning if we should disable at all that possibility and leave to the user the fact you can use vectorSimilarity as a function query (https://sease.io/2023/12/hybrid-search-with-apache-solr.html) which does what you want (I guess).

If the vector similarity it's not the only ranking factor, you can use learning to rank and use the vector similarity as one of the features.


> Efficiently rerank collapsed queries with vector queries
> --------------------------------------------------------
>
>                 Key: SOLR-16857
>                 URL: https://issues.apache.org/jira/browse/SOLR-16857
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>              Labels: hybrid-search
>
> E-commerce catalogs often use collapse to collapse product records within a group. For example a particular couch might come in different colors or fabrics. So a single couch might have a large number of slightly different records within the group.
> When reranking a collapsed query with a vector query the vector query will select the top K matches based on the vector. The top K could include multiple records from within the same product group although only one group head was selected from the group. This will pollute the top K results with lots of duplicate records of no value.
> The solution is to devise a filter that limits the vector query to searching only the selected group heads from the collapse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org