You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Christine Poerschke (Jira)" <ji...@apache.org> on 2023/03/03 12:00:06 UTC

[jira] [Commented] (SOLR-16289) [interleaving] transformer does not work in SolrCloud

    [ https://issues.apache.org/jira/browse/SOLR-16289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696125#comment-17696125 ] 

Christine Poerschke commented on SOLR-16289:
--------------------------------------------

Picking up here from the [https://github.com/apache/solr/pull/937#issuecomment-1448240364] comment, to perhaps keep that pull request focused on the NullPointerException and documentation and the discussion here focused on interleaving in SolrCloud.

It seems to me that there's (at least) two aspects:
 * the _meaning_ of interleaving in a distributed setup i.e. how can and should results be combined including what is correct (or good-enough)
 * the _mechanics_ of how interleaving in a distributed setup could be done

As a starting point for thinking about the mechanics (and only the mechanics) I'd like to refresh my understanding and outline how non-distributed interleaving works:
 * QueryComponent.prepare
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L384-L391]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L179-L197]

 * QueryComponent.process
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L420-L427]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L327]

 * QueryComponent calling SolrIndexSearcher
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L1637]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L710]

 * SolrIndexSearcher to interleaving code
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1759]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/search/AbstractReRankQuery.java#L60-L72]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/search/ReRankCollector.java#L118]
 ** [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/modules/ltr/src/java/org/apache/solr/ltr/interleaving/LTRInterleavingRescorer.java#L63]

The outline above is for non-distributed searches, {{prepare-then-process}} in short.

For distributed searches we have {{distributedProcess-handleResponses-finishStage}} in the search handler on the "coordinator" node that receives the request plus multiple times {{prepare-then-process}} in the search handlers on the nodes that receive shard-requests from the coordinator.
 * [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L477]
 * [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L560]
 * [https://github.com/apache/solr/blob/releases/solr/9.1.1/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L566]

> [interleaving] transformer does not work in SolrCloud
> -----------------------------------------------------
>
>                 Key: SOLR-16289
>                 URL: https://issues.apache.org/jira/browse/SOLR-16289
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - LTR
>    Affects Versions: 9.0
>            Reporter: Naoto Minami
>            Priority: Major
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In SolrCloud, two-stage shard requests are processed. The first stage is to execute the query return unique keys and scores of documents. Then, in the second stage, collect fields’ values of merged top documents. LTRInterleavingTransformerFactory should be run in the first stage (ResponseBuilder.STAGE_EXECUTE_QUERY), because the LTRInterleavingScoringQuery knows which model is used in scoring. However, it’s run in second stage(ResponseBuilder.STAGE_GET_FIELDS) and LTRInterleavingRescorer#rescore is skipped in second stage. LTRInterleavingTransformerFactory cannot handle this case, so thrown NullPointerException when fl=[interleaving] is specified in SolrCloud. There is a same problem in LTRFeatureLoggerTransformerFactory. But, if interleaving is not used, LTRFeatureLoggerTransformerFactory falls back when feature vector cache is not hit (i.e. in second stage).
> I will fix the NullPointerException problem, but the underlying solution should be discussed. One of the solution of this problem is disable two-stage request by distrib.singlePass=true parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org