You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Matt Hov (Jira)" <ji...@apache.org> on 2021/04/15 21:54:00 UTC

[jira] [Comment Edited] (SOLR-15308) ScoreJoinQParserPlugin chooses wrong Index when nesting more than 1 cross-index join query

    [ https://issues.apache.org/jira/browse/SOLR-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322493#comment-17322493 ] 

Matt Hov edited comment on SOLR-15308 at 4/15/21, 9:53 PM:
-----------------------------------------------------------

Hi [~gerlowskija] 

The effect of choosing the wrong core to join against gave incorrect results.  The core chosen for the nested join was not matching "fromIndex" so no matching results returned.  As for using "TESTenforceSameCoreAsAnotherOne=true" to short circuit the join from SameCoreJoinQuery into an OtherCoreJoinQuery (giving correct results) its definitely fast enough. Even 100x faster than JoinQParserPlugin if I were to omit "score=none", although JoinQParserPlugin does nested fromIndex correctly. However I imagine that having ScoreJoinQParserPlugin continue to treat nested cross-index joins SameCoreJoinQuery, but  chosing the correct core would be even faster than OtherCoreJoinQuery.

I think if you were to take this code from SameCoreJoinQuery
{code:java}
@Overridepublic Weight createWeight(IndexSearcher searcher, org.apache.lucene.search.ScoreMode scoreMode, float boost) throws IOException
{ SolrRequestInfo info = SolrRequestInfo.getRequestInfo(); final Query jq = JoinUtil.createJoinQuery(fromField, true, toField, fromQuery, info.getReq().getSearcher(), // <---- the root searcher of the main query/index this.scoreMode); return jq.rewrite(searcher.getIndexReader()).createWeight(searcher, scoreMode, boost); }
{code}
 

and change it to 
{code:java}
@Overridepublic Weight createWeight(IndexSearcher searcher, org.apache.lucene.search.ScoreMode scoreMode, float boost) throws IOException
{ SolrRequestInfo info = SolrRequestInfo.getRequestInfo(); final Query jq = JoinUtil.createJoinQuery(fromField, true, toField, fromQuery, searcher, // <---- the parent searcher which has the SameCore as the child this.scoreMode); return jq.rewrite(searcher.getIndexReader()).createWeight(searcher, scoreMode, boost); }
{code}
 

that SameCoreJoinQuery would just act on the parent searcher, instead of the root searcher of the query (which may have been a different core).

This _might_ be a breaking change to someone else doing crazy nested joins, but that might be considered a rare edge case. I think this would be the expected behavior. 

Thanks for looking into this.

 


was (Author: mhov):
Hi [~gerlowskija] 

The effect of choosing the wrong core to join against gave incorrect results.  The core chosen for the nested join was not matching "fromIndex" so no matching results returned.  As for using "TESTenforceSameCoreAsAnotherOne=true" to short circuit the join from SameCoreJoinQuery into an OtherCoreJoinQuery (giving correct results) its definitely fast enough. Even 100x faster than JoinQParserPlugin if I were to omit "score=none", although JoinQParserPlugin does nested fromIndex correctly. However I imagine that having ScoreJoinQParserPlugin continue to treat nested cross-index joins SameCoreJoinQuery, but  chosing the correct core would be even faster than OtherCoreJoinQuery.



I think if you were to take this code from SameCoreJoinQuery
@Overridepublic Weight createWeight(IndexSearcher searcher, org.apache.lucene.search.ScoreMode scoreMode, float boost) throws IOException {
  SolrRequestInfo info = SolrRequestInfo.getRequestInfo();  final Query jq = JoinUtil.createJoinQuery(fromField, true,
      toField, fromQuery, info.getReq().getSearcher(), // <---- the root searcher of the main query/index
       this.scoreMode);  return jq.rewrite(searcher.getIndexReader()).createWeight(searcher, scoreMode, boost);
}
and change it to 
@Overridepublic Weight createWeight(IndexSearcher searcher, org.apache.lucene.search.ScoreMode scoreMode, float boost) throws IOException {
  SolrRequestInfo info = SolrRequestInfo.getRequestInfo();  final Query jq = JoinUtil.createJoinQuery(fromField, true,
      toField, fromQuery, searcher, // <---- the parent searcher which has the SameCore as the child
      this.scoreMode);   return jq.rewrite(searcher.getIndexReader()).createWeight(searcher, scoreMode, boost);
}
that SameCoreJoinQuery would just act on the parent searcher, instead of the root searcher of the query (which may have been a different core).

This _might_ be a breaking change to someone else doing crazy nested joins, but that might be considered a rare edge case. I think this would be the expected behavior. 

Thanks for looking into this.

 

> ScoreJoinQParserPlugin chooses wrong Index when nesting more than 1 cross-index join query
> ------------------------------------------------------------------------------------------
>
>                 Key: SOLR-15308
>                 URL: https://issues.apache.org/jira/browse/SOLR-15308
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers, SearchComponents - other
>    Affects Versions: 8.8.1
>         Environment: ubuntu 20.04, SOLR 8.8.1
>            Reporter: Matt Hov
>            Priority: Minor
>              Labels: JOIN, join, parser, scorer
>
> In this situation i have 2 cores "nodes" and "edges" and I wish to join across them WITH the performance enhancements of the "score=none" join param.
> If I debug the following query (get me the child nodes of a child nodes of parentid:1, joined back to nodes core)
> {code:java}
> /solr/nodes/select?q=*:*&debugQuery=true
> &fq={!join from=childid to=id score=none fromIndex=edges v=$q2}
> &q0=parentid:1
> &q1={!join from=childid to=parentid score=none fromIndex=edges v=$q0}
> &q2={!join from=childid to=parentid score=none fromIndex=edges v=$q1}
> &rows=0
> {code}
> parsed_filter_queries shows the following 
> {code:java}
> OtherCoreJoinQuery(OtherCoreJoinQuery [fromIndex=edges, fromCoreOpenTime=608579757538032 extends SameCoreJoinQuery [fromQuery=SameCoreJoinQuery [fromQuery=SameCoreJoinQuery [fromQuery=parentid:1, fromField=childid, toField=parentid, scoreMode=None], fromField=childid, toField=parentid, scoreMode=None], fromField=childid, toField=id, scoreMode=None]])
> {code}
> Where all the nested joins are parsed as SameCoreJoinQuery, now I might expect that since the first join is the "edges" core (query is on "nodes"), that the child joins would be the Same core as the first OtherCoreJoinQuery parent query. However if you look at ScoreJoinQParserPlugin.java:159  (under SameCoreJoinQuery)
>  
> {code:java}
> @Override
> public Weight createWeight(IndexSearcher searcher, org.apache.lucene.search.ScoreMode scoreMode, float boost) throws IOException {
>   SolrRequestInfo info = SolrRequestInfo.getRequestInfo();
>   final Query jq = JoinUtil.createJoinQuery(fromField, true,
>       toField, fromQuery, info.getReq().getSearcher(), this.scoreMode);
>   return jq.rewrite(searcher.getIndexReader()).createWeight(searcher, scoreMode, boost);
> }
> {code}
> "info.getReq().getSearcher()" will always be the searcher for the main index/core "nodes" not the parent OtherCoreJoinQuery index "edges"
>  
> I noticed undocumented test params "TESTenforceSameCoreAsAnotherOne" and if I add them to each query 
> {code:java}
> /solr/nodes/select?q=*:*&debugQuery=true
> &fq={!join from=childid to=id score=none fromIndex=edges TESTenforceSameCoreAsAnotherOne=true v=$q2}
> &q0=parentid:1
> &q1={!join from=childid to=parentid score=none fromIndex=edges TESTenforceSameCoreAsAnotherOne=true v=$q0}
> &q2={!join from=childid to=parentid score=none fromIndex=edges TESTenforceSameCoreAsAnotherOne=true v=$q1}
> &rows=0
> {code}
> I'll receive this parsed_filter_queries
> {code:java}
> OtherCoreJoinQuery(OtherCoreJoinQuery [fromIndex=edges, fromCoreOpenTime=608579757538032 extends SameCoreJoinQuery [fromQuery=OtherCoreJoinQuery [fromIndex=edges, fromCoreOpenTime=608579757538032 extends SameCoreJoinQuery [fromQuery=OtherCoreJoinQuery [fromIndex=edges, fromCoreOpenTime=608579757538032 extends SameCoreJoinQuery [fromQuery=parentid:1, fromField=childid, toField=parentid, scoreMode=None]], fromField=childid, toField=parentid, scoreMode=None]], fromField=childid, toField=id, scoreMode=None]])
> {code}
> which gives me what I'd expect and the correct results.  So I have this as a workaround in the meantime. 
> So I guess the solution depends on what you meant to happen, should a cross-index join under a cross-index join (of the same index) be a SameCoreJoinQuery?
> if that's the case then replace "info.getReq().getSearcher()" with "searcher" in SameCoreJoinQuery.createWeight(...)
> if it should be a OtherCoreJoinQuery like it's parent join query, then ScoreJoinQParserPlugin.java:228
> {code:java}
> final String myCore = req.getCore().getCoreDescriptor().getName();
> {code}
> should be getting the top level coreName from "SolrRequestInfo.getRequestInfo().getReq().getCore()"
>  
> Thanks
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org