You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2018/10/19 16:59:00 UTC

[jira] [Comment Edited] (LUCENE-8531) QueryBuilder hard-codes inOrder=true for generated sloppy span near queries

    [ https://issues.apache.org/jira/browse/LUCENE-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657089#comment-16657089 ] 

Uwe Schindler edited comment on LUCENE-8531 at 10/19/18 4:58 PM:
-----------------------------------------------------------------

+1, please do this. I will then take care of the Solr issue. This is not fully related, but the Solr code depends on the structure of Lucene queries produced and then reorders them with lots of instanceof checks. Which is bad spaghetti-code, but that's how it is.

I'd like to get a Lucene class that allows you to generate edismax-like queries that parses some text, creates bigram and trigram shingles out of it to allow a "match" query to assign a higher score for hits when you have terms in order and close to each other (put a higher precedence if bigrams or trigrams in your query string are close together in the document). A lot of people use this, but currently it only works with Solr's edismax and whenever you want to use this for other custom Solr query parser or custom elasticsearch qp, you have to reimplement the shingling.


was (Author: thetaphi):
+1, please do this. I will then take care of the Solr issue. This is not fully related, but the Solr code depends on the structure of Lucene queries produced and then reorders them with lots of instanceof checks. Which is bad spaghetti-code, but that's how it is.

I'd like to get a Lucene class that allows you to generate edismax-like queries that parses some text, creates bigram and trigram shingles out of it to allow a "match" query to assign a higher score for hits when you have terms in order and close to each other (put a higher precedence if bigrams or trigrams in your query string are close together in the document). A lot of people use this, but currently it only works with Solr's edismax and whenever you want to use this for other query parser or elasticsearch, you have to reimplement the shingling.

> QueryBuilder hard-codes inOrder=true for generated sloppy span near queries
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-8531
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8531
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>            Priority: Major
>         Attachments: LUCENE-8531.patch
>
>
> QueryBuilder.analyzeGraphPhrase() generates SpanNearQuery-s with passed-in phraseSlop, but hard-codes inOrder ctor param as true.
> Before multi-term synonym support and graph token streams introduced the possibility of generating SpanNearQuery-s, QueryBuilder generated (Multi)PhraseQuery-s, which always interpret slop as allowing reordering edits.  Solr's eDismax query parser generates phrase queries when its pf/pf2/pf3 params are specified, and when multi-term synonyms are used with a graph-aware synonym filter, SpanNearQuery-s are generated that require clauses to be in order; unlike with (Multi)PhraseQuery-s, reordering edits are not allowed, so this is a kind of regression.  See SOLR-12243 for edismax pf/pf2/pf3 context.  (Note that the patch on SOLR-12243 also addresses another problem that blocks eDismax from generating queries *at all* under the above-described circumstances.)
> I propose adding a new analyzeGraphPhrase() method that allows configuration of inOrder, which would allow eDismax to specify inOrder=false.  The existing analyzeGraphPhrase() method would remain with its hard-coded inOrder=true, so existing client behavior would remain unchanged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org