You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael Dodsworth (JIRA)" <ji...@apache.org> on 2014/05/29 20:27:02 UTC

[jira] [Comment Edited] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)

    [ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012659#comment-14012659 ] 

Michael Dodsworth edited comment on SOLR-6062 at 5/29/14 6:25 PM:
------------------------------------------------------------------

Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats...

{code}
{"q", "aaaa bbbb cccc",
 "qf", "phrase_sw phrase1_sw",
 "pf", "phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30",
 "pf2", "phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55",
 "pf3", "phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555"}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped
(
  DisjunctionMaxQuery((phrase_sw:"aaaa bbbb"~2^22.0 | phrase1_sw:"aaaa bbbb"~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw:"bbbb cccc"~2^22.0 | phrase1_sw:"bbbb cccc"~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw:"aaaa bbbb"^33.0)) 
  DisjunctionMaxQuery((phrase_sw:"bbbb cccc"^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw:"aaaa bbbb"~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw:"bbbb cccc"~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~2^222.0 | phrase1_sw:"aaaa bbbb cccc"~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"^333.0)) 
DisjunctionMaxQuery((phrase1_sw:"aaaa bbbb cccc"~4^555.0)))

{code}


was (Author: mdodsworth@salesforce.com):
Thanks for looking at this, [~ramayer],

Here's an example that shows both the grouping within a particular pf? query (where the supplied fields have the same slop) and the splitting out/layering of queries when different slops are used for the same field(s). Hold on to your hats...

{"q", "aaaa bbbb cccc",
 "qf", "phrase_sw phrase1_sw",
 "pf", "phrase_sw~1^10 phrase_sw~2^20 phrase_sw^30",
 "pf2", "phrase_sw~2^22 phrase_sw^33 phrase1_sw~2^44 phrase1_sw~4^55",
 "pf3", "phrase_sw~2^222 phrase_sw^333 phrase1_sw~2^444 phrase1_sw~4^555"}

# pf -- phrase_sw with 3 different slop values results in 3 independent dismax queries
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~1^10.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~2^20.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"^30.0)) 

# pf2 -- phrase_sw and phrase1_sw were both supplied with a slop of 2, so those queries are grouped
(
  DisjunctionMaxQuery((phrase_sw:"aaaa bbbb"~2^22.0 | phrase1_sw:"aaaa bbbb"~2^44.0)) 
  DisjunctionMaxQuery((phrase_sw:"bbbb cccc"~2^22.0 | phrase1_sw:"bbbb cccc"~2^44.0))
) 

(
  DisjunctionMaxQuery((phrase_sw:"aaaa bbbb"^33.0)) 
  DisjunctionMaxQuery((phrase_sw:"bbbb cccc"^33.0))
)

(
  DisjunctionMaxQuery((phrase1_sw:"aaaa bbbb"~4^55.0)) 
  DisjunctionMaxQuery((phrase1_sw:"bbbb cccc"~4^55.0))
)

# pf3
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"~2^222.0 | phrase1_sw:"aaaa bbbb cccc"~2^444.0)) 
DisjunctionMaxQuery((phrase_sw:"aaaa bbbb cccc"^333.0)) 
DisjunctionMaxQuery((phrase1_sw:"aaaa bbbb cccc"~4^555.0)))

> Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6062
>                 URL: https://issues.apache.org/jira/browse/SOLR-6062
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 4.0
>            Reporter: Michael Dodsworth
>            Priority: Minor
>         Attachments: combined-phrased-dismax.patch
>
>
> https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query.
> For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity):
> {code:java}
> <main query>
> DisjunctionMaxQuery((field1:"term1 term2"^1.0)~0.1)
> DisjunctionMaxQuery((field2:"term1 term2"^1.0)~0.1)
> DisjunctionMaxQuery((field3:"term1 term2"^1.0)~0.1)
> {code}
> Prior to this change, we had:
> {code:java}
> <main query> 
> DisjunctionMaxQuery((field1:"term1 term2"^1.0 | field2:"term1 term2"^1.0 | field3:"term1 term2"^1.0)~0.1)
> {code}
> The upshot being that if the phrase query "term1 term2" appears in multiple fields, it will get a significant boost over the previous implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org