You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Girish Redekar <gi...@aplopio.com> on 2009/12/05 05:42:16 UTC

Query time boosting with dismax

Hi,

Is it possible to weigh specific query terms with a Dismax query parser? Is
it possible to write queries of the sort ...
field1:(term1)^2.0 + (term2^3.0)
with dismax?

Thanks,
Girish Redekar
http://girishredekar.net

Re: Query time boosting with dismax

Posted by Uri Boness <ub...@gmail.com>.
Checking it further by looking at the code, it seems that in most cases 
it indeed adds the boost queries as SHOULD. But if you define *one* bq 
parameter which contains a boolean query, then each clause in this 
boolean query will be added to the query as is. Therefore:

This set up will filter the query:
<str name="bq">
        +category:Audio +name:black
</str>

This set up will *not* filter the query:
<str name="bq">
        +category:Audio
</str>
<str name="bq">
        +name:black
</str>

So, in the first set up, the default operator as defined in the schema 
plays a role.

Cheers,
Uri

Erik Hatcher wrote:
> Are you sure about the default operator and bq?  I assume we're 
> talking about the setting in schema.xml.
>
> I think boosting queries are OR'd in automatically to the main query:
>
> From DismaxQParser#addBoostQuery()
>   ... query.add(f, BooleanClause.Occur.SHOULD);...
>
> There is one case where query.add((BooleanClause) c); is used though.
>
>     Erik
>
>
> On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:
>
>> You can actually define boost queries to do that (bq parameter). 
>> Boost queries accept the standard Lucene query syntax and eventually 
>> appended to the user query. Just make sure that the default operator 
>> is set to OR other wise these boost queries will not only influence 
>> the boosts but also filter out some of the results.
>>
>> Otis Gospodnetic wrote:
>>> Terms no, but fields (with terms) and phrases, yes.
>>>
>>>
>>> Otis
>>> -- 
>>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Girish Redekar <gi...@aplopio.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>>> Subject: Query time boosting with dismax
>>>>
>>>> Hi,
>>>>
>>>> Is it possible to weigh specific query terms with a Dismax query 
>>>> parser? Is
>>>> it possible to write queries of the sort ...
>>>> field1:(term1)^2.0 + (term2^3.0)
>>>> with dismax?
>>>>
>>>> Thanks,
>>>> Girish Redekar
>>>> http://girishredekar.net
>>>>
>>>
>>>
>>>
>
>

Re: Query time boosting with dismax

Posted by Uri Boness <ub...@gmail.com>.
Well.. this is mainly based on some experiments I did (not based on the 
code base). It appeared as if the boost queries were appended to the 
generated dismax query and if the default operator (in the schema) was 
set to AND it actually filtered out the request. For example, here's a 
dismax config:

<requestHandler name="dismax" class="solr.SearchHandler" default="true">
    <lst name="defaults">
     <str name="defType">dismax</str>
     <str name="qf">
        text^0.5 name^1.0 category^1.2
     </str>
     <str name="bq">
        *category:Audio name:black*
     </str>
     <str name="fl">
        *,score
     </str>
     ...
  </requestHandler>

When searching with a default OR operator, you receive more results than 
with an AND operator. Checking out the generated query using 
debugQuery=true reviles the following:

Generated query with default OR operator:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 | 
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 | 
text:black^0.5 | name:black^1.2)~0.01) *category:Audio name:black* 
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

Generated query with default AND operator:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 | 
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 | 
text:black^0.5 | name:black^1.2)~0.01) *+category:Audio +name:black* 
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

So when it's an AND, both clauses are marked as MUST in the overall 
query, which in turn filters the query. Indeed, I would expect it to add 
these queries as SHOULD and then the generated query would look like:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 | 
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 | 
text:black^0.5 | name:black^1.2)~0.01) (*+category:Audio +name:black*) 
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

Cheers,
Uri

Erik Hatcher wrote:
> Are you sure about the default operator and bq?  I assume we're 
> talking about the setting in schema.xml.
>
> I think boosting queries are OR'd in automatically to the main query:
>
> From DismaxQParser#addBoostQuery()
>   ... query.add(f, BooleanClause.Occur.SHOULD);...
>
> There is one case where query.add((BooleanClause) c); is used though.
>
>     Erik
>
>
> On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:
>
>> You can actually define boost queries to do that (bq parameter). 
>> Boost queries accept the standard Lucene query syntax and eventually 
>> appended to the user query. Just make sure that the default operator 
>> is set to OR other wise these boost queries will not only influence 
>> the boosts but also filter out some of the results.
>>
>> Otis Gospodnetic wrote:
>>> Terms no, but fields (with terms) and phrases, yes.
>>>
>>>
>>> Otis
>>> -- 
>>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Girish Redekar <gi...@aplopio.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>>> Subject: Query time boosting with dismax
>>>>
>>>> Hi,
>>>>
>>>> Is it possible to weigh specific query terms with a Dismax query 
>>>> parser? Is
>>>> it possible to write queries of the sort ...
>>>> field1:(term1)^2.0 + (term2^3.0)
>>>> with dismax?
>>>>
>>>> Thanks,
>>>> Girish Redekar
>>>> http://girishredekar.net
>>>>
>>>
>>>
>>>
>
>

Re: Query time boosting with dismax

Posted by Erik Hatcher <er...@gmail.com>.
Are you sure about the default operator and bq?  I assume we're  
talking about the setting in schema.xml.

I think boosting queries are OR'd in automatically to the main query:

 From DismaxQParser#addBoostQuery()
   ... query.add(f, BooleanClause.Occur.SHOULD);...

There is one case where query.add((BooleanClause) c); is used though.

	Erik


On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:

> You can actually define boost queries to do that (bq parameter).  
> Boost queries accept the standard Lucene query syntax and eventually  
> appended to the user query. Just make sure that the default operator  
> is set to OR other wise these boost queries will not only influence  
> the boosts but also filter out some of the results.
>
> Otis Gospodnetic wrote:
>> Terms no, but fields (with terms) and phrases, yes.
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>
>>
>>
>> ----- Original Message ----
>>
>>> From: Girish Redekar <gi...@aplopio.com>
>>> To: solr-user@lucene.apache.org
>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>> Subject: Query time boosting with dismax
>>>
>>> Hi,
>>>
>>> Is it possible to weigh specific query terms with a Dismax query  
>>> parser? Is
>>> it possible to write queries of the sort ...
>>> field1:(term1)^2.0 + (term2^3.0)
>>> with dismax?
>>>
>>> Thanks,
>>> Girish Redekar
>>> http://girishredekar.net
>>>
>>
>>
>>


Re: Query time boosting with dismax

Posted by Uri Boness <ub...@gmail.com>.
You can actually define boost queries to do that (bq parameter). Boost 
queries accept the standard Lucene query syntax and eventually appended 
to the user query. Just make sure that the default operator is set to OR 
other wise these boost queries will not only influence the boosts but 
also filter out some of the results.

Otis Gospodnetic wrote:
> Terms no, but fields (with terms) and phrases, yes.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message ----
>   
>> From: Girish Redekar <gi...@aplopio.com>
>> To: solr-user@lucene.apache.org
>> Sent: Fri, December 4, 2009 11:42:16 PM
>> Subject: Query time boosting with dismax
>>
>> Hi,
>>
>> Is it possible to weigh specific query terms with a Dismax query parser? Is
>> it possible to write queries of the sort ...
>> field1:(term1)^2.0 + (term2^3.0)
>> with dismax?
>>
>> Thanks,
>> Girish Redekar
>> http://girishredekar.net
>>     
>
>
>   

Re: Query time boosting with dismax

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Terms no, but fields (with terms) and phrases, yes.


Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Girish Redekar <gi...@aplopio.com>
> To: solr-user@lucene.apache.org
> Sent: Fri, December 4, 2009 11:42:16 PM
> Subject: Query time boosting with dismax
> 
> Hi,
> 
> Is it possible to weigh specific query terms with a Dismax query parser? Is
> it possible to write queries of the sort ...
> field1:(term1)^2.0 + (term2^3.0)
> with dismax?
> 
> Thanks,
> Girish Redekar
> http://girishredekar.net