You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lukas Kahwe Smith <ml...@pooteeweet.org> on 2010/06/29 09:41:04 UTC

optional vs. probhibited aka standard vs. dismax handler

Hi,

I am a bit confused about the +/- syntax. Am I understanding it properly that when using the normal query handler + means required and - means prohibit where as in the dismax handler + means required and - means optional?

http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document.
The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.

http://wiki.apache.org/solr/DisMaxRequestHandler
Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses ... but all other Lucene query parser special characters are escaped to simplify the user experience.

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: optional vs. probhibited aka standard vs. dismax handler

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.
On 29.06.2010, at 15:01, Jan Høydahl / Cominvent wrote:

> When you mix query handlers like this you will need to add a "+" or an "AND" in front of the _query_: part as well, in order for it to be required.

> You will see the difference when you try the above query directly on your Solr instance and add &debugQuery=true. Your parsedquerystring will show your real query.

ok .. i quickly tried this out before i send my last email. probably did something wrong.

> BTW:
> It is a better practice to submit your fielded terms as filters as they don't need to contribute to the score. I.e. you could rewrite your query from:
> q=+tag_ids:(23)  +document_code_prefix:(A/RES/58)  (_query_:"{!dismax qf='content document_title' pf='content document_title' v=$qq}&qq=decade -domestic
> to:
> q=decade -domestic&defType=dismax&qf=content document_title&pf=content document_title&fq=tag_ids:(23)&fq=document_code_prefix:(A/RES/58)

i am already doing the facet filters via fq. however since i do want to allow optional fielded filters it seemed to make more sense to leave all of that in the q part.

> Also, you may want to apply patch SOLR-1553 and start using the eDisMax handler which allows fielded search and boolean operators, if you need more advanced user-facing query syntax.


yeah .. i am keeping an eye on that already.

thx!

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: optional vs. probhibited aka standard vs. dismax handler

Posted by Jan Høydahl / Cominvent <ja...@cominvent.com>.
When you mix query handlers like this you will need to add a "+" or an "AND" in front of the _query_: part as well, in order for it to be required.
I.e. 
hl.fragsize=0&facet=true&sort=score+desc&hl.simple.pre=<strong>&hl.fl=*&hl=true&rows=21&fl=*,score&start=0&q=%2Btag_ids:(23)++%2Bdocument_code_prefix:(A/RES/58)+AND+(_query_:"{!dismax+qf%3D'content+document_title'+pf%3D'content+document_title'+v%3D$qq}")&hl.simple.post=</strong>&facet.field={!ex%3Ddt+key%3Dorig_legal_value}legal_value&facet.field={!ex%3Ddt+key%3Dorig_adoption_year}adoption_year&facet.field={!ex%3Ddt+key%3Dorig_organisation_id}organisation_id&facet.field={!ex%3Ddt+key%3Dorig_addressee_ids}addressee_ids&facet.field={!ex%3Ddt+key%3Dorig_documenttype_id}documenttype_id&facet.field={!ex%3Ddt+key%3Dorig_information_type_id}information_type_id&facet.field={!ex%3Ddt+key%3Dorig_operative_phrase_id}operative_phrase_id&facet.field={!ex%3Ddt+key%3Dorig_tag_ids}tag_ids&qq=decade+-domestic}

You will see the difference when you try the above query directly on your Solr instance and add &debugQuery=true. Your parsedquerystring will show your real query.

BTW:
It is a better practice to submit your fielded terms as filters as they don't need to contribute to the score. I.e. you could rewrite your query from:
q=+tag_ids:(23)  +document_code_prefix:(A/RES/58)  (_query_:"{!dismax qf='content document_title' pf='content document_title' v=$qq}&qq=decade -domestic
to:
q=decade -domestic&defType=dismax&qf=content document_title&pf=content document_title&fq=tag_ids:(23)&fq=document_code_prefix:(A/RES/58)

Also, you may want to apply patch SOLR-1553 and start using the eDisMax handler which allows fielded search and boolean operators, if you need more advanced user-facing query syntax.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 29. juni 2010, at 14.02, Lukas Kahwe Smith wrote:

> 
> On 29.06.2010, at 13:38, Lukas Kahwe Smith wrote:
> 
>> 
>> On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote:
>> 
>>> Hi,
>>> 
>>> In DisMax the "mm" parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add "+". You can change to mm=0 and you will get the same behaviour as standard parser, i.e. an "OR" behaviour, where the "+" would say that a term is required.
>> 
>> ok thx
>> 
>>> Using the "-" in DisMax still means prohibited.
>> 
>> 
>> what do you mean with "still"? .. with the default it behaves like "optional" in my tests. i will test with mm=0 if it behaves like prohibited.
> 
> 
> maybe to illustrate the issue:
> http://resolutionfinder.org/search?q=decade+-domestic++%2Btag%3Amalaria+%2Bcode%3AA%2FRES%2F58*&tm=any&s=Search
> 
> generates the following query. note i have the standard query handler with no hardcoded parameters setup as the default handler
> 
> INFO: [Clause_en] webapp=/solr path=/select params={hl.fragsize=0&facet=true&sort=score+desc&hl.simple.pre=<strong>&json.nl=map&hl.fl=*&wt=json&hl=true&rows=21&fl=*,score&start=0&q=%2Btag_ids:(23)++%2Bdocument_code_prefix:(A/RES/58)++(_query_:"{!dismax+qf%3D'content+document_title'+pf%3D'content+document_title'+v%3D$qq}")&hl.simple.post=</strong>&facet.field={!ex%3Ddt+key%3Dorig_legal_value}legal_value&facet.field={!ex%3Ddt+key%3Dorig_adoption_year}adoption_year&facet.field={!ex%3Ddt+key%3Dorig_organisation_id}organisation_id&facet.field={!ex%3Ddt+key%3Dorig_addressee_ids}addressee_ids&facet.field={!ex%3Ddt+key%3Dorig_documenttype_id}documenttype_id&facet.field={!ex%3Ddt+key%3Dorig_information_type_id}information_type_id&facet.field={!ex%3Ddt+key%3Dorig_operative_phrase_id}operative_phrase_id&facet.field={!ex%3Ddt+key%3Dorig_tag_ids}tag_ids&qq=decade+-domestic} hits=19 status=0 QTime=23 
> 
> I am using the dismax handler for the things that are not prefixed with a field name:
> decade -domestic
> 
> I am using the default mm setting.
> 
> But if you search for "domestic" in the result of the above url you can find "domestic" included in one of the results.
> 
> regards,
> Lukas Kahwe Smith
> mls@pooteeweet.org
> 
> 
> 


Re: optional vs. probhibited aka standard vs. dismax handler

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.
On 29.06.2010, at 13:38, Lukas Kahwe Smith wrote:

> 
> On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote:
> 
>> Hi,
>> 
>> In DisMax the "mm" parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add "+". You can change to mm=0 and you will get the same behaviour as standard parser, i.e. an "OR" behaviour, where the "+" would say that a term is required.
> 
> ok thx
> 
>> Using the "-" in DisMax still means prohibited.
> 
> 
> what do you mean with "still"? .. with the default it behaves like "optional" in my tests. i will test with mm=0 if it behaves like prohibited.


maybe to illustrate the issue:
http://resolutionfinder.org/search?q=decade+-domestic++%2Btag%3Amalaria+%2Bcode%3AA%2FRES%2F58*&tm=any&s=Search

generates the following query. note i have the standard query handler with no hardcoded parameters setup as the default handler

INFO: [Clause_en] webapp=/solr path=/select params={hl.fragsize=0&facet=true&sort=score+desc&hl.simple.pre=<strong>&json.nl=map&hl.fl=*&wt=json&hl=true&rows=21&fl=*,score&start=0&q=%2Btag_ids:(23)++%2Bdocument_code_prefix:(A/RES/58)++(_query_:"{!dismax+qf%3D'content+document_title'+pf%3D'content+document_title'+v%3D$qq}")&hl.simple.post=</strong>&facet.field={!ex%3Ddt+key%3Dorig_legal_value}legal_value&facet.field={!ex%3Ddt+key%3Dorig_adoption_year}adoption_year&facet.field={!ex%3Ddt+key%3Dorig_organisation_id}organisation_id&facet.field={!ex%3Ddt+key%3Dorig_addressee_ids}addressee_ids&facet.field={!ex%3Ddt+key%3Dorig_documenttype_id}documenttype_id&facet.field={!ex%3Ddt+key%3Dorig_information_type_id}information_type_id&facet.field={!ex%3Ddt+key%3Dorig_operative_phrase_id}operative_phrase_id&facet.field={!ex%3Ddt+key%3Dorig_tag_ids}tag_ids&qq=decade+-domestic} hits=19 status=0 QTime=23 

I am using the dismax handler for the things that are not prefixed with a field name:
decade -domestic

I am using the default mm setting.

But if you search for "domestic" in the result of the above url you can find "domestic" included in one of the results.

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: optional vs. probhibited aka standard vs. dismax handler

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.
On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote:

> Hi,
> 
> In DisMax the "mm" parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add "+". You can change to mm=0 and you will get the same behaviour as standard parser, i.e. an "OR" behaviour, where the "+" would say that a term is required.

ok thx

> Using the "-" in DisMax still means prohibited.


what do you mean with "still"? .. with the default it behaves like "optional" in my tests. i will test with mm=0 if it behaves like prohibited.

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: optional vs. probhibited aka standard vs. dismax handler

Posted by Jan Høydahl / Cominvent <ja...@cominvent.com>.
Hi,

In DisMax the "mm" parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add "+". You can change to mm=0 and you will get the same behaviour as standard parser, i.e. an "OR" behaviour, where the "+" would say that a term is required.

Using the "-" in DisMax still means prohibited.

The best way to be sure of what is really going on is to enable &debugQuery=true and inspect what your query is parsed to.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 29. juni 2010, at 09.41, Lukas Kahwe Smith wrote:

> Hi,
> 
> I am a bit confused about the +/- syntax. Am I understanding it properly that when using the normal query handler + means required and - means prohibit where as in the dismax handler + means required and - means optional?
> 
> http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
> The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document.
> The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.
> 
> http://wiki.apache.org/solr/DisMaxRequestHandler
> Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses ... but all other Lucene query parser special characters are escaped to simplify the user experience.
> 
> regards,
> Lukas Kahwe Smith
> mls@pooteeweet.org
> 
> 
>