You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "S.L" <si...@gmail.com> on 2014/04/01 07:04:08 UTC

Re: eDismax parser and the mm parameter

Jack ,

Thanks a lot , I am now using the pf ,pf2 an pf3  and have gotten rid of
the mm parameter from my queries, however for the fuzzy phrase queries , I
am not sure how I would be able to leverage the Complex Query Parser there
is absolutely nothing out there that gives me any idea as to how to do that
.

Why is fuzzy phrase search not provided by Solr OOB ? I am surprised

Thanks.


On Mon, Mar 31, 2014 at 5:39 AM, Jack Krupansky <ja...@basetechnology.com>wrote:

> The pf, pf2, and pf3 parameters should cover cases 1 and 2. Use q.op=OR
> (the default) and ignore the mm parameter. Give pf the highest boost, and
> boost pf3 higher than pf2.
>
> You could try using the complex phrase query parser for the third case.
>
> -- Jack Krupansky
>
> -----Original Message----- From: S.L
> Sent: Monday, March 31, 2014 12:08 AM
> To: solr-user@lucene.apache.org
> Subject: Re: eDismax parser and the mm parameter
>
> Thanks Jack , my use cases are as follows.
>
>
>   1. Search for "Ginseng" everything related to ginseng should show up.
>   2. Search For "White Siberian Ginseng" results with the whole phrase
>   show up first followed by 2 words from the phrase followed by a single
> word
>   in the phrase
>   3. Fuzzy Search "Whte Sberia Ginsng" (please note the typos here)
>   documents with White Siberian Ginseng Should show up , this looks like
> the
>   most complicated of all as Solr does not support fuzzy phrase searches .
> (I
>   have no solution for this yet).
>
> Thanks again!
>
>
> On Sun, Mar 30, 2014 at 11:21 PM, Jack Krupansky <ja...@basetechnology.com>
> wrote:
>
>  The mm parameter is really only relevant when the default operator is OR
>> or explicit OR operators are used.
>>
>> Again: Please provide your use case examples and your expectations for
>> each use case. It really doesn't make a lot of sense to prematurely focus
>> on a solution when you haven't clearly defined your use cases.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: S.L
>> Sent: Sunday, March 30, 2014 9:13 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: eDismax parser and the mm parameter
>>
>> Jack,
>>
>> I mis-stated the problem , I am not using the OR operator as default
>> now(now that I think about it it does not make sense to use the default
>> operator OR along with the mm parameter) , the reason I want to use pf and
>> mm in conjunction is because of my understanding of the edismax parser and
>> I have not looked into pf2 and pf3 parameters yet.
>>
>> I will state my understanding here below.
>>
>> Pf -  Is used to boost the result score if the complete phrase matches.
>> mm <(less than) search term length would help limit the query results  to
>> a
>> certain number of better matches.
>>
>> With that being said would it make sense to have dynamic mm (set to the
>> length of search term - 1)?
>>
>> I also have a question around using a fuzzy search along with eDismax
>> parser , but I will ask that in a seperate post once I go thru that aspect
>> of eDismax parser.
>>
>> Thanks again !
>>
>>
>>
>>
>>
>> On Sun, Mar 30, 2014 at 6:44 PM, Jack Krupansky <ja...@basetechnology.com>
>> wrote:
>>
>>  If you use pf, pf2, and pf3 and boost appropriately, the effects of mm
>>
>>> will be dwarfed.
>>>
>>> The general goal is to assure that the top documents really are the best,
>>> not to necessarily limit the total document count. Focusing on the latter
>>> could be a real waste of time.
>>>
>>> It's still not clear why or how you need or want to use OR as the default
>>> operator - you still haven't given us a use case for that.
>>>
>>> To repeat: Give us a full set of use cases before taking this XY Problem
>>> approach of pursuing a solution before the problem is understood.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: S.L
>>> Sent: Sunday, March 30, 2014 6:14 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: eDismax parser and the mm parameter
>>>
>>> Jacks Thanks Again,
>>>
>>> I am searching  Chinese medicine  documents , as the example I gave
>>> earlier
>>> a user can search for "Ginseng" or Siberian Ginseng or Red Siberian
>>> Ginseng
>>> , I certainly want to use pf parameter (which is not driven by mm
>>> parameter) , however for giving higher score to documents that have more
>>> of
>>> the terms I want to use edismax now if I give a mm of 3 and the search
>>> term
>>> is of only length 1 (like "Ginseng") what does edisMax do ?
>>>
>>>
>>> On Sun, Mar 30, 2014 at 1:21 PM, Jack Krupansky <jack@basetechnology.com
>>> >
>>> wrote:
>>>
>>>  It still depends on your objective - which you haven't told us yet. Show
>>>
>>>  us some use cases and detail what your expectations are for each use
>>>> case.
>>>>
>>>> The edismax phrase boosting is probably a lot more useful than messing
>>>> around with mm. Take a look at pf, pf2, and pf3.
>>>>
>>>> See:
>>>> http://wiki.apache.org/solr/ExtendedDisMax
>>>> https://cwiki.apache.org/confluence/display/solr/The+
>>>> Extended+DisMax+Query+Parser
>>>>
>>>> The focus on mm may indeed be a classic "XY Problem" - a premature focus
>>>> on a solution without detailing the problem.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> -----Original Message----- From: S.L
>>>> Sent: Sunday, March 30, 2014 11:18 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: eDismax parser and the mm parameter
>>>>
>>>> Thanks Jack! I understand the intent of mm parameter, my question is
>>>> that
>>>> since the query terms being provided are not of fixed length I do not
>>>> know
>>>> what the mm should like for example "Ginseng","Siberian Ginseng" are my
>>>> search terms. The first one can have an mm upto 1 and the second one can
>>>> have an mm of upto 2 .
>>>>
>>>> Should I dynamically set the mm based on the number of search terms in
>>>> my
>>>> query ?
>>>>
>>>> Thanks again.
>>>>
>>>>
>>>> On Sun, Mar 30, 2014 at 5:20 AM, Jack Krupansky <
>>>> jack@basetechnology.com
>>>> >
>>>> wrote:
>>>>
>>>>  1. Yes, the default for mm is 1.
>>>>
>>>>
>>>>  2. It depends on what you are really trying to do - you haven't told
>>>>> us.
>>>>>
>>>>> Generally, mm=1 is equivalent to q.op=OR, and mm=100% is equivalent to
>>>>> q.op=AND.
>>>>>
>>>>> Generally, use q.op unless you really know what you are doing.
>>>>>
>>>>> Generally, the intent of mm is to set the minimum number of OR/SHOULD
>>>>> clauses that must match on the top level of a query.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> -----Original Message----- From: S.L
>>>>> Sent: Sunday, March 30, 2014 2:25 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: eDismax parser and the mm parameter
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I am planning to use the eDismax query parser in SOLR to give boost to
>>>>> documents that have a phrase in their fields present. Now there is a mm
>>>>> parameter in the edismax parser query , since the query typed by the
>>>>> user
>>>>> could be of any length (i.e. >=1) I would like to set the mm value to 1
>>>>> .
>>>>> I
>>>>> have the following questions regarding this parameter.
>>>>>
>>>>>   1. Is it set to 1 by default ?
>>>>>   2. In my schema.xml the defaultOperator is set to "AND" should I set
>>>>> it
>>>>>   to "OR" inorder for the edismax parser to be effective with a mm of
>>>>> 1?
>>>>>
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: The word "no" in a query

Posted by Ahmet Arslan <io...@yahoo.com>.
Hi Bob,

Your field type would be useful here. Can you copy-paste it?

Ahmet



On Wednesday, April 2, 2014 2:01 PM, François Schiettecatte <fs...@gmail.com> wrote:
Have you looked at the debugging output?

    http://wiki.apache.org/solr/CommonQueryParameters#Debugging

François


On Apr 2, 2014, at 1:37 AM, Bob Laferriere <sp...@icloud.com> wrote:

> 
> I have built an commerce search engine. I am struggling with the word “no” in queries. We have products that are “No Smoking Sign.” When the query is “Smoking AND Sign” the product is found. If I query as “No AND Sign” I get no results? I do not have no as a stop word. Any ideas why I would get zero results back?
> 
> Regards,
> 
> Bob


Re: The word "no" in a query

Posted by François Schiettecatte <fs...@gmail.com>.
Have you looked at the debugging output?

	http://wiki.apache.org/solr/CommonQueryParameters#Debugging

François

On Apr 2, 2014, at 1:37 AM, Bob Laferriere <sp...@icloud.com> wrote:

> 
> I have built an commerce search engine. I am struggling with the word “no” in queries. We have products that are “No Smoking Sign.” When the query is “Smoking AND Sign” the product is found. If I query as “No AND Sign” I get no results? I do not have no as a stop word. Any ideas why I would get zero results back?
> 
> Regards,
> 
> Bob


The word "no" in a query

Posted by Bob Laferriere <sp...@icloud.com>.
I have built an commerce search engine. I am struggling with the word “no” in queries. We have products that are “No Smoking Sign.” When the query is “Smoking AND Sign” the product is found. If I query as “No AND Sign” I get no results? I do not have no as a stop word. Any ideas why I would get zero results back?

Regards,

Bob

Re: eDismax parser and the mm parameter

Posted by William Bell <bi...@gmail.com>.
Fuzzy is provided use ~


On Mon, Mar 31, 2014 at 11:04 PM, S.L <si...@gmail.com> wrote:

> Jack ,
>
> Thanks a lot , I am now using the pf ,pf2 an pf3  and have gotten rid of
> the mm parameter from my queries, however for the fuzzy phrase queries , I
> am not sure how I would be able to leverage the Complex Query Parser there
> is absolutely nothing out there that gives me any idea as to how to do that
> .
>
> Why is fuzzy phrase search not provided by Solr OOB ? I am surprised
>
> Thanks.
>
>
> On Mon, Mar 31, 2014 at 5:39 AM, Jack Krupansky <jack@basetechnology.com
> >wrote:
>
> > The pf, pf2, and pf3 parameters should cover cases 1 and 2. Use q.op=OR
> > (the default) and ignore the mm parameter. Give pf the highest boost, and
> > boost pf3 higher than pf2.
> >
> > You could try using the complex phrase query parser for the third case.
> >
> > -- Jack Krupansky
> >
> > -----Original Message----- From: S.L
> > Sent: Monday, March 31, 2014 12:08 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: eDismax parser and the mm parameter
> >
> > Thanks Jack , my use cases are as follows.
> >
> >
> >   1. Search for "Ginseng" everything related to ginseng should show up.
> >   2. Search For "White Siberian Ginseng" results with the whole phrase
> >   show up first followed by 2 words from the phrase followed by a single
> > word
> >   in the phrase
> >   3. Fuzzy Search "Whte Sberia Ginsng" (please note the typos here)
> >   documents with White Siberian Ginseng Should show up , this looks like
> > the
> >   most complicated of all as Solr does not support fuzzy phrase searches
> .
> > (I
> >   have no solution for this yet).
> >
> > Thanks again!
> >
> >
> > On Sun, Mar 30, 2014 at 11:21 PM, Jack Krupansky <
> jack@basetechnology.com>
> > wrote:
> >
> >  The mm parameter is really only relevant when the default operator is OR
> >> or explicit OR operators are used.
> >>
> >> Again: Please provide your use case examples and your expectations for
> >> each use case. It really doesn't make a lot of sense to prematurely
> focus
> >> on a solution when you haven't clearly defined your use cases.
> >>
> >> -- Jack Krupansky
> >>
> >> -----Original Message----- From: S.L
> >> Sent: Sunday, March 30, 2014 9:13 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: eDismax parser and the mm parameter
> >>
> >> Jack,
> >>
> >> I mis-stated the problem , I am not using the OR operator as default
> >> now(now that I think about it it does not make sense to use the default
> >> operator OR along with the mm parameter) , the reason I want to use pf
> and
> >> mm in conjunction is because of my understanding of the edismax parser
> and
> >> I have not looked into pf2 and pf3 parameters yet.
> >>
> >> I will state my understanding here below.
> >>
> >> Pf -  Is used to boost the result score if the complete phrase matches.
> >> mm <(less than) search term length would help limit the query results
>  to
> >> a
> >> certain number of better matches.
> >>
> >> With that being said would it make sense to have dynamic mm (set to the
> >> length of search term - 1)?
> >>
> >> I also have a question around using a fuzzy search along with eDismax
> >> parser , but I will ask that in a seperate post once I go thru that
> aspect
> >> of eDismax parser.
> >>
> >> Thanks again !
> >>
> >>
> >>
> >>
> >>
> >> On Sun, Mar 30, 2014 at 6:44 PM, Jack Krupansky <
> jack@basetechnology.com>
> >> wrote:
> >>
> >>  If you use pf, pf2, and pf3 and boost appropriately, the effects of mm
> >>
> >>> will be dwarfed.
> >>>
> >>> The general goal is to assure that the top documents really are the
> best,
> >>> not to necessarily limit the total document count. Focusing on the
> latter
> >>> could be a real waste of time.
> >>>
> >>> It's still not clear why or how you need or want to use OR as the
> default
> >>> operator - you still haven't given us a use case for that.
> >>>
> >>> To repeat: Give us a full set of use cases before taking this XY
> Problem
> >>> approach of pursuing a solution before the problem is understood.
> >>>
> >>> -- Jack Krupansky
> >>>
> >>> -----Original Message----- From: S.L
> >>> Sent: Sunday, March 30, 2014 6:14 PM
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: eDismax parser and the mm parameter
> >>>
> >>> Jacks Thanks Again,
> >>>
> >>> I am searching  Chinese medicine  documents , as the example I gave
> >>> earlier
> >>> a user can search for "Ginseng" or Siberian Ginseng or Red Siberian
> >>> Ginseng
> >>> , I certainly want to use pf parameter (which is not driven by mm
> >>> parameter) , however for giving higher score to documents that have
> more
> >>> of
> >>> the terms I want to use edismax now if I give a mm of 3 and the search
> >>> term
> >>> is of only length 1 (like "Ginseng") what does edisMax do ?
> >>>
> >>>
> >>> On Sun, Mar 30, 2014 at 1:21 PM, Jack Krupansky <
> jack@basetechnology.com
> >>> >
> >>> wrote:
> >>>
> >>>  It still depends on your objective - which you haven't told us yet.
> Show
> >>>
> >>>  us some use cases and detail what your expectations are for each use
> >>>> case.
> >>>>
> >>>> The edismax phrase boosting is probably a lot more useful than messing
> >>>> around with mm. Take a look at pf, pf2, and pf3.
> >>>>
> >>>> See:
> >>>> http://wiki.apache.org/solr/ExtendedDisMax
> >>>> https://cwiki.apache.org/confluence/display/solr/The+
> >>>> Extended+DisMax+Query+Parser
> >>>>
> >>>> The focus on mm may indeed be a classic "XY Problem" - a premature
> focus
> >>>> on a solution without detailing the problem.
> >>>>
> >>>> -- Jack Krupansky
> >>>>
> >>>> -----Original Message----- From: S.L
> >>>> Sent: Sunday, March 30, 2014 11:18 AM
> >>>> To: solr-user@lucene.apache.org
> >>>> Subject: Re: eDismax parser and the mm parameter
> >>>>
> >>>> Thanks Jack! I understand the intent of mm parameter, my question is
> >>>> that
> >>>> since the query terms being provided are not of fixed length I do not
> >>>> know
> >>>> what the mm should like for example "Ginseng","Siberian Ginseng" are
> my
> >>>> search terms. The first one can have an mm upto 1 and the second one
> can
> >>>> have an mm of upto 2 .
> >>>>
> >>>> Should I dynamically set the mm based on the number of search terms in
> >>>> my
> >>>> query ?
> >>>>
> >>>> Thanks again.
> >>>>
> >>>>
> >>>> On Sun, Mar 30, 2014 at 5:20 AM, Jack Krupansky <
> >>>> jack@basetechnology.com
> >>>> >
> >>>> wrote:
> >>>>
> >>>>  1. Yes, the default for mm is 1.
> >>>>
> >>>>
> >>>>  2. It depends on what you are really trying to do - you haven't told
> >>>>> us.
> >>>>>
> >>>>> Generally, mm=1 is equivalent to q.op=OR, and mm=100% is equivalent
> to
> >>>>> q.op=AND.
> >>>>>
> >>>>> Generally, use q.op unless you really know what you are doing.
> >>>>>
> >>>>> Generally, the intent of mm is to set the minimum number of OR/SHOULD
> >>>>> clauses that must match on the top level of a query.
> >>>>>
> >>>>> -- Jack Krupansky
> >>>>>
> >>>>> -----Original Message----- From: S.L
> >>>>> Sent: Sunday, March 30, 2014 2:25 AM
> >>>>> To: solr-user@lucene.apache.org
> >>>>> Subject: eDismax parser and the mm parameter
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I am planning to use the eDismax query parser in SOLR to give boost
> to
> >>>>> documents that have a phrase in their fields present. Now there is a
> mm
> >>>>> parameter in the edismax parser query , since the query typed by the
> >>>>> user
> >>>>> could be of any length (i.e. >=1) I would like to set the mm value
> to 1
> >>>>> .
> >>>>> I
> >>>>> have the following questions regarding this parameter.
> >>>>>
> >>>>>   1. Is it set to 1 by default ?
> >>>>>   2. In my schema.xml the defaultOperator is set to "AND" should I
> set
> >>>>> it
> >>>>>   to "OR" inorder for the edismax parser to be effective with a mm of
> >>>>> 1?
> >>>>>
> >>>>>
> >>>>> Thanks in advance!
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076