You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Russell Bahr <rb...@diligent.com> on 2020/05/06 15:57:45 UTC

Minimum Match Query

Hi SOLR team,
I have been asked if there is a way to return results only if those results match a minimum number of times present in the query.
( queries looking for a minimum amount of mentions for a particular term/phrase. Ie must be mentioned 'x' amount of times to return results).
Is this something that is possible using SOLR 6.5.1?  Is this something that would require a newer version of SOLR?
Any help on this would be appreciated.
Thank you,
Russ

Re: Minimum Match Query

Posted by ART GALLERY <al...@goretoy.com>.
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Thu, May 7, 2020 at 2:11 PM Russell Bahr <ru...@manzama.com> wrote:
>
> Thank you Emir, we will give this a try.
>
> Russ
>
>
> On Thu, May 7, 2020 at 12:55 AM Emir Arnautović <
> emir.arnautovic@sematext.com> wrote:
>
> > Hi Russel,
> > You are right about mm - it is about min term matches. Frequencies are
> > usually used to determine score. But you can also filter on number of
> > matches using function queries:
> > fq={!frange l=3}sum(termfreq(field, ‘barker’), termfreq(field, ‘jones’),
> > termfreq(field, ‘baker’))
> >
> > It is not perfect and you will need to handle phrases at index time to be
> > able to match phrases. Or you can combine it with some other query to
> > filter out unwanted results and use this approach to make sure frequencies
> > match.
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 7 May 2020, at 03:12, Russell Bahr <ru...@manzama.com> wrote:
> > >
> > > Hi Atita,
> > > We actually looked into that and it does not appear to match based on a
> > > single phrase, but says that it must match a certain percentage of the
> > > listed phrases.  What we need is something that would match based on a
> > > single phrase appearing a minimum number of times i.e. "Barker" minimum
> > > number of matches =3 where "Barker" showed up in a document 3 or more
> > times.
> > >
> > > Am I missing something there or am I reading this wrong?
> > > The mm (Minimum Should Match) Parameter When processing queries,
> > > Lucene/Solr recognizes three types of clauses: mandatory, prohibited, and
> > > "optional" (also known as "should" clauses). By default, all words or
> > > phrases specified in the q parameter are treated as "optional" clauses
> > > unless they are preceded by a "+" or a "-". When dealing with these
> > > "optional" clauses, the mm parameter makes it possible to say that a
> > > certain minimum number of those clauses must match. The DisMax query
> > parser
> > > offers great flexibility in how the minimum number can be specified.
> > >
> > > We did try doing a query and the results that came back were reflective
> > > only of minimum number of phrases matching as opposed to a phrase being
> > > mentioned a minimum number of times.
> > >
> > > For example, If I say query for “Google” with mm=100 it doesn’t find
> > > Articles with 100 mentions of Google.  It is used for multiple phrase
> > > queries.  Example against our servers:
> > >
> > > query = "Barker" OR "Jones" OR “Baker” mm=1 103,896 results
> > > query = "Barker" OR "Jones" OR “Baker” mm=2 1200 results
> > > query = "Barker" OR "Jones" OR “Baker” mm=3 16 results
> > >
> > > Please let me know.
> > > Thank you,
> > > Russ
> > >
> > >
> > >
> > > On Wed, May 6, 2020 at 10:13 AM Atita Arora <at...@gmail.com>
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> Did you happen to look into :
> > >>
> > >>
> > >>
> > https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
> > >>
> > >> I believe 6.5.1 has it too.
> > >>
> > >> I hope it should help.
> > >>
> > >>
> > >> On Wed, May 6, 2020 at 6:46 PM Russell Bahr <rb...@diligent.com> wrote:
> > >>
> > >>> Hi SOLR team,
> > >>> I have been asked if there is a way to return results only if those
> > >>> results match a minimum number of times present in the query.
> > >>> ( queries looking for a minimum amount of mentions for a particular
> > >>> term/phrase. Ie must be mentioned 'x' amount of times to return
> > results).
> > >>> Is this something that is possible using SOLR 6.5.1?  Is this something
> > >>> that would require a newer version of SOLR?
> > >>> Any help on this would be appreciated.
> > >>> Thank you,
> > >>> Russ
> > >>>
> > >>
> >
> >

Re: Minimum Match Query

Posted by Russell Bahr <ru...@manzama.com>.
Thank you Emir, we will give this a try.

Russ


On Thu, May 7, 2020 at 12:55 AM Emir Arnautović <
emir.arnautovic@sematext.com> wrote:

> Hi Russel,
> You are right about mm - it is about min term matches. Frequencies are
> usually used to determine score. But you can also filter on number of
> matches using function queries:
> fq={!frange l=3}sum(termfreq(field, ‘barker’), termfreq(field, ‘jones’),
> termfreq(field, ‘baker’))
>
> It is not perfect and you will need to handle phrases at index time to be
> able to match phrases. Or you can combine it with some other query to
> filter out unwanted results and use this approach to make sure frequencies
> match.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 7 May 2020, at 03:12, Russell Bahr <ru...@manzama.com> wrote:
> >
> > Hi Atita,
> > We actually looked into that and it does not appear to match based on a
> > single phrase, but says that it must match a certain percentage of the
> > listed phrases.  What we need is something that would match based on a
> > single phrase appearing a minimum number of times i.e. "Barker" minimum
> > number of matches =3 where "Barker" showed up in a document 3 or more
> times.
> >
> > Am I missing something there or am I reading this wrong?
> > The mm (Minimum Should Match) Parameter When processing queries,
> > Lucene/Solr recognizes three types of clauses: mandatory, prohibited, and
> > "optional" (also known as "should" clauses). By default, all words or
> > phrases specified in the q parameter are treated as "optional" clauses
> > unless they are preceded by a "+" or a "-". When dealing with these
> > "optional" clauses, the mm parameter makes it possible to say that a
> > certain minimum number of those clauses must match. The DisMax query
> parser
> > offers great flexibility in how the minimum number can be specified.
> >
> > We did try doing a query and the results that came back were reflective
> > only of minimum number of phrases matching as opposed to a phrase being
> > mentioned a minimum number of times.
> >
> > For example, If I say query for “Google” with mm=100 it doesn’t find
> > Articles with 100 mentions of Google.  It is used for multiple phrase
> > queries.  Example against our servers:
> >
> > query = "Barker" OR "Jones" OR “Baker” mm=1 103,896 results
> > query = "Barker" OR "Jones" OR “Baker” mm=2 1200 results
> > query = "Barker" OR "Jones" OR “Baker” mm=3 16 results
> >
> > Please let me know.
> > Thank you,
> > Russ
> >
> >
> >
> > On Wed, May 6, 2020 at 10:13 AM Atita Arora <at...@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> Did you happen to look into :
> >>
> >>
> >>
> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
> >>
> >> I believe 6.5.1 has it too.
> >>
> >> I hope it should help.
> >>
> >>
> >> On Wed, May 6, 2020 at 6:46 PM Russell Bahr <rb...@diligent.com> wrote:
> >>
> >>> Hi SOLR team,
> >>> I have been asked if there is a way to return results only if those
> >>> results match a minimum number of times present in the query.
> >>> ( queries looking for a minimum amount of mentions for a particular
> >>> term/phrase. Ie must be mentioned 'x' amount of times to return
> results).
> >>> Is this something that is possible using SOLR 6.5.1?  Is this something
> >>> that would require a newer version of SOLR?
> >>> Any help on this would be appreciated.
> >>> Thank you,
> >>> Russ
> >>>
> >>
>
>

Re: Minimum Match Query

Posted by Emir Arnautović <em...@sematext.com>.
Hi Russel,
You are right about mm - it is about min term matches. Frequencies are usually used to determine score. But you can also filter on number of matches using function queries:
fq={!frange l=3}sum(termfreq(field, ‘barker’), termfreq(field, ‘jones’), termfreq(field, ‘baker’))

It is not perfect and you will need to handle phrases at index time to be able to match phrases. Or you can combine it with some other query to filter out unwanted results and use this approach to make sure frequencies match.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 May 2020, at 03:12, Russell Bahr <ru...@manzama.com> wrote:
> 
> Hi Atita,
> We actually looked into that and it does not appear to match based on a
> single phrase, but says that it must match a certain percentage of the
> listed phrases.  What we need is something that would match based on a
> single phrase appearing a minimum number of times i.e. "Barker" minimum
> number of matches =3 where "Barker" showed up in a document 3 or more times.
> 
> Am I missing something there or am I reading this wrong?
> The mm (Minimum Should Match) Parameter When processing queries,
> Lucene/Solr recognizes three types of clauses: mandatory, prohibited, and
> "optional" (also known as "should" clauses). By default, all words or
> phrases specified in the q parameter are treated as "optional" clauses
> unless they are preceded by a "+" or a "-". When dealing with these
> "optional" clauses, the mm parameter makes it possible to say that a
> certain minimum number of those clauses must match. The DisMax query parser
> offers great flexibility in how the minimum number can be specified.
> 
> We did try doing a query and the results that came back were reflective
> only of minimum number of phrases matching as opposed to a phrase being
> mentioned a minimum number of times.
> 
> For example, If I say query for “Google” with mm=100 it doesn’t find
> Articles with 100 mentions of Google.  It is used for multiple phrase
> queries.  Example against our servers:
> 
> query = "Barker" OR "Jones" OR “Baker” mm=1 103,896 results
> query = "Barker" OR "Jones" OR “Baker” mm=2 1200 results
> query = "Barker" OR "Jones" OR “Baker” mm=3 16 results
> 
> Please let me know.
> Thank you,
> Russ
> 
> 
> 
> On Wed, May 6, 2020 at 10:13 AM Atita Arora <at...@gmail.com> wrote:
> 
>> Hi,
>> 
>> Did you happen to look into :
>> 
>> 
>> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
>> 
>> I believe 6.5.1 has it too.
>> 
>> I hope it should help.
>> 
>> 
>> On Wed, May 6, 2020 at 6:46 PM Russell Bahr <rb...@diligent.com> wrote:
>> 
>>> Hi SOLR team,
>>> I have been asked if there is a way to return results only if those
>>> results match a minimum number of times present in the query.
>>> ( queries looking for a minimum amount of mentions for a particular
>>> term/phrase. Ie must be mentioned 'x' amount of times to return results).
>>> Is this something that is possible using SOLR 6.5.1?  Is this something
>>> that would require a newer version of SOLR?
>>> Any help on this would be appreciated.
>>> Thank you,
>>> Russ
>>> 
>> 


Re: Minimum Match Query

Posted by Russell Bahr <ru...@manzama.com>.
Hi Atita,
We actually looked into that and it does not appear to match based on a
single phrase, but says that it must match a certain percentage of the
listed phrases.  What we need is something that would match based on a
single phrase appearing a minimum number of times i.e. "Barker" minimum
number of matches =3 where "Barker" showed up in a document 3 or more times.

Am I missing something there or am I reading this wrong?
The mm (Minimum Should Match) Parameter When processing queries,
Lucene/Solr recognizes three types of clauses: mandatory, prohibited, and
"optional" (also known as "should" clauses). By default, all words or
phrases specified in the q parameter are treated as "optional" clauses
unless they are preceded by a "+" or a "-". When dealing with these
"optional" clauses, the mm parameter makes it possible to say that a
certain minimum number of those clauses must match. The DisMax query parser
offers great flexibility in how the minimum number can be specified.

We did try doing a query and the results that came back were reflective
only of minimum number of phrases matching as opposed to a phrase being
mentioned a minimum number of times.

For example, If I say query for “Google” with mm=100 it doesn’t find
Articles with 100 mentions of Google.  It is used for multiple phrase
queries.  Example against our servers:

query = "Barker" OR "Jones" OR “Baker” mm=1 103,896 results
query = "Barker" OR "Jones" OR “Baker” mm=2 1200 results
query = "Barker" OR "Jones" OR “Baker” mm=3 16 results

Please let me know.
Thank you,
Russ



On Wed, May 6, 2020 at 10:13 AM Atita Arora <at...@gmail.com> wrote:

> Hi,
>
> Did you happen to look into :
>
>
> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
>
> I believe 6.5.1 has it too.
>
> I hope it should help.
>
>
> On Wed, May 6, 2020 at 6:46 PM Russell Bahr <rb...@diligent.com> wrote:
>
> > Hi SOLR team,
> > I have been asked if there is a way to return results only if those
> > results match a minimum number of times present in the query.
> > ( queries looking for a minimum amount of mentions for a particular
> > term/phrase. Ie must be mentioned 'x' amount of times to return results).
> > Is this something that is possible using SOLR 6.5.1?  Is this something
> > that would require a newer version of SOLR?
> > Any help on this would be appreciated.
> > Thank you,
> > Russ
> >
>

Re: Minimum Match Query

Posted by Atita Arora <at...@gmail.com>.
Hi,

Did you happen to look into :

https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter

I believe 6.5.1 has it too.

I hope it should help.


On Wed, May 6, 2020 at 6:46 PM Russell Bahr <rb...@diligent.com> wrote:

> Hi SOLR team,
> I have been asked if there is a way to return results only if those
> results match a minimum number of times present in the query.
> ( queries looking for a minimum amount of mentions for a particular
> term/phrase. Ie must be mentioned 'x' amount of times to return results).
> Is this something that is possible using SOLR 6.5.1?  Is this something
> that would require a newer version of SOLR?
> Any help on this would be appreciated.
> Thank you,
> Russ
>