You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pierre-Luc Thibeault <pi...@wantedtech.com> on 2011/04/11 20:55:04 UTC

Exact match on a field with stemming

Hi all,
 
Is there a way to perform an exact match query on a field that has stemming enable by using the standard /select handler?
 
I thought that putting word inside double-quotes would enable this behaviour but if I query my field with a single word like “manager”
I am receiving results containing the word “management”
 
I know I can use a CopyField with different types but that would double the size of my index… Is there an alternative?
 
Thanks
 

RE: Exact match on a field with stemming

Posted by Jean-Sebastien Vachon <js...@videotron.ca>.
Thanks for the clarification. This make sense.

-----Original Message-----
From: Jonathan Rochkind [mailto:rochkind@jhu.edu] 
Sent: April-11-11 7:54 PM
To: solr-user@lucene.apache.org
Subject: FW: Exact match on a field with stemming


> I'm curious to know why Solr is not respecting the phrase.
> If it consider "manager" as a phrase... shouldn't it return only document
containing that phrase?

A phrase means to solr (or rather to the lucene and dismax query parsers,
which are what understand double-quoted phrases)  "these tokens in exactly
this order"

So a phrase of one token "manager", is exactly the same as if you didn't use
the double quotes. It's only one token, so "all the tokens in this phrase in
exactly the order specified" is, well, just the same as one token without
phrase quotes. 

If you've set up a stemmed field at indexing time, then "manager" and
"management" are stemmed IN THE INDEX, probably to something like "manag".
There is no longer any information in the index (at least in that field) on
what the original literal was, it's been stemmed in the index.  So there's
no way possible for it to only match certain un-stemmed versions -- at least
using that field. And when you enter either 'manager' or 'management' at
query time, it is analyzed and stemmed to match that stemmed something-like
"manag" in the index either way. If it didn't analyze and stem at query
time, then instead the query would just match NOTHING, because neither
'manager' nor 'management' are in the index at all, only the stemmed
versions. 

So, yes, double quotes are interpreted as a phrase, and only documents
containing that phrase are returned, you got it. 


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: April-11-11 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Exact match on a field with stemming

Hi,

Using quoted means "use this as a phrase", not "use this as a literal". :) I
think copying to unstemmed field is the only/common work-around.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem
search :: http://search-lucene.com/



----- Original Message ----
> From: Pierre-Luc Thibeault <pi...@wantedtech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 2:55:04 PM
> Subject: Exact match on a field with stemming
>
> Hi all,
>
> Is there a way to perform an exact match query on a field that  has 
>stemming enable by using the standard /select handler?
>
> I thought  that putting word inside double-quotes would enable this 
>behaviour but if I  query my field with a single word like "manager"
> I am receiving results  containing the word "management"
>
> I know I can use a CopyField with  different types but that would 
>double the size of my index. Is there an  alternative?
>
> Thanks
>

=


FW: Exact match on a field with stemming

Posted by Jonathan Rochkind <ro...@jhu.edu>.
> I'm curious to know why Solr is not respecting the phrase.
> If it consider "manager" as a phrase... shouldn't it return only document containing that phrase?

A phrase means to solr (or rather to the lucene and dismax query parsers, which are what understand double-quoted phrases)  "these tokens in exactly this order"

So a phrase of one token "manager", is exactly the same as if you didn't use the double quotes. It's only one token, so "all the tokens in this phrase in exactly the order specified" is, well, just the same as one token without phrase quotes. 

If you've set up a stemmed field at indexing time, then "manager" and "management" are stemmed IN THE INDEX, probably to something like "manag".  There is no longer any information in the index (at least in that field) on what the original literal was, it's been stemmed in the index.  So there's no way possible for it to only match certain un-stemmed versions -- at least using that field. And when you enter either 'manager' or 'management' at query time, it is analyzed and stemmed to match that stemmed something-like "manag" in the index either way. If it didn't analyze and stem at query time, then instead the query would just match NOTHING, because neither 'manager' nor 'management' are in the index at all, only the stemmed versions. 

So, yes, double quotes are interpreted as a phrase, and only documents containing that phrase are returned, you got it. 


-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: April-11-11 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Exact match on a field with stemming

Hi,

Using quoted means "use this as a phrase", not "use this as a literal". :) I think copying to unstemmed field is the only/common work-around.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Pierre-Luc Thibeault <pi...@wantedtech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 2:55:04 PM
> Subject: Exact match on a field with stemming
>
> Hi all,
>
> Is there a way to perform an exact match query on a field that  has
>stemming enable by using the standard /select handler?
>
> I thought  that putting word inside double-quotes would enable this
>behaviour but if I  query my field with a single word like “manager”
> I am receiving results  containing the word “management”
>
> I know I can use a CopyField with  different types but that would
>double the size of my index… Is there an  alternative?
>
> Thanks
>


RE: Exact match on a field with stemming

Posted by Jean-Sebastien Vachon <js...@videotron.ca>.
I'm curious to know why Solr is not respecting the phrase.
If it consider "manager" as a phrase... shouldn't it return only document containing that phrase?

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: April-11-11 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Exact match on a field with stemming

Hi,

Using quoted means "use this as a phrase", not "use this as a literal". :) I think copying to unstemmed field is the only/common work-around.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Pierre-Luc Thibeault <pi...@wantedtech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 2:55:04 PM
> Subject: Exact match on a field with stemming
> 
> Hi all,
> 
> Is there a way to perform an exact match query on a field that  has 
>stemming enable by using the standard /select handler?
> 
> I thought  that putting word inside double-quotes would enable this 
>behaviour but if I  query my field with a single word like “manager”
> I am receiving results  containing the word “management”
> 
> I know I can use a CopyField with  different types but that would 
>double the size of my index… Is there an  alternative?
> 
> Thanks
> 


Re: Exact match on a field with stemming

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Using quoted means "use this as a phrase", not "use this as a literal". :)
I think copying to unstemmed field is the only/common work-around.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Pierre-Luc Thibeault <pi...@wantedtech.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, April 11, 2011 2:55:04 PM
> Subject: Exact match on a field with stemming
> 
> Hi all,
> 
> Is there a way to perform an exact match query on a field that  has stemming 
>enable by using the standard /select handler?
> 
> I thought  that putting word inside double-quotes would enable this behaviour 
>but if I  query my field with a single word like “manager”
> I am receiving results  containing the word “management”
> 
> I know I can use a CopyField with  different types but that would double the 
>size of my index… Is there an  alternative?
> 
> Thanks
>