You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Vishal A." <ab...@gmail.com> on 2010/06/23 21:58:32 UTC

Stemmed and/or unStemmed field

Hello all,

 

One quick question, trying to find out what scenario would work best.

We have huge free text dataset containing product titles, descriptions.
Unfortunately, we don't have the data categorized so we rely on 'search
relevancy + synonyms'  heavily to categorize.

Here is what I am trying to do :  Someone clicks on  'Comforters & Pillows'
, we would want the results to be filtered where title has keyword
'Comforter' or  'Pillows' but we have been getting results with word
'comfort' in the title. I assume it is because of stemming. What is the
right way to handle this?

I am thinking to create another unstemmed field as 'title_unstemmed' which
stores the data unstemmed. So basically, with dismax -  I could boost score
on unstemmed field.  I can think of other scenarios where stemming would be
needed so stemmed field would still match.

 

Does that sound like something that will work? Any suggestions please?  

 

Much appreciated 


RE: Stemmed and/or unStemmed field

Posted by caman <ab...@gmail.com>.
Ahh,perfect.

Will take a look. thanks

 

From: Robert Muir [via Lucene]
[mailto:ml-node+918302-232685105-124354@n3.nabble.com] 
Sent: Wednesday, June 23, 2010 4:17 PM
To: caman
Subject: Re: Stemmed and/or unStemmed field

 

On Wed, Jun 23, 2010 at 3:58 PM, Vishal A. 
<[hidden email]>wrote: 

> 
> Here is what I am trying to do :  Someone clicks on  'Comforters &
Pillows' 
> , we would want the results to be filtered where title has keyword 
> 'Comforter' or  'Pillows' but we have been getting results with word 
> 'comfort' in the title. I assume it is because of stemming. What is the 
> right way to handle this? 
> 

from your examples, it seems a more lightweight stemmer might be an easy 
option: https://issues.apache.org/jira/browse/LUCENE-2503

-- 
Robert Muir 
[hidden email] 



  _____  

View message @
http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p9
18302.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124354@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p918309.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stemmed and/or unStemmed field

Posted by Robert Muir <rc...@gmail.com>.
On Wed, Jun 23, 2010 at 3:58 PM, Vishal A.
<ab...@gmail.com>wrote:

>
> Here is what I am trying to do :  Someone clicks on  'Comforters & Pillows'
> , we would want the results to be filtered where title has keyword
> 'Comforter' or  'Pillows' but we have been getting results with word
> 'comfort' in the title. I assume it is because of stemming. What is the
> right way to handle this?
>

from your examples, it seems a more lightweight stemmer might be an easy
option: https://issues.apache.org/jira/browse/LUCENE-2503

-- 
Robert Muir
rcmuir@gmail.com