You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Susheel Kumar <su...@thedigitalgroup.net> on 2013/11/01 03:59:29 UTC

RE: dropping noise words and maintaining the relevancy

Thanks, Kranti. Nice suggestion. I'll try it out. 

-----Original Message-----
From: Kranti Parisa [mailto:kranti.parisa@gmail.com] 
Sent: Thursday, October 31, 2013 3:18 PM
To: solr-user@lucene.apache.org
Subject: Re: dropping noise words and maintaining the relevancy

One possible approach is you can populate the titles in a field (say
exactMatch) and point your search query to exactMatch:"160 Associates LP"
OR text:""160 Associates LP"
assuming that you have all the text populated into the field called "text"

you can also use field level boosting with the above query, example
exactMatch:"160 Associates LP"^10 OR text:""160 Associates LP"^5


Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Thu, Oct 31, 2013 at 4:00 PM, Susheel Kumar < susheel.kumar@thedigitalgroup.net> wrote:

> Hello,
>
> We have a very particular requirement of dropping noise words (LP, 
> LLP, LLC, Corp, Corporation, Inc, Incoporation, PA, Professional 
> Association, Attorney at law, GP, General Partnership etc.) at the end 
> of search key but maintaining the relevancy. For e.g.
>
> If user search for "160 Associates LP", we want search to return in 
> their below relevancy order. Basically if exact / similar match is 
> present, it comes first followed by other results.
>
> 160 Associates LP
> 160 Associates
> 160 Associates LLC
> 160 Associates LLLP
> 160 Hilton Associates
>
> If I handle this through "Stop words" then LP will get dropped from 
> search key and then all results will come but exact match will be 
> shown somewhere lower or deep.
>
> Regards and appreciate your help.
> Susheel
>

RE: dropping noise words and maintaining the relevancy

Posted by Susheel Kumar <su...@thedigitalgroup.net>.
Hello,

On dropping noise words we have scenario that we have to only drop ending noise words. For e.g. "160 Associates LP", the noise words here are Associates and LP but we only want to drop the LP one which is a ending noise word.

If we use stop words, it will drop both words and make search key as "160". 

Any suggestion?

Thanks in advance. 


-----Original Message-----
From: Susheel Kumar [mailto:susheel.kumar@thedigitalgroup.net] 
Sent: Thursday, October 31, 2013 9:59 PM
To: solr-user@lucene.apache.org
Subject: RE: dropping noise words and maintaining the relevancy

Thanks, Kranti. Nice suggestion. I'll try it out. 

-----Original Message-----
From: Kranti Parisa [mailto:kranti.parisa@gmail.com]
Sent: Thursday, October 31, 2013 3:18 PM
To: solr-user@lucene.apache.org
Subject: Re: dropping noise words and maintaining the relevancy

One possible approach is you can populate the titles in a field (say
exactMatch) and point your search query to exactMatch:"160 Associates LP"
OR text:""160 Associates LP"
assuming that you have all the text populated into the field called "text"

you can also use field level boosting with the above query, example
exactMatch:"160 Associates LP"^10 OR text:""160 Associates LP"^5


Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Thu, Oct 31, 2013 at 4:00 PM, Susheel Kumar < susheel.kumar@thedigitalgroup.net> wrote:

> Hello,
>
> We have a very particular requirement of dropping noise words (LP, 
> LLP, LLC, Corp, Corporation, Inc, Incoporation, PA, Professional 
> Association, Attorney at law, GP, General Partnership etc.) at the end 
> of search key but maintaining the relevancy. For e.g.
>
> If user search for "160 Associates LP", we want search to return in 
> their below relevancy order. Basically if exact / similar match is 
> present, it comes first followed by other results.
>
> 160 Associates LP
> 160 Associates
> 160 Associates LLC
> 160 Associates LLLP
> 160 Hilton Associates
>
> If I handle this through "Stop words" then LP will get dropped from 
> search key and then all results will come but exact match will be 
> shown somewhere lower or deep.
>
> Regards and appreciate your help.
> Susheel
>