You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by prerna07 <pk...@sapient.com> on 2009/07/24 14:28:36 UTC

How does search work with phonetic filter factory ?

Hi,

I am using phonetic filter factory. I am searching for product_12345 string,
which is present in only one record.

Issue: Solr return me all indexes which have product_ i.e. it ignore the
string present after _.

Also debugQuery looks like :
 <lst name="debug">
  <str name="rawquerystring">product_4844308</str> 
  <str name="querystring">product_4844308</str> 
  <str name="parsedquery">PhraseQuery(all:"PRTK ")</str> 
  <str name="parsedquery_toString">all:"PRTK "</str> 

Why is the query showing PRTK ?

Also when i use Dismaxrequest it shows:

<arr name="parsed_boost_queries">
  <str>all:ANKL^90.0 all:HNT^123.0 all:KLRS^2000.0 all:HLT^1.0E7
all:M0^100.0 all:KLR^100000.0</str> 
  </arr>

Why is this showing ANKL, HNT, KLRS..... ?

Why does solr convert search string to  ANKL, HNT, KLRS ... strings?

~prerna

-- 
View this message in context: http://www.nabble.com/How-does-search-work-with-phonetic-filter-factory---tp24643678p24643678.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How does search work with phonetic filter factory ?

Posted by Mats Lindh <ma...@gmail.com>.
On Fri, Jul 24, 2009 at 2:28 PM, prerna07<pk...@sapient.com> wrote:
> I am using phonetic filter factory. I am searching for product_12345 string,
> which is present in only one record.
..
> Why is the query showing PRTK ?

This is the phonetic filter at work (the double metaphone filter, if
I'm guessing correct); to be able to do phonetic searches the search
string is run through the phonetic filter, which returns a search
string converted to the phonetic form.

PRODUCT gets converted to PRTK as those are the four letters that are
considered important, and C and K are considered to be the same
(produkt vs product) (in this case). You can read more about metaphone
and double metaphone at wikipedia.

The phonetic filter should be run against a field by itself, and not
on the complete query string (or the regular query fields). That would
also allow you to weight the phonetic hit less than an exact hit. You
could also consider changing the algorithm used or write your own
phonetic factory. I wrote a small blog post when I did this a year
ago.

http://e-mats.org/2008/06/writing-a-solr-analysis-filter-plugin/

Hope that helps.

--mats