You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Navaa <na...@xtremumsolutions.com> on 2014/02/12 13:26:57 UTC

Phonetic search on multiple fields

Hi, 
I am beginner of solr, 
I am trying to implement phonetic search in my application 
my code in schema.xml for fieldType 

<fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100"> 
<analyzer type="index"> 
<tokenizer class="solr.StandardTokenizerFactory"/> 
<filter class="solr.LowerCaseFilterFactory"/> 
<filter class="solr.WordDelimiterFilterFactory" 
splitOnCaseChange="1" splitOnNumerics="0" 
generateWordParts="1" stemEnglishPossessive="0" 
generateNumberParts="0" 
catenateWords="1" catenateNumbers="0" catenateAll="0" 
preserveOriginal="1"/> 
</analyzer> 
<analyzer type="query"> 

<tokenizer class="solr.WhitespaceTokenizerFactory"/> 
<filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" enablePositionIncrements="true" /> 
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/> 
<filter class="solr.LowerCaseFilterFactory"/> 
</analyzer> 
</fieldType> 

And .... 

<fieldType name="text_general_phonetic" class="solr.TextField" 
positionIncrementGap="100"> 
<analyzer type="index"> 
<tokenizer class="solr.StandardTokenizerFactory"/> 

<filter class="solr.LowerCaseFilterFactory"/> 
<filter class="solr.WordDelimiterFilterFactory" 
splitOnCaseChange="1" splitOnNumerics="0" 
generateWordParts="1" stemEnglishPossessive="0" 
generateNumberParts="0" 
catenateWords="1" catenateNumbers="0" catenateAll="0" 
preserveOriginal="1"/> 
<filter class="solr.BeiderMorseFilterFactory" nameType="GENERIC" 
ruleType="APPROX" concat="true" languageSet="auto"/> 

</analyzer> 
<analyzer type="query"> 

<tokenizer class="solr.WhitespaceTokenizerFactory"/> 
<filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" enablePositionIncrements="true" /> 
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/> 
<filter class="solr.LowerCaseFilterFactory"/> 
</analyzer> 
</fieldType> 



AND field definition 

<field name="fname" type="text_general" indexed="true" stored="true" 
required="false" multiValued="false"/> 
<field name="fname_copy" type="text_general_phonetic" indexed="true" 
stored="true" required="false" /> 
<copyfield source="fname" dest="fname_copy"/> 


when I am search stephen, stifn will gives me stephen but it wont works... 
Also if how can I use phonetic filter with DoubleMetaphone encoder.. 
Please help me 
Thanks in Advance. 



--
View this message in context: http://lucene.472066.n3.nabble.com/Phonetic-search-on-multiple-fields-tp4116876.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phonetic search on multiple fields

Posted by Erick Erickson <er...@gmail.com>.
First, why are you talking about DoubleMetaphone when
your fieldType uses BeiderMorseFilterFactory? Which points
up a basic issue you need to wrap your head around or you'll
be endlessly confused. At least I was...

Your analysis chains _must_ do compatible things at index and
query time. The fieldType you're using for phonetic searching does
not since it doesn't use the BeiderMorseFilterFactory at query time.

So the actual values in your index are whatever the Beider... factory
produces but the terms searched are NOT transformed by that factory.

Say you index the term "Erick". Your index may have (and I
don't remember what the actual output of Beider is) something
totally transformed like MNUA. But your query does NOT do the
transformation, so the query is looking for "Erick". Obviously it
isn't found.

I _strongly_ advise you to take some time to get familiar with the
admin/analysis page, that'll shed light on a _lot_ of analysis issues.

Best,
Erick



On Wed, Feb 12, 2014 at 4:26 AM, Navaa <
navnath.thombare@xtremumsolutions.com> wrote:

> Hi,
> I am beginner of solr,
> I am trying to implement phonetic search in my application
> my code in schema.xml for fieldType
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> splitOnCaseChange="1" splitOnNumerics="0"
> generateWordParts="1" stemEnglishPossessive="0"
> generateNumberParts="0"
> catenateWords="1" catenateNumbers="0" catenateAll="0"
> preserveOriginal="1"/>
> </analyzer>
> <analyzer type="query">
>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
> And ....
>
> <fieldType name="text_general_phonetic" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> splitOnCaseChange="1" splitOnNumerics="0"
> generateWordParts="1" stemEnglishPossessive="0"
> generateNumberParts="0"
> catenateWords="1" catenateNumbers="0" catenateAll="0"
> preserveOriginal="1"/>
> <filter class="solr.BeiderMorseFilterFactory" nameType="GENERIC"
> ruleType="APPROX" concat="true" languageSet="auto"/>
>
> </analyzer>
> <analyzer type="query">
>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
>
> AND field definition
>
> <field name="fname" type="text_general" indexed="true" stored="true"
> required="false" multiValued="false"/>
> <field name="fname_copy" type="text_general_phonetic" indexed="true"
> stored="true" required="false" />
> <copyfield source="fname" dest="fname_copy"/>
>
>
> when I am search stephen, stifn will gives me stephen but it wont works...
> Also if how can I use phonetic filter with DoubleMetaphone encoder..
> Please help me
> Thanks in Advance.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Phonetic-search-on-multiple-fields-tp4116876.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>