You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by prerna07 <pk...@sapient.com> on 2009/07/22 11:09:08 UTC

US/UK/CA/AU English support

Hi,

1) Out of US/UK/CA/AU,which english does solr support ? 

2) PhoneticFilterFactory perform search for similar sounding words. 
For example : search on  carat will give results of carat, caret and carrat.
I also observed that PhoneticFilterFactory  also support linguistic
variation for US/UK/CA/AU. 
For example: search on Optimize give results of optimise and optimize.

Question : Does PhoneticFilterFactory support all characters/ words of
linguistic variations for US/UK/CA/AU OR linguistic search for US/UK/CA/AU
will be subset of phonetic search.

Please suggest.

Thanks,
Prerna





-- 
View this message in context: http://www.nabble.com/US-UK-CA-AU-English-support-tp24602629p24602629.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: US/UK/CA/AU English support

Posted by Grant Ingersoll <gs...@apache.org>.
On Jul 22, 2009, at 5:09 AM, prerna07 wrote:

>
> Hi,
>
> 1) Out of US/UK/CA/AU,which english does solr support ?

Please clarify what you mean by "support"?  The only thing in Solr  
that is potentially language dependent are the Tokenizers and  
TokenFilters and those are completely pluggable.  For tokenization,  
I'd say all are supported since all of those languages are whitespace  
delimited.  For things like stemming and synonyms, I'm not sure, but I  
suspect many of the existing capabilities will work in most cases,  
which is all one can ever expect no matter the language.


>
> 2) PhoneticFilterFactory perform search for similar sounding words.
> For example : search on  carat will give results of carat, caret and  
> carrat.
> I also observed that PhoneticFilterFactory  also support linguistic
> variation for US/UK/CA/AU.
> For example: search on Optimize give results of optimise and optimize.
>
> Question : Does PhoneticFilterFactory support all characters/ words of
> linguistic variations for US/UK/CA/AU OR linguistic search for US/UK/ 
> CA/AU
> will be subset of phonetic search.
>

I would think so, but I might suggest using either the Admin analysis  
capabilities and doing some tests with the various FieldTypes or  
automating some more tests by using the AnalysisRequestHandler (or  
whatever it is called these days)


-Grant

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search