You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rainer Gnan <Ra...@bsb-muenchen.de> on 2016/08/10 14:21:22 UTC
Solr 6.1 :: language specific analysis
Hello,
I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".
I appreachiate any help!
Rainer
--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------
Re: Solr 6.1 :: language specific analysis
Posted by Susheel Kumar <su...@gmail.com>.
BeiderMorse supports these phonetics variations like Foto / Photo and have
support for many languages including German. Please see
https://cwiki.apache.org/confluence/display/solr/Phonetic+Matching
Thanks,
Susheel
On Wed, Aug 10, 2016 at 2:47 PM, Alexandre Drouin <
alexandre.drouin@orckestra.com> wrote:
> Can you use Solr's synonym feature? You can find a German synonym file
> here: https://sites.google.com/site/kevinbouge/synonyms-lists
>
> Alexandre Drouin
>
> -----Original Message-----
> From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de]
> Sent: Wednesday, August 10, 2016 10:21 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 6.1 :: language specific analysis
>
> Hello,
>
> I wonder if solr offers a feature (class) to handle different orthogaphy
> versions?
> For the German language for example ... in order to find the same
> documents when searching after "Foto" or "Photo".
>
> I appreachiate any help!
>
> Rainer
>
>
> --------------------------------------------
> Rainer Gnan
> Bayerische Staatsbibliothek
> BibliotheksVerbund Bayern
> Verbundnahe Dienste
> 80539 München
> Tel.: +49(0)89/28638-2445
> Fax: +49(0)89/28638-2665
> E-Mail: rainer.gnan@bsb-muenchen.de
> --------------------------------------------
>
>
>
>
RE: Solr 6.1 :: language specific analysis
Posted by Alexandre Drouin <al...@orckestra.com>.
Can you use Solr's synonym feature? You can find a German synonym file here: https://sites.google.com/site/kevinbouge/synonyms-lists
Alexandre Drouin
-----Original Message-----
From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de]
Sent: Wednesday, August 10, 2016 10:21 AM
To: solr-user@lucene.apache.org
Subject: Solr 6.1 :: language specific analysis
Hello,
I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".
I appreachiate any help!
Rainer
--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------
RE: Solr 6.1 :: language specific analysis
Posted by "Allison, Timothy B." <ta...@mitre.org>.
ICU normalization (ICUFoldingFilterFactory) will at least handle "ß" -> "ss" (IIRC) and some other language-general variants that might get you close. There are, of course, language specific analyzers (https://wiki.apache.org/solr/LanguageAnalysis#German) , but I don't think they'll get you Foto->photo.
You might experiment with DoubleMetaphone encoding (DoubleMetaphoneFilterFactory) or, worst case, back off to synonym lists (SynonymFilterFactory) for your domain.
-----Original Message-----
From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de]
Sent: Wednesday, August 10, 2016 10:21 AM
To: solr-user@lucene.apache.org
Subject: Solr 6.1 :: language specific analysis
Hello,
I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".
I appreachiate any help!
Rainer
--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------