You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rainer Gnan <Ra...@bsb-muenchen.de> on 2016/08/10 14:21:22 UTC

Solr 6.1 :: language specific analysis

Hello,

I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".

I appreachiate any help!

Rainer


--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek 
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------




Re: Solr 6.1 :: language specific analysis

Posted by Susheel Kumar <su...@gmail.com>.
BeiderMorse supports these phonetics variations like Foto / Photo and have
support for many languages including German.  Please see
https://cwiki.apache.org/confluence/display/solr/Phonetic+Matching

Thanks,
Susheel

On Wed, Aug 10, 2016 at 2:47 PM, Alexandre Drouin <
alexandre.drouin@orckestra.com> wrote:

> Can you use Solr's synonym feature?  You can find a German synonym file
> here: https://sites.google.com/site/kevinbouge/synonyms-lists
>
> Alexandre Drouin
>
> -----Original Message-----
> From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de]
> Sent: Wednesday, August 10, 2016 10:21 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 6.1 :: language specific analysis
>
> Hello,
>
> I wonder if solr offers a feature (class) to handle different orthogaphy
> versions?
> For the German language for example ... in order to find the same
> documents when searching after "Foto" or "Photo".
>
> I appreachiate any help!
>
> Rainer
>
>
> --------------------------------------------
> Rainer Gnan
> Bayerische Staatsbibliothek
> BibliotheksVerbund Bayern
> Verbundnahe Dienste
> 80539 München
> Tel.: +49(0)89/28638-2445
> Fax: +49(0)89/28638-2665
> E-Mail: rainer.gnan@bsb-muenchen.de
> --------------------------------------------
>
>
>
>

RE: Solr 6.1 :: language specific analysis

Posted by Alexandre Drouin <al...@orckestra.com>.
Can you use Solr's synonym feature?  You can find a German synonym file here: https://sites.google.com/site/kevinbouge/synonyms-lists

Alexandre Drouin

-----Original Message-----
From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de] 
Sent: Wednesday, August 10, 2016 10:21 AM
To: solr-user@lucene.apache.org
Subject: Solr 6.1 :: language specific analysis

Hello,

I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".

I appreachiate any help!

Rainer


--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek 
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------




RE: Solr 6.1 :: language specific analysis

Posted by "Allison, Timothy B." <ta...@mitre.org>.
ICU normalization (ICUFoldingFilterFactory) will at least handle "ß" -> "ss" (IIRC) and some other language-general variants that might get you close.  There are, of course, language specific analyzers (https://wiki.apache.org/solr/LanguageAnalysis#German) , but I don't think they'll get you Foto->photo.  

You might experiment with DoubleMetaphone encoding (DoubleMetaphoneFilterFactory) or, worst case, back off to synonym lists (SynonymFilterFactory) for your domain.

-----Original Message-----
From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de] 
Sent: Wednesday, August 10, 2016 10:21 AM
To: solr-user@lucene.apache.org
Subject: Solr 6.1 :: language specific analysis

Hello,

I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching after "Foto" or "Photo".

I appreachiate any help!

Rainer


--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek 
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------