You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jürgen Tiedemann <ju...@yahoo.de> on 2011/06/30 11:58:13 UTC
Adding german phonetic to solr
Hi all,
does solar support german phonetic? Searching for "how to add german phonetic to
solr" on google does not deliver good results, just lots of JIRA stuff. I
searched for "cologne phonetic" too. The wikis
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=%28phonetic%29#solr.PhoneticFilterFactory
and http://wiki.apache.org/solr/LanguageAnalysis#German haven't also answered
my question. Please, can someone tell me how to do it or where to look for
appropriate information.
Nice regards
Jürgen
Re: AW: Adding german phonetic to solr
Posted by Paul Libbrecht <pa...@hoplahup.net>.
Jürgen,
clearly the Cologne-phonetic was not yet supported, please read:
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/java/org/apache/solr/analysis/PhoneticFilterFactory.java
one would need to add the line about Cologne-phonetic and recompile.
It'd make sense to open a jira issue for this.
paul
Le 30 juin 2011 à 14:24, Jürgen Tiedemann a écrit :
> Hi Paul,
>
> thanks for the quick reply. I replaced commons-codec-1.4.jar with
> commons-codec-1.5.jar to get the ColognePhonetic. In schema.xml I added
>
> <filter class="solr.PhoneticFilterFactory" encoder="ColognePhonetic"
> inject="true"/>
>
> but then I get
>
> org.apache.solr.common.SolrException: Unknown encoder: ColognePhonetic
> [[CAVERPHONE, SOUNDEX, METAPHONE, DOUBLEMETAPHONE, REFINEDSOUNDEX]].
>
> How do I get PhoneticFilterFactory to know ColognePhonetic? Or is my approach
> completely wrong?
>
> Jürgen
>
>
>
>
>
>
> ________________________________
> Von: Paul Libbrecht <pa...@hoplahup.net>
> An: solr-user@lucene.apache.org
> Gesendet: Donnerstag, den 30. Juni 2011, 12:09:18 Uhr
> Betreff: Re: Adding german phonetic to solr
>
> Jürgen,
>
> I haven't had the time to deploy it but i heard about "Kölner Phonetik" that was
> to be contributed as part of apache-commons-codec.
> It probably still is just a patch in a jira issue.
> https://issues.apache.org/jira/browse/CODEC-106
> The contribution was posted to commons-dev on september 15th 2010.
>
> Bringing this reachable into Solr would be interesting but it's a bit of work.
>
> We have used the Double-Metaphone indexer with Lucene with reasonable success in
> ActiveMath but it was not as fine as the Kölner analyzer and fine-graininess is
> really a desirable feature of a phonetic environment.
> You might want to also care for all the "proper nouns" around for which
> tradition phonetics is doomed to fail if, at least, your texts are a bit with
> international names!
>
> paul
>
>
> Le 30 juin 2011 à 11:58, Jürgen Tiedemann a écrit :
>
>> Hi all,
>>
>> does solar support german phonetic? Searching for "how to add german phonetic
>> to
>>
>> solr" on google does not deliver good results, just lots of JIRA stuff. I
>> searched for "cologne phonetic" too. The wikis
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=%28phonetic%29#solr.PhoneticFilterFactory
>> y
>> and http://wiki.apache.org/solr/LanguageAnalysis#German haven't also answered
>> my question. Please, can someone tell me how to do it or where to look for
>> appropriate information.
>>
>> Nice regards
>>
>> Jürgen
AW: Adding german phonetic to solr
Posted by Jürgen Tiedemann <ju...@yahoo.de>.
Hi Paul,
thanks for the quick reply. I replaced commons-codec-1.4.jar with
commons-codec-1.5.jar to get the ColognePhonetic. In schema.xml I added
<filter class="solr.PhoneticFilterFactory" encoder="ColognePhonetic"
inject="true"/>
but then I get
org.apache.solr.common.SolrException: Unknown encoder: ColognePhonetic
[[CAVERPHONE, SOUNDEX, METAPHONE, DOUBLEMETAPHONE, REFINEDSOUNDEX]].
How do I get PhoneticFilterFactory to know ColognePhonetic? Or is my approach
completely wrong?
Jürgen
________________________________
Von: Paul Libbrecht <pa...@hoplahup.net>
An: solr-user@lucene.apache.org
Gesendet: Donnerstag, den 30. Juni 2011, 12:09:18 Uhr
Betreff: Re: Adding german phonetic to solr
Jürgen,
I haven't had the time to deploy it but i heard about "Kölner Phonetik" that was
to be contributed as part of apache-commons-codec.
It probably still is just a patch in a jira issue.
https://issues.apache.org/jira/browse/CODEC-106
The contribution was posted to commons-dev on september 15th 2010.
Bringing this reachable into Solr would be interesting but it's a bit of work.
We have used the Double-Metaphone indexer with Lucene with reasonable success in
ActiveMath but it was not as fine as the Kölner analyzer and fine-graininess is
really a desirable feature of a phonetic environment.
You might want to also care for all the "proper nouns" around for which
tradition phonetics is doomed to fail if, at least, your texts are a bit with
international names!
paul
Le 30 juin 2011 à 11:58, Jürgen Tiedemann a écrit :
> Hi all,
>
> does solar support german phonetic? Searching for "how to add german phonetic
>to
>
> solr" on google does not deliver good results, just lots of JIRA stuff. I
> searched for "cologne phonetic" too. The wikis
>http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=%28phonetic%29#solr.PhoneticFilterFactory
>y
> and http://wiki.apache.org/solr/LanguageAnalysis#German haven't also answered
> my question. Please, can someone tell me how to do it or where to look for
> appropriate information.
>
> Nice regards
>
> Jürgen
Re: Adding german phonetic to solr
Posted by Paul Libbrecht <pa...@hoplahup.net>.
Jürgen,
I haven't had the time to deploy it but i heard about "Kölner Phonetik" that was to be contributed as part of apache-commons-codec.
It probably still is just a patch in a jira issue.
https://issues.apache.org/jira/browse/CODEC-106
The contribution was posted to commons-dev on september 15th 2010.
Bringing this reachable into Solr would be interesting but it's a bit of work.
We have used the Double-Metaphone indexer with Lucene with reasonable success in ActiveMath but it was not as fine as the Kölner analyzer and fine-graininess is really a desirable feature of a phonetic environment.
You might want to also care for all the "proper nouns" around for which tradition phonetics is doomed to fail if, at least, your texts are a bit with international names!
paul
Le 30 juin 2011 à 11:58, Jürgen Tiedemann a écrit :
> Hi all,
>
> does solar support german phonetic? Searching for "how to add german phonetic to
> solr" on google does not deliver good results, just lots of JIRA stuff. I
> searched for "cologne phonetic" too. The wikis
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=%28phonetic%29#solr.PhoneticFilterFactory
> and http://wiki.apache.org/solr/LanguageAnalysis#German haven't also answered
> my question. Please, can someone tell me how to do it or where to look for
> appropriate information.
>
> Nice regards
>
> Jürgen