You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrey Klochkov <ak...@griddynamics.com> on 2009/11/26 15:44:35 UTC
WordDelimiterFilter and acronyms normalization
Hi all!
Is there any ready-for-use filter which performs acronyms normalization such
as "I.N.C."->"INC"?
I see that Lucene's StandardFilter can do this but we can't use it as we're
using WhitespaceTokenizer instead of StandardTokenizer.
--
Andrew Klochkov
Senior Software Engineer,
Grid Dynamics
Re: WordDelimiterFilter and acronyms normalization
Posted by AHMET ARSLAN <io...@yahoo.com>.
> Is there any ready-for-use filter which performs acronyms
> normalization such
> as "I.N.C."->"INC"?
>
> I see that Lucene's StandardFilter can do this but we can't
> use it as we're
> using WhitespaceTokenizer instead of StandardTokenizer.
>
I am bad at regular expressions but if you can write a regex for that replacement solr.PatternReplaceFilterFactory can do that.
<filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])" replacement="" replace="all" />