You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrey Klochkov <ak...@griddynamics.com> on 2009/11/26 15:44:35 UTC

WordDelimiterFilter and acronyms normalization

Hi all!

Is there any ready-for-use filter which performs acronyms normalization such
as "I.N.C."->"INC"?

I see that Lucene's StandardFilter can do this but we can't use it as we're
using WhitespaceTokenizer instead of StandardTokenizer.

-- 
Andrew Klochkov
Senior Software Engineer,
Grid Dynamics

Re: WordDelimiterFilter and acronyms normalization

Posted by AHMET ARSLAN <io...@yahoo.com>.
> Is there any ready-for-use filter which performs acronyms
> normalization such
> as "I.N.C."->"INC"?
> 
> I see that Lucene's StandardFilter can do this but we can't
> use it as we're
> using WhitespaceTokenizer instead of StandardTokenizer.
> 

I am bad at regular expressions but if you can write a regex for that replacement solr.PatternReplaceFilterFactory can do that.

<filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])" replacement="" replace="all" />