You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by joeMcElroy <ph...@gmail.com> on 2008/11/10 11:14:30 UTC

Filters: acute accent characters replaced with their english counterpart

I need a custom filter to be added to a field which will replace special
foreign characters with their english counterpart. 

for example ø => o
Grave À È Ì Ò Ù à è ì ò ù => A E I O U a e i o u 
Circumflex Â Ê Î Ô Û â ê î ô û  => A E I O U a e i o u

is this possible?

joe
-- 
View this message in context: http://www.nabble.com/Filters%3A-acute-accent-characters-replaced-with-their-english-counterpart-tp20416888p20416888.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Filters: acute accent characters replaced with their english counterpart

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
joe,

This hasn't been committed yet, but SOLR-822 may be your answer.

https://issues.apache.org/jira/browse/SOLR-822

Koji

joeMcElroy wrote:
> I need a custom filter to be added to a field which will replace special
> foreign characters with their english counterpart. 
>
> for example ø => o
> Grave À È Ì Ò Ù à è ì ò ù => A E I O U a e i o u 
> Circumflex Â Ê Î Ô Û â ê î ô û  => A E I O U a e i o u
>
> is this possible?
>
> joe
>   


RE: Filters: acute accent characters replaced with their english counterpart

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Jarek,

On 11/10/2008 at 6:08 AM, Jarek Zgoda wrote:
> Wiadomość napisana w dniu 2008-11-10, o godz. 11:14, przez joeMcElroy:
> > I need a custom filter to be added to a field which will replace
> > special foreign characters with their english counterpart.
> > 
> > for example ø => o
> > Grave À È Ì Ò Ù à è ì ò ù => A E I O U a e i o u
> > Circumflex Â Ê Î Ô Û â ê î ô û  => A E I O U a e i o u
> > 
> > is this possible?
> 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-4ebf7aea23b3d6d34a1f8314f9de17334a3e2fac
> 
> I wish such filter exist for Latin2...

The following Lucene patch hasn't been committed yet, and there is no Solr Factory counterpart yet, but: ASCIIFoldingFilter folds all accented letters to their (accent-stripped, if necessary) ASCII equivalents:

<https://issues.apache.org/jira/browse/LUCENE-1390>

Steve

Re: Filters: acute accent characters replaced with their english counterpart

Posted by Jarek Zgoda <ja...@redefine.pl>.
Wiadomość napisana w dniu 2008-11-10, o godz. 11:14, przez joeMcElroy:

> I need a custom filter to be added to a field which will replace  
> special
> foreign characters with their english counterpart.
>
> for example ø => o
> Grave À È Ì Ò Ù à è ì ò ù => A E I O U a e i o u
> Circumflex Â Ê Î Ô Û â ê î ô û  => A E I O U a e i o u
>
> is this possible?

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-4ebf7aea23b3d6d34a1f8314f9de17334a3e2fac

I wish such filter exist for Latin2...

-- 
We read Knuth so you don't have to. - Tim Peters

Jarek Zgoda, R&D, Redefine
jarek.zgoda@redefine.pl


Re: Filters: acute accent characters replaced with their english counterpart

Posted by joeMcElroy <ph...@gmail.com>.
cheers for the quick response!

joe


-- 
View this message in context: http://www.nabble.com/Filters%3A-acute-accent-characters-replaced-with-their-english-counterpart-tp20416888p20418586.html
Sent from the Solr - User mailing list archive at Nabble.com.