You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jan Høydahl (Jira)" <ji...@apache.org> on 2021/04/14 14:56:00 UTC

[jira] [Updated] (LUCENE-9929) Make ScandinavianNormalizationFilter configurable wrt foldings

     [ https://issues.apache.org/jira/browse/LUCENE-9929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl updated LUCENE-9929:
--------------------------------
    Description: 
The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and oo. But all those five do not make sense for both Norwegian, Swedish and Danish. Implement an optional configuration option where users can select which of them to apply. I.e. for Norwegian, a user would then configure (in Solr):
{code:java}
<filter class="solr.ScandinavianNormalizationFilterFactory foldings="ae,oe,aa"/>
{code}
This would activate foldings for ae->æ, oe->ø, aa->å, but not oo->o and ao->a.

The default will be to activate all five as before, so it will be backward compatible.

  was:
The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and oo. But all those five do not make sense for both Norwegian, Swedish and Danish. Implement a configuration option where users can select which of them to apply.

I.e. for Norwegian, a user would then configure (in Solr):
{code:java}
<filter class="solr.ScandinavianNormalizationFilterFactory foldings="ae,oe,aa"/>
{code}
This would activate foldings for ae->æ, oe->ø, aa->å, but not oo->o and ao->a.


> Make ScandinavianNormalizationFilter configurable wrt foldings
> --------------------------------------------------------------
>
>                 Key: LUCENE-9929
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9929
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>            Priority: Major
>
> The ScandinavianNormalizationFilter applies foldings for aa, ao, ae, oe and oo. But all those five do not make sense for both Norwegian, Swedish and Danish. Implement an optional configuration option where users can select which of them to apply. I.e. for Norwegian, a user would then configure (in Solr):
> {code:java}
> <filter class="solr.ScandinavianNormalizationFilterFactory foldings="ae,oe,aa"/>
> {code}
> This would activate foldings for ae->æ, oe->ø, aa->å, but not oo->o and ao->a.
> The default will be to activate all five as before, so it will be backward compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org