You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dominique Bejean <do...@eolya.fr> on 2017/02/10 08:26:41 UTC

Stemming and accents

Hi,

Is the SnowballPorterFilter sensitive to the accents for French for
instance ?

If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to
configure ASCIIFoldingFilter after SnowballPorterFilter  ?

Regards.

Dominique
-- 
Dominique Béjean
06 08 46 12 43

Re: Stemming and accents

Posted by Dominique Bejean <do...@eolya.fr>.
Thank you both for your answers.

I tried to find some French homophone words (tache / tâche, bouche /
bouché, ...) with different stems (with snowball, minimal and light
stemmers), but without success. So put the ASCIIFolding filter before the
stemmer is not a big issue (in French) for precision.

Dominique


Le ven. 10 févr. 2017 à 23:06, Ahmet Arslan <io...@yahoo.com> a écrit :

> Hi,
>
> I have experimented before, and found that Snowball is sensitive to
> accents/diacritics.
> Please see for more details:
> http://www.sciencedirect.com/science/article/pii/S0306457315001053
>
> Ahmet
>
>
>
> On Friday, February 10, 2017 11:27 AM, Dominique Bejean <
> dominique.bejean@eolya.fr> wrote:
> Hi,
>
> Is the SnowballPorterFilter sensitive to the accents for French for
> instance ?
>
> If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to
> configure ASCIIFoldingFilter after SnowballPorterFilter  ?
>
> Regards.
>
> Dominique
> --
> Dominique Béjean
> 06 08 46 12 43
>
-- 
Dominique Béjean
06 08 46 12 43

Re: Stemming and accents

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

I have experimented before, and found that Snowball is sensitive to accents/diacritics.
Please see for more details: 
http://www.sciencedirect.com/science/article/pii/S0306457315001053

Ahmet



On Friday, February 10, 2017 11:27 AM, Dominique Bejean <do...@eolya.fr> wrote:
Hi,

Is the SnowballPorterFilter sensitive to the accents for French for
instance ?

If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to
configure ASCIIFoldingFilter after SnowballPorterFilter  ?

Regards.

Dominique
-- 
Dominique Béjean
06 08 46 12 43 

Re: Stemming and accents

Posted by Erick Erickson <er...@gmail.com>.
The easiest way to answer that is to define two different fieldTypes,
one with Snowball first and one with ASCIIFolding first, fire up the
admin/analysis page and give it some input. That'll show you _exactly_
what transformations take place at each step.

Best,
Erick

On Fri, Feb 10, 2017 at 12:26 AM, Dominique Bejean
<do...@eolya.fr> wrote:
> Hi,
>
> Is the SnowballPorterFilter sensitive to the accents for French for
> instance ?
>
> If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to
> configure ASCIIFoldingFilter after SnowballPorterFilter  ?
>
> Regards.
>
> Dominique
> --
> Dominique Béjean
> 06 08 46 12 43