You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by cs...@orange.com on 2013/09/09 16:29:57 UTC

Stemming and protwords configuration

Hi,

We have a Solr server using stemming:

<filter class="solr.SnowballPorterFilterFactory" language="French" protected="protwords.txt" />

I would like to query the French words "frais" and "fraise" separately. I put the word "fraise" in protwords.txt file.

- When I query the word "fraise", no document indexed with the word "frais" are found.
- When I query the word "frais", I've got documents indexed with the word "fraise".

Is there a way to do not match "fraises" documents in the second situation ?

I hope this is clear. Thanks for your reply.

Christophe


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Stemming and protwords configuration

Posted by Erick Erickson <er...@gmail.com>.
Did you try putting them _all_ in protwords.txt? i.e.
frais, fraise, fraises?

Don't forget to re-index.

An alternative is to index in a second field that doesn't have the
stemmer and when you want exact matches, search against that
field.

Best
Erick


On Mon, Sep 9, 2013 at 10:29 AM, <cs...@orange.com> wrote:

> Hi,
>
> We have a Solr server using stemming:
>
> <filter class="solr.SnowballPorterFilterFactory" language="French"
> protected="protwords.txt" />
>
> I would like to query the French words "frais" and "fraise" separately. I
> put the word "fraise" in protwords.txt file.
>
> - When I query the word "fraise", no document indexed with the word
> "frais" are found.
> - When I query the word "frais", I've got documents indexed with the word
> "fraise".
>
> Is there a way to do not match "fraises" documents in the second situation
> ?
>
> I hope this is clear. Thanks for your reply.
>
> Christophe
>
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>