You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/29 15:20:58 UTC

[GitHub] [lucene] xaviersanchez commented on pull request #461: LUCENE-10248: Spanish Plural Stemmer

xaviersanchez commented on pull request #461:
URL: https://github.com/apache/lucene/pull/461#issuecomment-981735988


   > Hi @xaviersanchez, this contribution looks great.
   > 
   > I'll do another pass on review and give some time for others to review as well.
   > 
   > I did a little investigation at a glance, and I think it is confusing that the current `SpanishMinimalStemmer` is doing aggressive conversions such as `ñ -> n`. I think, as a followup issue, we should `@deprecate` the `SpanishMinimalStemmer` and point users to this one instead?
   > 
   > `SpanishMinimalStemmer` is not a typical "upstream" algorithm, with academic papers/study from snowball or savoy, and there doesn't seem to be any reason to keep it anymore, except for a legacy index. So we could keep it around for another major release or so but not forever, IMO.
   
   Thanks @rmuir for the comment! 
   
   Yes, I agree we could deprecate SpanishMinimalStemmer and point the users to this implementation since it can cover the same use cases. We implemented this a while ago so, before contributing our code, we did the analysis of the different behaviors of the Spanish stemmers just for checking we could provide some added value. From our analysis we see that SpanishMinimalStemmer has some issues and does some quite aggressive text normalization. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org