You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/18 18:11:08 UTC

[GitHub] [lucene] rmuir commented on pull request #24: LUCENE-9852: Make Hunspell thread-safe

rmuir commented on pull request #24:
URL: https://github.com/apache/lucene/pull/24#issuecomment-802174939


   I guess my question is why does the stemmer need to be threadsafe?
   
   Stemmers in lucene aren't threadsafe, we use a threadlocal model for the analysis chain. So tokenizers, stemmers, etc are cached per-thread, and maintain some buffers to avoid creating tons of garbage.
   
   e.g. the way the Analyzer class works, if you are indexing with 8 threads, is that you have 8 HunspellStemFilters, each one with its own HunspellStemmer, so there are no thread safety issues. Previously the idea is that only the "large" thing (Dictionary) needed to be threadsafe as we don't want to instantiate it all the time anyway.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org