You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by GitBox <gi...@apache.org> on 2022/11/03 19:44:55 UTC

[GitHub] [opennlp] rzo1 commented on a diff in pull request #385: OPENNLP-1320: Makes lemmatize of MorfologikLemmatizer thread-safe

rzo1 commented on code in PR #385:
URL: https://github.com/apache/opennlp/pull/385#discussion_r1013330656


##########
opennlp-morfologik-addon/src/main/java/opennlp/morfologik/lemmatizer/MorfologikLemmatizer.java:
##########
@@ -47,7 +47,7 @@ public MorfologikLemmatizer(Dictionary dictionary) throws IllegalArgumentExcepti
     dictLookup = new DictionaryLookup(dictionary);
   }
 
-  private List<String> lemmatize(String word, String postag) {
+  private synchronized List<String> lemmatize(String word, String postag) {

Review Comment:
   An alternative to `synchronized` would be to re-create and dispose the `DictionaryLookup`, which is " is cheap to create and dispose (so it makes no sense to cache)" according to https://github.com/morfologik/morfologik-stemming/issues/69, i.e. moving to 
   
   ```java
   List<WordData> dictMap = new DictionaryLookup(dictionary).lookup(word.toLowerCase());
   ```
   
   and just store the (thread-safe) dictionary in the `MorfologikLemmatizer` instead of the not thread-safe `DictionaryLookup`. 
   
   That would avoid the synchronization cost. Wdyt?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org