You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/02/19 19:10:12 UTC

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2405: LUCENE-9790: Hunspell: avoid slow dictionary lookup if the word's hash isn't there

dweiss commented on a change in pull request #2405:
URL: https://github.com/apache/lucene-solr/pull/2405#discussion_r579415213



##########
File path: lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java
##########
@@ -160,7 +160,8 @@ public void testDictionariesLoadSuccessfully() throws Exception {
           try {
             Dictionary dic = loadDictionary(aff);
             totalMemory.addAndGet(RamUsageTester.sizeOf(dic));
-            totalWords.addAndGet(RamUsageTester.sizeOf(dic.words));
+            totalWords.addAndGet(
+                RamUsageTester.sizeOf(dic.words) + RamUsageTester.sizeOf(dic.wordHashes));

Review comment:
       These are actual numbers from the dictionaries you checked, correct? The code doesn't seem to have an upper limit (only highestOneBit(words*10)). I hope the dictionaries out there are reasonable (lexicon of spoken human languages certainly is bound, so I think it's a sane assumption).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org