You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Gus Heck (JIRA)" <ji...@apache.org> on 2015/12/11 16:12:11 UTC

[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores

    [ https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052878#comment-15052878 ] 

Gus Heck commented on SOLR-3443:
--------------------------------

Was working on something else and thinking about memory consistency when it occurred to me that this patch might need a couple of tweaks to the Dictionary class to ensure that it's loading *happens before* any look ups... unless there is some point in the overall solr initialization phase that ensures that request handling threads and the core initialization threads all lock and unlock the same monitor before requests are handled? Does that exist somewhere? Memory consistency seems like something that must have already been thought about...  Will think more and look at it tonight.

In any case this should not effect the general resource sharing patch in SOLR-8349 unless I decide to add further _caveat emptor_ warnings to the javadoc :).

> Optimize hunspell dictionary loading with multiple cores
> --------------------------------------------------------
>
>                 Key: SOLR-3443
>                 URL: https://issues.apache.org/jira/browse/SOLR-3443
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Luca Cavanna
>         Attachments: SOLR-3443.patch, Screen Shot 2015-11-29 at 9.52.06 AM.png
>
>
> The Hunspell dictionary is actually loaded into memory. Each core using hunspell loads its own dictionary, no matter if all the cores are using the same dictionary files. As a result, the same dictionary is loaded into memory multiple times, once for each core. I think we should share those dictionaries between all cores in order to optimize the memory usage. In fact, let's say a dictionary takes 20MB into memory (this is what I detected), if you have 20 cores you are going to use 400MB only for dictionaries, which doesn't seem a good idea to me.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org