You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mark Schoy <he...@gmx.de> on 2011/12/07 09:31:00 UTC

Reducing heap space consumption for large dictionaries?

Hi,

in my index schema I has defined a
DictionaryCompoundWordTokenFilterFactory and a
HunspellStemFilterFactory. Each FilterFactory has a dictionary with
about 100k entries.

To avoid an out of memory error I have to set the heap space to 128m
for 1 index.

Is there a way to reduce the memory consumption when parsing the dictionary?
I need to create several indexes and 128m for each index is too much.

mark

Re: Reducing heap space consumption for large dictionaries?

Posted by Maciej Lisiewski <c2...@poczta.fm>.
W dniu 2011-12-13 05:48, Chris Male pisze:
> Hi,
>
> Its good to hear some feedback on using the Hunspell dictionaries.
>   Lucene's support is pretty new so we're obviously looking to improve it.
>   Could you open a JIRA issue so we can explore whether there is some ways
> to reduce memory consumption?

Done:
https://issues.apache.org/jira/browse/SOLR-2968


-- 
Maciej Lisiewski

Re: Reducing heap space consumption for large dictionaries?

Posted by Chris Male <ge...@gmail.com>.
Hi,

Its good to hear some feedback on using the Hunspell dictionaries.
 Lucene's support is pretty new so we're obviously looking to improve it.
 Could you open a JIRA issue so we can explore whether there is some ways
to reduce memory consumption?

On Tue, Dec 13, 2011 at 5:37 PM, Maciej Lisiewski <c2...@poczta.fm> wrote:

> Hi,
>>
>> in my index schema I has defined a
>> DictionaryCompoundWordTokenFil**terFactory and a
>> HunspellStemFilterFactory. Each FilterFactory has a dictionary with
>> about 100k entries.
>>
>> To avoid an out of memory error I have to set the heap space to 128m
>> for 1 index.
>>
>> Is there a way to reduce the memory consumption when parsing the
>> dictionary?
>> I need to create several indexes and 128m for each index is too much.
>>
>
> Same problem here - even with an empty index (no data yet) and two fields
> using Hunspell (pl_PL) I had to increase heap size to over 2GB for solr to
> start at all..
>
> Stempel using the very same dictionary works fine with 128M..
>
> --
> Maciej Lisiewski
>



-- 
Chris Male | Software Developer | DutchWorks | www.dutchworks.nl

Re: Reducing heap space consumption for large dictionaries?

Posted by Maciej Lisiewski <c2...@poczta.fm>.
> Hi,
>
> in my index schema I has defined a
> DictionaryCompoundWordTokenFilterFactory and a
> HunspellStemFilterFactory. Each FilterFactory has a dictionary with
> about 100k entries.
>
> To avoid an out of memory error I have to set the heap space to 128m
> for 1 index.
>
> Is there a way to reduce the memory consumption when parsing the dictionary?
> I need to create several indexes and 128m for each index is too much.

Same problem here - even with an empty index (no data yet) and two 
fields using Hunspell (pl_PL) I had to increase heap size to over 2GB 
for solr to start at all..

Stempel using the very same dictionary works fine with 128M..

-- 
Maciej Lisiewski