You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Leandro <le...@gmail.com> on 2008/04/10 20:42:40 UTC

Problem when try to make a bench of indexing (a dictionary with 120.000 words)

Hello,

*Sample code:*
SpellChecker spell;
RAMDirectory dram = new RAMDirectory();
Dicionario dic = new Dicionario(); //one implementation of spell.Dictionary
spell= new SpellChecker(dram);
spell.indexDictionary(dic); //indexing...

*Then I got the:*
machine1: Windows XP SP2, Celerom 2.66GHz e 256MB
word: 60.000 (40~53 caracteres cada)
memory alloc: 16 (MB)
time to index: 55108 (ms)

So* I tried with 120.000 words* ... when I run the program ...

*Exception in thread "Thread-1"
org.apache.lucene.index.MergePolicy$MergeExceptio
n: java.lang.OutOfMemoryError: Java heap space
        at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc
urrentMergeScheduler.java:271)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.store.RAMFile.newBuffer(RAMFile.java:88)
        at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:61)
        at
org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutput
Stream.java:128)
        at
org.apache.lucene.store.RAMOutputStream.writeByte(RAMOutputStream.jav
a:105)
...

*Why this occors?*
*

Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Thu, 2008-04-10 at 15:42 -0300, Leandro wrote:
> machine1: Windows XP SP2, Celerom 2.66GHz e 256MB

If that is a physical machine (as opposed to virtual), then the amount
of RAM if not at all well balanced against the processor speed.

> [...] java.lang.OutOfMemoryError: Java heap space

How much memory do you allocate for the whole JVM? If you're not sure,
then you're probably using the default, which is probably 64MB for your
machine. If so, you can try allocating more memory for the JVM:
java -Xmx=128m -jar LeandroApplication.jar


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

Posted by Leandro <le...@gmail.com>.
>
> If tye 16M means you're only giving the process that much memory, it
> surprises
> me that it runs at all. Especially since you're putting it all in a
> RAMdir.
>

Sorry that 16M is dictonarySizeInBytes() I would imagine that it is the same
size of index...

Well when I start to use a Dictonary with more than 60.000 need I to use
FSDirectory?



>
> Or is that 16M referring to something else?


Just Dictonary size...
:(


>
> Best
> Erick
>

Re: Problem when try to make a bench of indexing (a dictionary with 120.000 words)

Posted by Erick Erickson <er...@gmail.com>.
If tye 16M means you're only giving the process that much memory, it
surprises
me that it runs at all. Especially since you're putting it all in a RAMdir.

Or is that 16M referring to something else?

Best
Erick

On Thu, Apr 10, 2008 at 2:42 PM, Leandro <le...@gmail.com> wrote:

> Hello,
>
> *Sample code:*
> SpellChecker spell;
> RAMDirectory dram = new RAMDirectory();
> Dicionario dic = new Dicionario(); //one implementation of
> spell.Dictionary
> spell= new SpellChecker(dram);
> spell.indexDictionary(dic); //indexing...
>
> *Then I got the:*
> machine1: Windows XP SP2, Celerom 2.66GHz e 256MB
> word: 60.000 (40~53 caracteres cada)
> memory alloc: 16 (MB)
> time to index: 55108 (ms)
>
> So* I tried with 120.000 words* ... when I run the program ...
>
> *Exception in thread "Thread-1"
> org.apache.lucene.index.MergePolicy$MergeExceptio
> n: java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc
> urrentMergeScheduler.java:271)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>        at org.apache.lucene.store.RAMFile.newBuffer(RAMFile.java:88)
>        at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:61)
>        at
> org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutput
> Stream.java:128)
>        at
> org.apache.lucene.store.RAMOutputStream.writeByte(RAMOutputStream.jav
> a:105)
> ...
>
> *Why this occors?*
> *
>