You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christian Ziech (JIRA)" <ji...@apache.org> on 2014/08/07 17:54:12 UTC

[jira] [Commented] (LUCENE-5875) Default page/block sizes in the FST package can cause OOMs

    [ https://issues.apache.org/jira/browse/LUCENE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089366#comment-14089366 ] 

Christian Ziech commented on LUCENE-5875:
-----------------------------------------

Oh there is another OOM we get: At the time the exception was thrown we were indexing for 5-6 hours and have closed the IndexWriter already. Now we only want to store the special terms we gathered during indexing into a custom FST. At the point in time the Exception was thrown effectively one one thread was active in the VM the last attempt of a GC printed the following:
Eden: 0B(4021M)->0B(4021M) Survivors: 75M->75M Heap: 9615M(30720M)->9615M(30720M)
Those values are also pretty much in line with the numbers we get from the runtime if we add custom debug statements.

java.lang.OutOfMemoryError: Java heap space
	at org.apache.lucene.util.packed.Packed64.<init>(Packed64.java:73)
	at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1034)
	at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1001)
	at org.apache.lucene.util.packed.GrowableWriter.<init>(GrowableWriter.java:46)
	at org.apache.lucene.util.packed.GrowableWriter.resize(GrowableWriter.java:98)
	at org.apache.lucene.util.fst.FST.addNode(FST.java:845)
	at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:200)
	at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:289)
	at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
	at com.nokia.search.candgen.spelling.AtomicFSTBuilder$FSTWriter.put(AtomicFSTBuilder.java:358)
	at com.nokia.search.candgen.spelling.AtomicFSTBuilder$WriteTask.run(AtomicFSTBuilder.java:156)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)


> Default page/block sizes in the FST package can cause OOMs
> ----------------------------------------------------------
>
>                 Key: LUCENE-5875
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5875
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 4.9
>            Reporter: Christian Ziech
>            Priority: Minor
>
> We are building some fairly big FSTs (the biggest one having about 500M terms with an average of 20 characters per term) and that works very well so far.
> The problem is just that we can use neither the "doShareSuffix" nor the "doPackFST" option from the builder since both would cause us to get exceptions. One beeing an OOM and the other an IllegalArgumentException for a negative array size in ArrayUtil.
> The thing here is that we in theory still have far more than enough memory available but it seems that java for some reason cannot allocate byte or long arrays of the size the NodeHash needs (maybe fragmentation?).
> Reducing the constant in the NodeHash from 1<<30 to e.g. 27 seems to fix the issue mostly. Could e.g. the Builder pass through its bytesPageBits to the NodeHash or could we get a custom parameter for that?
> The other problem we run into was a NegativeArraySizeException when we try to pack the FST. It seems that we overflowed to 0x80000000. Unfortunately I accidentally overwrote that exception but I remember it was triggered by the GrowableWriter for the inCounts in line 728 of the FST. If it helps I can try to reproduce it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org