You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pulkit Singhal <pu...@gmail.com> on 2011/09/20 23:45:45 UTC

Troubleshooting OOM in DIH w/ FileListEntityProcessor and XPathEntityProcessor

Hello Everyone,

I need help in:
(a) figuring out the causes of OutOfMemoryError (OOM) when I run Data
Import Handler (DIH),
(b) finding workarounds and fixes to get rid of the OOM issue per cause.

The stacktrace is at the very bottom to avoid having your eyes glaze
over and to prevent you from skipping this thread ;)

1) Based on the documentation so far, I would say that "batchSize"
based control does not exist for FileListEntityProcessor or
XPathEntityProcessor. Please correct me if I'm wrong about this.

2) The files being processed by FileListEntityProcessor range from
90.9 to 2.8 MB in size.
2.1) Is there some way to let FileListEntityProcessor bring in only
one file at a time? Or is that the default already?
2.2) Is there some way to let FileListEntityProcessor stream the file
to its nested XPathEntityProcessor?
2.3) If streaming a file is something that should be configured
directly on XPathEntityProcessor, then please let me know how to do
that as well.

3) Where are the default xms and xmx for Solr configured? Please let
me know so I may try tweaking them for startup.

====
STACKTRACE:
====
SEVERE: Exception while processing: bbyopenProductsArchive document : null:
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMemoryError: Java heap space
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:718)
...
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2734)
        at java.util.ArrayList.toArray(ArrayList.java:275)
        at java.util.ArrayList.<init>(ArrayList.java:131)
        at org.apache.solr.handler.dataimport.XPathRecordReader$Node.getDeepCopy(XPathRecordReader.java:586)
...
INFO: start rollback
Sep 20, 2011 4:22:26 PM org.apache.solr.handler.dataimport.SolrWriter rollback
SEVERE: Exception while solr rollback.
java.lang.NullPointerException
        at org.apache.solr.update.DefaultSolrCoreState.rollbackIndexWriter(DefaultSolrCoreState.java:73)

Re: Troubleshooting OOM in DIH w/ FileListEntityProcessor and XPathEntityProcessor

Posted by Erick Erickson <er...@gmail.com>.
The first thing I'd try is just tweaking the Xmx parameter on the invocation,
java -Xmx2048M -jar start.jar

Second option: Play with your <autocommit> options in solrconfig.xml
and lower it substantially, although I'm not quite sure how DIH interacts
with that.

Gotta rush, so sorry this is so terse.

Best
Erick

On Tue, Sep 20, 2011 at 2:45 PM, Pulkit Singhal <pu...@gmail.com> wrote:
> Hello Everyone,
>
> I need help in:
> (a) figuring out the causes of OutOfMemoryError (OOM) when I run Data
> Import Handler (DIH),
> (b) finding workarounds and fixes to get rid of the OOM issue per cause.
>
> The stacktrace is at the very bottom to avoid having your eyes glaze
> over and to prevent you from skipping this thread ;)
>
> 1) Based on the documentation so far, I would say that "batchSize"
> based control does not exist for FileListEntityProcessor or
> XPathEntityProcessor. Please correct me if I'm wrong about this.
>
> 2) The files being processed by FileListEntityProcessor range from
> 90.9 to 2.8 MB in size.
> 2.1) Is there some way to let FileListEntityProcessor bring in only
> one file at a time? Or is that the default already?
> 2.2) Is there some way to let FileListEntityProcessor stream the file
> to its nested XPathEntityProcessor?
> 2.3) If streaming a file is something that should be configured
> directly on XPathEntityProcessor, then please let me know how to do
> that as well.
>
> 3) Where are the default xms and xmx for Solr configured? Please let
> me know so I may try tweaking them for startup.
>
> ====
> STACKTRACE:
> ====
> SEVERE: Exception while processing: bbyopenProductsArchive document : null:
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.OutOfMemoryError: Java heap space
>        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:718)
> ...
> Caused by: java.lang.OutOfMemoryError: Java heap space
>        at java.util.Arrays.copyOf(Arrays.java:2734)
>        at java.util.ArrayList.toArray(ArrayList.java:275)
>        at java.util.ArrayList.<init>(ArrayList.java:131)
>        at org.apache.solr.handler.dataimport.XPathRecordReader$Node.getDeepCopy(XPathRecordReader.java:586)
> ...
> INFO: start rollback
> Sep 20, 2011 4:22:26 PM org.apache.solr.handler.dataimport.SolrWriter rollback
> SEVERE: Exception while solr rollback.
> java.lang.NullPointerException
>        at org.apache.solr.update.DefaultSolrCoreState.rollbackIndexWriter(DefaultSolrCoreState.java:73)
>