You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sandra Scott <sc...@gmail.com> on 2013/12/12 18:10:27 UTC

solr OOM Crash

Helllo,

We are experiencing unexplained OOM crashes. We have already seen it a few
times, over our different solr instances. The crash happens only at a
single shard of the collection.

Environment details:
1. Solr 4.3, running on tomcat.
2. 24 Shards.
3. Indexing rate of ~800 docs per minute.

Solrconfig.xml:
1. Merge factor 4
2. Sofrcommit every 10 min
3. Hardcommit every 30 min

Main findings:
1. Solr logs: No query failures prior to the OOM, but DOUBLE the amount of
soft and hard commits in comparison to other shards.
2. Analyzing the dump (VisualVM): Class byte[] takes 4gb out of 5gb
resourced to the JVM, mainly referenced by CompressingStoredFieldsReader GC
root (which by looking at the code, we suspect they were created due to
CompressingSortedFieldsWriter.merge).

Sub findings:
1. GC logs: Showed 108 GC fails prior to the crash.
2. CPI: Overall usage seems fine, but the % of CPU time for the GC stays
high 6 min before the OOM.
3. Memory: Half an hour before OOM the usage slowly rises, until it gets to
5.4gb.

Has anyone encountered higher than normal commit rate that seem to increase
merge rate and cause what I described?

Re: solr OOM Crash

Posted by Sébastien Michel <se...@atos.net>.
Hi Sandra,

Excuse me for the late reply.
We use lotsofcores (http://wiki.apache.org/solr/LotsOfCores) Solr feature,
around 100 simultaneous loaded cores. But the issue is reproducible with
few less cores.
We also have a high rate of indexing, and also reindexing (atomic update).

We are indexing media files metadata, but also metadata and contents of
PDF, the content is stored in a "text" field (stored="true").
Until release 4.3, Solr  uses a growing buffer to uncompress stored fields
(I assume one buffer per Solr Core or per Shard).
The issue comes when Solr read some big docs, the buffer of
CompressedStoredFieldReader grows but never shrinks.  The more such big
docs are read in different threads, the more the Heap usage is growing,
until the heap has no more free memory available and GC runs continuously.

Analyzing the dump: Class byte[] takes 3gb out of 4gb resourced to the JVM,
mainly referenced by CompressingStoredFieldsReader

I hope it can help you.

Sébastien


2013/12/30 Sandra Scott <sc...@gmail.com>

> Hello Sébastien,
>
> Can you give some information about your environment so I can make sure we
> are having the same problem you had?
> Also, did you find out what caused the GC to go crazy or what caused the
> increased commit rate?
>
> Thanks,
> Sandra
>
>
> On Thu, Dec 19, 2013 at 12:34 PM, Sébastien Michel <
> sebastien.michel@atos.net> wrote:
>
> > Hi Sandra,
> >
> > I'm not sure if your problem is same as ours, but we encountered the same
> > issue on our Solr 4.2, the major memory usage was due to
> > CompressingStoredFieldsReader and GC became crazy.
> > In our context, we have some stored fields and for some documents the
> > content of the text field could be huge.
> >
> > We resolved our issue with the backport of this fix :
> > https://issues.apache.org/jira/browse/LUCENE-4995
> >
> > You should also upgrade to Solr 4.4 or more
> >
> > Regards,
> > Sébastien
> >
> >
> > 2013/12/12 Sandra Scott <sc...@gmail.com>
> >
> > > Helllo,
> > >
> > > We are experiencing unexplained OOM crashes. We have already seen it a
> > few
> > > times, over our different solr instances. The crash happens only at a
> > > single shard of the collection.
> > >
> > > Environment details:
> > > 1. Solr 4.3, running on tomcat.
> > > 2. 24 Shards.
> > > 3. Indexing rate of ~800 docs per minute.
> > >
> > > Solrconfig.xml:
> > > 1. Merge factor 4
> > > 2. Sofrcommit every 10 min
> > > 3. Hardcommit every 30 min
> > >
> > > Main findings:
> > > 1. Solr logs: No query failures prior to the OOM, but DOUBLE the amount
> > of
> > > soft and hard commits in comparison to other shards.
> > > 2. Analyzing the dump (VisualVM): Class byte[] takes 4gb out of 5gb
> > > resourced to the JVM, mainly referenced by
> CompressingStoredFieldsReader
> > GC
> > > root (which by looking at the code, we suspect they were created due to
> > > CompressingSortedFieldsWriter.merge).
> > >
> > > Sub findings:
> > > 1. GC logs: Showed 108 GC fails prior to the crash.
> > > 2. CPI: Overall usage seems fine, but the % of CPU time for the GC
> stays
> > > high 6 min before the OOM.
> > > 3. Memory: Half an hour before OOM the usage slowly rises, until it
> gets
> > to
> > > 5.4gb.
> > >
> > > Has anyone encountered higher than normal commit rate that seem to
> > increase
> > > merge rate and cause what I described?
> > >
> >
>

Re: solr OOM Crash

Posted by Sandra Scott <sc...@gmail.com>.
Hello Sébastien,

Can you give some information about your environment so I can make sure we
are having the same problem you had?
Also, did you find out what caused the GC to go crazy or what caused the
increased commit rate?

Thanks,
Sandra


On Thu, Dec 19, 2013 at 12:34 PM, Sébastien Michel <
sebastien.michel@atos.net> wrote:

> Hi Sandra,
>
> I'm not sure if your problem is same as ours, but we encountered the same
> issue on our Solr 4.2, the major memory usage was due to
> CompressingStoredFieldsReader and GC became crazy.
> In our context, we have some stored fields and for some documents the
> content of the text field could be huge.
>
> We resolved our issue with the backport of this fix :
> https://issues.apache.org/jira/browse/LUCENE-4995
>
> You should also upgrade to Solr 4.4 or more
>
> Regards,
> Sébastien
>
>
> 2013/12/12 Sandra Scott <sc...@gmail.com>
>
> > Helllo,
> >
> > We are experiencing unexplained OOM crashes. We have already seen it a
> few
> > times, over our different solr instances. The crash happens only at a
> > single shard of the collection.
> >
> > Environment details:
> > 1. Solr 4.3, running on tomcat.
> > 2. 24 Shards.
> > 3. Indexing rate of ~800 docs per minute.
> >
> > Solrconfig.xml:
> > 1. Merge factor 4
> > 2. Sofrcommit every 10 min
> > 3. Hardcommit every 30 min
> >
> > Main findings:
> > 1. Solr logs: No query failures prior to the OOM, but DOUBLE the amount
> of
> > soft and hard commits in comparison to other shards.
> > 2. Analyzing the dump (VisualVM): Class byte[] takes 4gb out of 5gb
> > resourced to the JVM, mainly referenced by CompressingStoredFieldsReader
> GC
> > root (which by looking at the code, we suspect they were created due to
> > CompressingSortedFieldsWriter.merge).
> >
> > Sub findings:
> > 1. GC logs: Showed 108 GC fails prior to the crash.
> > 2. CPI: Overall usage seems fine, but the % of CPU time for the GC stays
> > high 6 min before the OOM.
> > 3. Memory: Half an hour before OOM the usage slowly rises, until it gets
> to
> > 5.4gb.
> >
> > Has anyone encountered higher than normal commit rate that seem to
> increase
> > merge rate and cause what I described?
> >
>

Re: solr OOM Crash

Posted by Sébastien Michel <se...@atos.net>.
Hi Sandra,

I'm not sure if your problem is same as ours, but we encountered the same
issue on our Solr 4.2, the major memory usage was due to
CompressingStoredFieldsReader and GC became crazy.
In our context, we have some stored fields and for some documents the
content of the text field could be huge.

We resolved our issue with the backport of this fix :
https://issues.apache.org/jira/browse/LUCENE-4995

You should also upgrade to Solr 4.4 or more

Regards,
Sébastien


2013/12/12 Sandra Scott <sc...@gmail.com>

> Helllo,
>
> We are experiencing unexplained OOM crashes. We have already seen it a few
> times, over our different solr instances. The crash happens only at a
> single shard of the collection.
>
> Environment details:
> 1. Solr 4.3, running on tomcat.
> 2. 24 Shards.
> 3. Indexing rate of ~800 docs per minute.
>
> Solrconfig.xml:
> 1. Merge factor 4
> 2. Sofrcommit every 10 min
> 3. Hardcommit every 30 min
>
> Main findings:
> 1. Solr logs: No query failures prior to the OOM, but DOUBLE the amount of
> soft and hard commits in comparison to other shards.
> 2. Analyzing the dump (VisualVM): Class byte[] takes 4gb out of 5gb
> resourced to the JVM, mainly referenced by CompressingStoredFieldsReader GC
> root (which by looking at the code, we suspect they were created due to
> CompressingSortedFieldsWriter.merge).
>
> Sub findings:
> 1. GC logs: Showed 108 GC fails prior to the crash.
> 2. CPI: Overall usage seems fine, but the % of CPU time for the GC stays
> high 6 min before the OOM.
> 3. Memory: Half an hour before OOM the usage slowly rises, until it gets to
> 5.4gb.
>
> Has anyone encountered higher than normal commit rate that seem to increase
> merge rate and cause what I described?
>

Re: solr OOM Crash

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Sandra,

Not a direct answer, but if you are seeing this around merges, have you
tried relaxing the merge factor to, say, 10?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Dec 12, 2013 at 12:10 PM, Sandra Scott <sc...@gmail.com>wrote:

> Helllo,
>
> We are experiencing unexplained OOM crashes. We have already seen it a few
> times, over our different solr instances. The crash happens only at a
> single shard of the collection.
>
> Environment details:
> 1. Solr 4.3, running on tomcat.
> 2. 24 Shards.
> 3. Indexing rate of ~800 docs per minute.
>
> Solrconfig.xml:
> 1. Merge factor 4
> 2. Sofrcommit every 10 min
> 3. Hardcommit every 30 min
>
> Main findings:
> 1. Solr logs: No query failures prior to the OOM, but DOUBLE the amount of
> soft and hard commits in comparison to other shards.
> 2. Analyzing the dump (VisualVM): Class byte[] takes 4gb out of 5gb
> resourced to the JVM, mainly referenced by CompressingStoredFieldsReader GC
> root (which by looking at the code, we suspect they were created due to
> CompressingSortedFieldsWriter.merge).
>
> Sub findings:
> 1. GC logs: Showed 108 GC fails prior to the crash.
> 2. CPI: Overall usage seems fine, but the % of CPU time for the GC stays
> high 6 min before the OOM.
> 3. Memory: Half an hour before OOM the usage slowly rises, until it gets to
> 5.4gb.
>
> Has anyone encountered higher than normal commit rate that seem to increase
> merge rate and cause what I described?
>