You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Rahul Goswami <ra...@gmail.com> on 2022/09/28 22:47:19 UTC

Solr 7.7.2 going OOM

Hi,
I am running Solr 7.7.2 in standalone mode. 32 GB heap. Proces is throwing
an OOM exception. Heap dump analysis shows ~14 GB of FrozeBufferedUpdates
and 9 GB held across ~3000 DWPTs . There is a log replay in progress and
another indexing thread.

My understanding is that Solr flushes when accumulated updates hit 100
MB *across
threads.*

Hence I am puzzled at the humongous buffer build up. Any clues as to
where I should look?

Thanks,
Rahul

Re: Solr 7.7.2 going OOM

Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.

On 9/28/22 16:47, Rahul Goswami wrote:
> I am running Solr 7.7.2 in standalone mode. 32 GB heap. Proces is throwing
> an OOM exception. Heap dump analysis shows ~14 GB of FrozeBufferedUpdates
> and 9 GB held across ~3000 DWPTs . There is a log replay in progress and
> another indexing thread.

FrozenBufferedUpdates is deep within Lucene internals and I know nothing about it.

I had to google to find out what a DWPT was.  And even knowing what class it is, I still have no idea what it actually does.

Do you have the actual exception with stacktrace?  The first thing to determine is what resource ran out to trigger the OOME.  It is not always memory.  If it is something other than heap memory, then heap analysis probably will not help.

If your heap size is 32GB, then you can actually have more memory available by setting the heap size to 31GB.  This sounds all wrong, but due to the way Java memory management works, it is the case.  At 32GB, Java must switch to 64 bit pointers in order to address that memory.

Are those objects you mentioned from the heap dump actually reachable objects?  Wondering if it's possible that some of them might be garbage that hasn't been collected yet.  I haven't done a lot of heap analysis, so I am unsure of whether that would show garbage, or only reachable objects.  I would hope that garbage doesn't show up in any reports you get from the analysis.

> My understanding is that Solr flushes when accumulated updates hit 100
> MB *across
> threads.*

By default, Solr sets the ramBufferSizeMB value to 100, which in turn 
sets a corresponding Lucene value.  When that buffer gets full, Lucene 
flushes the segment it is building in memory to the particular directory 
implementation that it is using, which in most cases means it will be 
written to disk.  There are most likely multiple memory structures that 
Lucene creates which do not count against that buffer.

We need to see the full OOME exception so we know what questions to 
answer next.

Are you running any third party jars in your Solr install? Examples of 
this would be a JDBC driver for your database software, or a custom 
component.

Thanks,
Shawn