You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Vincenzo D'Amore <v....@gmail.com> on 2022/05/03 23:01:30 UTC

Solr - frequent OOM

Hi all,

I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB ram)
that is experiencing frequent OutOfMemoryError (20 a day in total)
exceptions during the execution of a group query.

Looking at query group.limit=1 but the rows range between 1000 and 10000.
I'm analyzing the solr query, and I've added a few JVM parameters to dump
the active threads and the allocated memory to better analyze the OOM.
But I was curious to ask in your experience how I should be preoccupied by
the OOM(s).
In other words, I'm working to remove them ASAP, but when an OOM happens
the Solr behaviour is completely compromised or Solr returns seamlessly to
work normally?

Best regards,
Vincenzo



-- 
Vincenzo D'Amore

Re: Solr - frequent OOM

Posted by Ritvik Sharma <ri...@gmail.com>.
Try to update with Latest version of solrcloud.

Note: There are massive changes.

On Wed, 4 May 2022 at 07:08, Shawn Heisey <ap...@elyograg.org> wrote:

> On 5/3/2022 5:01 PM, Vincenzo D'Amore wrote:
> > I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB
> ram)
> > that is experiencing frequent OutOfMemoryError (20 a day in total)
> > exceptions during the execution of a group query.
> >
> > Looking at query group.limit=1 but the rows range between 1000 and 10000.
> > I'm analyzing the solr query, and I've added a few JVM parameters to dump
> > the active threads and the allocated memory to better analyze the OOM.
> > But I was curious to ask in your experience how I should be preoccupied
> by
> > the OOM(s).
> > In other words, I'm working to remove them ASAP, but when an OOM happens
> > the Solr behaviour is completely compromised or Solr returns seamlessly
> to
> > work normally?
>
> As others have said, Java program state when OOME occurs is completely
> unpredictable.  For Solr, anything could happen, including index
> corruption.
>
> This is why when Solr is started via the bin/solr shell script, it is
> started with a java parameter that will cause it to commit suicide
> whenever OOME occurs.  This functionality has not yet been implemented
> on Windows.  Starting in 9.0, because the minimum Java version will be
> 11, I think we can alter the way that works so equivalent functionality
> will exist on Windows.
>
> Solr does NOT come with anything that will restart after OOME ...
> because chances are that if you encounter OOME once, it will continue to
> happen until you fix the problem.  Anything that anyone has which
> restarts Solr automatically is something they implemented -- Solr will
> not do this out of the box.  I don't recommend implementing anything
> like that.  Solr normally does NOT crash.  If it does crash, there is
> usually something VERY wrong that needs to be fixed.
>
> There are precisely two ways to deal with OOME.  One is to increase the
> available amount of the resource that has been depleted, which might not
> actually be memory.  The other is to change things so less of that
> resource is required -- reduce the index size, modify queries, etc.
>
> Thanks,
> Shawn
>
>

Re: Solr - frequent OOM

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/3/2022 5:01 PM, Vincenzo D'Amore wrote:
> I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB ram)
> that is experiencing frequent OutOfMemoryError (20 a day in total)
> exceptions during the execution of a group query.
>
> Looking at query group.limit=1 but the rows range between 1000 and 10000.
> I'm analyzing the solr query, and I've added a few JVM parameters to dump
> the active threads and the allocated memory to better analyze the OOM.
> But I was curious to ask in your experience how I should be preoccupied by
> the OOM(s).
> In other words, I'm working to remove them ASAP, but when an OOM happens
> the Solr behaviour is completely compromised or Solr returns seamlessly to
> work normally?

As others have said, Java program state when OOME occurs is completely 
unpredictable.  For Solr, anything could happen, including index corruption.

This is why when Solr is started via the bin/solr shell script, it is 
started with a java parameter that will cause it to commit suicide 
whenever OOME occurs.  This functionality has not yet been implemented 
on Windows.  Starting in 9.0, because the minimum Java version will be 
11, I think we can alter the way that works so equivalent functionality 
will exist on Windows.

Solr does NOT come with anything that will restart after OOME ... 
because chances are that if you encounter OOME once, it will continue to 
happen until you fix the problem.  Anything that anyone has which 
restarts Solr automatically is something they implemented -- Solr will 
not do this out of the box.  I don't recommend implementing anything 
like that.  Solr normally does NOT crash.  If it does crash, there is 
usually something VERY wrong that needs to be fixed.

There are precisely two ways to deal with OOME.  One is to increase the 
available amount of the resource that has been depleted, which might not 
actually be memory.  The other is to change things so less of that 
resource is required -- reduce the index size, modify queries, etc.

Thanks,
Shawn


Re: Solr - frequent OOM

Posted by Rahul Goswami <ra...@gmail.com>.
Unfortunately in my experience, Solr doesn’t handle OOMs well and needs to
be restarted.
For example, if you have an indexing job going on or an expensive group by
or collapse query, it will close the IndexWriter or IndexSearcher and the
core is just defunct thereafter unless Solr is restarted. I wish there was
restartability option in Solr or atleast the ability to auto reload the
core unless the jvm is completely shit down by oom-killer.

Rahul

On Tue, May 3, 2022 at 8:09 PM Brian Lininger
<br...@veeva.com.invalid> wrote:

> You need to restart your JVM anytime you hit an OOM exception, the state of
> the JVM is nondeterministic once you hit this.  There is a JVM flag to
> automatically restart on OOM for this exact reason.
>
> On Tue, May 3, 2022, 4:49 PM matthew sporleder <ms...@gmail.com>
> wrote:
>
> > In my experience solr handles that stuff pretty well but I do,
> > occasionally, remember seeing lost file handles and weirdness after an
> OOM.
> >
> > On Tue, May 3, 2022 at 7:01 PM Vincenzo D'Amore <v....@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB
> > ram)
> > > that is experiencing frequent OutOfMemoryError (20 a day in total)
> > > exceptions during the execution of a group query.
> > >
> > > Looking at query group.limit=1 but the rows range between 1000 and
> 10000.
> > > I'm analyzing the solr query, and I've added a few JVM parameters to
> dump
> > > the active threads and the allocated memory to better analyze the OOM.
> > > But I was curious to ask in your experience how I should be preoccupied
> > by
> > > the OOM(s).
> > > In other words, I'm working to remove them ASAP, but when an OOM
> happens
> > > the Solr behaviour is completely compromised or Solr returns seamlessly
> > to
> > > work normally?
> > >
> > > Best regards,
> > > Vincenzo
> > >
> > >
> > >
> > > --
> > > Vincenzo D'Amore
> > >
> >
>

Re: Solr - frequent OOM

Posted by Brian Lininger <br...@veeva.com.INVALID>.
You need to restart your JVM anytime you hit an OOM exception, the state of
the JVM is nondeterministic once you hit this.  There is a JVM flag to
automatically restart on OOM for this exact reason.

On Tue, May 3, 2022, 4:49 PM matthew sporleder <ms...@gmail.com> wrote:

> In my experience solr handles that stuff pretty well but I do,
> occasionally, remember seeing lost file handles and weirdness after an OOM.
>
> On Tue, May 3, 2022 at 7:01 PM Vincenzo D'Amore <v....@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB
> ram)
> > that is experiencing frequent OutOfMemoryError (20 a day in total)
> > exceptions during the execution of a group query.
> >
> > Looking at query group.limit=1 but the rows range between 1000 and 10000.
> > I'm analyzing the solr query, and I've added a few JVM parameters to dump
> > the active threads and the allocated memory to better analyze the OOM.
> > But I was curious to ask in your experience how I should be preoccupied
> by
> > the OOM(s).
> > In other words, I'm working to remove them ASAP, but when an OOM happens
> > the Solr behaviour is completely compromised or Solr returns seamlessly
> to
> > work normally?
> >
> > Best regards,
> > Vincenzo
> >
> >
> >
> > --
> > Vincenzo D'Amore
> >
>

Re: Solr - frequent OOM

Posted by matthew sporleder <ms...@gmail.com>.
In my experience solr handles that stuff pretty well but I do,
occasionally, remember seeing lost file handles and weirdness after an OOM.

On Tue, May 3, 2022 at 7:01 PM Vincenzo D'Amore <v....@gmail.com> wrote:

> Hi all,
>
> I'm tuning a solrcloud 5.4.1 deployment (3 nodes, 12 cores each, 18GB ram)
> that is experiencing frequent OutOfMemoryError (20 a day in total)
> exceptions during the execution of a group query.
>
> Looking at query group.limit=1 but the rows range between 1000 and 10000.
> I'm analyzing the solr query, and I've added a few JVM parameters to dump
> the active threads and the allocated memory to better analyze the OOM.
> But I was curious to ask in your experience how I should be preoccupied by
> the OOM(s).
> In other words, I'm working to remove them ASAP, but when an OOM happens
> the Solr behaviour is completely compromised or Solr returns seamlessly to
> work normally?
>
> Best regards,
> Vincenzo
>
>
>
> --
> Vincenzo D'Amore
>