You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Raji N <ra...@gmail.com> on 2020/05/01 00:54:59 UTC

Re: off-heap OOM

It used to occur every 3 days ,we reduced heap and it started
occurring every 5 days .  From the logs we can't get much. Some times we
see "unable to create  new native thread" in the logs and many times no
exceptions .
When it says "unable to create native thread" error , we got below
exceptions as we use cdcr. To eliminate cdcr from this issue , we disabled
CDCR also. But we still get OOM.

 WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception

java.lang.OutOfMemoryError: unable to create new native thread

               at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]

               at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]

               at
org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
~[httpclient-4.5.3.jar:4.5.3]

               at
org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
~[httpclient-4.5.3.jar:4.5.3]

               at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

               at
org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
[solr-core-7.6.0.jar:7.6.0-SNAPSHOT
34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
14:02:46]

               at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_211]

               at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[?:1.8.0_211]

               at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_211]

               at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[?:1.8.0_211]

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_211]

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_211]

Thanks,
Raji
On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <mk...@apache.org> wrote:

> Raji, how that "OOM for solr occur in every 5 days." exactly looks like?
> What is the error message? Where it's occurring exactly?
>
> On Thu, Apr 30, 2020 at 1:30 AM Raji N <ra...@gmail.com> wrote:
>
> > Thanks so much Jan. Will try your suggestions , yes we are also running
> > solr inside docker.
> >
> > Thanks,
> > Raji
> >
> > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <ja...@cominvent.com>
> wrote:
> >
> > > I have seen the same, but only in Docker.
> > > I think it does not relate to Solr’s off-heap usage for filters and
> other
> > > data structures, but rather how Docker treats memory-mapped files as
> > > virtual memory.
> > > As you know, when using MMapDirectoryFactory, you actually let Linux
> > > handle the loading and unloading of the index files, and Solr will
> access
> > > them as if they were in a huge virtual memory pool. Naturally the index
> > > files grow large, and there is something strange going on in the way
> > Docker
> > > handles this, leading to OOM, not for Java heap but for the process.
> > >
> > > I have no definitive answer, but so far my research has found a few
> > > possible settings
> > >
> > > Set env.var MALLOC_ARENA_MAX=2
> > > Try to limit -XX:MaxDirectMemorySize
> > > Lower mem swappiness in Docker (--memory-swappiness 0)
> > > More generic insight into java mem allocation in Docker:
> > > https://dzone.com/articles/native-memory-allocation-in-examples
> > >
> > > Have not yet found a silver bullet, so very interested in this thread.
> > >
> > > Jan
> > >
> > > > 29. apr. 2020 kl. 19:26 skrev Raji N <ra...@gmail.com>:
> > > >
> > > > Thank you for your reply.  When OOM happens somehow it doesn't
> generate
> > > > dump file. So we have hourly heaps running to diagnose this issue.
> Heap
> > > is
> > > > around 700MB and threads around 150. But 29GB of native memory is
> used
> > > up,
> > > > it is consumed by java.io.DirectBufferR (27GB major consumption) and
> > > > java.io.DirectByteBuffer  objects .
> > > >
> > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
> > > >
> > > > java -version
> > > >
> > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> > > >
> > > > java version "1.8.0_211"
> > > >
> > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> > > >
> > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> > > >
> > > >
> > > >
> > > > Thanks much for taking a look at it.
> > > >
> > > > Raji
> > > >
> > > >
> > > >
> > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey <ap...@elyograg.org>
> > > wrote:
> > > >
> > > >> On 4/29/2020 2:07 AM, Raji N wrote:
> > > >>> Has anyone encountered off-heap OOM. We are thinking of reducing
> heap
> > > >>> further and increasing the hardcommit interval . Any other
> > > suggestions? .
> > > >>> Please share your thoughts.
> > > >>
> > > >> It sounds like it's not heap memory that's running out.
> > > >>
> > > >> When the OutOfMemoryError is logged, it will also contain a message
> > > >> mentioning which resource ran out.
> > > >>
> > > >> A common message that might be logged with the OOME is "Unable to
> > create
> > > >> native thread".  This type of error, if that's what's happening,
> > > >> actually has nothing at all to do with memory, OOME is just how Java
> > > >> happens to report it.
> > > >>
> > > >> You will need to know exactly which resource is running out before
> we
> > > >> can offer any assistance.
> > > >>
> > > >> If the OOME is logged, the message you're looking for will be in the
> > > >> solr log, not the tiny special log that is created when Solr is
> killed
> > > >> by an OOME.  What version of Solr are you running, and what OS is it
> > > >> running on?
> > > >>
> > > >> Thanks,
> > > >> Shawn
> > > >>
> > >
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: off-heap OOM

Posted by Mikhail Khludnev <mk...@apache.org>.
I don't know exactly, but couldn't it hit host-wide threads limit
limitation?

On Fri, May 1, 2020 at 11:02 AM Raji N <ra...@gmail.com> wrote:

> Thanks for your  reply . Sure will take a look at the docker host log.  But
> even when we got "unable to create new native thread" error , the heap dump
> taken within hour before (we have hourly heap generation) the OOM did not
> have more than 150 to 160 threads. So it doesn't look like it happens due
> to running out of threads. Rather suspecting it happens because there is no
> native memory?.
>
> Thanks,
> Raji
>
> On Fri, May 1, 2020 at 12:13 AM Mikhail Khludnev <mk...@apache.org> wrote:
>
> > > java.lang.OutOfMemoryError: unable to create new native thread
> > Usually mean code flaw, but there is a workaround to trigger heap GC.
> > It happens when app creates threads instead of proper pooling, and no GC
> > occurs, so java Thread objects hangs in heap in stopped state, but every
> of
> > them holds a native thread handler; and system run out of native threads
> > sooner or later. So, in this case reducing heap size, frees native thread
> > and app is able to recycle them. But you are right, it's rather better to
> > disable it.
> > Also, check docker host log, there's a specific error message for java
> > under docker.
> >
> > On Fri, May 1, 2020 at 3:55 AM Raji N <ra...@gmail.com> wrote:
> >
> > > It used to occur every 3 days ,we reduced heap and it started
> > > occurring every 5 days .  From the logs we can't get much. Some times
> we
> > > see "unable to create  new native thread" in the logs and many times no
> > > exceptions .
> > > When it says "unable to create native thread" error , we got below
> > > exceptions as we use cdcr. To eliminate cdcr from this issue , we
> > disabled
> > > CDCR also. But we still get OOM.
> > >
> > >  WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
> > > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> > >
> > > java.lang.OutOfMemoryError: unable to create new native thread
> > >
> > >                at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> > >
> > >                at java.lang.Thread.start(Thread.java:717)
> ~[?:1.8.0_211]
> > >
> > >                at
> > >
> > >
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > > ~[httpclient-4.5.3.jar:4.5.3]
> > >
> > >                at
> > >
> > >
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > > ~[httpclient-4.5.3.jar:4.5.3]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > > - nknize - 2018-12-07 14:47:53]
> > >
> > >                at
> > >
> > >
> >
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> > > [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> > > 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> > > 14:02:46]
> > >
> > >                at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > > [?:1.8.0_211]
> > >
> > >                at
> > > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> > > [?:1.8.0_211]
> > >
> > >                at
> > >
> > >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> > > [?:1.8.0_211]
> > >
> > >                at
> > >
> > >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> > > [?:1.8.0_211]
> > >
> > >                at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > > [?:1.8.0_211]
> > >
> > >                at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > > [?:1.8.0_211]
> > >
> > > Thanks,
> > > Raji
> > > On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <mk...@apache.org>
> > wrote:
> > >
> > > > Raji, how that "OOM for solr occur in every 5 days." exactly looks
> > like?
> > > > What is the error message? Where it's occurring exactly?
> > > >
> > > > On Thu, Apr 30, 2020 at 1:30 AM Raji N <ra...@gmail.com> wrote:
> > > >
> > > > > Thanks so much Jan. Will try your suggestions , yes we are also
> > running
> > > > > solr inside docker.
> > > > >
> > > > > Thanks,
> > > > > Raji
> > > > >
> > > > > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <jan.asf@cominvent.com
> >
> > > > wrote:
> > > > >
> > > > > > I have seen the same, but only in Docker.
> > > > > > I think it does not relate to Solr’s off-heap usage for filters
> and
> > > > other
> > > > > > data structures, but rather how Docker treats memory-mapped files
> > as
> > > > > > virtual memory.
> > > > > > As you know, when using MMapDirectoryFactory, you actually let
> > Linux
> > > > > > handle the loading and unloading of the index files, and Solr
> will
> > > > access
> > > > > > them as if they were in a huge virtual memory pool. Naturally the
> > > index
> > > > > > files grow large, and there is something strange going on in the
> > way
> > > > > Docker
> > > > > > handles this, leading to OOM, not for Java heap but for the
> > process.
> > > > > >
> > > > > > I have no definitive answer, but so far my research has found a
> few
> > > > > > possible settings
> > > > > >
> > > > > > Set env.var MALLOC_ARENA_MAX=2
> > > > > > Try to limit -XX:MaxDirectMemorySize
> > > > > > Lower mem swappiness in Docker (--memory-swappiness 0)
> > > > > > More generic insight into java mem allocation in Docker:
> > > > > > https://dzone.com/articles/native-memory-allocation-in-examples
> > > > > >
> > > > > > Have not yet found a silver bullet, so very interested in this
> > > thread.
> > > > > >
> > > > > > Jan
> > > > > >
> > > > > > > 29. apr. 2020 kl. 19:26 skrev Raji N <ra...@gmail.com>:
> > > > > > >
> > > > > > > Thank you for your reply.  When OOM happens somehow it doesn't
> > > > generate
> > > > > > > dump file. So we have hourly heaps running to diagnose this
> > issue.
> > > > Heap
> > > > > > is
> > > > > > > around 700MB and threads around 150. But 29GB of native memory
> is
> > > > used
> > > > > > up,
> > > > > > > it is consumed by java.io.DirectBufferR (27GB major
> consumption)
> > > and
> > > > > > > java.io.DirectByteBuffer  objects .
> > > > > > >
> > > > > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java
> > version
> > > > > > >
> > > > > > > java -version
> > > > > > >
> > > > > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> > > > > > >
> > > > > > > java version "1.8.0_211"
> > > > > > >
> > > > > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> > > > > > >
> > > > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed
> mode)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks much for taking a look at it.
> > > > > > >
> > > > > > > Raji
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey <
> > apache@elyograg.org
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> On 4/29/2020 2:07 AM, Raji N wrote:
> > > > > > >>> Has anyone encountered off-heap OOM. We are thinking of
> > reducing
> > > > heap
> > > > > > >>> further and increasing the hardcommit interval . Any other
> > > > > > suggestions? .
> > > > > > >>> Please share your thoughts.
> > > > > > >>
> > > > > > >> It sounds like it's not heap memory that's running out.
> > > > > > >>
> > > > > > >> When the OutOfMemoryError is logged, it will also contain a
> > > message
> > > > > > >> mentioning which resource ran out.
> > > > > > >>
> > > > > > >> A common message that might be logged with the OOME is "Unable
> > to
> > > > > create
> > > > > > >> native thread".  This type of error, if that's what's
> happening,
> > > > > > >> actually has nothing at all to do with memory, OOME is just
> how
> > > Java
> > > > > > >> happens to report it.
> > > > > > >>
> > > > > > >> You will need to know exactly which resource is running out
> > before
> > > > we
> > > > > > >> can offer any assistance.
> > > > > > >>
> > > > > > >> If the OOME is logged, the message you're looking for will be
> in
> > > the
> > > > > > >> solr log, not the tiny special log that is created when Solr
> is
> > > > killed
> > > > > > >> by an OOME.  What version of Solr are you running, and what OS
> > is
> > > it
> > > > > > >> running on?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Shawn
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours
> > > > Mikhail Khludnev
> > > >
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev

Re: off-heap OOM

Posted by Raji N <ra...@gmail.com>.
Thanks for your  reply . Sure will take a look at the docker host log.  But
even when we got "unable to create new native thread" error , the heap dump
taken within hour before (we have hourly heap generation) the OOM did not
have more than 150 to 160 threads. So it doesn't look like it happens due
to running out of threads. Rather suspecting it happens because there is no
native memory?.

Thanks,
Raji

On Fri, May 1, 2020 at 12:13 AM Mikhail Khludnev <mk...@apache.org> wrote:

> > java.lang.OutOfMemoryError: unable to create new native thread
> Usually mean code flaw, but there is a workaround to trigger heap GC.
> It happens when app creates threads instead of proper pooling, and no GC
> occurs, so java Thread objects hangs in heap in stopped state, but every of
> them holds a native thread handler; and system run out of native threads
> sooner or later. So, in this case reducing heap size, frees native thread
> and app is able to recycle them. But you are right, it's rather better to
> disable it.
> Also, check docker host log, there's a specific error message for java
> under docker.
>
> On Fri, May 1, 2020 at 3:55 AM Raji N <ra...@gmail.com> wrote:
>
> > It used to occur every 3 days ,we reduced heap and it started
> > occurring every 5 days .  From the logs we can't get much. Some times we
> > see "unable to create  new native thread" in the logs and many times no
> > exceptions .
> > When it says "unable to create native thread" error , we got below
> > exceptions as we use cdcr. To eliminate cdcr from this issue , we
> disabled
> > CDCR also. But we still get OOM.
> >
> >  WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
> > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> >                at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >
> >                at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >
> >                at
> >
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >                at
> >
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >                at
> >
> >
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> > [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> > 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> > 14:02:46]
> >
> >                at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > [?:1.8.0_211]
> >
> >                at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> > [?:1.8.0_211]
> >
> >                at
> >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> > [?:1.8.0_211]
> >
> >                at
> >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> > [?:1.8.0_211]
> >
> >                at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > [?:1.8.0_211]
> >
> >                at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > [?:1.8.0_211]
> >
> > Thanks,
> > Raji
> > On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <mk...@apache.org>
> wrote:
> >
> > > Raji, how that "OOM for solr occur in every 5 days." exactly looks
> like?
> > > What is the error message? Where it's occurring exactly?
> > >
> > > On Thu, Apr 30, 2020 at 1:30 AM Raji N <ra...@gmail.com> wrote:
> > >
> > > > Thanks so much Jan. Will try your suggestions , yes we are also
> running
> > > > solr inside docker.
> > > >
> > > > Thanks,
> > > > Raji
> > > >
> > > > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <ja...@cominvent.com>
> > > wrote:
> > > >
> > > > > I have seen the same, but only in Docker.
> > > > > I think it does not relate to Solr’s off-heap usage for filters and
> > > other
> > > > > data structures, but rather how Docker treats memory-mapped files
> as
> > > > > virtual memory.
> > > > > As you know, when using MMapDirectoryFactory, you actually let
> Linux
> > > > > handle the loading and unloading of the index files, and Solr will
> > > access
> > > > > them as if they were in a huge virtual memory pool. Naturally the
> > index
> > > > > files grow large, and there is something strange going on in the
> way
> > > > Docker
> > > > > handles this, leading to OOM, not for Java heap but for the
> process.
> > > > >
> > > > > I have no definitive answer, but so far my research has found a few
> > > > > possible settings
> > > > >
> > > > > Set env.var MALLOC_ARENA_MAX=2
> > > > > Try to limit -XX:MaxDirectMemorySize
> > > > > Lower mem swappiness in Docker (--memory-swappiness 0)
> > > > > More generic insight into java mem allocation in Docker:
> > > > > https://dzone.com/articles/native-memory-allocation-in-examples
> > > > >
> > > > > Have not yet found a silver bullet, so very interested in this
> > thread.
> > > > >
> > > > > Jan
> > > > >
> > > > > > 29. apr. 2020 kl. 19:26 skrev Raji N <ra...@gmail.com>:
> > > > > >
> > > > > > Thank you for your reply.  When OOM happens somehow it doesn't
> > > generate
> > > > > > dump file. So we have hourly heaps running to diagnose this
> issue.
> > > Heap
> > > > > is
> > > > > > around 700MB and threads around 150. But 29GB of native memory is
> > > used
> > > > > up,
> > > > > > it is consumed by java.io.DirectBufferR (27GB major consumption)
> > and
> > > > > > java.io.DirectByteBuffer  objects .
> > > > > >
> > > > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java
> version
> > > > > >
> > > > > > java -version
> > > > > >
> > > > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> > > > > >
> > > > > > java version "1.8.0_211"
> > > > > >
> > > > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> > > > > >
> > > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks much for taking a look at it.
> > > > > >
> > > > > > Raji
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey <
> apache@elyograg.org
> > >
> > > > > wrote:
> > > > > >
> > > > > >> On 4/29/2020 2:07 AM, Raji N wrote:
> > > > > >>> Has anyone encountered off-heap OOM. We are thinking of
> reducing
> > > heap
> > > > > >>> further and increasing the hardcommit interval . Any other
> > > > > suggestions? .
> > > > > >>> Please share your thoughts.
> > > > > >>
> > > > > >> It sounds like it's not heap memory that's running out.
> > > > > >>
> > > > > >> When the OutOfMemoryError is logged, it will also contain a
> > message
> > > > > >> mentioning which resource ran out.
> > > > > >>
> > > > > >> A common message that might be logged with the OOME is "Unable
> to
> > > > create
> > > > > >> native thread".  This type of error, if that's what's happening,
> > > > > >> actually has nothing at all to do with memory, OOME is just how
> > Java
> > > > > >> happens to report it.
> > > > > >>
> > > > > >> You will need to know exactly which resource is running out
> before
> > > we
> > > > > >> can offer any assistance.
> > > > > >>
> > > > > >> If the OOME is logged, the message you're looking for will be in
> > the
> > > > > >> solr log, not the tiny special log that is created when Solr is
> > > killed
> > > > > >> by an OOME.  What version of Solr are you running, and what OS
> is
> > it
> > > > > >> running on?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Shawn
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: off-heap OOM

Posted by Mikhail Khludnev <mk...@apache.org>.
> java.lang.OutOfMemoryError: unable to create new native thread
Usually mean code flaw, but there is a workaround to trigger heap GC.
It happens when app creates threads instead of proper pooling, and no GC
occurs, so java Thread objects hangs in heap in stopped state, but every of
them holds a native thread handler; and system run out of native threads
sooner or later. So, in this case reducing heap size, frees native thread
and app is able to recycle them. But you are right, it's rather better to
disable it.
Also, check docker host log, there's a specific error message for java
under docker.

On Fri, May 1, 2020 at 3:55 AM Raji N <ra...@gmail.com> wrote:

> It used to occur every 3 days ,we reduced heap and it started
> occurring every 5 days .  From the logs we can't get much. Some times we
> see "unable to create  new native thread" in the logs and many times no
> exceptions .
> When it says "unable to create native thread" error , we got below
> exceptions as we use cdcr. To eliminate cdcr from this issue , we disabled
> CDCR also. But we still get OOM.
>
>  WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
> o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
>
> java.lang.OutOfMemoryError: unable to create new native thread
>
>                at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
>
>                at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
>
>                at
>
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> ~[httpclient-4.5.3.jar:4.5.3]
>
>                at
>
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> ~[httpclient-4.5.3.jar:4.5.3]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> - nknize - 2018-12-07 14:47:53]
>
>                at
>
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> 14:02:46]
>
>                at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_211]
>
>                at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [?:1.8.0_211]
>
>                at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [?:1.8.0_211]
>
>                at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [?:1.8.0_211]
>
>                at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_211]
>
>                at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_211]
>
> Thanks,
> Raji
> On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <mk...@apache.org> wrote:
>
> > Raji, how that "OOM for solr occur in every 5 days." exactly looks like?
> > What is the error message? Where it's occurring exactly?
> >
> > On Thu, Apr 30, 2020 at 1:30 AM Raji N <ra...@gmail.com> wrote:
> >
> > > Thanks so much Jan. Will try your suggestions , yes we are also running
> > > solr inside docker.
> > >
> > > Thanks,
> > > Raji
> > >
> > > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <ja...@cominvent.com>
> > wrote:
> > >
> > > > I have seen the same, but only in Docker.
> > > > I think it does not relate to Solr’s off-heap usage for filters and
> > other
> > > > data structures, but rather how Docker treats memory-mapped files as
> > > > virtual memory.
> > > > As you know, when using MMapDirectoryFactory, you actually let Linux
> > > > handle the loading and unloading of the index files, and Solr will
> > access
> > > > them as if they were in a huge virtual memory pool. Naturally the
> index
> > > > files grow large, and there is something strange going on in the way
> > > Docker
> > > > handles this, leading to OOM, not for Java heap but for the process.
> > > >
> > > > I have no definitive answer, but so far my research has found a few
> > > > possible settings
> > > >
> > > > Set env.var MALLOC_ARENA_MAX=2
> > > > Try to limit -XX:MaxDirectMemorySize
> > > > Lower mem swappiness in Docker (--memory-swappiness 0)
> > > > More generic insight into java mem allocation in Docker:
> > > > https://dzone.com/articles/native-memory-allocation-in-examples
> > > >
> > > > Have not yet found a silver bullet, so very interested in this
> thread.
> > > >
> > > > Jan
> > > >
> > > > > 29. apr. 2020 kl. 19:26 skrev Raji N <ra...@gmail.com>:
> > > > >
> > > > > Thank you for your reply.  When OOM happens somehow it doesn't
> > generate
> > > > > dump file. So we have hourly heaps running to diagnose this issue.
> > Heap
> > > > is
> > > > > around 700MB and threads around 150. But 29GB of native memory is
> > used
> > > > up,
> > > > > it is consumed by java.io.DirectBufferR (27GB major consumption)
> and
> > > > > java.io.DirectByteBuffer  objects .
> > > > >
> > > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
> > > > >
> > > > > java -version
> > > > >
> > > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> > > > >
> > > > > java version "1.8.0_211"
> > > > >
> > > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> > > > >
> > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> > > > >
> > > > >
> > > > >
> > > > > Thanks much for taking a look at it.
> > > > >
> > > > > Raji
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey <apache@elyograg.org
> >
> > > > wrote:
> > > > >
> > > > >> On 4/29/2020 2:07 AM, Raji N wrote:
> > > > >>> Has anyone encountered off-heap OOM. We are thinking of reducing
> > heap
> > > > >>> further and increasing the hardcommit interval . Any other
> > > > suggestions? .
> > > > >>> Please share your thoughts.
> > > > >>
> > > > >> It sounds like it's not heap memory that's running out.
> > > > >>
> > > > >> When the OutOfMemoryError is logged, it will also contain a
> message
> > > > >> mentioning which resource ran out.
> > > > >>
> > > > >> A common message that might be logged with the OOME is "Unable to
> > > create
> > > > >> native thread".  This type of error, if that's what's happening,
> > > > >> actually has nothing at all to do with memory, OOME is just how
> Java
> > > > >> happens to report it.
> > > > >>
> > > > >> You will need to know exactly which resource is running out before
> > we
> > > > >> can offer any assistance.
> > > > >>
> > > > >> If the OOME is logged, the message you're looking for will be in
> the
> > > > >> solr log, not the tiny special log that is created when Solr is
> > killed
> > > > >> by an OOME.  What version of Solr are you running, and what OS is
> it
> > > > >> running on?
> > > > >>
> > > > >> Thanks,
> > > > >> Shawn
> > > > >>
> > > >
> > > >
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev