You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Dominic Humphries <do...@adzuna.com.INVALID> on 2021/10/12 15:14:16 UTC

Solr keeps creating new threads until it dies

We run 8.3.1 in prod without any problems, but we're having issues with
trying to upgrade.

I've created an 8.9.0 leader & follower, imported our live data into it,
and am testing it via replaying requests made to prod. We're seeing a big
problem where fairly moderate request rates are causing the instance to
become so slow it fails healthcheck. The logs showed a lot of errors around
creating threads:

solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
pthread_create failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
0k, detached.

WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
java.lang.OutOfMemoryError: unable to create native thread: possibly out of
memory or process/resource limits reached

So I monitored thread count for the process whilst running the test suite
and saw a persistent pattern: Threads increased until maxed out, the logs
flooded with errors as it tried to create still more threads, and the
instance slowed down until terminated as unhealthy.

The DefaultTasksMax is set to 4915, I've tried raising and lowering it but
regardless of value the result is the same: it gets maxed and everything
slows down.

Is there anything I can do to stop solr spinning up so many threads it
ceases to function? There have been a few test passes where it
spontaneously dropped threadcount from thousands to hundreds and stayed up
longer, but there seems no pattern to when this happens. Running the same
tests on 8.3.1 results in a much slower increase in threads and it never
quite maxes them so things continue to function.

See below for the thread count and healthcheck times seen on a (fairly
harsh) test run of 100 requests/sec

Thanks

Dominic


Threadcount:

ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep 'start.jar'
| wc -l; sleep 10s; done
Tue Oct 12 14:27:33 UTC 2021
52
Tue Oct 12 14:27:43 UTC 2021
52
Tue Oct 12 14:27:54 UTC 2021
52
Tue Oct 12 14:28:04 UTC 2021
52
Tue Oct 12 14:28:14 UTC 2021
569
Tue Oct 12 14:28:24 UTC 2021
899
Tue Oct 12 14:28:34 UTC 2021
1198
Tue Oct 12 14:28:44 UTC 2021
1589
Tue Oct 12 14:28:54 UTC 2021
2016
Tue Oct 12 14:29:05 UTC 2021
2451
Tue Oct 12 14:29:15 UTC 2021
2851
Tue Oct 12 14:29:26 UTC 2021
2934
Tue Oct 12 14:29:36 UTC 2021
3249
Tue Oct 12 14:29:46 UTC 2021
3501
Tue Oct 12 14:29:57 UTC 2021
3734
Tue Oct 12 14:30:07 UTC 2021
4128
Tue Oct 12 14:30:18 UTC 2021
4374
Tue Oct 12 14:30:29 UTC 2021
4637
Tue Oct 12 14:30:39 UTC 2021
4693
Tue Oct 12 14:30:50 UTC 2021
4807
Tue Oct 12 14:31:01 UTC 2021
4916
Tue Oct 12 14:31:11 UTC 2021
4916
Tue Oct 12 14:31:22 UTC 2021
Connection to 10.40.22.166 closed by remote host.


Healthcheck:

ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
10s; done
Tue Oct 12 14:27:34 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:27:34 UTC 2021
----
Tue Oct 12 14:27:44 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:27:44 UTC 2021
----
Tue Oct 12 14:27:54 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:27:54 UTC 2021
----
Tue Oct 12 14:28:04 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:28:04 UTC 2021
----
Tue Oct 12 14:28:14 UTC 2021
> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:28:16 UTC 2021
----
Tue Oct 12 14:28:26 UTC 2021
> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:28:39 UTC 2021
----
Tue Oct 12 14:28:49 UTC 2021
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
  0> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:29:13 UTC 2021
----
Tue Oct 12 14:29:23 UTC 2021
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
  0> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:29:25 UTC 2021
----
Tue Oct 12 14:29:35 UTC 2021
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
  0> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:29:44 UTC 2021
----
Tue Oct 12 14:29:54 UTC 2021
> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:30:06 UTC 2021
----
Tue Oct 12 14:30:16 UTC 2021
> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:30:20 UTC 2021
----
Tue Oct 12 14:30:30 UTC 2021
> GET /solr/ HTTP/1.1
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
  0< HTTP/1.1 200 OK
Tue Oct 12 14:30:33 UTC 2021
----
Tue Oct 12 14:30:43 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:30:43 UTC 2021
----
Tue Oct 12 14:30:53 UTC 2021
> GET /solr/ HTTP/1.1
Tue Oct 12 14:30:55 UTC 2021
----
Tue Oct 12 14:31:05 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:31:05 UTC 2021
----
Tue Oct 12 14:31:15 UTC 2021
> GET /solr/ HTTP/1.1
< HTTP/1.1 200 OK
Tue Oct 12 14:31:15 UTC 2021
----
Connection to 10.40.22.166 closed by remote host.

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
This looks more like a memoryleak rather than a thread issue.

On Wed, 13 Oct 2021, 04:33 Joel Bernstein, <jo...@gmail.com> wrote:

> There is a thread dump on the Solr admin. You can use that to determine
> what all those threads are doing and where they are getting stuck. You can
> post parts of the thread dump back to this email thread as well.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > We run 8.3.1 in prod without any problems, but we're having issues with
> > trying to upgrade.
> >
> > I've created an 8.9.0 leader & follower, imported our live data into it,
> > and am testing it via replaying requests made to prod. We're seeing a big
> > problem where fairly moderate request rates are causing the instance to
> > become so slow it fails healthcheck. The logs showed a lot of errors
> around
> > creating threads:
> >
> > solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> guardsize:
> > 0k, detached.
> >
> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > java.lang.OutOfMemoryError: unable to create native thread: possibly out
> of
> > memory or process/resource limits reached
> >
> > So I monitored thread count for the process whilst running the test suite
> > and saw a persistent pattern: Threads increased until maxed out, the logs
> > flooded with errors as it tried to create still more threads, and the
> > instance slowed down until terminated as unhealthy.
> >
> > The DefaultTasksMax is set to 4915, I've tried raising and lowering it
> but
> > regardless of value the result is the same: it gets maxed and everything
> > slows down.
> >
> > Is there anything I can do to stop solr spinning up so many threads it
> > ceases to function? There have been a few test passes where it
> > spontaneously dropped threadcount from thousands to hundreds and stayed
> up
> > longer, but there seems no pattern to when this happens. Running the same
> > tests on 8.3.1 results in a much slower increase in threads and it never
> > quite maxes them so things continue to function.
> >
> > See below for the thread count and healthcheck times seen on a (fairly
> > harsh) test run of 100 requests/sec
> >
> > Thanks
> >
> > Dominic
> >
> >
> > Threadcount:
> >
> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> 'start.jar'
> > | wc -l; sleep 10s; done
> > Tue Oct 12 14:27:33 UTC 2021
> > 52
> > Tue Oct 12 14:27:43 UTC 2021
> > 52
> > Tue Oct 12 14:27:54 UTC 2021
> > 52
> > Tue Oct 12 14:28:04 UTC 2021
> > 52
> > Tue Oct 12 14:28:14 UTC 2021
> > 569
> > Tue Oct 12 14:28:24 UTC 2021
> > 899
> > Tue Oct 12 14:28:34 UTC 2021
> > 1198
> > Tue Oct 12 14:28:44 UTC 2021
> > 1589
> > Tue Oct 12 14:28:54 UTC 2021
> > 2016
> > Tue Oct 12 14:29:05 UTC 2021
> > 2451
> > Tue Oct 12 14:29:15 UTC 2021
> > 2851
> > Tue Oct 12 14:29:26 UTC 2021
> > 2934
> > Tue Oct 12 14:29:36 UTC 2021
> > 3249
> > Tue Oct 12 14:29:46 UTC 2021
> > 3501
> > Tue Oct 12 14:29:57 UTC 2021
> > 3734
> > Tue Oct 12 14:30:07 UTC 2021
> > 4128
> > Tue Oct 12 14:30:18 UTC 2021
> > 4374
> > Tue Oct 12 14:30:29 UTC 2021
> > 4637
> > Tue Oct 12 14:30:39 UTC 2021
> > 4693
> > Tue Oct 12 14:30:50 UTC 2021
> > 4807
> > Tue Oct 12 14:31:01 UTC 2021
> > 4916
> > Tue Oct 12 14:31:11 UTC 2021
> > 4916
> > Tue Oct 12 14:31:22 UTC 2021
> > Connection to 10.40.22.166 closed by remote host.
> >
> >
> > Healthcheck:
> >
> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> > 10s; done
> > Tue Oct 12 14:27:34 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:34 UTC 2021
> > ----
> > Tue Oct 12 14:27:44 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:44 UTC 2021
> > ----
> > Tue Oct 12 14:27:54 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:54 UTC 2021
> > ----
> > Tue Oct 12 14:28:04 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:28:04 UTC 2021
> > ----
> > Tue Oct 12 14:28:14 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:28:16 UTC 2021
> > ----
> > Tue Oct 12 14:28:26 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:28:39 UTC 2021
> > ----
> > Tue Oct 12 14:28:49 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
> >   0> GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:29:13 UTC 2021
> > ----
> > Tue Oct 12 14:29:23 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
> >   0> GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:29:25 UTC 2021
> > ----
> > Tue Oct 12 14:29:35 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
> >   0> GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:29:44 UTC 2021
> > ----
> > Tue Oct 12 14:29:54 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:06 UTC 2021
> > ----
> > Tue Oct 12 14:30:16 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:20 UTC 2021
> > ----
> > Tue Oct 12 14:30:30 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:33 UTC 2021
> > ----
> > Tue Oct 12 14:30:43 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:30:43 UTC 2021
> > ----
> > Tue Oct 12 14:30:53 UTC 2021
> > > GET /solr/ HTTP/1.1
> > Tue Oct 12 14:30:55 UTC 2021
> > ----
> > Tue Oct 12 14:31:05 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:31:05 UTC 2021
> > ----
> > Tue Oct 12 14:31:15 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:31:15 UTC 2021
> > ----
> > Connection to 10.40.22.166 closed by remote host.
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Okay, I think we have a suspect!

I confirmed that the replicas are read-only and no updates are expected
from queries. Then I took a look at replication status and noticed the
following:
      "replicationFailedAtList":["Mon Oct 18 12:35:44 UTC 2021",
        "Mon Oct 18 12:01:22 UTC 2021",
        "Mon Oct 18 11:52:31 UTC 2021",
        "Mon Oct 18 11:05:11 UTC 2021",
        "Mon Oct 18 10:40:59 UTC 2021",
        "Mon Oct 18 10:11:42 UTC 2021",
        "Wed Oct 13 15:07:10 UTC 2021"],

Multiple failures yesterday (when I was testing all day) and then nothing
back until Wednesday last week (when I was last running tests)

So I disabled polling for updates on the replica and ran the hour-long test
cycle a couple times. Success rate jumped straight up to the high 90s at
last!

I'm unsure why replication should cause an issue for the upgrade but it
does seem like a candidate for the problem.

On Mon, 18 Oct 2021 at 18:00, Deepak Goel <de...@gmail.com> wrote:

> This
>
> (java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> )
> clearly indicates data contention. We just need to find out what.
>
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Mon, Oct 18, 2021 at 10:07 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > The test is essentially a dump of requests made to our live service,
> > replayed on our test instances. They're purely read-only, so there should
> > be no contention in terms of writing updates - we update solr via a
> > completely separate mechanism.
> >
> > On Mon, 18 Oct 2021 at 17:30, Deepak Goel <de...@gmail.com> wrote:
> >
> > > One thing concerns me is the input of the test. Is the data/docs
> limited
> > to
> > > a particular number? There could be a possibility that all threads are
> > > getting locked doing an update of a particular document (rather than
> > > updating a large number of documents). This would create a bottleneck
> and
> > > performance would not show any improvement even if you increase the
> > > capacity of servers or add more threads or tune the GC-cycle.
> > >
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Mon, Oct 18, 2021 at 9:04 PM Dominic Humphries
> > > <do...@adzuna.com.invalid> wrote:
> > >
> > > > Apologies for not replying earlier, illness struck last week :(
> > > >
> > > > So, I did some testing with increased Xss and Xmx values for 8.9.0 as
> > > > suggested.
> > > >
> > > > Boosting Xss from 256k to 1024k made no impact on any tests.
> > > >
> > > > Boosting from -Xmx7913m to -Xmx15826m did improve some things: solr
> was
> > > > less likely to become unresponsive when it maxed out on creating
> > threads
> > > > during high-intensity testing, and increased test scores
> significantly
> > > when
> > > > running low-intensity short-lived tests - from ~70% to 100% success
> > > testing
> > > > at 10 requests/sec. So that's all good.
> > > >
> > > > Sadly, it hasn't fixed everything: Running the low-rate 10/sec on
> > > one-hour
> > > > testing cycles has shown no improvement: It's still in the high-80s,
> > > > low-90s compared to the high-90s of 8.3.1 running the same tests with
> > the
> > > > same settings. The failures are caused by a higher incidence of "no
> > > > response" rather than receiving any error. Also at this long&slow
> test
> > > > cycle, there was a significant difference in threads being run by the
> > two
> > > > versions: 8.3.1 hovering at around 350 whilst 8.9.0 had nearly a
> > > thousand.
> > > > In both cases, admin interface showed that almost all threads were
> the
> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject
> > in
> > > > TIMED_WAITING status.
> > > >
> > > > Since the "long & low intensity" tests are the closest reflection of
> > the
> > > > demands that would be placed on the instances in production use, this
> > > > remains a problem for us: On current showing, even though we're
> running
> > > > 8.9.0 on instances with faster CPUs and the same heap settings as
> > 8.3.1,
> > > we
> > > > see more latency and more failures. Neither RAM not CPU use is
> > > concerningly
> > > > high on either type; load is at around 1 on both types. I'm at a loss
> > as
> > > to
> > > > why 8.9.0 is failing to respond to so many more requests than 8.3.1
> > > unless
> > > > we fall back to the "threads aren't terminstaing when they should"
> > > > possibility. I haven't found a way to set the threshold for the timed
> > > > threads, so if there is a way I'd be happy to try it?
> > > >
> > > > Thanks
> > > >
> > > > Dominic
> > > >
> > > > On Thu, 14 Oct 2021 at 06:28, Deepak Goel <de...@gmail.com> wrote:
> > > >
> > > > > Hello
> > > > >
> > > > > This might not be exactly what you are looking for, but it might
> give
> > > > some
> > > > > clues:
> > > > > https://www.eclipse.org/lists/jetty-users/msg07971.html
> > > > >
> > > > > I think you definitely need to increase your heap size. There might
> > be
> > > no
> > > > > memory-leaks, it might be just that the load which you are pushing
> > into
> > > > > Solr always requires a particular amount of memory.
> > > > >
> > > > > Deepak
> > > > > "The greatness of a nation can be judged by the way its animals are
> > > > treated
> > > > > - Mahatma Gandhi"
> > > > >
> > > > > +91 73500 12833
> > > > > deicool@gmail.com
> > > > >
> > > > > Facebook: https://www.facebook.com/deicool
> > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > >
> > > > > "Plant a Tree, Go Green"
> > > > >
> > > > > Make In India : http://www.makeinindia.com/home
> > > > >
> > > > >
> > > > > On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > I was gonna mention point 2 as well. Maybe you're hitting the
> -Xss
> > > JVM
> > > > > > limit (256Kb) and need to manually increase it.
> > > > > > I seem to recall that it should be at the very least 1Mb (or so
> it
> > > says
> > > > > > here
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> > > > > > >
> > > > > > )
> > > > > >
> > > > > > Maybe removing that parameter or setting it to 1024Kb may change
> > your
> > > > > > results
> > > > > >
> > > > > > On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > There could be parameters which specify:
> > > > > > >
> > > > > > > 1. How long do old threads live? And when they can be reused.
> > > > > > >
> > > > > > > 2. You might also be reaching memory restrictions for creating
> > new
> > > > > > threads.
> > > > > > >
> > > > > > > 3. This is a bit offbeat thinking. But is there a lot of disk
> > > > > activities
> > > > > > > happening as threads are nothing but files.
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries,
> > > > > <dominic@adzuna.com.invalid
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Sure - when solr gets pushed enough to reach the max thread
> > > count,
> > > > it
> > > > > > > > doesn't just re-use what it has, it keeps trying to create
> new
> > > > ones.
> > > > > > The
> > > > > > > > logs *flood* with these errors:
> > > > > > > >
> > > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > > [5032072.842s][warning][os,thread] Failed to start thread -
> > > > > > > pthread_create
> > > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k,
> > > > > > detached.
> > > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > > > pthread_create
> > > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k,
> > > > > > detached.
> > > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > > > pthread_create
> > > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k,
> > > > > > detached.
> > > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > > [5032072.848s][warning][os,thread] Failed to start thread -
> > > > > > > pthread_create
> > > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k,
> > > > > > detached.
> > > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > > [5032072.852s][warning][os,thread] Failed to start thread -
> > > > > > > pthread_create
> > > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k,
> > > > > > detached.
> > > > > > > >
> > > > > > > >
> > > > > > > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > > > > > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError:
> unable
> > to
> > > > > > create
> > > > > > > > native thread: possibly out of memory or process/resource
> > limits
> > > > > > reached
> > > > > > > >         at java.base/java.lang.Thread.start0(Native Method)
> > > > > > > > java.lang.OutOfMemoryError: unable to create native thread:
> > > > possibly
> > > > > > out
> > > > > > > of
> > > > > > > > memory or process/resource limits reached
> > > > > > > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > > > > > > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > .SelectorManager.execute(SelectorManager.java:163)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > .ManagedSelector.execute(ManagedSelector.java:332)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > .ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > .AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > .AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > > org.eclipse.jetty.io
> > > > > > > >
> > > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > > > > > > >
> > > > > > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <deicool@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Am sorry, did you post the thread error.I might have missed
> > it.
> > > > Can
> > > > > > you
> > > > > > > > > please post it again?
> > > > > > > > >
> > > > > > > > > Deepak
> > > > > > > > > "The greatness of a nation can be judged by the way its
> > animals
> > > > are
> > > > > > > > treated
> > > > > > > > > - Mahatma Gandhi"
> > > > > > > > >
> > > > > > > > > +91 73500 12833
> > > > > > > > > deicool@gmail.com
> > > > > > > > >
> > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > >
> > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > >
> > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Keeping threads around for re-use would certainly make
> > sense
> > > to
> > > > > me,
> > > > > > > but
> > > > > > > > > > when the code starts erroring because it's trying to
> create
> > > > > > ever-more
> > > > > > > > new
> > > > > > > > > > ones rather than re-use old ones, it seems there must be
> a
> > > bug
> > > > > > > > > >
> > > > > > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <
> > deicool@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > We will have to check the solr logic. Threads waiting
> > might
> > > > get
> > > > > > > > reused
> > > > > > > > > in
> > > > > > > > > > > the future, so might be good for all you know.
> > > > > > > > > > >
> > > > > > > > > > > Deepak
> > > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > > animals
> > > > > > are
> > > > > > > > > > treated
> > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > >
> > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > >
> > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > >
> > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Yep - I edited the two instances to remove those
> > > > differences
> > > > > > and
> > > > > > > > > > re-ran a
> > > > > > > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > > > > > > >
> > > > > > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly
> > at
> > > > this
> > > > > > > point
> > > > > > > > > and
> > > > > > > > > > > > neither hit the threads limit or failed to create new
> > > > > threads.
> > > > > > > But
> > > > > > > > > both
> > > > > > > > > > > > scaled up to well above 2k threads and even after
> > > stopping
> > > > > the
> > > > > > > > test,
> > > > > > > > > > the
> > > > > > > > > > > > threads remained in a timed wait condition. So I'm
> > still
> > > > > > thinking
> > > > > > > > > > > there's a
> > > > > > > > > > > > problem with threads not getting terminated
> somewhere.
> > > Test
> > > > > > > output
> > > > > > > > > > below
> > > > > > > > > > > if
> > > > > > > > > > > > it's of use:
> > > > > > > > > > > >
> > > > > > > > > > > > 8.3.1
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 25.05
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.97089157s,
> > > > > > > > > 59.990075461s,
> > > > > > > > > > > > 29.980816109s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 11.011900024s,
> > > > > > > > 5.175300857s,
> > > > > > > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > > > > > > Bytes In      [total, mean]              42385846,
> > > 7064.31
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    37.57%
> > > > > > > > > > > > Status Codes  [code:count]               0:3741
> > 200:2254
> > > > > > 500:5
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 10.98
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.931457331s,
> > > > > > > > > > 59.990859187s,
> > > > > > > > > > > > 29.940598144s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s,
> > > > > 57.37µs,
> > > > > > > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > > > > > > Bytes In      [total, mean]              21287225,
> > > 3547.87
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    16.45%
> > > > > > > > > > > > Status Codes  [code:count]               0:5007
> > 200:987
> > > > > 500:6
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 11.68
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.58141878s,
> > > > > > > > > 59.990945585s,
> > > > > > > > > > > > 29.590473195s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s,
> > > > > 58.742µs,
> > > > > > > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > > > > > > Bytes In      [total, mean]              17434533,
> > > 2905.76
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    17.43%
> > > > > > > > > > > > Status Codes  [code:count]               0:4874
> > 200:1046
> > > > > > 500:80
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > > cancellation
> > > > > > > > > while
> > > > > > > > > > > > reading body)
> > > > > > > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > 1.70
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.541198374s,
> > > > > > > > > > 59.990833778s,
> > > > > > > > > > > > 29.550364596s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 8.155874514s,
> > > > > > 51.073µs,
> > > > > > > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > > > > > > Bytes In      [total, mean]              2211649,
> > 368.61
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    2.53%
> > > > > > > > > > > > Status Codes  [code:count]               0:5810
> > 200:152
> > > > > > 500:38
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > 0.73
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.451307081s,
> > > > > > > > > > 59.990601937s,
> > > > > > > > > > > > 29.460705144s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 7.257427504s,
> > > > > > 50.866µs,
> > > > > > > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > > > > > > Bytes In      [total, mean]              837808,
> 139.63
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    1.08%
> > > > > > > > > > > > Status Codes  [code:count]               0:5907
> 200:65
> > > > > 500:28
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > > cancellation
> > > > > > > > > while
> > > > > > > > > > > > reading body)
> > > > > > > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > 1.65
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.760671622s,
> > > > > > > > > > 59.990482467s,
> > > > > > > > > > > > 29.770189155s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 8.288506559s,
> > > > > > 50.424µs,
> > > > > > > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > > > > > > Bytes In      [total, mean]              1973103,
> > 328.85
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    2.47%
> > > > > > > > > > > > Status Codes  [code:count]               0:5821
> > 200:148
> > > > > > 500:31
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 8.9.0
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 28.50
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.930332927s,
> > > > > > > > > > 59.990809535s,
> > > > > > > > > > > > 29.939523392s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 10.958871049s,
> > > > > > > > 5.184437078s,
> > > > > > > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > > > > > > Bytes In      [total, mean]              51095634,
> > > 8515.94
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    42.72%
> > > > > > > > > > > > Status Codes  [code:count]               0:3432
> > 200:2563
> > > > > > 500:5
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 38.15
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.951119527s,
> > > > > > > > > > 59.990310608s,
> > > > > > > > > > > > 29.960808919s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 10.089597212s,
> > > > > > > > 4.856237339s,
> > > > > > > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > > > > > > Bytes In      [total, mean]              64455304,
> > > 10742.55
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    57.20%
> > > > > > > > > > > > Status Codes  [code:count]               0:2565
> > 200:3432
> > > > > > 500:3
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 31.12
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.971513653s,
> > > > > > > > > > 59.990882146s,
> > > > > > > > > > > > 29.980631507s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 7.982882635s,
> > > > > > > > 4.086936886s,
> > > > > > > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > > > > > > Bytes In      [total, mean]              49838902,
> > > 8306.48
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    46.67%
> > > > > > > > > > > > Status Codes  [code:count]               0:3011
> > 200:2800
> > > > > > 500:4
> > > > > > > > > > 503:185
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > unexpected EOF
> > > > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 30.21
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.871026977s,
> > > > > > > > > > 59.990066513s,
> > > > > > > > > > > > 29.880960464s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 9.187948603s,
> > > > > > > > 5.166973696s,
> > > > > > > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > > > > > > Bytes In      [total, mean]              52508950,
> > > 8751.49
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    45.25%
> > > > > > > > > > > > Status Codes  [code:count]               0:3118
> > 200:2715
> > > > > > 500:4
> > > > > > > > > > 503:163
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > unexpected EOF
> > > > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 28.33
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.931899376s,
> > > > > > > > > > 59.990583341s,
> > > > > > > > > > > > 29.941316035s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 10.643567102s,
> > > > > > > > 4.009658048s,
> > > > > > > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > > > > > > Bytes In      [total, mean]              51869384,
> > > 8644.90
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    42.47%
> > > > > > > > > > > > Status Codes  [code:count]               0:3448
> > 200:2548
> > > > > > 500:4
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > > cancellation
> > > > > > > > > while
> > > > > > > > > > > > reading body)
> > > > > > > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > > >
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > > > > > > Requests      [total, rate, throughput]  6000,
> 100.02,
> > > > 40.46
> > > > > > > > > > > > Duration      [total, attack, wait]
> > 1m29.540373624s,
> > > > > > > > > > 59.990568262s,
> > > > > > > > > > > > 29.549805362s
> > > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]
> 9.938485443s,
> > > > > > > > 5.811191827s,
> > > > > > > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > > > > > > Bytes In      [total, mean]              64483027,
> > > 10747.17
> > > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > > Success       [ratio]                    60.38%
> > > > > > > > > > > > Status Codes  [code:count]               0:2372
> > 200:3623
> > > > > > 500:5
> > > > > > > > > > > > Error Set:
> > > > > > > > > > > > 500 Server Error
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <
> > > > deicool@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello
> > > > > > > > > > > > >
> > > > > > > > > > > > > I can as of now see two changes:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1.  -Xmx
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > > > > > > >
> > > > > > > > > > > > > Deepak
> > > > > > > > > > > > > "The greatness of a nation can be judged by the way
> > its
> > > > > > animals
> > > > > > > > are
> > > > > > > > > > > > treated
> > > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > > >
> > > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > > >
> > > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > > >
> > > > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> > > -DSTOP.KEY=solrrocks
> > > > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > > > -Dsolr.data.home=
> > -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > > >
> -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > > > > > > -jar start.jar --module=http
> > > > > > > > > > > > > > I believe the key items are:
> > > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > > -Xmx15826m
> > > > > > > > > > > > > > -Xss256k
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > And for 8.9.0 is
> > > > > > > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > > > > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> > > -DSTOP.KEY=solrrocks
> > > > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > > > >
> > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > > > 8983
> > > > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > > > -Dsolr.data.home=
> > -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar
> > > > > start.jar
> > > > > > > > > > > > --module=http
> > > > > > > > > > > > > > Key:
> > > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > > > >
> > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > > > 8983
> > > > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > > -Xmx7913m
> > > > > > > > > > > > > > -Xss256k
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xmx values are based on the instance RAM,
> currently
> > > > > they're
> > > > > > > > > running
> > > > > > > > > > > on
> > > > > > > > > > > > > two
> > > > > > > > > > > > > > different instance types but we see the same
> > > behaviour
> > > > > when
> > > > > > > > > they're
> > > > > > > > > > > on
> > > > > > > > > > > > > > identical types too.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Many thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Dominic
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> > > > > > deicool@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Can you please tell us the JVM Heap Setting for
> > > both
> > > > > the
> > > > > > > > > > versions:
> > > > > > > > > > > > > 8.3.1,
> > > > > > > > > > > > > > > 8.9.0?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I will also have to look into the following
> code:
> > > > > > > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Deepak
> > > > > > > > > > > > > > > "The greatness of a nation can be judged by the
> > way
> > > > its
> > > > > > > > animals
> > > > > > > > > > are
> > > > > > > > > > > > > > treated
> > > > > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Make In India :
> http://www.makeinindia.com/home
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic
> Humphries
> > > > > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The overwhelming majority appear to be
> threads
> > in
> > > > > > > > > > TIMED_WAITING,
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > waiting on the same
> > > > > > > > > > > > > > > > thing:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I've attached a screenshot which includes the
> > > stack
> > > > > > > trace.
> > > > > > > > > > > Stopping
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > queries to the instance and waiting didn't
> > result
> > > > in
> > > > > > any
> > > > > > > > > > > noticeable
> > > > > > > > > > > > > > > > decrease in the number of threads so it looks
> > > like
> > > > > > > despite
> > > > > > > > > > being
> > > > > > > > > > > > > timed,
> > > > > > > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Restarting the service takes me back down to
> > just
> > > > 53
> > > > > > > > threads;
> > > > > > > > > > > > > > re-running
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > test results in many new threads immediately
> > > coming
> > > > > > into
> > > > > > > > > being,
> > > > > > > > > > > > this
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > with a higher proportion of threads BLOCKED
> on
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > > > > - See second screenshot. The stack trace for
> > > those
> > > > is
> > > > > > too
> > > > > > > > big
> > > > > > > > > > for
> > > > > > > > > > > > one
> > > > > > > > > > > > > > > > screen so here's the output:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > >
> org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > >
> > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > > >    -
> > > > > > > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > > > > > > >    Source)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > >
> > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > >
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > > > >
> .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > > > > > > >    - java.base@11.0.5
> > > > > > > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein
> <
> > > > > > > > > > joelsolr@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> There is a thread dump on the Solr admin.
> You
> > > can
> > > > > use
> > > > > > > that
> > > > > > > > > to
> > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > >> what all those threads are doing and where
> > they
> > > > are
> > > > > > > > getting
> > > > > > > > > > > stuck.
> > > > > > > > > > > > > You
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > >> post parts of the thread dump back to this
> > email
> > > > > > thread
> > > > > > > as
> > > > > > > > > > well.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic
> > > Humphries
> > > > > > > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> > We run 8.3.1 in prod without any problems,
> > but
> > > > > we're
> > > > > > > > > having
> > > > > > > > > > > > issues
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > I've created an 8.9.0 leader & follower,
> > > > imported
> > > > > > our
> > > > > > > > live
> > > > > > > > > > > data
> > > > > > > > > > > > > into
> > > > > > > > > > > > > > > it,
> > > > > > > > > > > > > > > >> > and am testing it via replaying requests
> > made
> > > to
> > > > > > prod.
> > > > > > > > > We're
> > > > > > > > > > > > > seeing
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > >> big
> > > > > > > > > > > > > > > >> > problem where fairly moderate request
> rates
> > > are
> > > > > > > causing
> > > > > > > > > the
> > > > > > > > > > > > > instance
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > become so slow it fails healthcheck. The
> > logs
> > > > > > showed a
> > > > > > > > lot
> > > > > > > > > > of
> > > > > > > > > > > > > errors
> > > > > > > > > > > > > > > >> around
> > > > > > > > > > > > > > > >> > creating threads:
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > solr[4507]:
> > [124136.511s][warning][os,thread]
> > > > > Failed
> > > > > > > to
> > > > > > > > > > start
> > > > > > > > > > > > > > thread -
> > > > > > > > > > > > > > > >> > pthread_create failed (EAGAIN) for
> > attributes:
> > > > > > > > stacksize:
> > > > > > > > > > > 256k,
> > > > > > > > > > > > > > > >> guardsize:
> > > > > > > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > > > > > > o.e.j.i.ManagedSelector
> > > > > > > > > =>
> > > > > > > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to
> create
> > > > > native
> > > > > > > > > thread:
> > > > > > > > > > > > > possibly
> > > > > > > > > > > > > > > >> out of
> > > > > > > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > So I monitored thread count for the
> process
> > > > whilst
> > > > > > > > running
> > > > > > > > > > the
> > > > > > > > > > > > > test
> > > > > > > > > > > > > > > >> suite
> > > > > > > > > > > > > > > >> > and saw a persistent pattern: Threads
> > > increased
> > > > > > until
> > > > > > > > > maxed
> > > > > > > > > > > out,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> logs
> > > > > > > > > > > > > > > >> > flooded with errors as it tried to create
> > > still
> > > > > more
> > > > > > > > > > threads,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > instance slowed down until terminated as
> > > > > unhealthy.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've
> > tried
> > > > > > raising
> > > > > > > > and
> > > > > > > > > > > > > lowering
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > >> > regardless of value the result is the
> same:
> > it
> > > > > gets
> > > > > > > > maxed
> > > > > > > > > > and
> > > > > > > > > > > > > > > everything
> > > > > > > > > > > > > > > >> > slows down.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Is there anything I can do to stop solr
> > > spinning
> > > > > up
> > > > > > so
> > > > > > > > > many
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> > ceases to function? There have been a few
> > test
> > > > > > passes
> > > > > > > > > where
> > > > > > > > > > it
> > > > > > > > > > > > > > > >> > spontaneously dropped threadcount from
> > > thousands
> > > > > to
> > > > > > > > > hundreds
> > > > > > > > > > > and
> > > > > > > > > > > > > > > stayed
> > > > > > > > > > > > > > > >> up
> > > > > > > > > > > > > > > >> > longer, but there seems no pattern to when
> > > this
> > > > > > > happens.
> > > > > > > > > > > Running
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> same
> > > > > > > > > > > > > > > >> > tests on 8.3.1 results in a much slower
> > > increase
> > > > > in
> > > > > > > > > threads
> > > > > > > > > > > and
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > never
> > > > > > > > > > > > > > > >> > quite maxes them so things continue to
> > > function.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > See below for the thread count and
> > healthcheck
> > > > > times
> > > > > > > > seen
> > > > > > > > > > on a
> > > > > > > > > > > > > > (fairly
> > > > > > > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Thanks
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Dominic
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Threadcount:
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> > > date;
> > > > > ps
> > > > > > > > -eLF |
> > > > > > > > > > > grep
> > > > > > > > > > > > > > > >> 'start.jar'
> > > > > > > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > > > >> > 569
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > > > > > > >> > 899
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > > > > > > >> > 1198
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > > > > > > >> > 1589
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > > > > > > >> > 2016
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > > > > > > >> > 2451
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > > > > > > >> > 2851
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > > > > > > >> > 2934
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > > > > > > >> > 3249
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > > > > > > >> > 3501
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > > > > > > >> > 3734
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > > > > > > >> > 4128
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > > > > > > >> > 4374
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > > > > > > >> > 4637
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > > > > > > >> > 4693
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > > > > > > >> > 4807
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by
> remote
> > > > host.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> > > date;
> > > > > > curl
> > > > > > > -v
> > > > > > > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP;
> date;
> > > > echo
> > > > > > > > '----';
> > > > > > > > > > > sleep
> > > > > > > > > > > > > > > >> > 10s; done
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:02
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:12
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:01
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:23
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:01
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:03
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:09
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:11
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:03
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> > 0
> > > > > > --:--:--
> > > > > > > > > > > 0:00:02
> > > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by
> remote
> > > > host.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
This
(java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb)
clearly indicates data contention. We just need to find out what.


Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Mon, Oct 18, 2021 at 10:07 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> The test is essentially a dump of requests made to our live service,
> replayed on our test instances. They're purely read-only, so there should
> be no contention in terms of writing updates - we update solr via a
> completely separate mechanism.
>
> On Mon, 18 Oct 2021 at 17:30, Deepak Goel <de...@gmail.com> wrote:
>
> > One thing concerns me is the input of the test. Is the data/docs limited
> to
> > a particular number? There could be a possibility that all threads are
> > getting locked doing an update of a particular document (rather than
> > updating a large number of documents). This would create a bottleneck and
> > performance would not show any improvement even if you increase the
> > capacity of servers or add more threads or tune the GC-cycle.
> >
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Mon, Oct 18, 2021 at 9:04 PM Dominic Humphries
> > <do...@adzuna.com.invalid> wrote:
> >
> > > Apologies for not replying earlier, illness struck last week :(
> > >
> > > So, I did some testing with increased Xss and Xmx values for 8.9.0 as
> > > suggested.
> > >
> > > Boosting Xss from 256k to 1024k made no impact on any tests.
> > >
> > > Boosting from -Xmx7913m to -Xmx15826m did improve some things: solr was
> > > less likely to become unresponsive when it maxed out on creating
> threads
> > > during high-intensity testing, and increased test scores significantly
> > when
> > > running low-intensity short-lived tests - from ~70% to 100% success
> > testing
> > > at 10 requests/sec. So that's all good.
> > >
> > > Sadly, it hasn't fixed everything: Running the low-rate 10/sec on
> > one-hour
> > > testing cycles has shown no improvement: It's still in the high-80s,
> > > low-90s compared to the high-90s of 8.3.1 running the same tests with
> the
> > > same settings. The failures are caused by a higher incidence of "no
> > > response" rather than receiving any error. Also at this long&slow test
> > > cycle, there was a significant difference in threads being run by the
> two
> > > versions: 8.3.1 hovering at around 350 whilst 8.9.0 had nearly a
> > thousand.
> > > In both cases, admin interface showed that almost all threads were the
> > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject
> in
> > > TIMED_WAITING status.
> > >
> > > Since the "long & low intensity" tests are the closest reflection of
> the
> > > demands that would be placed on the instances in production use, this
> > > remains a problem for us: On current showing, even though we're running
> > > 8.9.0 on instances with faster CPUs and the same heap settings as
> 8.3.1,
> > we
> > > see more latency and more failures. Neither RAM not CPU use is
> > concerningly
> > > high on either type; load is at around 1 on both types. I'm at a loss
> as
> > to
> > > why 8.9.0 is failing to respond to so many more requests than 8.3.1
> > unless
> > > we fall back to the "threads aren't terminstaing when they should"
> > > possibility. I haven't found a way to set the threshold for the timed
> > > threads, so if there is a way I'd be happy to try it?
> > >
> > > Thanks
> > >
> > > Dominic
> > >
> > > On Thu, 14 Oct 2021 at 06:28, Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > Hello
> > > >
> > > > This might not be exactly what you are looking for, but it might give
> > > some
> > > > clues:
> > > > https://www.eclipse.org/lists/jetty-users/msg07971.html
> > > >
> > > > I think you definitely need to increase your heap size. There might
> be
> > no
> > > > memory-leaks, it might be just that the load which you are pushing
> into
> > > > Solr always requires a particular amount of memory.
> > > >
> > > > Deepak
> > > > "The greatness of a nation can be judged by the way its animals are
> > > treated
> > > > - Mahatma Gandhi"
> > > >
> > > > +91 73500 12833
> > > > deicool@gmail.com
> > > >
> > > > Facebook: https://www.facebook.com/deicool
> > > > LinkedIn: www.linkedin.com/in/deicool
> > > >
> > > > "Plant a Tree, Go Green"
> > > >
> > > > Make In India : http://www.makeinindia.com/home
> > > >
> > > >
> > > > On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com>
> > > wrote:
> > > >
> > > > > I was gonna mention point 2 as well. Maybe you're hitting the -Xss
> > JVM
> > > > > limit (256Kb) and need to manually increase it.
> > > > > I seem to recall that it should be at the very least 1Mb (or so it
> > says
> > > > > here
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> > > > > >
> > > > > )
> > > > >
> > > > > Maybe removing that parameter or setting it to 1024Kb may change
> your
> > > > > results
> > > > >
> > > > > On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com>
> > wrote:
> > > > >
> > > > > > There could be parameters which specify:
> > > > > >
> > > > > > 1. How long do old threads live? And when they can be reused.
> > > > > >
> > > > > > 2. You might also be reaching memory restrictions for creating
> new
> > > > > threads.
> > > > > >
> > > > > > 3. This is a bit offbeat thinking. But is there a lot of disk
> > > > activities
> > > > > > happening as threads are nothing but files.
> > > > > >
> > > > > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries,
> > > > <dominic@adzuna.com.invalid
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Sure - when solr gets pushed enough to reach the max thread
> > count,
> > > it
> > > > > > > doesn't just re-use what it has, it keeps trying to create new
> > > ones.
> > > > > The
> > > > > > > logs *flood* with these errors:
> > > > > > >
> > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > [5032072.842s][warning][os,thread] Failed to start thread -
> > > > > > pthread_create
> > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > > detached.
> > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > > pthread_create
> > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > > detached.
> > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > > pthread_create
> > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > > detached.
> > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > [5032072.848s][warning][os,thread] Failed to start thread -
> > > > > > pthread_create
> > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > > detached.
> > > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > > [5032072.852s][warning][os,thread] Failed to start thread -
> > > > > > pthread_create
> > > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > > detached.
> > > > > > >
> > > > > > >
> > > > > > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > > > > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable
> to
> > > > > create
> > > > > > > native thread: possibly out of memory or process/resource
> limits
> > > > > reached
> > > > > > >         at java.base/java.lang.Thread.start0(Native Method)
> > > > > > > java.lang.OutOfMemoryError: unable to create native thread:
> > > possibly
> > > > > out
> > > > > > of
> > > > > > > memory or process/resource limits reached
> > > > > > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > > > > > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > .SelectorManager.execute(SelectorManager.java:163)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > .ManagedSelector.execute(ManagedSelector.java:332)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > .ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > .AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > .AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > > org.eclipse.jetty.io
> > > > > > >
> > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Am sorry, did you post the thread error.I might have missed
> it.
> > > Can
> > > > > you
> > > > > > > > please post it again?
> > > > > > > >
> > > > > > > > Deepak
> > > > > > > > "The greatness of a nation can be judged by the way its
> animals
> > > are
> > > > > > > treated
> > > > > > > > - Mahatma Gandhi"
> > > > > > > >
> > > > > > > > +91 73500 12833
> > > > > > > > deicool@gmail.com
> > > > > > > >
> > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > >
> > > > > > > > "Plant a Tree, Go Green"
> > > > > > > >
> > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > >
> > > > > > > > > Keeping threads around for re-use would certainly make
> sense
> > to
> > > > me,
> > > > > > but
> > > > > > > > > when the code starts erroring because it's trying to create
> > > > > ever-more
> > > > > > > new
> > > > > > > > > ones rather than re-use old ones, it seems there must be a
> > bug
> > > > > > > > >
> > > > > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <
> deicool@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > We will have to check the solr logic. Threads waiting
> might
> > > get
> > > > > > > reused
> > > > > > > > in
> > > > > > > > > > the future, so might be good for all you know.
> > > > > > > > > >
> > > > > > > > > > Deepak
> > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > animals
> > > > > are
> > > > > > > > > treated
> > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > >
> > > > > > > > > > +91 73500 12833
> > > > > > > > > > deicool@gmail.com
> > > > > > > > > >
> > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > >
> > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > >
> > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Yep - I edited the two instances to remove those
> > > differences
> > > > > and
> > > > > > > > > re-ran a
> > > > > > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > > > > > >
> > > > > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly
> at
> > > this
> > > > > > point
> > > > > > > > and
> > > > > > > > > > > neither hit the threads limit or failed to create new
> > > > threads.
> > > > > > But
> > > > > > > > both
> > > > > > > > > > > scaled up to well above 2k threads and even after
> > stopping
> > > > the
> > > > > > > test,
> > > > > > > > > the
> > > > > > > > > > > threads remained in a timed wait condition. So I'm
> still
> > > > > thinking
> > > > > > > > > > there's a
> > > > > > > > > > > problem with threads not getting terminated somewhere.
> > Test
> > > > > > output
> > > > > > > > > below
> > > > > > > > > > if
> > > > > > > > > > > it's of use:
> > > > > > > > > > >
> > > > > > > > > > > 8.3.1
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 25.05
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.97089157s,
> > > > > > > > 59.990075461s,
> > > > > > > > > > > 29.980816109s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > > > > > > 5.175300857s,
> > > > > > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > > > > > Bytes In      [total, mean]              42385846,
> > 7064.31
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    37.57%
> > > > > > > > > > > Status Codes  [code:count]               0:3741
> 200:2254
> > > > > 500:5
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 10.98
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.931457331s,
> > > > > > > > > 59.990859187s,
> > > > > > > > > > > 29.940598144s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s,
> > > > 57.37µs,
> > > > > > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > > > > > Bytes In      [total, mean]              21287225,
> > 3547.87
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    16.45%
> > > > > > > > > > > Status Codes  [code:count]               0:5007
> 200:987
> > > > 500:6
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 11.68
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.58141878s,
> > > > > > > > 59.990945585s,
> > > > > > > > > > > 29.590473195s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s,
> > > > 58.742µs,
> > > > > > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > > > > > Bytes In      [total, mean]              17434533,
> > 2905.76
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    17.43%
> > > > > > > > > > > Status Codes  [code:count]               0:4874
> 200:1046
> > > > > 500:80
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > cancellation
> > > > > > > > while
> > > > > > > > > > > reading body)
> > > > > > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 1.70
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.541198374s,
> > > > > > > > > 59.990833778s,
> > > > > > > > > > > 29.550364596s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s,
> > > > > 51.073µs,
> > > > > > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > > > > > Bytes In      [total, mean]              2211649,
> 368.61
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    2.53%
> > > > > > > > > > > Status Codes  [code:count]               0:5810
> 200:152
> > > > > 500:38
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 0.73
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.451307081s,
> > > > > > > > > 59.990601937s,
> > > > > > > > > > > 29.460705144s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s,
> > > > > 50.866µs,
> > > > > > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    1.08%
> > > > > > > > > > > Status Codes  [code:count]               0:5907  200:65
> > > > 500:28
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > cancellation
> > > > > > > > while
> > > > > > > > > > > reading body)
> > > > > > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 1.65
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.760671622s,
> > > > > > > > > 59.990482467s,
> > > > > > > > > > > 29.770189155s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s,
> > > > > 50.424µs,
> > > > > > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > > > > > Bytes In      [total, mean]              1973103,
> 328.85
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    2.47%
> > > > > > > > > > > Status Codes  [code:count]               0:5821
> 200:148
> > > > > 500:31
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 8.9.0
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 28.50
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.930332927s,
> > > > > > > > > 59.990809535s,
> > > > > > > > > > > 29.939523392s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > > > > > > 5.184437078s,
> > > > > > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > > > > > Bytes In      [total, mean]              51095634,
> > 8515.94
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    42.72%
> > > > > > > > > > > Status Codes  [code:count]               0:3432
> 200:2563
> > > > > 500:5
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 38.15
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.951119527s,
> > > > > > > > > 59.990310608s,
> > > > > > > > > > > 29.960808919s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > > > > > > 4.856237339s,
> > > > > > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > > > > > Bytes In      [total, mean]              64455304,
> > 10742.55
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    57.20%
> > > > > > > > > > > Status Codes  [code:count]               0:2565
> 200:3432
> > > > > 500:3
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 31.12
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.971513653s,
> > > > > > > > > 59.990882146s,
> > > > > > > > > > > 29.980631507s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > > > > > > 4.086936886s,
> > > > > > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > > > > > Bytes In      [total, mean]              49838902,
> > 8306.48
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    46.67%
> > > > > > > > > > > Status Codes  [code:count]               0:3011
> 200:2800
> > > > > 500:4
> > > > > > > > > 503:185
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > unexpected EOF
> > > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 30.21
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.871026977s,
> > > > > > > > > 59.990066513s,
> > > > > > > > > > > 29.880960464s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > > > > > > 5.166973696s,
> > > > > > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > > > > > Bytes In      [total, mean]              52508950,
> > 8751.49
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    45.25%
> > > > > > > > > > > Status Codes  [code:count]               0:3118
> 200:2715
> > > > > 500:4
> > > > > > > > > 503:163
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > unexpected EOF
> > > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 28.33
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.931899376s,
> > > > > > > > > 59.990583341s,
> > > > > > > > > > > 29.941316035s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > > > > > > 4.009658048s,
> > > > > > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > > > > > Bytes In      [total, mean]              51869384,
> > 8644.90
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    42.47%
> > > > > > > > > > > Status Codes  [code:count]               0:3448
> 200:2548
> > > > > 500:4
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > > cancellation
> > > > > > > > while
> > > > > > > > > > > reading body)
> > > > > > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > > >
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > > 40.46
> > > > > > > > > > > Duration      [total, attack, wait]
> 1m29.540373624s,
> > > > > > > > > 59.990568262s,
> > > > > > > > > > > 29.549805362s
> > > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > > > > > > 5.811191827s,
> > > > > > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > > > > > Bytes In      [total, mean]              64483027,
> > 10747.17
> > > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > > Success       [ratio]                    60.38%
> > > > > > > > > > > Status Codes  [code:count]               0:2372
> 200:3623
> > > > > 500:5
> > > > > > > > > > > Error Set:
> > > > > > > > > > > 500 Server Error
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <
> > > deicool@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello
> > > > > > > > > > > >
> > > > > > > > > > > > I can as of now see two changes:
> > > > > > > > > > > >
> > > > > > > > > > > > 1.  -Xmx
> > > > > > > > > > > >
> > > > > > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > > > > > >
> > > > > > > > > > > > Deepak
> > > > > > > > > > > > "The greatness of a nation can be judged by the way
> its
> > > > > animals
> > > > > > > are
> > > > > > > > > > > treated
> > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > >
> > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > >
> > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > >
> > > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> -XX:MaxGCPauseMillis=250
> > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> > -DSTOP.KEY=solrrocks
> > > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > > -Dsolr.data.home=
> -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > > > > > -jar start.jar --module=http
> > > > > > > > > > > > > I believe the key items are:
> > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > -Xmx15826m
> > > > > > > > > > > > > -Xss256k
> > > > > > > > > > > > >
> > > > > > > > > > > > > And for 8.9.0 is
> > > > > > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> -XX:MaxGCPauseMillis=250
> > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > -XX:+AlwaysPreTouch
> -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > > > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> > -DSTOP.KEY=solrrocks
> > > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > > >
> > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > > 8983
> > > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > > -Dsolr.data.home=
> -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar
> > > > start.jar
> > > > > > > > > > > --module=http
> > > > > > > > > > > > > Key:
> > > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > > >
> > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > > 8983
> > > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > > -Xmx7913m
> > > > > > > > > > > > > -Xss256k
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xmx values are based on the instance RAM, currently
> > > > they're
> > > > > > > > running
> > > > > > > > > > on
> > > > > > > > > > > > two
> > > > > > > > > > > > > different instance types but we see the same
> > behaviour
> > > > when
> > > > > > > > they're
> > > > > > > > > > on
> > > > > > > > > > > > > identical types too.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Many thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > > Dominic
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> > > > > deicool@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Can you please tell us the JVM Heap Setting for
> > both
> > > > the
> > > > > > > > > versions:
> > > > > > > > > > > > 8.3.1,
> > > > > > > > > > > > > > 8.9.0?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I will also have to look into the following code:
> > > > > > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Deepak
> > > > > > > > > > > > > > "The greatness of a nation can be judged by the
> way
> > > its
> > > > > > > animals
> > > > > > > > > are
> > > > > > > > > > > > > treated
> > > > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The overwhelming majority appear to be threads
> in
> > > > > > > > > TIMED_WAITING,
> > > > > > > > > > > all
> > > > > > > > > > > > > > > waiting on the same
> > > > > > > > > > > > > > > thing:
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I've attached a screenshot which includes the
> > stack
> > > > > > trace.
> > > > > > > > > > Stopping
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > queries to the instance and waiting didn't
> result
> > > in
> > > > > any
> > > > > > > > > > noticeable
> > > > > > > > > > > > > > > decrease in the number of threads so it looks
> > like
> > > > > > despite
> > > > > > > > > being
> > > > > > > > > > > > timed,
> > > > > > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Restarting the service takes me back down to
> just
> > > 53
> > > > > > > threads;
> > > > > > > > > > > > > re-running
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > test results in many new threads immediately
> > coming
> > > > > into
> > > > > > > > being,
> > > > > > > > > > > this
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > > > - See second screenshot. The stack trace for
> > those
> > > is
> > > > > too
> > > > > > > big
> > > > > > > > > for
> > > > > > > > > > > one
> > > > > > > > > > > > > > > screen so here's the output:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > > > > > >    -
> > > > > > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > > > > > >    -
> > > > > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > >
> > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > > >    -
> > > > > > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > > > > > >    Source)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > >
> > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > > > > > >    -
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > > > > > >    - java.base@11.0.5
> > > > > > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > > > > > > joelsolr@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> There is a thread dump on the Solr admin. You
> > can
> > > > use
> > > > > > that
> > > > > > > > to
> > > > > > > > > > > > > determine
> > > > > > > > > > > > > > >> what all those threads are doing and where
> they
> > > are
> > > > > > > getting
> > > > > > > > > > stuck.
> > > > > > > > > > > > You
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > >> post parts of the thread dump back to this
> email
> > > > > thread
> > > > > > as
> > > > > > > > > well.
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic
> > Humphries
> > > > > > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> > We run 8.3.1 in prod without any problems,
> but
> > > > we're
> > > > > > > > having
> > > > > > > > > > > issues
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > I've created an 8.9.0 leader & follower,
> > > imported
> > > > > our
> > > > > > > live
> > > > > > > > > > data
> > > > > > > > > > > > into
> > > > > > > > > > > > > > it,
> > > > > > > > > > > > > > >> > and am testing it via replaying requests
> made
> > to
> > > > > prod.
> > > > > > > > We're
> > > > > > > > > > > > seeing
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > >> big
> > > > > > > > > > > > > > >> > problem where fairly moderate request rates
> > are
> > > > > > causing
> > > > > > > > the
> > > > > > > > > > > > instance
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > become so slow it fails healthcheck. The
> logs
> > > > > showed a
> > > > > > > lot
> > > > > > > > > of
> > > > > > > > > > > > errors
> > > > > > > > > > > > > > >> around
> > > > > > > > > > > > > > >> > creating threads:
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > solr[4507]:
> [124136.511s][warning][os,thread]
> > > > Failed
> > > > > > to
> > > > > > > > > start
> > > > > > > > > > > > > thread -
> > > > > > > > > > > > > > >> > pthread_create failed (EAGAIN) for
> attributes:
> > > > > > > stacksize:
> > > > > > > > > > 256k,
> > > > > > > > > > > > > > >> guardsize:
> > > > > > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > > > > > o.e.j.i.ManagedSelector
> > > > > > > > =>
> > > > > > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create
> > > > native
> > > > > > > > thread:
> > > > > > > > > > > > possibly
> > > > > > > > > > > > > > >> out of
> > > > > > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > So I monitored thread count for the process
> > > whilst
> > > > > > > running
> > > > > > > > > the
> > > > > > > > > > > > test
> > > > > > > > > > > > > > >> suite
> > > > > > > > > > > > > > >> > and saw a persistent pattern: Threads
> > increased
> > > > > until
> > > > > > > > maxed
> > > > > > > > > > out,
> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> logs
> > > > > > > > > > > > > > >> > flooded with errors as it tried to create
> > still
> > > > more
> > > > > > > > > threads,
> > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > instance slowed down until terminated as
> > > > unhealthy.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've
> tried
> > > > > raising
> > > > > > > and
> > > > > > > > > > > > lowering
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > >> > regardless of value the result is the same:
> it
> > > > gets
> > > > > > > maxed
> > > > > > > > > and
> > > > > > > > > > > > > > everything
> > > > > > > > > > > > > > >> > slows down.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Is there anything I can do to stop solr
> > spinning
> > > > up
> > > > > so
> > > > > > > > many
> > > > > > > > > > > > threads
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> > ceases to function? There have been a few
> test
> > > > > passes
> > > > > > > > where
> > > > > > > > > it
> > > > > > > > > > > > > > >> > spontaneously dropped threadcount from
> > thousands
> > > > to
> > > > > > > > hundreds
> > > > > > > > > > and
> > > > > > > > > > > > > > stayed
> > > > > > > > > > > > > > >> up
> > > > > > > > > > > > > > >> > longer, but there seems no pattern to when
> > this
> > > > > > happens.
> > > > > > > > > > Running
> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> same
> > > > > > > > > > > > > > >> > tests on 8.3.1 results in a much slower
> > increase
> > > > in
> > > > > > > > threads
> > > > > > > > > > and
> > > > > > > > > > > it
> > > > > > > > > > > > > > never
> > > > > > > > > > > > > > >> > quite maxes them so things continue to
> > function.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > See below for the thread count and
> healthcheck
> > > > times
> > > > > > > seen
> > > > > > > > > on a
> > > > > > > > > > > > > (fairly
> > > > > > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Thanks
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Dominic
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Threadcount:
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> > date;
> > > > ps
> > > > > > > -eLF |
> > > > > > > > > > grep
> > > > > > > > > > > > > > >> 'start.jar'
> > > > > > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > > >> > 569
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > > > > > >> > 899
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > > > > > >> > 1198
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > > > > > >> > 1589
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > > > > > >> > 2016
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > > > > > >> > 2451
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > > > > > >> > 2851
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > > > > > >> > 2934
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > > > > > >> > 3249
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > > > > > >> > 3501
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > > > > > >> > 3734
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > > > > > >> > 4128
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > > > > > >> > 4374
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > > > > > >> > 4637
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > > > > > >> > 4693
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > > > > > >> > 4807
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> > > host.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> > date;
> > > > > curl
> > > > > > -v
> > > > > > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date;
> > > echo
> > > > > > > '----';
> > > > > > > > > > sleep
> > > > > > > > > > > > > > >> > 10s; done
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:02
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:12
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:01
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:23
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:01
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:03
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:09
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:11
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:03
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> >   0     0    0     0    0     0      0
> 0
> > > > > --:--:--
> > > > > > > > > > 0:00:02
> > > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> > > host.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
The test is essentially a dump of requests made to our live service,
replayed on our test instances. They're purely read-only, so there should
be no contention in terms of writing updates - we update solr via a
completely separate mechanism.

On Mon, 18 Oct 2021 at 17:30, Deepak Goel <de...@gmail.com> wrote:

> One thing concerns me is the input of the test. Is the data/docs limited to
> a particular number? There could be a possibility that all threads are
> getting locked doing an update of a particular document (rather than
> updating a large number of documents). This would create a bottleneck and
> performance would not show any improvement even if you increase the
> capacity of servers or add more threads or tune the GC-cycle.
>
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Mon, Oct 18, 2021 at 9:04 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > Apologies for not replying earlier, illness struck last week :(
> >
> > So, I did some testing with increased Xss and Xmx values for 8.9.0 as
> > suggested.
> >
> > Boosting Xss from 256k to 1024k made no impact on any tests.
> >
> > Boosting from -Xmx7913m to -Xmx15826m did improve some things: solr was
> > less likely to become unresponsive when it maxed out on creating threads
> > during high-intensity testing, and increased test scores significantly
> when
> > running low-intensity short-lived tests - from ~70% to 100% success
> testing
> > at 10 requests/sec. So that's all good.
> >
> > Sadly, it hasn't fixed everything: Running the low-rate 10/sec on
> one-hour
> > testing cycles has shown no improvement: It's still in the high-80s,
> > low-90s compared to the high-90s of 8.3.1 running the same tests with the
> > same settings. The failures are caused by a higher incidence of "no
> > response" rather than receiving any error. Also at this long&slow test
> > cycle, there was a significant difference in threads being run by the two
> > versions: 8.3.1 hovering at around 350 whilst 8.9.0 had nearly a
> thousand.
> > In both cases, admin interface showed that almost all threads were the
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject in
> > TIMED_WAITING status.
> >
> > Since the "long & low intensity" tests are the closest reflection of the
> > demands that would be placed on the instances in production use, this
> > remains a problem for us: On current showing, even though we're running
> > 8.9.0 on instances with faster CPUs and the same heap settings as 8.3.1,
> we
> > see more latency and more failures. Neither RAM not CPU use is
> concerningly
> > high on either type; load is at around 1 on both types. I'm at a loss as
> to
> > why 8.9.0 is failing to respond to so many more requests than 8.3.1
> unless
> > we fall back to the "threads aren't terminstaing when they should"
> > possibility. I haven't found a way to set the threshold for the timed
> > threads, so if there is a way I'd be happy to try it?
> >
> > Thanks
> >
> > Dominic
> >
> > On Thu, 14 Oct 2021 at 06:28, Deepak Goel <de...@gmail.com> wrote:
> >
> > > Hello
> > >
> > > This might not be exactly what you are looking for, but it might give
> > some
> > > clues:
> > > https://www.eclipse.org/lists/jetty-users/msg07971.html
> > >
> > > I think you definitely need to increase your heap size. There might be
> no
> > > memory-leaks, it might be just that the load which you are pushing into
> > > Solr always requires a particular amount of memory.
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com>
> > wrote:
> > >
> > > > I was gonna mention point 2 as well. Maybe you're hitting the -Xss
> JVM
> > > > limit (256Kb) and need to manually increase it.
> > > > I seem to recall that it should be at the very least 1Mb (or so it
> says
> > > > here
> > > > <
> > > >
> > >
> >
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> > > > >
> > > > )
> > > >
> > > > Maybe removing that parameter or setting it to 1024Kb may change your
> > > > results
> > > >
> > > > On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com>
> wrote:
> > > >
> > > > > There could be parameters which specify:
> > > > >
> > > > > 1. How long do old threads live? And when they can be reused.
> > > > >
> > > > > 2. You might also be reaching memory restrictions for creating new
> > > > threads.
> > > > >
> > > > > 3. This is a bit offbeat thinking. But is there a lot of disk
> > > activities
> > > > > happening as threads are nothing but files.
> > > > >
> > > > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries,
> > > <dominic@adzuna.com.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Sure - when solr gets pushed enough to reach the max thread
> count,
> > it
> > > > > > doesn't just re-use what it has, it keeps trying to create new
> > ones.
> > > > The
> > > > > > logs *flood* with these errors:
> > > > > >
> > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > [5032072.842s][warning][os,thread] Failed to start thread -
> > > > > pthread_create
> > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > detached.
> > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > pthread_create
> > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > detached.
> > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > > pthread_create
> > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > detached.
> > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > [5032072.848s][warning][os,thread] Failed to start thread -
> > > > > pthread_create
> > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > detached.
> > > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > > [5032072.852s][warning][os,thread] Failed to start thread -
> > > > > pthread_create
> > > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > > detached.
> > > > > >
> > > > > >
> > > > > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > > > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to
> > > > create
> > > > > > native thread: possibly out of memory or process/resource limits
> > > > reached
> > > > > >         at java.base/java.lang.Thread.start0(Native Method)
> > > > > > java.lang.OutOfMemoryError: unable to create native thread:
> > possibly
> > > > out
> > > > > of
> > > > > > memory or process/resource limits reached
> > > > > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > > > > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > .SelectorManager.execute(SelectorManager.java:163)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > .ManagedSelector.execute(ManagedSelector.java:332)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > .ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > .AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > .AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > > >
> > > > >
> > > >
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > > org.eclipse.jetty.io
> > > > > >
> .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > > > > >
> > > > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Am sorry, did you post the thread error.I might have missed it.
> > Can
> > > > you
> > > > > > > please post it again?
> > > > > > >
> > > > > > > Deepak
> > > > > > > "The greatness of a nation can be judged by the way its animals
> > are
> > > > > > treated
> > > > > > > - Mahatma Gandhi"
> > > > > > >
> > > > > > > +91 73500 12833
> > > > > > > deicool@gmail.com
> > > > > > >
> > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > >
> > > > > > > "Plant a Tree, Go Green"
> > > > > > >
> > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > >
> > > > > > > > Keeping threads around for re-use would certainly make sense
> to
> > > me,
> > > > > but
> > > > > > > > when the code starts erroring because it's trying to create
> > > > ever-more
> > > > > > new
> > > > > > > > ones rather than re-use old ones, it seems there must be a
> bug
> > > > > > > >
> > > > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <deicool@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > We will have to check the solr logic. Threads waiting might
> > get
> > > > > > reused
> > > > > > > in
> > > > > > > > > the future, so might be good for all you know.
> > > > > > > > >
> > > > > > > > > Deepak
> > > > > > > > > "The greatness of a nation can be judged by the way its
> > animals
> > > > are
> > > > > > > > treated
> > > > > > > > > - Mahatma Gandhi"
> > > > > > > > >
> > > > > > > > > +91 73500 12833
> > > > > > > > > deicool@gmail.com
> > > > > > > > >
> > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > >
> > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > >
> > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Yep - I edited the two instances to remove those
> > differences
> > > > and
> > > > > > > > re-ran a
> > > > > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > > > > >
> > > > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at
> > this
> > > > > point
> > > > > > > and
> > > > > > > > > > neither hit the threads limit or failed to create new
> > > threads.
> > > > > But
> > > > > > > both
> > > > > > > > > > scaled up to well above 2k threads and even after
> stopping
> > > the
> > > > > > test,
> > > > > > > > the
> > > > > > > > > > threads remained in a timed wait condition. So I'm still
> > > > thinking
> > > > > > > > > there's a
> > > > > > > > > > problem with threads not getting terminated somewhere.
> Test
> > > > > output
> > > > > > > > below
> > > > > > > > > if
> > > > > > > > > > it's of use:
> > > > > > > > > >
> > > > > > > > > > 8.3.1
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 25.05
> > > > > > > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > > > > > > 59.990075461s,
> > > > > > > > > > 29.980816109s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > > > > > 5.175300857s,
> > > > > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > > > > Bytes In      [total, mean]              42385846,
> 7064.31
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    37.57%
> > > > > > > > > > Status Codes  [code:count]               0:3741  200:2254
> > > > 500:5
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 10.98
> > > > > > > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > > > > > > 59.990859187s,
> > > > > > > > > > 29.940598144s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s,
> > > 57.37µs,
> > > > > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > > > > Bytes In      [total, mean]              21287225,
> 3547.87
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    16.45%
> > > > > > > > > > Status Codes  [code:count]               0:5007  200:987
> > > 500:6
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 11.68
> > > > > > > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > > > > > > 59.990945585s,
> > > > > > > > > > 29.590473195s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s,
> > > 58.742µs,
> > > > > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > > > > Bytes In      [total, mean]              17434533,
> 2905.76
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    17.43%
> > > > > > > > > > Status Codes  [code:count]               0:4874  200:1046
> > > > 500:80
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > cancellation
> > > > > > > while
> > > > > > > > > > reading body)
> > > > > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 1.70
> > > > > > > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > > > > > > 59.990833778s,
> > > > > > > > > > 29.550364596s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s,
> > > > 51.073µs,
> > > > > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    2.53%
> > > > > > > > > > Status Codes  [code:count]               0:5810  200:152
> > > > 500:38
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 0.73
> > > > > > > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > > > > > > 59.990601937s,
> > > > > > > > > > 29.460705144s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s,
> > > > 50.866µs,
> > > > > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    1.08%
> > > > > > > > > > Status Codes  [code:count]               0:5907  200:65
> > > 500:28
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > cancellation
> > > > > > > while
> > > > > > > > > > reading body)
> > > > > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 1.65
> > > > > > > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > > > > > > 59.990482467s,
> > > > > > > > > > 29.770189155s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s,
> > > > 50.424µs,
> > > > > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    2.47%
> > > > > > > > > > Status Codes  [code:count]               0:5821  200:148
> > > > 500:31
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 8.9.0
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 28.50
> > > > > > > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > > > > > > 59.990809535s,
> > > > > > > > > > 29.939523392s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > > > > > 5.184437078s,
> > > > > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > > > > Bytes In      [total, mean]              51095634,
> 8515.94
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    42.72%
> > > > > > > > > > Status Codes  [code:count]               0:3432  200:2563
> > > > 500:5
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 38.15
> > > > > > > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > > > > > > 59.990310608s,
> > > > > > > > > > 29.960808919s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > > > > > 4.856237339s,
> > > > > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > > > > Bytes In      [total, mean]              64455304,
> 10742.55
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    57.20%
> > > > > > > > > > Status Codes  [code:count]               0:2565  200:3432
> > > > 500:3
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 31.12
> > > > > > > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > > > > > > 59.990882146s,
> > > > > > > > > > 29.980631507s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > > > > > 4.086936886s,
> > > > > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > > > > Bytes In      [total, mean]              49838902,
> 8306.48
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    46.67%
> > > > > > > > > > Status Codes  [code:count]               0:3011  200:2800
> > > > 500:4
> > > > > > > > 503:185
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > unexpected EOF
> > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 30.21
> > > > > > > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > > > > > > 59.990066513s,
> > > > > > > > > > 29.880960464s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > > > > > 5.166973696s,
> > > > > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > > > > Bytes In      [total, mean]              52508950,
> 8751.49
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    45.25%
> > > > > > > > > > Status Codes  [code:count]               0:3118  200:2715
> > > > 500:4
> > > > > > > > 503:163
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > unexpected EOF
> > > > > > > > > > 503 Service Unavailable
> > > > > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 28.33
> > > > > > > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > > > > > > 59.990583341s,
> > > > > > > > > > 29.941316035s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > > > > > 4.009658048s,
> > > > > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > > > > Bytes In      [total, mean]              51869384,
> 8644.90
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    42.47%
> > > > > > > > > > Status Codes  [code:count]               0:3448  200:2548
> > > > 500:4
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > > cancellation
> > > > > > > while
> > > > > > > > > > reading body)
> > > > > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > > >
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> > 40.46
> > > > > > > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > > > > > > 59.990568262s,
> > > > > > > > > > 29.549805362s
> > > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > > > > > 5.811191827s,
> > > > > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > > > > Bytes In      [total, mean]              64483027,
> 10747.17
> > > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > > Success       [ratio]                    60.38%
> > > > > > > > > > Status Codes  [code:count]               0:2372  200:3623
> > > > 500:5
> > > > > > > > > > Error Set:
> > > > > > > > > > 500 Server Error
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <
> > deicool@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello
> > > > > > > > > > >
> > > > > > > > > > > I can as of now see two changes:
> > > > > > > > > > >
> > > > > > > > > > > 1.  -Xmx
> > > > > > > > > > >
> > > > > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > > > > >
> > > > > > > > > > > Deepak
> > > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > > animals
> > > > > > are
> > > > > > > > > > treated
> > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > >
> > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > >
> > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > >
> > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> -DSTOP.KEY=solrrocks
> > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > > > > -jar start.jar --module=http
> > > > > > > > > > > > I believe the key items are:
> > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > -Xmx15826m
> > > > > > > > > > > > -Xss256k
> > > > > > > > > > > >
> > > > > > > > > > > > And for 8.9.0 is
> > > > > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983
> -DSTOP.KEY=solrrocks
> > > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > >
> -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > 8983
> > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar
> > > start.jar
> > > > > > > > > > --module=http
> > > > > > > > > > > > Key:
> > > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > >
> -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > > 8983
> > > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > > -Xmx7913m
> > > > > > > > > > > > -Xss256k
> > > > > > > > > > > >
> > > > > > > > > > > > Xmx values are based on the instance RAM, currently
> > > they're
> > > > > > > running
> > > > > > > > > on
> > > > > > > > > > > two
> > > > > > > > > > > > different instance types but we see the same
> behaviour
> > > when
> > > > > > > they're
> > > > > > > > > on
> > > > > > > > > > > > identical types too.
> > > > > > > > > > > >
> > > > > > > > > > > > Many thanks
> > > > > > > > > > > >
> > > > > > > > > > > > Dominic
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> > > > deicool@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello
> > > > > > > > > > > > >
> > > > > > > > > > > > > Can you please tell us the JVM Heap Setting for
> both
> > > the
> > > > > > > > versions:
> > > > > > > > > > > 8.3.1,
> > > > > > > > > > > > > 8.9.0?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I will also have to look into the following code:
> > > > > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Deepak
> > > > > > > > > > > > > "The greatness of a nation can be judged by the way
> > its
> > > > > > animals
> > > > > > > > are
> > > > > > > > > > > > treated
> > > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > > >
> > > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > > >
> > > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > > >
> > > > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The overwhelming majority appear to be threads in
> > > > > > > > TIMED_WAITING,
> > > > > > > > > > all
> > > > > > > > > > > > > > waiting on the same
> > > > > > > > > > > > > > thing:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I've attached a screenshot which includes the
> stack
> > > > > trace.
> > > > > > > > > Stopping
> > > > > > > > > > > all
> > > > > > > > > > > > > > queries to the instance and waiting didn't result
> > in
> > > > any
> > > > > > > > > noticeable
> > > > > > > > > > > > > > decrease in the number of threads so it looks
> like
> > > > > despite
> > > > > > > > being
> > > > > > > > > > > timed,
> > > > > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Restarting the service takes me back down to just
> > 53
> > > > > > threads;
> > > > > > > > > > > > re-running
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > test results in many new threads immediately
> coming
> > > > into
> > > > > > > being,
> > > > > > > > > > this
> > > > > > > > > > > > time
> > > > > > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > > - See second screenshot. The stack trace for
> those
> > is
> > > > too
> > > > > > big
> > > > > > > > for
> > > > > > > > > > one
> > > > > > > > > > > > > > screen so here's the output:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > > > > >    -
> > > > > > > > > >
> > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > > > > >    -
> > > > > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > > > > >    -
> > > > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > >
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > > > > >    -
> > > > > > > > > >
> > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > > >    -
> > > > > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > > > > >    Source)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > >
> > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > > > > >    -
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > > > > >    - java.base@11.0.5
> > > > > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > > > > > joelsolr@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> There is a thread dump on the Solr admin. You
> can
> > > use
> > > > > that
> > > > > > > to
> > > > > > > > > > > > determine
> > > > > > > > > > > > > >> what all those threads are doing and where they
> > are
> > > > > > getting
> > > > > > > > > stuck.
> > > > > > > > > > > You
> > > > > > > > > > > > > can
> > > > > > > > > > > > > >> post parts of the thread dump back to this email
> > > > thread
> > > > > as
> > > > > > > > well.
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic
> Humphries
> > > > > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> > We run 8.3.1 in prod without any problems, but
> > > we're
> > > > > > > having
> > > > > > > > > > issues
> > > > > > > > > > > > > with
> > > > > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > I've created an 8.9.0 leader & follower,
> > imported
> > > > our
> > > > > > live
> > > > > > > > > data
> > > > > > > > > > > into
> > > > > > > > > > > > > it,
> > > > > > > > > > > > > >> > and am testing it via replaying requests made
> to
> > > > prod.
> > > > > > > We're
> > > > > > > > > > > seeing
> > > > > > > > > > > > a
> > > > > > > > > > > > > >> big
> > > > > > > > > > > > > >> > problem where fairly moderate request rates
> are
> > > > > causing
> > > > > > > the
> > > > > > > > > > > instance
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > become so slow it fails healthcheck. The logs
> > > > showed a
> > > > > > lot
> > > > > > > > of
> > > > > > > > > > > errors
> > > > > > > > > > > > > >> around
> > > > > > > > > > > > > >> > creating threads:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread]
> > > Failed
> > > > > to
> > > > > > > > start
> > > > > > > > > > > > thread -
> > > > > > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> > > > > > stacksize:
> > > > > > > > > 256k,
> > > > > > > > > > > > > >> guardsize:
> > > > > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > > > > o.e.j.i.ManagedSelector
> > > > > > > =>
> > > > > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create
> > > native
> > > > > > > thread:
> > > > > > > > > > > possibly
> > > > > > > > > > > > > >> out of
> > > > > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > So I monitored thread count for the process
> > whilst
> > > > > > running
> > > > > > > > the
> > > > > > > > > > > test
> > > > > > > > > > > > > >> suite
> > > > > > > > > > > > > >> > and saw a persistent pattern: Threads
> increased
> > > > until
> > > > > > > maxed
> > > > > > > > > out,
> > > > > > > > > > > the
> > > > > > > > > > > > > >> logs
> > > > > > > > > > > > > >> > flooded with errors as it tried to create
> still
> > > more
> > > > > > > > threads,
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> > instance slowed down until terminated as
> > > unhealthy.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried
> > > > raising
> > > > > > and
> > > > > > > > > > > lowering
> > > > > > > > > > > > it
> > > > > > > > > > > > > >> but
> > > > > > > > > > > > > >> > regardless of value the result is the same: it
> > > gets
> > > > > > maxed
> > > > > > > > and
> > > > > > > > > > > > > everything
> > > > > > > > > > > > > >> > slows down.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Is there anything I can do to stop solr
> spinning
> > > up
> > > > so
> > > > > > > many
> > > > > > > > > > > threads
> > > > > > > > > > > > it
> > > > > > > > > > > > > >> > ceases to function? There have been a few test
> > > > passes
> > > > > > > where
> > > > > > > > it
> > > > > > > > > > > > > >> > spontaneously dropped threadcount from
> thousands
> > > to
> > > > > > > hundreds
> > > > > > > > > and
> > > > > > > > > > > > > stayed
> > > > > > > > > > > > > >> up
> > > > > > > > > > > > > >> > longer, but there seems no pattern to when
> this
> > > > > happens.
> > > > > > > > > Running
> > > > > > > > > > > the
> > > > > > > > > > > > > >> same
> > > > > > > > > > > > > >> > tests on 8.3.1 results in a much slower
> increase
> > > in
> > > > > > > threads
> > > > > > > > > and
> > > > > > > > > > it
> > > > > > > > > > > > > never
> > > > > > > > > > > > > >> > quite maxes them so things continue to
> function.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > See below for the thread count and healthcheck
> > > times
> > > > > > seen
> > > > > > > > on a
> > > > > > > > > > > > (fairly
> > > > > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Thanks
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Dominic
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Threadcount:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> date;
> > > ps
> > > > > > -eLF |
> > > > > > > > > grep
> > > > > > > > > > > > > >> 'start.jar'
> > > > > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > >> > 52
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > >> > 569
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > > > > >> > 899
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > > > > >> > 1198
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > > > > >> > 1589
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > > > > >> > 2016
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > > > > >> > 2451
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > > > > >> > 2851
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > > > > >> > 2934
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > > > > >> > 3249
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > > > > >> > 3501
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > > > > >> > 3734
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > > > > >> > 4128
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > > > > >> > 4374
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > > > > >> > 4637
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > > > > >> > 4693
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > > > > >> > 4807
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> > host.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do
> date;
> > > > curl
> > > > > -v
> > > > > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date;
> > echo
> > > > > > '----';
> > > > > > > > > sleep
> > > > > > > > > > > > > >> > 10s; done
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:02
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:12
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:01
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:23
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:01
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:03
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:09
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:11
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:03
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > > --:--:--
> > > > > > > > > 0:00:02
> > > > > > > > > > > > > --:--:--
> > > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > > >> > ----
> > > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> > host.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
One thing concerns me is the input of the test. Is the data/docs limited to
a particular number? There could be a possibility that all threads are
getting locked doing an update of a particular document (rather than
updating a large number of documents). This would create a bottleneck and
performance would not show any improvement even if you increase the
capacity of servers or add more threads or tune the GC-cycle.


Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Mon, Oct 18, 2021 at 9:04 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> Apologies for not replying earlier, illness struck last week :(
>
> So, I did some testing with increased Xss and Xmx values for 8.9.0 as
> suggested.
>
> Boosting Xss from 256k to 1024k made no impact on any tests.
>
> Boosting from -Xmx7913m to -Xmx15826m did improve some things: solr was
> less likely to become unresponsive when it maxed out on creating threads
> during high-intensity testing, and increased test scores significantly when
> running low-intensity short-lived tests - from ~70% to 100% success testing
> at 10 requests/sec. So that's all good.
>
> Sadly, it hasn't fixed everything: Running the low-rate 10/sec on one-hour
> testing cycles has shown no improvement: It's still in the high-80s,
> low-90s compared to the high-90s of 8.3.1 running the same tests with the
> same settings. The failures are caused by a higher incidence of "no
> response" rather than receiving any error. Also at this long&slow test
> cycle, there was a significant difference in threads being run by the two
> versions: 8.3.1 hovering at around 350 whilst 8.9.0 had nearly a thousand.
> In both cases, admin interface showed that almost all threads were the
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject in
> TIMED_WAITING status.
>
> Since the "long & low intensity" tests are the closest reflection of the
> demands that would be placed on the instances in production use, this
> remains a problem for us: On current showing, even though we're running
> 8.9.0 on instances with faster CPUs and the same heap settings as 8.3.1, we
> see more latency and more failures. Neither RAM not CPU use is concerningly
> high on either type; load is at around 1 on both types. I'm at a loss as to
> why 8.9.0 is failing to respond to so many more requests than 8.3.1 unless
> we fall back to the "threads aren't terminstaing when they should"
> possibility. I haven't found a way to set the threshold for the timed
> threads, so if there is a way I'd be happy to try it?
>
> Thanks
>
> Dominic
>
> On Thu, 14 Oct 2021 at 06:28, Deepak Goel <de...@gmail.com> wrote:
>
> > Hello
> >
> > This might not be exactly what you are looking for, but it might give
> some
> > clues:
> > https://www.eclipse.org/lists/jetty-users/msg07971.html
> >
> > I think you definitely need to increase your heap size. There might be no
> > memory-leaks, it might be just that the load which you are pushing into
> > Solr always requires a particular amount of memory.
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com>
> wrote:
> >
> > > I was gonna mention point 2 as well. Maybe you're hitting the -Xss JVM
> > > limit (256Kb) and need to manually increase it.
> > > I seem to recall that it should be at the very least 1Mb (or so it says
> > > here
> > > <
> > >
> >
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> > > >
> > > )
> > >
> > > Maybe removing that parameter or setting it to 1024Kb may change your
> > > results
> > >
> > > On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > There could be parameters which specify:
> > > >
> > > > 1. How long do old threads live? And when they can be reused.
> > > >
> > > > 2. You might also be reaching memory restrictions for creating new
> > > threads.
> > > >
> > > > 3. This is a bit offbeat thinking. But is there a lot of disk
> > activities
> > > > happening as threads are nothing but files.
> > > >
> > > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries,
> > <dominic@adzuna.com.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Sure - when solr gets pushed enough to reach the max thread count,
> it
> > > > > doesn't just re-use what it has, it keeps trying to create new
> ones.
> > > The
> > > > > logs *flood* with these errors:
> > > > >
> > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > [5032072.842s][warning][os,thread] Failed to start thread -
> > > > pthread_create
> > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > detached.
> > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > pthread_create
> > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > detached.
> > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > > pthread_create
> > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > detached.
> > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > [5032072.848s][warning][os,thread] Failed to start thread -
> > > > pthread_create
> > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > detached.
> > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > > [5032072.852s][warning][os,thread] Failed to start thread -
> > > > pthread_create
> > > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > > detached.
> > > > >
> > > > >
> > > > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to
> > > create
> > > > > native thread: possibly out of memory or process/resource limits
> > > reached
> > > > >         at java.base/java.lang.Thread.start0(Native Method)
> > > > > java.lang.OutOfMemoryError: unable to create native thread:
> possibly
> > > out
> > > > of
> > > > > memory or process/resource limits reached
> > > > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > > > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > .SelectorManager.execute(SelectorManager.java:163)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > .ManagedSelector.execute(ManagedSelector.java:332)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > .ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > .AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > .AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > > >
> > > >
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > > org.eclipse.jetty.io
> > > > > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > > > >
> > > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com>
> wrote:
> > > > >
> > > > > > Am sorry, did you post the thread error.I might have missed it.
> Can
> > > you
> > > > > > please post it again?
> > > > > >
> > > > > > Deepak
> > > > > > "The greatness of a nation can be judged by the way its animals
> are
> > > > > treated
> > > > > > - Mahatma Gandhi"
> > > > > >
> > > > > > +91 73500 12833
> > > > > > deicool@gmail.com
> > > > > >
> > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > >
> > > > > > "Plant a Tree, Go Green"
> > > > > >
> > > > > > Make In India : http://www.makeinindia.com/home
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > >
> > > > > > > Keeping threads around for re-use would certainly make sense to
> > me,
> > > > but
> > > > > > > when the code starts erroring because it's trying to create
> > > ever-more
> > > > > new
> > > > > > > ones rather than re-use old ones, it seems there must be a bug
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > We will have to check the solr logic. Threads waiting might
> get
> > > > > reused
> > > > > > in
> > > > > > > > the future, so might be good for all you know.
> > > > > > > >
> > > > > > > > Deepak
> > > > > > > > "The greatness of a nation can be judged by the way its
> animals
> > > are
> > > > > > > treated
> > > > > > > > - Mahatma Gandhi"
> > > > > > > >
> > > > > > > > +91 73500 12833
> > > > > > > > deicool@gmail.com
> > > > > > > >
> > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > >
> > > > > > > > "Plant a Tree, Go Green"
> > > > > > > >
> > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > >
> > > > > > > > > Yep - I edited the two instances to remove those
> differences
> > > and
> > > > > > > re-ran a
> > > > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > > > >
> > > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at
> this
> > > > point
> > > > > > and
> > > > > > > > > neither hit the threads limit or failed to create new
> > threads.
> > > > But
> > > > > > both
> > > > > > > > > scaled up to well above 2k threads and even after stopping
> > the
> > > > > test,
> > > > > > > the
> > > > > > > > > threads remained in a timed wait condition. So I'm still
> > > thinking
> > > > > > > > there's a
> > > > > > > > > problem with threads not getting terminated somewhere. Test
> > > > output
> > > > > > > below
> > > > > > > > if
> > > > > > > > > it's of use:
> > > > > > > > >
> > > > > > > > > 8.3.1
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 25.05
> > > > > > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > > > > > 59.990075461s,
> > > > > > > > > 29.980816109s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > > > > 5.175300857s,
> > > > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    37.57%
> > > > > > > > > Status Codes  [code:count]               0:3741  200:2254
> > > 500:5
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 10.98
> > > > > > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > > > > > 59.990859187s,
> > > > > > > > > 29.940598144s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s,
> > 57.37µs,
> > > > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    16.45%
> > > > > > > > > Status Codes  [code:count]               0:5007  200:987
> > 500:6
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 11.68
> > > > > > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > > > > > 59.990945585s,
> > > > > > > > > 29.590473195s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s,
> > 58.742µs,
> > > > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    17.43%
> > > > > > > > > Status Codes  [code:count]               0:4874  200:1046
> > > 500:80
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > cancellation
> > > > > > while
> > > > > > > > > reading body)
> > > > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > > > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > > > > > 59.990833778s,
> > > > > > > > > 29.550364596s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s,
> > > 51.073µs,
> > > > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    2.53%
> > > > > > > > > Status Codes  [code:count]               0:5810  200:152
> > > 500:38
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > > > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > > > > > 59.990601937s,
> > > > > > > > > 29.460705144s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s,
> > > 50.866µs,
> > > > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    1.08%
> > > > > > > > > Status Codes  [code:count]               0:5907  200:65
> > 500:28
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > cancellation
> > > > > > while
> > > > > > > > > reading body)
> > > > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > > > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > > > > > 59.990482467s,
> > > > > > > > > 29.770189155s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s,
> > > 50.424µs,
> > > > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    2.47%
> > > > > > > > > Status Codes  [code:count]               0:5821  200:148
> > > 500:31
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 8.9.0
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 28.50
> > > > > > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > > > > > 59.990809535s,
> > > > > > > > > 29.939523392s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > > > > 5.184437078s,
> > > > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    42.72%
> > > > > > > > > Status Codes  [code:count]               0:3432  200:2563
> > > 500:5
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 38.15
> > > > > > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > > > > > 59.990310608s,
> > > > > > > > > 29.960808919s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > > > > 4.856237339s,
> > > > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    57.20%
> > > > > > > > > Status Codes  [code:count]               0:2565  200:3432
> > > 500:3
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 31.12
> > > > > > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > > > > > 59.990882146s,
> > > > > > > > > 29.980631507s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > > > > 4.086936886s,
> > > > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    46.67%
> > > > > > > > > Status Codes  [code:count]               0:3011  200:2800
> > > 500:4
> > > > > > > 503:185
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > unexpected EOF
> > > > > > > > > 503 Service Unavailable
> > > > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 30.21
> > > > > > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > > > > > 59.990066513s,
> > > > > > > > > 29.880960464s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > > > > 5.166973696s,
> > > > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    45.25%
> > > > > > > > > Status Codes  [code:count]               0:3118  200:2715
> > > 500:4
> > > > > > > 503:163
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > unexpected EOF
> > > > > > > > > 503 Service Unavailable
> > > > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 28.33
> > > > > > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > > > > > 59.990583341s,
> > > > > > > > > 29.941316035s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > > > > 4.009658048s,
> > > > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    42.47%
> > > > > > > > > Status Codes  [code:count]               0:3448  200:2548
> > > 500:4
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > > context deadline exceeded (Client.Timeout or context
> > > cancellation
> > > > > > while
> > > > > > > > > reading body)
> > > > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > > >
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > > > Requests      [total, rate, throughput]  6000, 100.02,
> 40.46
> > > > > > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > > > > > 59.990568262s,
> > > > > > > > > 29.549805362s
> > > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > > > > 5.811191827s,
> > > > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > > Success       [ratio]                    60.38%
> > > > > > > > > Status Codes  [code:count]               0:2372  200:3623
> > > 500:5
> > > > > > > > > Error Set:
> > > > > > > > > 500 Server Error
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <
> deicool@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello
> > > > > > > > > >
> > > > > > > > > > I can as of now see two changes:
> > > > > > > > > >
> > > > > > > > > > 1.  -Xmx
> > > > > > > > > >
> > > > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > > > >
> > > > > > > > > > Deepak
> > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > animals
> > > > > are
> > > > > > > > > treated
> > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > >
> > > > > > > > > > +91 73500 12833
> > > > > > > > > > deicool@gmail.com
> > > > > > > > > >
> > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > >
> > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > >
> > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > > > -jar start.jar --module=http
> > > > > > > > > > > I believe the key items are:
> > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > -Xmx15826m
> > > > > > > > > > > -Xss256k
> > > > > > > > > > >
> > > > > > > > > > > And for 8.9.0 is
> > > > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > 8983
> > > > > > > > > > /srv/solr/logs
> > > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar
> > start.jar
> > > > > > > > > --module=http
> > > > > > > > > > > Key:
> > > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> > 8983
> > > > > > > > > > /srv/solr/logs
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > > -Xmx7913m
> > > > > > > > > > > -Xss256k
> > > > > > > > > > >
> > > > > > > > > > > Xmx values are based on the instance RAM, currently
> > they're
> > > > > > running
> > > > > > > > on
> > > > > > > > > > two
> > > > > > > > > > > different instance types but we see the same behaviour
> > when
> > > > > > they're
> > > > > > > > on
> > > > > > > > > > > identical types too.
> > > > > > > > > > >
> > > > > > > > > > > Many thanks
> > > > > > > > > > >
> > > > > > > > > > > Dominic
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> > > deicool@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello
> > > > > > > > > > > >
> > > > > > > > > > > > Can you please tell us the JVM Heap Setting for both
> > the
> > > > > > > versions:
> > > > > > > > > > 8.3.1,
> > > > > > > > > > > > 8.9.0?
> > > > > > > > > > > >
> > > > > > > > > > > > I will also have to look into the following code:
> > > > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > > > >
> > > > > > > > > > > > Deepak
> > > > > > > > > > > > "The greatness of a nation can be judged by the way
> its
> > > > > animals
> > > > > > > are
> > > > > > > > > > > treated
> > > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > > >
> > > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > > >
> > > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > > >
> > > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > > > >
> > > > > > > > > > > > > The overwhelming majority appear to be threads in
> > > > > > > TIMED_WAITING,
> > > > > > > > > all
> > > > > > > > > > > > > waiting on the same
> > > > > > > > > > > > > thing:
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've attached a screenshot which includes the stack
> > > > trace.
> > > > > > > > Stopping
> > > > > > > > > > all
> > > > > > > > > > > > > queries to the instance and waiting didn't result
> in
> > > any
> > > > > > > > noticeable
> > > > > > > > > > > > > decrease in the number of threads so it looks like
> > > > despite
> > > > > > > being
> > > > > > > > > > timed,
> > > > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Restarting the service takes me back down to just
> 53
> > > > > threads;
> > > > > > > > > > > re-running
> > > > > > > > > > > > a
> > > > > > > > > > > > > test results in many new threads immediately coming
> > > into
> > > > > > being,
> > > > > > > > > this
> > > > > > > > > > > time
> > > > > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > > - See second screenshot. The stack trace for those
> is
> > > too
> > > > > big
> > > > > > > for
> > > > > > > > > one
> > > > > > > > > > > > > screen so here's the output:
> > > > > > > > > > > > >
> > > > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > >
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > > > >    -
> > > > > > > > >
> > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > > > >    -
> > > > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > > > >    -
> > > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > >
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > > > >    -
> > > > > > > > >
> > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > > >    -
> > > > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > > > >    Source)
> > > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > >
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > > > >    -
> > > > > > > > > >
> > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > > > >    -
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > > > >    - java.base@11.0.5
> > > > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > > > >
> > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > > > > joelsolr@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> There is a thread dump on the Solr admin. You can
> > use
> > > > that
> > > > > > to
> > > > > > > > > > > determine
> > > > > > > > > > > > >> what all those threads are doing and where they
> are
> > > > > getting
> > > > > > > > stuck.
> > > > > > > > > > You
> > > > > > > > > > > > can
> > > > > > > > > > > > >> post parts of the thread dump back to this email
> > > thread
> > > > as
> > > > > > > well.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> > We run 8.3.1 in prod without any problems, but
> > we're
> > > > > > having
> > > > > > > > > issues
> > > > > > > > > > > > with
> > > > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I've created an 8.9.0 leader & follower,
> imported
> > > our
> > > > > live
> > > > > > > > data
> > > > > > > > > > into
> > > > > > > > > > > > it,
> > > > > > > > > > > > >> > and am testing it via replaying requests made to
> > > prod.
> > > > > > We're
> > > > > > > > > > seeing
> > > > > > > > > > > a
> > > > > > > > > > > > >> big
> > > > > > > > > > > > >> > problem where fairly moderate request rates are
> > > > causing
> > > > > > the
> > > > > > > > > > instance
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > become so slow it fails healthcheck. The logs
> > > showed a
> > > > > lot
> > > > > > > of
> > > > > > > > > > errors
> > > > > > > > > > > > >> around
> > > > > > > > > > > > >> > creating threads:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread]
> > Failed
> > > > to
> > > > > > > start
> > > > > > > > > > > thread -
> > > > > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> > > > > stacksize:
> > > > > > > > 256k,
> > > > > > > > > > > > >> guardsize:
> > > > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > > > o.e.j.i.ManagedSelector
> > > > > > =>
> > > > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create
> > native
> > > > > > thread:
> > > > > > > > > > possibly
> > > > > > > > > > > > >> out of
> > > > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > So I monitored thread count for the process
> whilst
> > > > > running
> > > > > > > the
> > > > > > > > > > test
> > > > > > > > > > > > >> suite
> > > > > > > > > > > > >> > and saw a persistent pattern: Threads increased
> > > until
> > > > > > maxed
> > > > > > > > out,
> > > > > > > > > > the
> > > > > > > > > > > > >> logs
> > > > > > > > > > > > >> > flooded with errors as it tried to create still
> > more
> > > > > > > threads,
> > > > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > >> > instance slowed down until terminated as
> > unhealthy.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried
> > > raising
> > > > > and
> > > > > > > > > > lowering
> > > > > > > > > > > it
> > > > > > > > > > > > >> but
> > > > > > > > > > > > >> > regardless of value the result is the same: it
> > gets
> > > > > maxed
> > > > > > > and
> > > > > > > > > > > > everything
> > > > > > > > > > > > >> > slows down.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Is there anything I can do to stop solr spinning
> > up
> > > so
> > > > > > many
> > > > > > > > > > threads
> > > > > > > > > > > it
> > > > > > > > > > > > >> > ceases to function? There have been a few test
> > > passes
> > > > > > where
> > > > > > > it
> > > > > > > > > > > > >> > spontaneously dropped threadcount from thousands
> > to
> > > > > > hundreds
> > > > > > > > and
> > > > > > > > > > > > stayed
> > > > > > > > > > > > >> up
> > > > > > > > > > > > >> > longer, but there seems no pattern to when this
> > > > happens.
> > > > > > > > Running
> > > > > > > > > > the
> > > > > > > > > > > > >> same
> > > > > > > > > > > > >> > tests on 8.3.1 results in a much slower increase
> > in
> > > > > > threads
> > > > > > > > and
> > > > > > > > > it
> > > > > > > > > > > > never
> > > > > > > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > See below for the thread count and healthcheck
> > times
> > > > > seen
> > > > > > > on a
> > > > > > > > > > > (fairly
> > > > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Thanks
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Dominic
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Threadcount:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date;
> > ps
> > > > > -eLF |
> > > > > > > > grep
> > > > > > > > > > > > >> 'start.jar'
> > > > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > > > >> > 52
> > > > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > > > >> > 52
> > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > >> > 52
> > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > >> > 52
> > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > >> > 569
> > > > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > > > >> > 899
> > > > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > > > >> > 1198
> > > > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > > > >> > 1589
> > > > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > > > >> > 2016
> > > > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > > > >> > 2451
> > > > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > > > >> > 2851
> > > > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > > > >> > 2934
> > > > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > > > >> > 3249
> > > > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > > > >> > 3501
> > > > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > > > >> > 3734
> > > > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > > > >> > 4128
> > > > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > > > >> > 4374
> > > > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > > > >> > 4637
> > > > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > > > >> > 4693
> > > > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > > > >> > 4807
> > > > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > > > >> > 4916
> > > > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> host.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date;
> > > curl
> > > > -v
> > > > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date;
> echo
> > > > > '----';
> > > > > > > > sleep
> > > > > > > > > > > > >> > 10s; done
> > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:02
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:12
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:01
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:23
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:01
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:03
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:09
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:11
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:03
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > > --:--:--
> > > > > > > > 0:00:02
> > > > > > > > > > > > --:--:--
> > > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > > >> > ----
> > > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote
> host.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Apologies for not replying earlier, illness struck last week :(

So, I did some testing with increased Xss and Xmx values for 8.9.0 as
suggested.

Boosting Xss from 256k to 1024k made no impact on any tests.

Boosting from -Xmx7913m to -Xmx15826m did improve some things: solr was
less likely to become unresponsive when it maxed out on creating threads
during high-intensity testing, and increased test scores significantly when
running low-intensity short-lived tests - from ~70% to 100% success testing
at 10 requests/sec. So that's all good.

Sadly, it hasn't fixed everything: Running the low-rate 10/sec on one-hour
testing cycles has shown no improvement: It's still in the high-80s,
low-90s compared to the high-90s of 8.3.1 running the same tests with the
same settings. The failures are caused by a higher incidence of "no
response" rather than receiving any error. Also at this long&slow test
cycle, there was a significant difference in threads being run by the two
versions: 8.3.1 hovering at around 350 whilst 8.9.0 had nearly a thousand.
In both cases, admin interface showed that almost all threads were the
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject in
TIMED_WAITING status.

Since the "long & low intensity" tests are the closest reflection of the
demands that would be placed on the instances in production use, this
remains a problem for us: On current showing, even though we're running
8.9.0 on instances with faster CPUs and the same heap settings as 8.3.1, we
see more latency and more failures. Neither RAM not CPU use is concerningly
high on either type; load is at around 1 on both types. I'm at a loss as to
why 8.9.0 is failing to respond to so many more requests than 8.3.1 unless
we fall back to the "threads aren't terminstaing when they should"
possibility. I haven't found a way to set the threshold for the timed
threads, so if there is a way I'd be happy to try it?

Thanks

Dominic

On Thu, 14 Oct 2021 at 06:28, Deepak Goel <de...@gmail.com> wrote:

> Hello
>
> This might not be exactly what you are looking for, but it might give some
> clues:
> https://www.eclipse.org/lists/jetty-users/msg07971.html
>
> I think you definitely need to increase your heap size. There might be no
> memory-leaks, it might be just that the load which you are pushing into
> Solr always requires a particular amount of memory.
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com> wrote:
>
> > I was gonna mention point 2 as well. Maybe you're hitting the -Xss JVM
> > limit (256Kb) and need to manually increase it.
> > I seem to recall that it should be at the very least 1Mb (or so it says
> > here
> > <
> >
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> > >
> > )
> >
> > Maybe removing that parameter or setting it to 1024Kb may change your
> > results
> >
> > On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com> wrote:
> >
> > > There could be parameters which specify:
> > >
> > > 1. How long do old threads live? And when they can be reused.
> > >
> > > 2. You might also be reaching memory restrictions for creating new
> > threads.
> > >
> > > 3. This is a bit offbeat thinking. But is there a lot of disk
> activities
> > > happening as threads are nothing but files.
> > >
> > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries,
> <dominic@adzuna.com.invalid
> > >
> > > wrote:
> > >
> > > > Sure - when solr gets pushed enough to reach the max thread count, it
> > > > doesn't just re-use what it has, it keeps trying to create new ones.
> > The
> > > > logs *flood* with these errors:
> > > >
> > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > [5032072.842s][warning][os,thread] Failed to start thread -
> > > pthread_create
> > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > detached.
> > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > pthread_create
> > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > detached.
> > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > [5032072.843s][warning][os,thread] Failed to start thread -
> > > pthread_create
> > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > detached.
> > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > [5032072.848s][warning][os,thread] Failed to start thread -
> > > pthread_create
> > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > detached.
> > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > > [5032072.852s][warning][os,thread] Failed to start thread -
> > > pthread_create
> > > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> > detached.
> > > >
> > > >
> > > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to
> > create
> > > > native thread: possibly out of memory or process/resource limits
> > reached
> > > >         at java.base/java.lang.Thread.start0(Native Method)
> > > > java.lang.OutOfMemoryError: unable to create native thread: possibly
> > out
> > > of
> > > > memory or process/resource limits reached
> > > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> .SelectorManager.execute(SelectorManager.java:163)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> .ManagedSelector.execute(ManagedSelector.java:332)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> .ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> .AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> .AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > > >
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > > org.eclipse.jetty.io
> > > > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at
> > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > > >
> > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com> wrote:
> > > >
> > > > > Am sorry, did you post the thread error.I might have missed it. Can
> > you
> > > > > please post it again?
> > > > >
> > > > > Deepak
> > > > > "The greatness of a nation can be judged by the way its animals are
> > > > treated
> > > > > - Mahatma Gandhi"
> > > > >
> > > > > +91 73500 12833
> > > > > deicool@gmail.com
> > > > >
> > > > > Facebook: https://www.facebook.com/deicool
> > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > >
> > > > > "Plant a Tree, Go Green"
> > > > >
> > > > > Make In India : http://www.makeinindia.com/home
> > > > >
> > > > >
> > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > > <do...@adzuna.com.invalid> wrote:
> > > > >
> > > > > > Keeping threads around for re-use would certainly make sense to
> me,
> > > but
> > > > > > when the code starts erroring because it's trying to create
> > ever-more
> > > > new
> > > > > > ones rather than re-use old ones, it seems there must be a bug
> > > > > >
> > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > We will have to check the solr logic. Threads waiting might get
> > > > reused
> > > > > in
> > > > > > > the future, so might be good for all you know.
> > > > > > >
> > > > > > > Deepak
> > > > > > > "The greatness of a nation can be judged by the way its animals
> > are
> > > > > > treated
> > > > > > > - Mahatma Gandhi"
> > > > > > >
> > > > > > > +91 73500 12833
> > > > > > > deicool@gmail.com
> > > > > > >
> > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > >
> > > > > > > "Plant a Tree, Go Green"
> > > > > > >
> > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > >
> > > > > > > > Yep - I edited the two instances to remove those differences
> > and
> > > > > > re-ran a
> > > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > > >
> > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this
> > > point
> > > > > and
> > > > > > > > neither hit the threads limit or failed to create new
> threads.
> > > But
> > > > > both
> > > > > > > > scaled up to well above 2k threads and even after stopping
> the
> > > > test,
> > > > > > the
> > > > > > > > threads remained in a timed wait condition. So I'm still
> > thinking
> > > > > > > there's a
> > > > > > > > problem with threads not getting terminated somewhere. Test
> > > output
> > > > > > below
> > > > > > > if
> > > > > > > > it's of use:
> > > > > > > >
> > > > > > > > 8.3.1
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > > > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > > > > 59.990075461s,
> > > > > > > > 29.980816109s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > > > 5.175300857s,
> > > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    37.57%
> > > > > > > > Status Codes  [code:count]               0:3741  200:2254
> > 500:5
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > > > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > > > > 59.990859187s,
> > > > > > > > 29.940598144s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s,
> 57.37µs,
> > > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    16.45%
> > > > > > > > Status Codes  [code:count]               0:5007  200:987
> 500:6
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > > > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > > > > 59.990945585s,
> > > > > > > > 29.590473195s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s,
> 58.742µs,
> > > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    17.43%
> > > > > > > > Status Codes  [code:count]               0:4874  200:1046
> > 500:80
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > context deadline exceeded (Client.Timeout or context
> > cancellation
> > > > > while
> > > > > > > > reading body)
> > > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > > > > 59.990833778s,
> > > > > > > > 29.550364596s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s,
> > 51.073µs,
> > > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    2.53%
> > > > > > > > Status Codes  [code:count]               0:5810  200:152
> > 500:38
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > > > > 59.990601937s,
> > > > > > > > 29.460705144s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s,
> > 50.866µs,
> > > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    1.08%
> > > > > > > > Status Codes  [code:count]               0:5907  200:65
> 500:28
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > context deadline exceeded (Client.Timeout or context
> > cancellation
> > > > > while
> > > > > > > > reading body)
> > > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > > upload: ./831_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > > > > 59.990482467s,
> > > > > > > > 29.770189155s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s,
> > 50.424µs,
> > > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    2.47%
> > > > > > > > Status Codes  [code:count]               0:5821  200:148
> > 500:31
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > >
> > > > > > > >
> > > > > > > > 8.9.0
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > > > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > > > > 59.990809535s,
> > > > > > > > 29.939523392s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > > > 5.184437078s,
> > > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    42.72%
> > > > > > > > Status Codes  [code:count]               0:3432  200:2563
> > 500:5
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > > > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > > > > 59.990310608s,
> > > > > > > > 29.960808919s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > > > 4.856237339s,
> > > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    57.20%
> > > > > > > > Status Codes  [code:count]               0:2565  200:3432
> > 500:3
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > > > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > > > > 59.990882146s,
> > > > > > > > 29.980631507s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > > > 4.086936886s,
> > > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    46.67%
> > > > > > > > Status Codes  [code:count]               0:3011  200:2800
> > 500:4
> > > > > > 503:185
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > unexpected EOF
> > > > > > > > 503 Service Unavailable
> > > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > > > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > > > > 59.990066513s,
> > > > > > > > 29.880960464s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > > > 5.166973696s,
> > > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    45.25%
> > > > > > > > Status Codes  [code:count]               0:3118  200:2715
> > 500:4
> > > > > > 503:163
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > unexpected EOF
> > > > > > > > 503 Service Unavailable
> > > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > > > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > > > > 59.990583341s,
> > > > > > > > 29.941316035s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > > > 4.009658048s,
> > > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    42.47%
> > > > > > > > Status Codes  [code:count]               0:3448  200:2548
> > 500:4
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > > context deadline exceeded (Client.Timeout or context
> > cancellation
> > > > > while
> > > > > > > > reading body)
> > > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > > upload: ./890_tests_results.bin to
> > > > > > > >
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > > > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > > > > 59.990568262s,
> > > > > > > > 29.549805362s
> > > > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > > > 5.811191827s,
> > > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > > Success       [ratio]                    60.38%
> > > > > > > > Status Codes  [code:count]               0:2372  200:3623
> > 500:5
> > > > > > > > Error Set:
> > > > > > > > 500 Server Error
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <deicool@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hello
> > > > > > > > >
> > > > > > > > > I can as of now see two changes:
> > > > > > > > >
> > > > > > > > > 1.  -Xmx
> > > > > > > > >
> > > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > > >
> > > > > > > > > Deepak
> > > > > > > > > "The greatness of a nation can be judged by the way its
> > animals
> > > > are
> > > > > > > > treated
> > > > > > > > > - Mahatma Gandhi"
> > > > > > > > >
> > > > > > > > > +91 73500 12833
> > > > > > > > > deicool@gmail.com
> > > > > > > > >
> > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > >
> > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > >
> > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > > -XX:+PerfDisableSharedMem
> > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > -XX:+UseLargePages
> > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > > -jar start.jar --module=http
> > > > > > > > > > I believe the key items are:
> > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > -Xmx15826m
> > > > > > > > > > -Xss256k
> > > > > > > > > >
> > > > > > > > > > And for 8.9.0 is
> > > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> > -XX:+PerfDisableSharedMem
> > > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > > -XX:+UseLargePages
> > > > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > > -Duser.timezone=UTC
> > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> 8983
> > > > > > > > > /srv/solr/logs
> > > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar
> start.jar
> > > > > > > > --module=http
> > > > > > > > > > Key:
> > > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > > -XX:+UseG1GC
> > > > > > > > > > -XX:+UseLargePages
> > > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh
> 8983
> > > > > > > > > /srv/solr/logs
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > > -Xmx7913m
> > > > > > > > > > -Xss256k
> > > > > > > > > >
> > > > > > > > > > Xmx values are based on the instance RAM, currently
> they're
> > > > > running
> > > > > > > on
> > > > > > > > > two
> > > > > > > > > > different instance types but we see the same behaviour
> when
> > > > > they're
> > > > > > > on
> > > > > > > > > > identical types too.
> > > > > > > > > >
> > > > > > > > > > Many thanks
> > > > > > > > > >
> > > > > > > > > > Dominic
> > > > > > > > > >
> > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> > deicool@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello
> > > > > > > > > > >
> > > > > > > > > > > Can you please tell us the JVM Heap Setting for both
> the
> > > > > > versions:
> > > > > > > > > 8.3.1,
> > > > > > > > > > > 8.9.0?
> > > > > > > > > > >
> > > > > > > > > > > I will also have to look into the following code:
> > > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > > >
> > > > > > > > > > > Deepak
> > > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > > animals
> > > > > > are
> > > > > > > > > > treated
> > > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > > >
> > > > > > > > > > > +91 73500 12833
> > > > > > > > > > > deicool@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > > >
> > > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > > >
> > > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > > >
> > > > > > > > > > > > The overwhelming majority appear to be threads in
> > > > > > TIMED_WAITING,
> > > > > > > > all
> > > > > > > > > > > > waiting on the same
> > > > > > > > > > > > thing:
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > > >
> > > > > > > > > > > > I've attached a screenshot which includes the stack
> > > trace.
> > > > > > > Stopping
> > > > > > > > > all
> > > > > > > > > > > > queries to the instance and waiting didn't result in
> > any
> > > > > > > noticeable
> > > > > > > > > > > > decrease in the number of threads so it looks like
> > > despite
> > > > > > being
> > > > > > > > > timed,
> > > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > > >
> > > > > > > > > > > > Restarting the service takes me back down to just 53
> > > > threads;
> > > > > > > > > > re-running
> > > > > > > > > > > a
> > > > > > > > > > > > test results in many new threads immediately coming
> > into
> > > > > being,
> > > > > > > > this
> > > > > > > > > > time
> > > > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > > - See second screenshot. The stack trace for those is
> > too
> > > > big
> > > > > > for
> > > > > > > > one
> > > > > > > > > > > > screen so here's the output:
> > > > > > > > > > > >
> > > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > >
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > > >    -
> > > > > > > >
> org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > > >    -
> > > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > > >    -
> > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > > >    -
> > > > > > > > > >
> > > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > > >    -
> > > > > > > >
> > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > > >    -
> > > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > > >    Source)
> > > > > > > > > > > >    -
> > > > > > > > > >
> > > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > > >    -
> > > > > > > > >
> > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > > >    -
> > > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > > >    -
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > > >    - java.base@11.0.5
> > > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > > >
> > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > > [image: image.png]
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > > > joelsolr@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> There is a thread dump on the Solr admin. You can
> use
> > > that
> > > > > to
> > > > > > > > > > determine
> > > > > > > > > > > >> what all those threads are doing and where they are
> > > > getting
> > > > > > > stuck.
> > > > > > > > > You
> > > > > > > > > > > can
> > > > > > > > > > > >> post parts of the thread dump back to this email
> > thread
> > > as
> > > > > > well.
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> > We run 8.3.1 in prod without any problems, but
> we're
> > > > > having
> > > > > > > > issues
> > > > > > > > > > > with
> > > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > I've created an 8.9.0 leader & follower, imported
> > our
> > > > live
> > > > > > > data
> > > > > > > > > into
> > > > > > > > > > > it,
> > > > > > > > > > > >> > and am testing it via replaying requests made to
> > prod.
> > > > > We're
> > > > > > > > > seeing
> > > > > > > > > > a
> > > > > > > > > > > >> big
> > > > > > > > > > > >> > problem where fairly moderate request rates are
> > > causing
> > > > > the
> > > > > > > > > instance
> > > > > > > > > > > to
> > > > > > > > > > > >> > become so slow it fails healthcheck. The logs
> > showed a
> > > > lot
> > > > > > of
> > > > > > > > > errors
> > > > > > > > > > > >> around
> > > > > > > > > > > >> > creating threads:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread]
> Failed
> > > to
> > > > > > start
> > > > > > > > > > thread -
> > > > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> > > > stacksize:
> > > > > > > 256k,
> > > > > > > > > > > >> guardsize:
> > > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > > o.e.j.i.ManagedSelector
> > > > > =>
> > > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create
> native
> > > > > thread:
> > > > > > > > > possibly
> > > > > > > > > > > >> out of
> > > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > So I monitored thread count for the process whilst
> > > > running
> > > > > > the
> > > > > > > > > test
> > > > > > > > > > > >> suite
> > > > > > > > > > > >> > and saw a persistent pattern: Threads increased
> > until
> > > > > maxed
> > > > > > > out,
> > > > > > > > > the
> > > > > > > > > > > >> logs
> > > > > > > > > > > >> > flooded with errors as it tried to create still
> more
> > > > > > threads,
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > >> > instance slowed down until terminated as
> unhealthy.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried
> > raising
> > > > and
> > > > > > > > > lowering
> > > > > > > > > > it
> > > > > > > > > > > >> but
> > > > > > > > > > > >> > regardless of value the result is the same: it
> gets
> > > > maxed
> > > > > > and
> > > > > > > > > > > everything
> > > > > > > > > > > >> > slows down.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Is there anything I can do to stop solr spinning
> up
> > so
> > > > > many
> > > > > > > > > threads
> > > > > > > > > > it
> > > > > > > > > > > >> > ceases to function? There have been a few test
> > passes
> > > > > where
> > > > > > it
> > > > > > > > > > > >> > spontaneously dropped threadcount from thousands
> to
> > > > > hundreds
> > > > > > > and
> > > > > > > > > > > stayed
> > > > > > > > > > > >> up
> > > > > > > > > > > >> > longer, but there seems no pattern to when this
> > > happens.
> > > > > > > Running
> > > > > > > > > the
> > > > > > > > > > > >> same
> > > > > > > > > > > >> > tests on 8.3.1 results in a much slower increase
> in
> > > > > threads
> > > > > > > and
> > > > > > > > it
> > > > > > > > > > > never
> > > > > > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > See below for the thread count and healthcheck
> times
> > > > seen
> > > > > > on a
> > > > > > > > > > (fairly
> > > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Thanks
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Dominic
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Threadcount:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date;
> ps
> > > > -eLF |
> > > > > > > grep
> > > > > > > > > > > >> 'start.jar'
> > > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > > >> > 52
> > > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > > >> > 52
> > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > >> > 52
> > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > >> > 52
> > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > >> > 569
> > > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > > >> > 899
> > > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > > >> > 1198
> > > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > > >> > 1589
> > > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > > >> > 2016
> > > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > > >> > 2451
> > > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > > >> > 2851
> > > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > > >> > 2934
> > > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > > >> > 3249
> > > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > > >> > 3501
> > > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > > >> > 3734
> > > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > > >> > 4128
> > > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > > >> > 4374
> > > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > > >> > 4637
> > > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > > >> > 4693
> > > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > > >> > 4807
> > > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > > >> > 4916
> > > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > > >> > 4916
> > > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date;
> > curl
> > > -v
> > > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo
> > > > '----';
> > > > > > > sleep
> > > > > > > > > > > >> > 10s; done
> > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:02
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:12
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:01
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:23
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:01
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:03
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:09
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:11
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:03
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> > --:--:--
> > > > > > > 0:00:02
> > > > > > > > > > > --:--:--
> > > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > > >> > ----
> > > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
Hello

This might not be exactly what you are looking for, but it might give some
clues:
https://www.eclipse.org/lists/jetty-users/msg07971.html

I think you definitely need to increase your heap size. There might be no
memory-leaks, it might be just that the load which you are pushing into
Solr always requires a particular amount of memory.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 11:10 PM Carlos Grappa <cg...@gmail.com> wrote:

> I was gonna mention point 2 as well. Maybe you're hitting the -Xss JVM
> limit (256Kb) and need to manually increase it.
> I seem to recall that it should be at the very least 1Mb (or so it says
> here
> <
> https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34
> >
> )
>
> Maybe removing that parameter or setting it to 1024Kb may change your
> results
>
> On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com> wrote:
>
> > There could be parameters which specify:
> >
> > 1. How long do old threads live? And when they can be reused.
> >
> > 2. You might also be reaching memory restrictions for creating new
> threads.
> >
> > 3. This is a bit offbeat thinking. But is there a lot of disk activities
> > happening as threads are nothing but files.
> >
> > On Wed, 13 Oct 2021, 22:30 Dominic Humphries, <dominic@adzuna.com.invalid
> >
> > wrote:
> >
> > > Sure - when solr gets pushed enough to reach the max thread count, it
> > > doesn't just re-use what it has, it keeps trying to create new ones.
> The
> > > logs *flood* with these errors:
> > >
> > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > [5032072.842s][warning][os,thread] Failed to start thread -
> > pthread_create
> > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> detached.
> > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > [5032072.843s][warning][os,thread] Failed to start thread -
> > pthread_create
> > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> detached.
> > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > [5032072.843s][warning][os,thread] Failed to start thread -
> > pthread_create
> > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> detached.
> > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > [5032072.848s][warning][os,thread] Failed to start thread -
> > pthread_create
> > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> detached.
> > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > > [5032072.852s][warning][os,thread] Failed to start thread -
> > pthread_create
> > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k,
> detached.
> > >
> > >
> > > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to
> create
> > > native thread: possibly out of memory or process/resource limits
> reached
> > >         at java.base/java.lang.Thread.start0(Native Method)
> > > java.lang.OutOfMemoryError: unable to create native thread: possibly
> out
> > of
> > > memory or process/resource limits reached
> > >         at java.lang.Thread.start0(Native Method) ~[?:?]
> > >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io.SelectorManager.execute(SelectorManager.java:163)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io.ManagedSelector.execute(ManagedSelector.java:332)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> .ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> > >
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > > org.eclipse.jetty.io
> > > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at
> > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> > >         at java.lang.Thread.run(Thread.java:834) [?:?]
> > >
> > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > Am sorry, did you post the thread error.I might have missed it. Can
> you
> > > > please post it again?
> > > >
> > > > Deepak
> > > > "The greatness of a nation can be judged by the way its animals are
> > > treated
> > > > - Mahatma Gandhi"
> > > >
> > > > +91 73500 12833
> > > > deicool@gmail.com
> > > >
> > > > Facebook: https://www.facebook.com/deicool
> > > > LinkedIn: www.linkedin.com/in/deicool
> > > >
> > > > "Plant a Tree, Go Green"
> > > >
> > > > Make In India : http://www.makeinindia.com/home
> > > >
> > > >
> > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > > <do...@adzuna.com.invalid> wrote:
> > > >
> > > > > Keeping threads around for re-use would certainly make sense to me,
> > but
> > > > > when the code starts erroring because it's trying to create
> ever-more
> > > new
> > > > > ones rather than re-use old ones, it seems there must be a bug
> > > > >
> > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com>
> wrote:
> > > > >
> > > > > > We will have to check the solr logic. Threads waiting might get
> > > reused
> > > > in
> > > > > > the future, so might be good for all you know.
> > > > > >
> > > > > > Deepak
> > > > > > "The greatness of a nation can be judged by the way its animals
> are
> > > > > treated
> > > > > > - Mahatma Gandhi"
> > > > > >
> > > > > > +91 73500 12833
> > > > > > deicool@gmail.com
> > > > > >
> > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > >
> > > > > > "Plant a Tree, Go Green"
> > > > > >
> > > > > > Make In India : http://www.makeinindia.com/home
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > >
> > > > > > > Yep - I edited the two instances to remove those differences
> and
> > > > > re-ran a
> > > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > > >
> > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this
> > point
> > > > and
> > > > > > > neither hit the threads limit or failed to create new threads.
> > But
> > > > both
> > > > > > > scaled up to well above 2k threads and even after stopping the
> > > test,
> > > > > the
> > > > > > > threads remained in a timed wait condition. So I'm still
> thinking
> > > > > > there's a
> > > > > > > problem with threads not getting terminated somewhere. Test
> > output
> > > > > below
> > > > > > if
> > > > > > > it's of use:
> > > > > > >
> > > > > > > 8.3.1
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > > > 59.990075461s,
> > > > > > > 29.980816109s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > > 5.175300857s,
> > > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    37.57%
> > > > > > > Status Codes  [code:count]               0:3741  200:2254
> 500:5
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > > > 59.990859187s,
> > > > > > > 29.940598144s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    16.45%
> > > > > > > Status Codes  [code:count]               0:5007  200:987  500:6
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > > > 59.990945585s,
> > > > > > > 29.590473195s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    17.43%
> > > > > > > Status Codes  [code:count]               0:4874  200:1046
> 500:80
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > context deadline exceeded (Client.Timeout or context
> cancellation
> > > > while
> > > > > > > reading body)
> > > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > > > 59.990833778s,
> > > > > > > 29.550364596s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s,
> 51.073µs,
> > > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    2.53%
> > > > > > > Status Codes  [code:count]               0:5810  200:152
> 500:38
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > > > 59.990601937s,
> > > > > > > 29.460705144s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s,
> 50.866µs,
> > > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    1.08%
> > > > > > > Status Codes  [code:count]               0:5907  200:65  500:28
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > context deadline exceeded (Client.Timeout or context
> cancellation
> > > > while
> > > > > > > reading body)
> > > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > > upload: ./831_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > > > 59.990482467s,
> > > > > > > 29.770189155s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s,
> 50.424µs,
> > > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    2.47%
> > > > > > > Status Codes  [code:count]               0:5821  200:148
> 500:31
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > >
> > > > > > >
> > > > > > > 8.9.0
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > > > 59.990809535s,
> > > > > > > 29.939523392s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > > 5.184437078s,
> > > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    42.72%
> > > > > > > Status Codes  [code:count]               0:3432  200:2563
> 500:5
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > > > 59.990310608s,
> > > > > > > 29.960808919s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > > 4.856237339s,
> > > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    57.20%
> > > > > > > Status Codes  [code:count]               0:2565  200:3432
> 500:3
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > > > 59.990882146s,
> > > > > > > 29.980631507s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > > 4.086936886s,
> > > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    46.67%
> > > > > > > Status Codes  [code:count]               0:3011  200:2800
> 500:4
> > > > > 503:185
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > unexpected EOF
> > > > > > > 503 Service Unavailable
> > > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > > > 59.990066513s,
> > > > > > > 29.880960464s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > > 5.166973696s,
> > > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    45.25%
> > > > > > > Status Codes  [code:count]               0:3118  200:2715
> 500:4
> > > > > 503:163
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > unexpected EOF
> > > > > > > 503 Service Unavailable
> > > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > > > 59.990583341s,
> > > > > > > 29.941316035s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > > 4.009658048s,
> > > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    42.47%
> > > > > > > Status Codes  [code:count]               0:3448  200:2548
> 500:4
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > > context deadline exceeded (Client.Timeout or context
> cancellation
> > > > while
> > > > > > > reading body)
> > > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > > upload: ./890_tests_results.bin to
> > > > > > >
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > > > 59.990568262s,
> > > > > > > 29.549805362s
> > > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > > 5.811191827s,
> > > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > > Success       [ratio]                    60.38%
> > > > > > > Status Codes  [code:count]               0:2372  200:3623
> 500:5
> > > > > > > Error Set:
> > > > > > > 500 Server Error
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hello
> > > > > > > >
> > > > > > > > I can as of now see two changes:
> > > > > > > >
> > > > > > > > 1.  -Xmx
> > > > > > > >
> > > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > > >
> > > > > > > > Deepak
> > > > > > > > "The greatness of a nation can be judged by the way its
> animals
> > > are
> > > > > > > treated
> > > > > > > > - Mahatma Gandhi"
> > > > > > > >
> > > > > > > > +91 73500 12833
> > > > > > > > deicool@gmail.com
> > > > > > > >
> > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > >
> > > > > > > > "Plant a Tree, Go Green"
> > > > > > > >
> > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > >
> > > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> > -XX:+PerfDisableSharedMem
> > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > -XX:+UseLargePages
> > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > -Duser.timezone=UTC
> > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > > -jar start.jar --module=http
> > > > > > > > > I believe the key items are:
> > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > -XX:+UseG1GC
> > > > > > > > > -XX:+UseLargePages
> > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > -Xmx15826m
> > > > > > > > > -Xss256k
> > > > > > > > >
> > > > > > > > > And for 8.9.0 is
> > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC
> -XX:+PerfDisableSharedMem
> > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > > -XX:+UseLargePages
> > > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > > -Duser.timezone=UTC
> > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > > > /srv/solr/logs
> > > > > > > > > -Djetty.home=/usr/local/solr/server
> > > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > > > > > > --module=http
> > > > > > > > > Key:
> > > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > > -XX:+UseG1GC
> > > > > > > > > -XX:+UseLargePages
> > > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > > > /srv/solr/logs
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > > -Xmx7913m
> > > > > > > > > -Xss256k
> > > > > > > > >
> > > > > > > > > Xmx values are based on the instance RAM, currently they're
> > > > running
> > > > > > on
> > > > > > > > two
> > > > > > > > > different instance types but we see the same behaviour when
> > > > they're
> > > > > > on
> > > > > > > > > identical types too.
> > > > > > > > >
> > > > > > > > > Many thanks
> > > > > > > > >
> > > > > > > > > Dominic
> > > > > > > > >
> > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <
> deicool@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello
> > > > > > > > > >
> > > > > > > > > > Can you please tell us the JVM Heap Setting for both the
> > > > > versions:
> > > > > > > > 8.3.1,
> > > > > > > > > > 8.9.0?
> > > > > > > > > >
> > > > > > > > > > I will also have to look into the following code:
> > > > > > > > > FileFloatSource.java:210.
> > > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > > >
> > > > > > > > > > Deepak
> > > > > > > > > > "The greatness of a nation can be judged by the way its
> > > animals
> > > > > are
> > > > > > > > > treated
> > > > > > > > > > - Mahatma Gandhi"
> > > > > > > > > >
> > > > > > > > > > +91 73500 12833
> > > > > > > > > > deicool@gmail.com
> > > > > > > > > >
> > > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > > >
> > > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > > >
> > > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > > >
> > > > > > > > > > > The overwhelming majority appear to be threads in
> > > > > TIMED_WAITING,
> > > > > > > all
> > > > > > > > > > > waiting on the same
> > > > > > > > > > > thing:
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > > >
> > > > > > > > > > > I've attached a screenshot which includes the stack
> > trace.
> > > > > > Stopping
> > > > > > > > all
> > > > > > > > > > > queries to the instance and waiting didn't result in
> any
> > > > > > noticeable
> > > > > > > > > > > decrease in the number of threads so it looks like
> > despite
> > > > > being
> > > > > > > > timed,
> > > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > > >
> > > > > > > > > > > Restarting the service takes me back down to just 53
> > > threads;
> > > > > > > > > re-running
> > > > > > > > > > a
> > > > > > > > > > > test results in many new threads immediately coming
> into
> > > > being,
> > > > > > > this
> > > > > > > > > time
> > > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > > - See second screenshot. The stack trace for those is
> too
> > > big
> > > > > for
> > > > > > > one
> > > > > > > > > > > screen so here's the output:
> > > > > > > > > > >
> > > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > >
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > > >    -
> > > > > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > > >    -
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > > >    -
> > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > > >    -
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > > >    -
> > > > > > > > > >
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > > >    -
> > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > > >    -
> > > > > > > > >
> > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > > >    -
> > > > > > >
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > > >    -
> > > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > > >    Source)
> > > > > > > > > > >    -
> > > > > > > > >
> > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > > >    -
> > > > > > > >
> > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > > >    -
> > > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > > >
> > > > > > >
> > > >
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > > >    -
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > > >    - java.base@11.0.5
> > > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > > >
> > > > > > > > > > > [image: image.png]
> > > > > > > > > > > [image: image.png]
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > > joelsolr@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >> There is a thread dump on the Solr admin. You can use
> > that
> > > > to
> > > > > > > > > determine
> > > > > > > > > > >> what all those threads are doing and where they are
> > > getting
> > > > > > stuck.
> > > > > > > > You
> > > > > > > > > > can
> > > > > > > > > > >> post parts of the thread dump back to this email
> thread
> > as
> > > > > well.
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > > >> Joel Bernstein
> > > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > We run 8.3.1 in prod without any problems, but we're
> > > > having
> > > > > > > issues
> > > > > > > > > > with
> > > > > > > > > > >> > trying to upgrade.
> > > > > > > > > > >> >
> > > > > > > > > > >> > I've created an 8.9.0 leader & follower, imported
> our
> > > live
> > > > > > data
> > > > > > > > into
> > > > > > > > > > it,
> > > > > > > > > > >> > and am testing it via replaying requests made to
> prod.
> > > > We're
> > > > > > > > seeing
> > > > > > > > > a
> > > > > > > > > > >> big
> > > > > > > > > > >> > problem where fairly moderate request rates are
> > causing
> > > > the
> > > > > > > > instance
> > > > > > > > > > to
> > > > > > > > > > >> > become so slow it fails healthcheck. The logs
> showed a
> > > lot
> > > > > of
> > > > > > > > errors
> > > > > > > > > > >> around
> > > > > > > > > > >> > creating threads:
> > > > > > > > > > >> >
> > > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed
> > to
> > > > > start
> > > > > > > > > thread -
> > > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> > > stacksize:
> > > > > > 256k,
> > > > > > > > > > >> guardsize:
> > > > > > > > > > >> > 0k, detached.
> > > > > > > > > > >> >
> > > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> > o.e.j.i.ManagedSelector
> > > > =>
> > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create native
> > > > thread:
> > > > > > > > possibly
> > > > > > > > > > >> out of
> > > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > > >> >
> > > > > > > > > > >> > So I monitored thread count for the process whilst
> > > running
> > > > > the
> > > > > > > > test
> > > > > > > > > > >> suite
> > > > > > > > > > >> > and saw a persistent pattern: Threads increased
> until
> > > > maxed
> > > > > > out,
> > > > > > > > the
> > > > > > > > > > >> logs
> > > > > > > > > > >> > flooded with errors as it tried to create still more
> > > > > threads,
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > > > > > > >> >
> > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried
> raising
> > > and
> > > > > > > > lowering
> > > > > > > > > it
> > > > > > > > > > >> but
> > > > > > > > > > >> > regardless of value the result is the same: it gets
> > > maxed
> > > > > and
> > > > > > > > > > everything
> > > > > > > > > > >> > slows down.
> > > > > > > > > > >> >
> > > > > > > > > > >> > Is there anything I can do to stop solr spinning up
> so
> > > > many
> > > > > > > > threads
> > > > > > > > > it
> > > > > > > > > > >> > ceases to function? There have been a few test
> passes
> > > > where
> > > > > it
> > > > > > > > > > >> > spontaneously dropped threadcount from thousands to
> > > > hundreds
> > > > > > and
> > > > > > > > > > stayed
> > > > > > > > > > >> up
> > > > > > > > > > >> > longer, but there seems no pattern to when this
> > happens.
> > > > > > Running
> > > > > > > > the
> > > > > > > > > > >> same
> > > > > > > > > > >> > tests on 8.3.1 results in a much slower increase in
> > > > threads
> > > > > > and
> > > > > > > it
> > > > > > > > > > never
> > > > > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > > > > >> >
> > > > > > > > > > >> > See below for the thread count and healthcheck times
> > > seen
> > > > > on a
> > > > > > > > > (fairly
> > > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > > >> >
> > > > > > > > > > >> > Thanks
> > > > > > > > > > >> >
> > > > > > > > > > >> > Dominic
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > Threadcount:
> > > > > > > > > > >> >
> > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps
> > > -eLF |
> > > > > > grep
> > > > > > > > > > >> 'start.jar'
> > > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > > >> > 52
> > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > > >> > 52
> > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > >> > 52
> > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > >> > 52
> > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > >> > 569
> > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > > >> > 899
> > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > > >> > 1198
> > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > > >> > 1589
> > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > > >> > 2016
> > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > > >> > 2451
> > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > > >> > 2851
> > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > > >> > 2934
> > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > > >> > 3249
> > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > > >> > 3501
> > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > > >> > 3734
> > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > > >> > 4128
> > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > > >> > 4374
> > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > > >> > 4637
> > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > > >> > 4693
> > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > > >> > 4807
> > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > > >> > 4916
> > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > > >> > 4916
> > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > Healthcheck:
> > > > > > > > > > >> >
> > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date;
> curl
> > -v
> > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo
> > > '----';
> > > > > > sleep
> > > > > > > > > > >> > 10s; done
> > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:02
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:12
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:01
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:23
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:01
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:03
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:09
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:11
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:03
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> >   0     0    0     0    0     0      0      0
> --:--:--
> > > > > > 0:00:02
> > > > > > > > > > --:--:--
> > > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > > >> > ----
> > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > > >> >
> > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Carlos Grappa <cg...@gmail.com>.
I was gonna mention point 2 as well. Maybe you're hitting the -Xss JVM
limit (256Kb) and need to manually increase it.
I seem to recall that it should be at the very least 1Mb (or so it says here
<https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34>
)

Maybe removing that parameter or setting it to 1024Kb may change your
results

On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <de...@gmail.com> wrote:

> There could be parameters which specify:
>
> 1. How long do old threads live? And when they can be reused.
>
> 2. You might also be reaching memory restrictions for creating new threads.
>
> 3. This is a bit offbeat thinking. But is there a lot of disk activities
> happening as threads are nothing but files.
>
> On Wed, 13 Oct 2021, 22:30 Dominic Humphries, <do...@adzuna.com.invalid>
> wrote:
>
> > Sure - when solr gets pushed enough to reach the max thread count, it
> > doesn't just re-use what it has, it keeps trying to create new ones. The
> > logs *flood* with these errors:
> >
> > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > [5032072.842s][warning][os,thread] Failed to start thread -
> pthread_create
> > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > [5032072.843s][warning][os,thread] Failed to start thread -
> pthread_create
> > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > [5032072.843s][warning][os,thread] Failed to start thread -
> pthread_create
> > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > [5032072.848s][warning][os,thread] Failed to start thread -
> pthread_create
> > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> > [5032072.852s][warning][os,thread] Failed to start thread -
> pthread_create
> > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> >
> >
> > 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> > o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to create
> > native thread: possibly out of memory or process/resource limits reached
> >         at java.base/java.lang.Thread.start0(Native Method)
> > java.lang.OutOfMemoryError: unable to create native thread: possibly out
> of
> > memory or process/resource limits reached
> >         at java.lang.Thread.start0(Native Method) ~[?:?]
> >         at java.lang.Thread.start(Thread.java:803) ~[?:?]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.SelectorManager.execute(SelectorManager.java:163)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.ManagedSelector.execute(ManagedSelector.java:332)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io
> > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io
> .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io.ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io
> > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io
> >
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> > org.eclipse.jetty.io
> > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
> >         at java.lang.Thread.run(Thread.java:834) [?:?]
> >
> > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com> wrote:
> >
> > > Am sorry, did you post the thread error.I might have missed it. Can you
> > > please post it again?
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > > <do...@adzuna.com.invalid> wrote:
> > >
> > > > Keeping threads around for re-use would certainly make sense to me,
> but
> > > > when the code starts erroring because it's trying to create ever-more
> > new
> > > > ones rather than re-use old ones, it seems there must be a bug
> > > >
> > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com> wrote:
> > > >
> > > > > We will have to check the solr logic. Threads waiting might get
> > reused
> > > in
> > > > > the future, so might be good for all you know.
> > > > >
> > > > > Deepak
> > > > > "The greatness of a nation can be judged by the way its animals are
> > > > treated
> > > > > - Mahatma Gandhi"
> > > > >
> > > > > +91 73500 12833
> > > > > deicool@gmail.com
> > > > >
> > > > > Facebook: https://www.facebook.com/deicool
> > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > >
> > > > > "Plant a Tree, Go Green"
> > > > >
> > > > > Make In India : http://www.makeinindia.com/home
> > > > >
> > > > >
> > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > > <do...@adzuna.com.invalid> wrote:
> > > > >
> > > > > > Yep - I edited the two instances to remove those differences and
> > > > re-ran a
> > > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > > >
> > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this
> point
> > > and
> > > > > > neither hit the threads limit or failed to create new threads.
> But
> > > both
> > > > > > scaled up to well above 2k threads and even after stopping the
> > test,
> > > > the
> > > > > > threads remained in a timed wait condition. So I'm still thinking
> > > > > there's a
> > > > > > problem with threads not getting terminated somewhere. Test
> output
> > > > below
> > > > > if
> > > > > > it's of use:
> > > > > >
> > > > > > 8.3.1
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > > 59.990075461s,
> > > > > > 29.980816109s
> > > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> > 5.175300857s,
> > > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    37.57%
> > > > > > Status Codes  [code:count]               0:3741  200:2254  500:5
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > > 59.990859187s,
> > > > > > 29.940598144s
> > > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    16.45%
> > > > > > Status Codes  [code:count]               0:5007  200:987  500:6
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > > 59.990945585s,
> > > > > > 29.590473195s
> > > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    17.43%
> > > > > > Status Codes  [code:count]               0:4874  200:1046  500:80
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > context deadline exceeded (Client.Timeout or context cancellation
> > > while
> > > > > > reading body)
> > > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > > 59.990833778s,
> > > > > > 29.550364596s
> > > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> > > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    2.53%
> > > > > > Status Codes  [code:count]               0:5810  200:152  500:38
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > > 59.990601937s,
> > > > > > 29.460705144s
> > > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> > > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    1.08%
> > > > > > Status Codes  [code:count]               0:5907  200:65  500:28
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > context deadline exceeded (Client.Timeout or context cancellation
> > > while
> > > > > > reading body)
> > > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > > upload: ./831_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > > 59.990482467s,
> > > > > > 29.770189155s
> > > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> > > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    2.47%
> > > > > > Status Codes  [code:count]               0:5821  200:148  500:31
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > >
> > > > > >
> > > > > > 8.9.0
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > > 59.990809535s,
> > > > > > 29.939523392s
> > > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> > 5.184437078s,
> > > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    42.72%
> > > > > > Status Codes  [code:count]               0:3432  200:2563  500:5
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > > 59.990310608s,
> > > > > > 29.960808919s
> > > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> > 4.856237339s,
> > > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    57.20%
> > > > > > Status Codes  [code:count]               0:2565  200:3432  500:3
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > > 59.990882146s,
> > > > > > 29.980631507s
> > > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> > 4.086936886s,
> > > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    46.67%
> > > > > > Status Codes  [code:count]               0:3011  200:2800  500:4
> > > > 503:185
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > unexpected EOF
> > > > > > 503 Service Unavailable
> > > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > > 59.990066513s,
> > > > > > 29.880960464s
> > > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> > 5.166973696s,
> > > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    45.25%
> > > > > > Status Codes  [code:count]               0:3118  200:2715  500:4
> > > > 503:163
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > unexpected EOF
> > > > > > 503 Service Unavailable
> > > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > > 59.990583341s,
> > > > > > 29.941316035s
> > > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> > 4.009658048s,
> > > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    42.47%
> > > > > > Status Codes  [code:count]               0:3448  200:2548  500:4
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > > context deadline exceeded (Client.Timeout or context cancellation
> > > while
> > > > > > reading body)
> > > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > > upload: ./890_tests_results.bin to
> > > > > >
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > > 59.990568262s,
> > > > > > 29.549805362s
> > > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> > 5.811191827s,
> > > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > > Success       [ratio]                    60.38%
> > > > > > Status Codes  [code:count]               0:2372  200:3623  500:5
> > > > > > Error Set:
> > > > > > 500 Server Error
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hello
> > > > > > >
> > > > > > > I can as of now see two changes:
> > > > > > >
> > > > > > > 1.  -Xmx
> > > > > > >
> > > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > > >
> > > > > > > Deepak
> > > > > > > "The greatness of a nation can be judged by the way its animals
> > are
> > > > > > treated
> > > > > > > - Mahatma Gandhi"
> > > > > > >
> > > > > > > +91 73500 12833
> > > > > > > deicool@gmail.com
> > > > > > >
> > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > >
> > > > > > > "Plant a Tree, Go Green"
> > > > > > >
> > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > >
> > > > > > > > CLI invocation for 8.3.1 is
> > > > > > > > java -server -Xmx15826m -XX:+UseG1GC
> -XX:+PerfDisableSharedMem
> > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > -XX:+UseLargePages
> > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > -Duser.timezone=UTC
> > > > > > > > -Djetty.home=/usr/local/solr/server
> > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > > -Dsolr.jetty.https.port=8983
> > > > > > > > -jar start.jar --module=http
> > > > > > > > I believe the key items are:
> > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > -XX:+UseG1GC
> > > > > > > > -XX:+UseLargePages
> > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > -Xmx15826m
> > > > > > > > -Xss256k
> > > > > > > >
> > > > > > > > And for 8.9.0 is
> > > > > > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > > -XX:+UseLargePages
> > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > -Dsolr.jetty.inetaccess.includes=
> > > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > > -Duser.timezone=UTC
> > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > > /srv/solr/logs
> > > > > > > > -Djetty.home=/usr/local/solr/server
> > > -Dsolr.solr.home=/srv/solr/data
> > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > > > > > --module=http
> > > > > > > > Key:
> > > > > > > > -XX:+AlwaysPreTouch
> > > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > > -XX:+UseG1GC
> > > > > > > > -XX:+UseLargePages
> > > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > > /srv/solr/logs
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > > -Xmx7913m
> > > > > > > > -Xss256k
> > > > > > > >
> > > > > > > > Xmx values are based on the instance RAM, currently they're
> > > running
> > > > > on
> > > > > > > two
> > > > > > > > different instance types but we see the same behaviour when
> > > they're
> > > > > on
> > > > > > > > identical types too.
> > > > > > > >
> > > > > > > > Many thanks
> > > > > > > >
> > > > > > > > Dominic
> > > > > > > >
> > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <deicool@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hello
> > > > > > > > >
> > > > > > > > > Can you please tell us the JVM Heap Setting for both the
> > > > versions:
> > > > > > > 8.3.1,
> > > > > > > > > 8.9.0?
> > > > > > > > >
> > > > > > > > > I will also have to look into the following code:
> > > > > > > > FileFloatSource.java:210.
> > > > > > > > > (will do it tonite-IST and update)
> > > > > > > > >
> > > > > > > > > Deepak
> > > > > > > > > "The greatness of a nation can be judged by the way its
> > animals
> > > > are
> > > > > > > > treated
> > > > > > > > > - Mahatma Gandhi"
> > > > > > > > >
> > > > > > > > > +91 73500 12833
> > > > > > > > > deicool@gmail.com
> > > > > > > > >
> > > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > > >
> > > > > > > > > "Plant a Tree, Go Green"
> > > > > > > > >
> > > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > > >
> > > > > > > > > > The overwhelming majority appear to be threads in
> > > > TIMED_WAITING,
> > > > > > all
> > > > > > > > > > waiting on the same
> > > > > > > > > > thing:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > > >
> > > > > > > > > > I've attached a screenshot which includes the stack
> trace.
> > > > > Stopping
> > > > > > > all
> > > > > > > > > > queries to the instance and waiting didn't result in any
> > > > > noticeable
> > > > > > > > > > decrease in the number of threads so it looks like
> despite
> > > > being
> > > > > > > timed,
> > > > > > > > > > they're simply not getting terminated.
> > > > > > > > > >
> > > > > > > > > > Restarting the service takes me back down to just 53
> > threads;
> > > > > > > > re-running
> > > > > > > > > a
> > > > > > > > > > test results in many new threads immediately coming into
> > > being,
> > > > > > this
> > > > > > > > time
> > > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > > - See second screenshot. The stack trace for those is too
> > big
> > > > for
> > > > > > one
> > > > > > > > > > screen so here's the output:
> > > > > > > > > >
> > > > > > > > > > qtp178604517-861 (861)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > >
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > > >    -
> > > > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > > >    -
> > > > > > > > >
> > > > >
> org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > > >    -
> > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > > >    -
> > > > > > > > >
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > > >    -
> > > > > > > > >
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > > >    -
> > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > > >    -
> > > > > > > >
> > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > > >    -
> > > > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > >
> > > > >
> > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > > >    -
> > org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > > >    Source)
> > > > > > > > > >    -
> > > > > > > >
> > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > > >    -
> > > > > > >
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > > >    -
> > > > > > > > > >    org.eclipse.jetty.io
> > > > > > > > >
> > > > > >
> > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > > >    -
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > > >    - java.base@11.0.5
> > /java.lang.Thread.run(Thread.java:834)
> > > > > > > > > >
> > > > > > > > > > [image: image.png]
> > > > > > > > > > [image: image.png]
> > > > > > > > > >
> > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > > joelsolr@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> There is a thread dump on the Solr admin. You can use
> that
> > > to
> > > > > > > > determine
> > > > > > > > > >> what all those threads are doing and where they are
> > getting
> > > > > stuck.
> > > > > > > You
> > > > > > > > > can
> > > > > > > > > >> post parts of the thread dump back to this email thread
> as
> > > > well.
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> Joel Bernstein
> > > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > We run 8.3.1 in prod without any problems, but we're
> > > having
> > > > > > issues
> > > > > > > > > with
> > > > > > > > > >> > trying to upgrade.
> > > > > > > > > >> >
> > > > > > > > > >> > I've created an 8.9.0 leader & follower, imported our
> > live
> > > > > data
> > > > > > > into
> > > > > > > > > it,
> > > > > > > > > >> > and am testing it via replaying requests made to prod.
> > > We're
> > > > > > > seeing
> > > > > > > > a
> > > > > > > > > >> big
> > > > > > > > > >> > problem where fairly moderate request rates are
> causing
> > > the
> > > > > > > instance
> > > > > > > > > to
> > > > > > > > > >> > become so slow it fails healthcheck. The logs showed a
> > lot
> > > > of
> > > > > > > errors
> > > > > > > > > >> around
> > > > > > > > > >> > creating threads:
> > > > > > > > > >> >
> > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed
> to
> > > > start
> > > > > > > > thread -
> > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> > stacksize:
> > > > > 256k,
> > > > > > > > > >> guardsize:
> > > > > > > > > >> > 0k, detached.
> > > > > > > > > >> >
> > > > > > > > > >> > WARN  (qtp178604517-3891) [   ]
> o.e.j.i.ManagedSelector
> > > =>
> > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create native
> > > thread:
> > > > > > > possibly
> > > > > > > > > >> out of
> > > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > > >> >
> > > > > > > > > >> > So I monitored thread count for the process whilst
> > running
> > > > the
> > > > > > > test
> > > > > > > > > >> suite
> > > > > > > > > >> > and saw a persistent pattern: Threads increased until
> > > maxed
> > > > > out,
> > > > > > > the
> > > > > > > > > >> logs
> > > > > > > > > >> > flooded with errors as it tried to create still more
> > > > threads,
> > > > > > and
> > > > > > > > the
> > > > > > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > > > > > >> >
> > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising
> > and
> > > > > > > lowering
> > > > > > > > it
> > > > > > > > > >> but
> > > > > > > > > >> > regardless of value the result is the same: it gets
> > maxed
> > > > and
> > > > > > > > > everything
> > > > > > > > > >> > slows down.
> > > > > > > > > >> >
> > > > > > > > > >> > Is there anything I can do to stop solr spinning up so
> > > many
> > > > > > > threads
> > > > > > > > it
> > > > > > > > > >> > ceases to function? There have been a few test passes
> > > where
> > > > it
> > > > > > > > > >> > spontaneously dropped threadcount from thousands to
> > > hundreds
> > > > > and
> > > > > > > > > stayed
> > > > > > > > > >> up
> > > > > > > > > >> > longer, but there seems no pattern to when this
> happens.
> > > > > Running
> > > > > > > the
> > > > > > > > > >> same
> > > > > > > > > >> > tests on 8.3.1 results in a much slower increase in
> > > threads
> > > > > and
> > > > > > it
> > > > > > > > > never
> > > > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > > > >> >
> > > > > > > > > >> > See below for the thread count and healthcheck times
> > seen
> > > > on a
> > > > > > > > (fairly
> > > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > > >> >
> > > > > > > > > >> > Thanks
> > > > > > > > > >> >
> > > > > > > > > >> > Dominic
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > Threadcount:
> > > > > > > > > >> >
> > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps
> > -eLF |
> > > > > grep
> > > > > > > > > >> 'start.jar'
> > > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > > >> > 52
> > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > > >> > 52
> > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > >> > 52
> > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > >> > 52
> > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > >> > 569
> > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > > >> > 899
> > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > > >> > 1198
> > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > > >> > 1589
> > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > > >> > 2016
> > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > > >> > 2451
> > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > > >> > 2851
> > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > > >> > 2934
> > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > > >> > 3249
> > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > > >> > 3501
> > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > > >> > 3734
> > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > > >> > 4128
> > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > > >> > 4374
> > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > > >> > 4637
> > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > > >> > 4693
> > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > > >> > 4807
> > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > > >> > 4916
> > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > > >> > 4916
> > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > Healthcheck:
> > > > > > > > > >> >
> > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl
> -v
> > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo
> > '----';
> > > > > sleep
> > > > > > > > > >> > 10s; done
> > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:02
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:12
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:01
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:23
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:01
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:03
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:09
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:11
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:03
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > > 0:00:02
> > > > > > > > > --:--:--
> > > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > > >> > ----
> > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
There could be parameters which specify:

1. How long do old threads live? And when they can be reused.

2. You might also be reaching memory restrictions for creating new threads.

3. This is a bit offbeat thinking. But is there a lot of disk activities
happening as threads are nothing but files.

On Wed, 13 Oct 2021, 22:30 Dominic Humphries, <do...@adzuna.com.invalid>
wrote:

> Sure - when solr gets pushed enough to reach the max thread count, it
> doesn't just re-use what it has, it keeps trying to create new ones. The
> logs *flood* with these errors:
>
> Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> [5032072.842s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> [5032072.843s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> [5032072.843s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> [5032072.848s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
> Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
> [5032072.852s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
>
>
> 2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
> o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to create
> native thread: possibly out of memory or process/resource limits reached
>         at java.base/java.lang.Thread.start0(Native Method)
> java.lang.OutOfMemoryError: unable to create native thread: possibly out of
> memory or process/resource limits reached
>         at java.lang.Thread.start0(Native Method) ~[?:?]
>         at java.lang.Thread.start(Thread.java:803) ~[?:?]
>         at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.SelectorManager.execute(SelectorManager.java:163)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.ManagedSelector.execute(ManagedSelector.java:332)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io
> .ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:232)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io.ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io
> .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io
> .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
> org.eclipse.jetty.io
> .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
> ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
>         at java.lang.Thread.run(Thread.java:834) [?:?]
>
> On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com> wrote:
>
> > Am sorry, did you post the thread error.I might have missed it. Can you
> > please post it again?
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> > <do...@adzuna.com.invalid> wrote:
> >
> > > Keeping threads around for re-use would certainly make sense to me, but
> > > when the code starts erroring because it's trying to create ever-more
> new
> > > ones rather than re-use old ones, it seems there must be a bug
> > >
> > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > We will have to check the solr logic. Threads waiting might get
> reused
> > in
> > > > the future, so might be good for all you know.
> > > >
> > > > Deepak
> > > > "The greatness of a nation can be judged by the way its animals are
> > > treated
> > > > - Mahatma Gandhi"
> > > >
> > > > +91 73500 12833
> > > > deicool@gmail.com
> > > >
> > > > Facebook: https://www.facebook.com/deicool
> > > > LinkedIn: www.linkedin.com/in/deicool
> > > >
> > > > "Plant a Tree, Go Green"
> > > >
> > > > Make In India : http://www.makeinindia.com/home
> > > >
> > > >
> > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > > <do...@adzuna.com.invalid> wrote:
> > > >
> > > > > Yep - I edited the two instances to remove those differences and
> > > re-ran a
> > > > > fairly tortuous testing cycle - 100 requests/sec.
> > > > >
> > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point
> > and
> > > > > neither hit the threads limit or failed to create new threads. But
> > both
> > > > > scaled up to well above 2k threads and even after stopping the
> test,
> > > the
> > > > > threads remained in a timed wait condition. So I'm still thinking
> > > > there's a
> > > > > problem with threads not getting terminated somewhere. Test output
> > > below
> > > > if
> > > > > it's of use:
> > > > >
> > > > > 8.3.1
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > > > Duration      [total, attack, wait]      1m29.97089157s,
> > 59.990075461s,
> > > > > 29.980816109s
> > > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s,
> 5.175300857s,
> > > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    37.57%
> > > > > Status Codes  [code:count]               0:3741  200:2254  500:5
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > Wed Oct 13 12:20:27 UTC 2021
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > > > Duration      [total, attack, wait]      1m29.931457331s,
> > > 59.990859187s,
> > > > > 29.940598144s
> > > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    16.45%
> > > > > Status Codes  [code:count]               0:5007  200:987  500:6
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > Wed Oct 13 12:21:58 UTC 2021
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > > > Duration      [total, attack, wait]      1m29.58141878s,
> > 59.990945585s,
> > > > > 29.590473195s
> > > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    17.43%
> > > > > Status Codes  [code:count]               0:4874  200:1046  500:80
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > context deadline exceeded (Client.Timeout or context cancellation
> > while
> > > > > reading body)
> > > > > Wed Oct 13 12:23:29 UTC 2021
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > > Duration      [total, attack, wait]      1m29.541198374s,
> > > 59.990833778s,
> > > > > 29.550364596s
> > > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> > > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > > Bytes In      [total, mean]              2211649, 368.61
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    2.53%
> > > > > Status Codes  [code:count]               0:5810  200:152  500:38
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > Wed Oct 13 12:24:59 UTC 2021
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > > Duration      [total, attack, wait]      1m29.451307081s,
> > > 59.990601937s,
> > > > > 29.460705144s
> > > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> > > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > > Bytes In      [total, mean]              837808, 139.63
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    1.08%
> > > > > Status Codes  [code:count]               0:5907  200:65  500:28
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > context deadline exceeded (Client.Timeout or context cancellation
> > while
> > > > > reading body)
> > > > > Wed Oct 13 12:26:30 UTC 2021
> > > > > upload: ./831_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > > Duration      [total, attack, wait]      1m29.760671622s,
> > > 59.990482467s,
> > > > > 29.770189155s
> > > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> > > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > > Bytes In      [total, mean]              1973103, 328.85
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    2.47%
> > > > > Status Codes  [code:count]               0:5821  200:148  500:31
> > > > > Error Set:
> > > > > 500 Server Error
> > > > >
> > > > >
> > > > > 8.9.0
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > > > Duration      [total, attack, wait]      1m29.930332927s,
> > > 59.990809535s,
> > > > > 29.939523392s
> > > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s,
> 5.184437078s,
> > > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    42.72%
> > > > > Status Codes  [code:count]               0:3432  200:2563  500:5
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > Wed Oct 13 12:20:28 UTC 2021
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > > > Duration      [total, attack, wait]      1m29.951119527s,
> > > 59.990310608s,
> > > > > 29.960808919s
> > > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s,
> 4.856237339s,
> > > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    57.20%
> > > > > Status Codes  [code:count]               0:2565  200:3432  500:3
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > Wed Oct 13 12:22:00 UTC 2021
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > > > Duration      [total, attack, wait]      1m29.971513653s,
> > > 59.990882146s,
> > > > > 29.980631507s
> > > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s,
> 4.086936886s,
> > > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    46.67%
> > > > > Status Codes  [code:count]               0:3011  200:2800  500:4
> > > 503:185
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > unexpected EOF
> > > > > 503 Service Unavailable
> > > > > Wed Oct 13 12:23:31 UTC 2021
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > > > Duration      [total, attack, wait]      1m29.871026977s,
> > > 59.990066513s,
> > > > > 29.880960464s
> > > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s,
> 5.166973696s,
> > > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    45.25%
> > > > > Status Codes  [code:count]               0:3118  200:2715  500:4
> > > 503:163
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > unexpected EOF
> > > > > 503 Service Unavailable
> > > > > Wed Oct 13 12:25:02 UTC 2021
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > > > Duration      [total, attack, wait]      1m29.931899376s,
> > > 59.990583341s,
> > > > > 29.941316035s
> > > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s,
> 4.009658048s,
> > > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    42.47%
> > > > > Status Codes  [code:count]               0:3448  200:2548  500:4
> > > > > Error Set:
> > > > > 500 Server Error
> > > > > context deadline exceeded (Client.Timeout or context cancellation
> > while
> > > > > reading body)
> > > > > Wed Oct 13 12:26:34 UTC 2021
> > > > > upload: ./890_tests_results.bin to
> > > > >
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > > > Duration      [total, attack, wait]      1m29.540373624s,
> > > 59.990568262s,
> > > > > 29.549805362s
> > > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s,
> 5.811191827s,
> > > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > > Bytes Out     [total, mean]              0, 0.00
> > > > > Success       [ratio]                    60.38%
> > > > > Status Codes  [code:count]               0:2372  200:3623  500:5
> > > > > Error Set:
> > > > > 500 Server Error
> > > > >
> > > > >
> > > > >
> > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com>
> wrote:
> > > > >
> > > > > > Hello
> > > > > >
> > > > > > I can as of now see two changes:
> > > > > >
> > > > > > 1.  -Xmx
> > > > > >
> > > > > > 2. +ExplicitGCInvokesConcurrent
> > > > > >
> > > > > > Deepak
> > > > > > "The greatness of a nation can be judged by the way its animals
> are
> > > > > treated
> > > > > > - Mahatma Gandhi"
> > > > > >
> > > > > > +91 73500 12833
> > > > > > deicool@gmail.com
> > > > > >
> > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > >
> > > > > > "Plant a Tree, Go Green"
> > > > > >
> > > > > > Make In India : http://www.makeinindia.com/home
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > >
> > > > > > > CLI invocation for 8.3.1 is
> > > > > > > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > -XX:+UseLargePages
> > > > > > > -XX:+AlwaysPreTouch
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > -Duser.timezone=UTC
> > > > > > > -Djetty.home=/usr/local/solr/server
> > -Dsolr.solr.home=/srv/solr/data
> > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > > -Dsolr.jetty.https.port=8983
> > > > > > > -jar start.jar --module=http
> > > > > > > I believe the key items are:
> > > > > > > -XX:+AlwaysPreTouch
> > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > -XX:+UseG1GC
> > > > > > > -XX:+UseLargePages
> > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > -Xmx15826m
> > > > > > > -Xss256k
> > > > > > >
> > > > > > > And for 8.9.0 is
> > > > > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > > -XX:+UseLargePages
> > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > -Dsolr.jetty.inetaccess.includes=
> > -Dsolr.jetty.inetaccess.excludes=
> > > > > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > > -Duser.timezone=UTC
> > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > /srv/solr/logs
> > > > > > > -Djetty.home=/usr/local/solr/server
> > -Dsolr.solr.home=/srv/solr/data
> > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > > > > --module=http
> > > > > > > Key:
> > > > > > > -XX:+AlwaysPreTouch
> > > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > > -XX:+ParallelRefProcEnabled
> > > > > > > -XX:+PerfDisableSharedMem
> > > > > > > -XX:+UseG1GC
> > > > > > > -XX:+UseLargePages
> > > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > > -XX:MaxGCPauseMillis=250
> > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > > /srv/solr/logs
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > > -Xmx7913m
> > > > > > > -Xss256k
> > > > > > >
> > > > > > > Xmx values are based on the instance RAM, currently they're
> > running
> > > > on
> > > > > > two
> > > > > > > different instance types but we see the same behaviour when
> > they're
> > > > on
> > > > > > > identical types too.
> > > > > > >
> > > > > > > Many thanks
> > > > > > >
> > > > > > > Dominic
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hello
> > > > > > > >
> > > > > > > > Can you please tell us the JVM Heap Setting for both the
> > > versions:
> > > > > > 8.3.1,
> > > > > > > > 8.9.0?
> > > > > > > >
> > > > > > > > I will also have to look into the following code:
> > > > > > > FileFloatSource.java:210.
> > > > > > > > (will do it tonite-IST and update)
> > > > > > > >
> > > > > > > > Deepak
> > > > > > > > "The greatness of a nation can be judged by the way its
> animals
> > > are
> > > > > > > treated
> > > > > > > > - Mahatma Gandhi"
> > > > > > > >
> > > > > > > > +91 73500 12833
> > > > > > > > deicool@gmail.com
> > > > > > > >
> > > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > > >
> > > > > > > > "Plant a Tree, Go Green"
> > > > > > > >
> > > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > > >
> > > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > > >
> > > > > > > > > The overwhelming majority appear to be threads in
> > > TIMED_WAITING,
> > > > > all
> > > > > > > > > waiting on the same
> > > > > > > > > thing:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > > >
> > > > > > > > > I've attached a screenshot which includes the stack trace.
> > > > Stopping
> > > > > > all
> > > > > > > > > queries to the instance and waiting didn't result in any
> > > > noticeable
> > > > > > > > > decrease in the number of threads so it looks like despite
> > > being
> > > > > > timed,
> > > > > > > > > they're simply not getting terminated.
> > > > > > > > >
> > > > > > > > > Restarting the service takes me back down to just 53
> threads;
> > > > > > > re-running
> > > > > > > > a
> > > > > > > > > test results in many new threads immediately coming into
> > being,
> > > > > this
> > > > > > > time
> > > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > > - See second screenshot. The stack trace for those is too
> big
> > > for
> > > > > one
> > > > > > > > > screen so here's the output:
> > > > > > > > >
> > > > > > > > > qtp178604517-861 (861)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > >
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > > >    -
> > > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > > >    -
> > > > > > > >
> > > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > > >    -
> > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > > >    -
> > > > > > > >
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > > >    -
> > > > > > > >
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > > >    -
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > > >    -
> > > > > > >
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > > >    -
> > > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > >
> > > >
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > > >    -
> org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > > >    Source)
> > > > > > > > >    -
> > > > > > >
> > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > > >    -
> > > > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > > >    -
> > > > > > > > >    org.eclipse.jetty.io
> > > > > > > >
> > > > >
> > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > > >    - org.eclipse.jetty.io
> > > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > > >    - org.eclipse.jetty.io
> > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > > >    -
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > > >    - java.base@11.0.5
> /java.lang.Thread.run(Thread.java:834)
> > > > > > > > >
> > > > > > > > > [image: image.png]
> > > > > > > > > [image: image.png]
> > > > > > > > >
> > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > > joelsolr@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> There is a thread dump on the Solr admin. You can use that
> > to
> > > > > > > determine
> > > > > > > > >> what all those threads are doing and where they are
> getting
> > > > stuck.
> > > > > > You
> > > > > > > > can
> > > > > > > > >> post parts of the thread dump back to this email thread as
> > > well.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Joel Bernstein
> > > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > > >>
> > > > > > > > >> > We run 8.3.1 in prod without any problems, but we're
> > having
> > > > > issues
> > > > > > > > with
> > > > > > > > >> > trying to upgrade.
> > > > > > > > >> >
> > > > > > > > >> > I've created an 8.9.0 leader & follower, imported our
> live
> > > > data
> > > > > > into
> > > > > > > > it,
> > > > > > > > >> > and am testing it via replaying requests made to prod.
> > We're
> > > > > > seeing
> > > > > > > a
> > > > > > > > >> big
> > > > > > > > >> > problem where fairly moderate request rates are causing
> > the
> > > > > > instance
> > > > > > > > to
> > > > > > > > >> > become so slow it fails healthcheck. The logs showed a
> lot
> > > of
> > > > > > errors
> > > > > > > > >> around
> > > > > > > > >> > creating threads:
> > > > > > > > >> >
> > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to
> > > start
> > > > > > > thread -
> > > > > > > > >> > pthread_create failed (EAGAIN) for attributes:
> stacksize:
> > > > 256k,
> > > > > > > > >> guardsize:
> > > > > > > > >> > 0k, detached.
> > > > > > > > >> >
> > > > > > > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector
> > =>
> > > > > > > > >> > java.lang.OutOfMemoryError: unable to create native
> > thread:
> > > > > > possibly
> > > > > > > > >> out of
> > > > > > > > >> > memory or process/resource limits reached
> > > > > > > > >> >
> > > > > > > > >> > So I monitored thread count for the process whilst
> running
> > > the
> > > > > > test
> > > > > > > > >> suite
> > > > > > > > >> > and saw a persistent pattern: Threads increased until
> > maxed
> > > > out,
> > > > > > the
> > > > > > > > >> logs
> > > > > > > > >> > flooded with errors as it tried to create still more
> > > threads,
> > > > > and
> > > > > > > the
> > > > > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > > > > >> >
> > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising
> and
> > > > > > lowering
> > > > > > > it
> > > > > > > > >> but
> > > > > > > > >> > regardless of value the result is the same: it gets
> maxed
> > > and
> > > > > > > > everything
> > > > > > > > >> > slows down.
> > > > > > > > >> >
> > > > > > > > >> > Is there anything I can do to stop solr spinning up so
> > many
> > > > > > threads
> > > > > > > it
> > > > > > > > >> > ceases to function? There have been a few test passes
> > where
> > > it
> > > > > > > > >> > spontaneously dropped threadcount from thousands to
> > hundreds
> > > > and
> > > > > > > > stayed
> > > > > > > > >> up
> > > > > > > > >> > longer, but there seems no pattern to when this happens.
> > > > Running
> > > > > > the
> > > > > > > > >> same
> > > > > > > > >> > tests on 8.3.1 results in a much slower increase in
> > threads
> > > > and
> > > > > it
> > > > > > > > never
> > > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > > >> >
> > > > > > > > >> > See below for the thread count and healthcheck times
> seen
> > > on a
> > > > > > > (fairly
> > > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > > >> >
> > > > > > > > >> > Thanks
> > > > > > > > >> >
> > > > > > > > >> > Dominic
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > Threadcount:
> > > > > > > > >> >
> > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps
> -eLF |
> > > > grep
> > > > > > > > >> 'start.jar'
> > > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > > >> > 52
> > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > > >> > 52
> > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > >> > 52
> > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > >> > 52
> > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > >> > 569
> > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > > >> > 899
> > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > > >> > 1198
> > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > > >> > 1589
> > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > > >> > 2016
> > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > > >> > 2451
> > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > > >> > 2851
> > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > > >> > 2934
> > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > > >> > 3249
> > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > > >> > 3501
> > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > > >> > 3734
> > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > > >> > 4128
> > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > > >> > 4374
> > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > > >> > 4637
> > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > > >> > 4693
> > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > > >> > 4807
> > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > > >> > 4916
> > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > > >> > 4916
> > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > Healthcheck:
> > > > > > > > >> >
> > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo
> '----';
> > > > sleep
> > > > > > > > >> > 10s; done
> > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:02
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:12
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:01
> > > > > > > > --:--:--
> > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:23
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:01
> > > > > > > > --:--:--
> > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:03
> > > > > > > > --:--:--
> > > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:09
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:11
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:03
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > > 0:00:02
> > > > > > > > --:--:--
> > > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > > >> > ----
> > > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Sure - when solr gets pushed enough to reach the max thread count, it
doesn't just re-use what it has, it keeps trying to create new ones. The
logs *flood* with these errors:

Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
[5032072.842s][warning][os,thread] Failed to start thread - pthread_create
failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
[5032072.843s][warning][os,thread] Failed to start thread - pthread_create
failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
[5032072.843s][warning][os,thread] Failed to start thread - pthread_create
failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
[5032072.848s][warning][os,thread] Failed to start thread - pthread_create
failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.
Oct 07 15:38:56 ip-10-40-23-148 solr[4734]:
[5032072.852s][warning][os,thread] Failed to start thread - pthread_create
failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached.


2021-10-11 14:22:44.214 WARN  (qtp178604517-3891) [   ]
o.e.j.i.ManagedSelector  => java.lang.OutOfMemoryError: unable to create
native thread: possibly out of memory or process/resource limits reached
        at java.base/java.lang.Thread.start0(Native Method)
java.lang.OutOfMemoryError: unable to create native thread: possibly out of
memory or process/resource limits reached
        at java.lang.Thread.start0(Native Method) ~[?:?]
        at java.lang.Thread.start(Thread.java:803) ~[?:?]
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.SelectorManager.execute(SelectorManager.java:163)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ManagedSelector.execute(ManagedSelector.java:332)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ManagedSelector.destroyEndPoint(ManagedSelector.java:413)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:232)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ChannelEndPoint.updateKey(ChannelEndPoint.java:389)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516]
        at java.lang.Thread.run(Thread.java:834) [?:?]

On Wed, 13 Oct 2021 at 17:24, Deepak Goel <de...@gmail.com> wrote:

> Am sorry, did you post the thread error.I might have missed it. Can you
> please post it again?
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > Keeping threads around for re-use would certainly make sense to me, but
> > when the code starts erroring because it's trying to create ever-more new
> > ones rather than re-use old ones, it seems there must be a bug
> >
> > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com> wrote:
> >
> > > We will have to check the solr logic. Threads waiting might get reused
> in
> > > the future, so might be good for all you know.
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > > <do...@adzuna.com.invalid> wrote:
> > >
> > > > Yep - I edited the two instances to remove those differences and
> > re-ran a
> > > > fairly tortuous testing cycle - 100 requests/sec.
> > > >
> > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point
> and
> > > > neither hit the threads limit or failed to create new threads. But
> both
> > > > scaled up to well above 2k threads and even after stopping the test,
> > the
> > > > threads remained in a timed wait condition. So I'm still thinking
> > > there's a
> > > > problem with threads not getting terminated somewhere. Test output
> > below
> > > if
> > > > it's of use:
> > > >
> > > > 8.3.1
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > > Duration      [total, attack, wait]      1m29.97089157s,
> 59.990075461s,
> > > > 29.980816109s
> > > > Latencies     [mean, 50, 95, 99, max]    11.011900024s, 5.175300857s,
> > > > 30.00091013s, 30.001077592s, 30.014343222s
> > > > Bytes In      [total, mean]              42385846, 7064.31
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    37.57%
> > > > Status Codes  [code:count]               0:3741  200:2254  500:5
> > > > Error Set:
> > > > 500 Server Error
> > > > Wed Oct 13 12:20:27 UTC 2021
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > > Duration      [total, attack, wait]      1m29.931457331s,
> > 59.990859187s,
> > > > 29.940598144s
> > > > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > > > 30.000885208s, 30.00105703s, 30.011268185s
> > > > Bytes In      [total, mean]              21287225, 3547.87
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    16.45%
> > > > Status Codes  [code:count]               0:5007  200:987  500:6
> > > > Error Set:
> > > > 500 Server Error
> > > > Wed Oct 13 12:21:58 UTC 2021
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > > Duration      [total, attack, wait]      1m29.58141878s,
> 59.990945585s,
> > > > 29.590473195s
> > > > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > > > 30.000880433s, 30.001052307s, 30.004624884s
> > > > Bytes In      [total, mean]              17434533, 2905.76
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    17.43%
> > > > Status Codes  [code:count]               0:4874  200:1046  500:80
> > > > Error Set:
> > > > 500 Server Error
> > > > context deadline exceeded (Client.Timeout or context cancellation
> while
> > > > reading body)
> > > > Wed Oct 13 12:23:29 UTC 2021
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > > Duration      [total, attack, wait]      1m29.541198374s,
> > 59.990833778s,
> > > > 29.550364596s
> > > > Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> > > > 30.000892751s, 30.001051441s, 30.009856373s
> > > > Bytes In      [total, mean]              2211649, 368.61
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    2.53%
> > > > Status Codes  [code:count]               0:5810  200:152  500:38
> > > > Error Set:
> > > > 500 Server Error
> > > > Wed Oct 13 12:24:59 UTC 2021
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > > Duration      [total, attack, wait]      1m29.451307081s,
> > 59.990601937s,
> > > > 29.460705144s
> > > > Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> > > > 30.000885386s, 30.001023192s, 30.009992128s
> > > > Bytes In      [total, mean]              837808, 139.63
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    1.08%
> > > > Status Codes  [code:count]               0:5907  200:65  500:28
> > > > Error Set:
> > > > 500 Server Error
> > > > context deadline exceeded (Client.Timeout or context cancellation
> while
> > > > reading body)
> > > > Wed Oct 13 12:26:30 UTC 2021
> > > > upload: ./831_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > > Duration      [total, attack, wait]      1m29.760671622s,
> > 59.990482467s,
> > > > 29.770189155s
> > > > Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> > > > 30.000894904s, 30.00104114s, 30.016506845s
> > > > Bytes In      [total, mean]              1973103, 328.85
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    2.47%
> > > > Status Codes  [code:count]               0:5821  200:148  500:31
> > > > Error Set:
> > > > 500 Server Error
> > > >
> > > >
> > > > 8.9.0
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > > Duration      [total, attack, wait]      1m29.930332927s,
> > 59.990809535s,
> > > > 29.939523392s
> > > > Latencies     [mean, 50, 95, 99, max]    10.958871049s, 5.184437078s,
> > > > 30.000885573s, 30.001088926s, 30.00946393s
> > > > Bytes In      [total, mean]              51095634, 8515.94
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    42.72%
> > > > Status Codes  [code:count]               0:3432  200:2563  500:5
> > > > Error Set:
> > > > 500 Server Error
> > > > Wed Oct 13 12:20:28 UTC 2021
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > > Duration      [total, attack, wait]      1m29.951119527s,
> > 59.990310608s,
> > > > 29.960808919s
> > > > Latencies     [mean, 50, 95, 99, max]    10.089597212s, 4.856237339s,
> > > > 30.000863563s, 30.001063417s, 30.013413816s
> > > > Bytes In      [total, mean]              64455304, 10742.55
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    57.20%
> > > > Status Codes  [code:count]               0:2565  200:3432  500:3
> > > > Error Set:
> > > > 500 Server Error
> > > > Wed Oct 13 12:22:00 UTC 2021
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > > Duration      [total, attack, wait]      1m29.971513653s,
> > 59.990882146s,
> > > > 29.980631507s
> > > > Latencies     [mean, 50, 95, 99, max]    7.982882635s, 4.086936886s,
> > > > 30.000611417s, 30.001022921s, 30.00949772s
> > > > Bytes In      [total, mean]              49838902, 8306.48
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    46.67%
> > > > Status Codes  [code:count]               0:3011  200:2800  500:4
> > 503:185
> > > > Error Set:
> > > > 500 Server Error
> > > > unexpected EOF
> > > > 503 Service Unavailable
> > > > Wed Oct 13 12:23:31 UTC 2021
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > > Duration      [total, attack, wait]      1m29.871026977s,
> > 59.990066513s,
> > > > 29.880960464s
> > > > Latencies     [mean, 50, 95, 99, max]    9.187948603s, 5.166973696s,
> > > > 30.000723538s, 30.001043637s, 30.009522646s
> > > > Bytes In      [total, mean]              52508950, 8751.49
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    45.25%
> > > > Status Codes  [code:count]               0:3118  200:2715  500:4
> > 503:163
> > > > Error Set:
> > > > 500 Server Error
> > > > unexpected EOF
> > > > 503 Service Unavailable
> > > > Wed Oct 13 12:25:02 UTC 2021
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > > Duration      [total, attack, wait]      1m29.931899376s,
> > 59.990583341s,
> > > > 29.941316035s
> > > > Latencies     [mean, 50, 95, 99, max]    10.643567102s, 4.009658048s,
> > > > 30.000888301s, 30.00107338s, 30.010319307s
> > > > Bytes In      [total, mean]              51869384, 8644.90
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    42.47%
> > > > Status Codes  [code:count]               0:3448  200:2548  500:4
> > > > Error Set:
> > > > 500 Server Error
> > > > context deadline exceeded (Client.Timeout or context cancellation
> while
> > > > reading body)
> > > > Wed Oct 13 12:26:34 UTC 2021
> > > > upload: ./890_tests_results.bin to
> > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > > Duration      [total, attack, wait]      1m29.540373624s,
> > 59.990568262s,
> > > > 29.549805362s
> > > > Latencies     [mean, 50, 95, 99, max]    9.938485443s, 5.811191827s,
> > > > 30.000761953s, 30.001041833s, 30.014534484s
> > > > Bytes In      [total, mean]              64483027, 10747.17
> > > > Bytes Out     [total, mean]              0, 0.00
> > > > Success       [ratio]                    60.38%
> > > > Status Codes  [code:count]               0:2372  200:3623  500:5
> > > > Error Set:
> > > > 500 Server Error
> > > >
> > > >
> > > >
> > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com> wrote:
> > > >
> > > > > Hello
> > > > >
> > > > > I can as of now see two changes:
> > > > >
> > > > > 1.  -Xmx
> > > > >
> > > > > 2. +ExplicitGCInvokesConcurrent
> > > > >
> > > > > Deepak
> > > > > "The greatness of a nation can be judged by the way its animals are
> > > > treated
> > > > > - Mahatma Gandhi"
> > > > >
> > > > > +91 73500 12833
> > > > > deicool@gmail.com
> > > > >
> > > > > Facebook: https://www.facebook.com/deicool
> > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > >
> > > > > "Plant a Tree, Go Green"
> > > > >
> > > > > Make In India : http://www.makeinindia.com/home
> > > > >
> > > > >
> > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > > <do...@adzuna.com.invalid> wrote:
> > > > >
> > > > > > CLI invocation for 8.3.1 is
> > > > > > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > -XX:+UseLargePages
> > > > > > -XX:+AlwaysPreTouch
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > -Dcom.sun.management.jmxremote
> > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > -Duser.timezone=UTC
> > > > > > -Djetty.home=/usr/local/solr/server
> -Dsolr.solr.home=/srv/solr/data
> > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > > -Dsolr.jetty.https.port=8983
> > > > > > -jar start.jar --module=http
> > > > > > I believe the key items are:
> > > > > > -XX:+AlwaysPreTouch
> > > > > > -XX:+ParallelRefProcEnabled
> > > > > > -XX:+PerfDisableSharedMem
> > > > > > -XX:+UseG1GC
> > > > > > -XX:+UseLargePages
> > > > > > -XX:MaxGCPauseMillis=250
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > -Xmx15826m
> > > > > > -Xss256k
> > > > > >
> > > > > > And for 8.9.0 is
> > > > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > > -XX:+UseLargePages
> > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > -Dsolr.jetty.inetaccess.includes=
> -Dsolr.jetty.inetaccess.excludes=
> > > > > > -Dcom.sun.management.jmxremote
> > > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > > -Duser.timezone=UTC
> > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > /srv/solr/logs
> > > > > > -Djetty.home=/usr/local/solr/server
> -Dsolr.solr.home=/srv/solr/data
> > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > > >
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > > > --module=http
> > > > > > Key:
> > > > > > -XX:+AlwaysPreTouch
> > > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > > -XX:+ParallelRefProcEnabled
> > > > > > -XX:+PerfDisableSharedMem
> > > > > > -XX:+UseG1GC
> > > > > > -XX:+UseLargePages
> > > > > > -XX:-OmitStackTraceInFastThrow
> > > > > > -XX:MaxGCPauseMillis=250
> > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > > /srv/solr/logs
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > > -Xmx7913m
> > > > > > -Xss256k
> > > > > >
> > > > > > Xmx values are based on the instance RAM, currently they're
> running
> > > on
> > > > > two
> > > > > > different instance types but we see the same behaviour when
> they're
> > > on
> > > > > > identical types too.
> > > > > >
> > > > > > Many thanks
> > > > > >
> > > > > > Dominic
> > > > > >
> > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hello
> > > > > > >
> > > > > > > Can you please tell us the JVM Heap Setting for both the
> > versions:
> > > > > 8.3.1,
> > > > > > > 8.9.0?
> > > > > > >
> > > > > > > I will also have to look into the following code:
> > > > > > FileFloatSource.java:210.
> > > > > > > (will do it tonite-IST and update)
> > > > > > >
> > > > > > > Deepak
> > > > > > > "The greatness of a nation can be judged by the way its animals
> > are
> > > > > > treated
> > > > > > > - Mahatma Gandhi"
> > > > > > >
> > > > > > > +91 73500 12833
> > > > > > > deicool@gmail.com
> > > > > > >
> > > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > > >
> > > > > > > "Plant a Tree, Go Green"
> > > > > > >
> > > > > > > Make In India : http://www.makeinindia.com/home
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > > >
> > > > > > > > Oh, that's very helpful to know about, ty
> > > > > > > >
> > > > > > > > The overwhelming majority appear to be threads in
> > TIMED_WAITING,
> > > > all
> > > > > > > > waiting on the same
> > > > > > > > thing:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > > >
> > > > > > > > I've attached a screenshot which includes the stack trace.
> > > Stopping
> > > > > all
> > > > > > > > queries to the instance and waiting didn't result in any
> > > noticeable
> > > > > > > > decrease in the number of threads so it looks like despite
> > being
> > > > > timed,
> > > > > > > > they're simply not getting terminated.
> > > > > > > >
> > > > > > > > Restarting the service takes me back down to just 53 threads;
> > > > > > re-running
> > > > > > > a
> > > > > > > > test results in many new threads immediately coming into
> being,
> > > > this
> > > > > > time
> > > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > > - See second screenshot. The stack trace for those is too big
> > for
> > > > one
> > > > > > > > screen so here's the output:
> > > > > > > >
> > > > > > > > qtp178604517-861 (861)
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > >
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > > >    -
> > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > > >    -
> > > > > > >
> > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > > >    -
> > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > > >    -
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > > >    -
> > > > > > >
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > > >    -
> org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > > >    -
> > > > > >
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > > >    -
> > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > > >    -
> > > > > > > >
> > > > > >
> > > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > > >    Source)
> > > > > > > >    -
> > > > > >
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > > >    -
> > > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > > >    -
> > > > > > > >    org.eclipse.jetty.io
> > > > > > >
> > > >
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > > >    - org.eclipse.jetty.io
> > > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > > >    - org.eclipse.jetty.io
> > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > > >    -
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > > > > > > >
> > > > > > > > [image: image.png]
> > > > > > > > [image: image.png]
> > > > > > > >
> > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> > joelsolr@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> There is a thread dump on the Solr admin. You can use that
> to
> > > > > > determine
> > > > > > > >> what all those threads are doing and where they are getting
> > > stuck.
> > > > > You
> > > > > > > can
> > > > > > > >> post parts of the thread dump back to this email thread as
> > well.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Joel Bernstein
> > > > > > > >> http://joelsolr.blogspot.com/
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > > >>
> > > > > > > >> > We run 8.3.1 in prod without any problems, but we're
> having
> > > > issues
> > > > > > > with
> > > > > > > >> > trying to upgrade.
> > > > > > > >> >
> > > > > > > >> > I've created an 8.9.0 leader & follower, imported our live
> > > data
> > > > > into
> > > > > > > it,
> > > > > > > >> > and am testing it via replaying requests made to prod.
> We're
> > > > > seeing
> > > > > > a
> > > > > > > >> big
> > > > > > > >> > problem where fairly moderate request rates are causing
> the
> > > > > instance
> > > > > > > to
> > > > > > > >> > become so slow it fails healthcheck. The logs showed a lot
> > of
> > > > > errors
> > > > > > > >> around
> > > > > > > >> > creating threads:
> > > > > > > >> >
> > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to
> > start
> > > > > > thread -
> > > > > > > >> > pthread_create failed (EAGAIN) for attributes: stacksize:
> > > 256k,
> > > > > > > >> guardsize:
> > > > > > > >> > 0k, detached.
> > > > > > > >> >
> > > > > > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector
> =>
> > > > > > > >> > java.lang.OutOfMemoryError: unable to create native
> thread:
> > > > > possibly
> > > > > > > >> out of
> > > > > > > >> > memory or process/resource limits reached
> > > > > > > >> >
> > > > > > > >> > So I monitored thread count for the process whilst running
> > the
> > > > > test
> > > > > > > >> suite
> > > > > > > >> > and saw a persistent pattern: Threads increased until
> maxed
> > > out,
> > > > > the
> > > > > > > >> logs
> > > > > > > >> > flooded with errors as it tried to create still more
> > threads,
> > > > and
> > > > > > the
> > > > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > > > >> >
> > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising and
> > > > > lowering
> > > > > > it
> > > > > > > >> but
> > > > > > > >> > regardless of value the result is the same: it gets maxed
> > and
> > > > > > > everything
> > > > > > > >> > slows down.
> > > > > > > >> >
> > > > > > > >> > Is there anything I can do to stop solr spinning up so
> many
> > > > > threads
> > > > > > it
> > > > > > > >> > ceases to function? There have been a few test passes
> where
> > it
> > > > > > > >> > spontaneously dropped threadcount from thousands to
> hundreds
> > > and
> > > > > > > stayed
> > > > > > > >> up
> > > > > > > >> > longer, but there seems no pattern to when this happens.
> > > Running
> > > > > the
> > > > > > > >> same
> > > > > > > >> > tests on 8.3.1 results in a much slower increase in
> threads
> > > and
> > > > it
> > > > > > > never
> > > > > > > >> > quite maxes them so things continue to function.
> > > > > > > >> >
> > > > > > > >> > See below for the thread count and healthcheck times seen
> > on a
> > > > > > (fairly
> > > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > > >> >
> > > > > > > >> > Thanks
> > > > > > > >> >
> > > > > > > >> > Dominic
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Threadcount:
> > > > > > > >> >
> > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF |
> > > grep
> > > > > > > >> 'start.jar'
> > > > > > > >> > | wc -l; sleep 10s; done
> > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > > >> > 52
> > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > > >> > 52
> > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > >> > 52
> > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > >> > 52
> > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > >> > 569
> > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > > >> > 899
> > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > > >> > 1198
> > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > > >> > 1589
> > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > > >> > 2016
> > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > > >> > 2451
> > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > > >> > 2851
> > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > > >> > 2934
> > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > > >> > 3249
> > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > > >> > 3501
> > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > > >> > 3734
> > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > > >> > 4128
> > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > > >> > 4374
> > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > > >> > 4637
> > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > > >> > 4693
> > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > > >> > 4807
> > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > > >> > 4916
> > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > > >> > 4916
> > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Healthcheck:
> > > > > > > >> >
> > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----';
> > > sleep
> > > > > > > >> > 10s; done
> > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:02
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:12
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:01
> > > > > > > --:--:--
> > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:23
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:01
> > > > > > > --:--:--
> > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:03
> > > > > > > --:--:--
> > > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:09
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:11
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:03
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > > 0:00:02
> > > > > > > --:--:--
> > > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > > >> > < HTTP/1.1 200 OK
> > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > > >> > ----
> > > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
Am sorry, did you post the thread error.I might have missed it. Can you
please post it again?

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> Keeping threads around for re-use would certainly make sense to me, but
> when the code starts erroring because it's trying to create ever-more new
> ones rather than re-use old ones, it seems there must be a bug
>
> On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com> wrote:
>
> > We will have to check the solr logic. Threads waiting might get reused in
> > the future, so might be good for all you know.
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> > <do...@adzuna.com.invalid> wrote:
> >
> > > Yep - I edited the two instances to remove those differences and
> re-ran a
> > > fairly tortuous testing cycle - 100 requests/sec.
> > >
> > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point and
> > > neither hit the threads limit or failed to create new threads. But both
> > > scaled up to well above 2k threads and even after stopping the test,
> the
> > > threads remained in a timed wait condition. So I'm still thinking
> > there's a
> > > problem with threads not getting terminated somewhere. Test output
> below
> > if
> > > it's of use:
> > >
> > > 8.3.1
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > > Duration      [total, attack, wait]      1m29.97089157s, 59.990075461s,
> > > 29.980816109s
> > > Latencies     [mean, 50, 95, 99, max]    11.011900024s, 5.175300857s,
> > > 30.00091013s, 30.001077592s, 30.014343222s
> > > Bytes In      [total, mean]              42385846, 7064.31
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    37.57%
> > > Status Codes  [code:count]               0:3741  200:2254  500:5
> > > Error Set:
> > > 500 Server Error
> > > Wed Oct 13 12:20:27 UTC 2021
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > > Duration      [total, attack, wait]      1m29.931457331s,
> 59.990859187s,
> > > 29.940598144s
> > > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > > 30.000885208s, 30.00105703s, 30.011268185s
> > > Bytes In      [total, mean]              21287225, 3547.87
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    16.45%
> > > Status Codes  [code:count]               0:5007  200:987  500:6
> > > Error Set:
> > > 500 Server Error
> > > Wed Oct 13 12:21:58 UTC 2021
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > > Duration      [total, attack, wait]      1m29.58141878s, 59.990945585s,
> > > 29.590473195s
> > > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > > 30.000880433s, 30.001052307s, 30.004624884s
> > > Bytes In      [total, mean]              17434533, 2905.76
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    17.43%
> > > Status Codes  [code:count]               0:4874  200:1046  500:80
> > > Error Set:
> > > 500 Server Error
> > > context deadline exceeded (Client.Timeout or context cancellation while
> > > reading body)
> > > Wed Oct 13 12:23:29 UTC 2021
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > > Duration      [total, attack, wait]      1m29.541198374s,
> 59.990833778s,
> > > 29.550364596s
> > > Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> > > 30.000892751s, 30.001051441s, 30.009856373s
> > > Bytes In      [total, mean]              2211649, 368.61
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    2.53%
> > > Status Codes  [code:count]               0:5810  200:152  500:38
> > > Error Set:
> > > 500 Server Error
> > > Wed Oct 13 12:24:59 UTC 2021
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > > Duration      [total, attack, wait]      1m29.451307081s,
> 59.990601937s,
> > > 29.460705144s
> > > Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> > > 30.000885386s, 30.001023192s, 30.009992128s
> > > Bytes In      [total, mean]              837808, 139.63
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    1.08%
> > > Status Codes  [code:count]               0:5907  200:65  500:28
> > > Error Set:
> > > 500 Server Error
> > > context deadline exceeded (Client.Timeout or context cancellation while
> > > reading body)
> > > Wed Oct 13 12:26:30 UTC 2021
> > > upload: ./831_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > > Duration      [total, attack, wait]      1m29.760671622s,
> 59.990482467s,
> > > 29.770189155s
> > > Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> > > 30.000894904s, 30.00104114s, 30.016506845s
> > > Bytes In      [total, mean]              1973103, 328.85
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    2.47%
> > > Status Codes  [code:count]               0:5821  200:148  500:31
> > > Error Set:
> > > 500 Server Error
> > >
> > >
> > > 8.9.0
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > > Duration      [total, attack, wait]      1m29.930332927s,
> 59.990809535s,
> > > 29.939523392s
> > > Latencies     [mean, 50, 95, 99, max]    10.958871049s, 5.184437078s,
> > > 30.000885573s, 30.001088926s, 30.00946393s
> > > Bytes In      [total, mean]              51095634, 8515.94
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    42.72%
> > > Status Codes  [code:count]               0:3432  200:2563  500:5
> > > Error Set:
> > > 500 Server Error
> > > Wed Oct 13 12:20:28 UTC 2021
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > > Duration      [total, attack, wait]      1m29.951119527s,
> 59.990310608s,
> > > 29.960808919s
> > > Latencies     [mean, 50, 95, 99, max]    10.089597212s, 4.856237339s,
> > > 30.000863563s, 30.001063417s, 30.013413816s
> > > Bytes In      [total, mean]              64455304, 10742.55
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    57.20%
> > > Status Codes  [code:count]               0:2565  200:3432  500:3
> > > Error Set:
> > > 500 Server Error
> > > Wed Oct 13 12:22:00 UTC 2021
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > > Duration      [total, attack, wait]      1m29.971513653s,
> 59.990882146s,
> > > 29.980631507s
> > > Latencies     [mean, 50, 95, 99, max]    7.982882635s, 4.086936886s,
> > > 30.000611417s, 30.001022921s, 30.00949772s
> > > Bytes In      [total, mean]              49838902, 8306.48
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    46.67%
> > > Status Codes  [code:count]               0:3011  200:2800  500:4
> 503:185
> > > Error Set:
> > > 500 Server Error
> > > unexpected EOF
> > > 503 Service Unavailable
> > > Wed Oct 13 12:23:31 UTC 2021
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > > Duration      [total, attack, wait]      1m29.871026977s,
> 59.990066513s,
> > > 29.880960464s
> > > Latencies     [mean, 50, 95, 99, max]    9.187948603s, 5.166973696s,
> > > 30.000723538s, 30.001043637s, 30.009522646s
> > > Bytes In      [total, mean]              52508950, 8751.49
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    45.25%
> > > Status Codes  [code:count]               0:3118  200:2715  500:4
> 503:163
> > > Error Set:
> > > 500 Server Error
> > > unexpected EOF
> > > 503 Service Unavailable
> > > Wed Oct 13 12:25:02 UTC 2021
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > > Duration      [total, attack, wait]      1m29.931899376s,
> 59.990583341s,
> > > 29.941316035s
> > > Latencies     [mean, 50, 95, 99, max]    10.643567102s, 4.009658048s,
> > > 30.000888301s, 30.00107338s, 30.010319307s
> > > Bytes In      [total, mean]              51869384, 8644.90
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    42.47%
> > > Status Codes  [code:count]               0:3448  200:2548  500:4
> > > Error Set:
> > > 500 Server Error
> > > context deadline exceeded (Client.Timeout or context cancellation while
> > > reading body)
> > > Wed Oct 13 12:26:34 UTC 2021
> > > upload: ./890_tests_results.bin to
> > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > > Duration      [total, attack, wait]      1m29.540373624s,
> 59.990568262s,
> > > 29.549805362s
> > > Latencies     [mean, 50, 95, 99, max]    9.938485443s, 5.811191827s,
> > > 30.000761953s, 30.001041833s, 30.014534484s
> > > Bytes In      [total, mean]              64483027, 10747.17
> > > Bytes Out     [total, mean]              0, 0.00
> > > Success       [ratio]                    60.38%
> > > Status Codes  [code:count]               0:2372  200:3623  500:5
> > > Error Set:
> > > 500 Server Error
> > >
> > >
> > >
> > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > Hello
> > > >
> > > > I can as of now see two changes:
> > > >
> > > > 1.  -Xmx
> > > >
> > > > 2. +ExplicitGCInvokesConcurrent
> > > >
> > > > Deepak
> > > > "The greatness of a nation can be judged by the way its animals are
> > > treated
> > > > - Mahatma Gandhi"
> > > >
> > > > +91 73500 12833
> > > > deicool@gmail.com
> > > >
> > > > Facebook: https://www.facebook.com/deicool
> > > > LinkedIn: www.linkedin.com/in/deicool
> > > >
> > > > "Plant a Tree, Go Green"
> > > >
> > > > Make In India : http://www.makeinindia.com/home
> > > >
> > > >
> > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > > <do...@adzuna.com.invalid> wrote:
> > > >
> > > > > CLI invocation for 8.3.1 is
> > > > > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > -XX:+UseLargePages
> > > > > -XX:+AlwaysPreTouch
> > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > -Duser.timezone=UTC
> > > > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > > -Dsolr.jetty.https.port=8983
> > > > > -jar start.jar --module=http
> > > > > I believe the key items are:
> > > > > -XX:+AlwaysPreTouch
> > > > > -XX:+ParallelRefProcEnabled
> > > > > -XX:+PerfDisableSharedMem
> > > > > -XX:+UseG1GC
> > > > > -XX:+UseLargePages
> > > > > -XX:MaxGCPauseMillis=250
> > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > -Xmx15826m
> > > > > -Xss256k
> > > > >
> > > > > And for 8.9.0 is
> > > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> > -XX:+UseLargePages
> > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
> > > > > -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > -Dcom.sun.management.jmxremote.port=18983
> > > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > > -Dsolr.log.dir=/srv/solr/logs
> > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > > -Duser.timezone=UTC
> > > > > -XX:-OmitStackTraceInFastThrow
> > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > /srv/solr/logs
> > > > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > > >
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > > --module=http
> > > > > Key:
> > > > > -XX:+AlwaysPreTouch
> > > > > -XX:+ExplicitGCInvokesConcurrent
> > > > > -XX:+ParallelRefProcEnabled
> > > > > -XX:+PerfDisableSharedMem
> > > > > -XX:+UseG1GC
> > > > > -XX:+UseLargePages
> > > > > -XX:-OmitStackTraceInFastThrow
> > > > > -XX:MaxGCPauseMillis=250
> > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > > /srv/solr/logs
> > > > >
> > > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > > -Xmx7913m
> > > > > -Xss256k
> > > > >
> > > > > Xmx values are based on the instance RAM, currently they're running
> > on
> > > > two
> > > > > different instance types but we see the same behaviour when they're
> > on
> > > > > identical types too.
> > > > >
> > > > > Many thanks
> > > > >
> > > > > Dominic
> > > > >
> > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com>
> wrote:
> > > > >
> > > > > > Hello
> > > > > >
> > > > > > Can you please tell us the JVM Heap Setting for both the
> versions:
> > > > 8.3.1,
> > > > > > 8.9.0?
> > > > > >
> > > > > > I will also have to look into the following code:
> > > > > FileFloatSource.java:210.
> > > > > > (will do it tonite-IST and update)
> > > > > >
> > > > > > Deepak
> > > > > > "The greatness of a nation can be judged by the way its animals
> are
> > > > > treated
> > > > > > - Mahatma Gandhi"
> > > > > >
> > > > > > +91 73500 12833
> > > > > > deicool@gmail.com
> > > > > >
> > > > > > Facebook: https://www.facebook.com/deicool
> > > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > > >
> > > > > > "Plant a Tree, Go Green"
> > > > > >
> > > > > > Make In India : http://www.makeinindia.com/home
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > > <do...@adzuna.com.invalid> wrote:
> > > > > >
> > > > > > > Oh, that's very helpful to know about, ty
> > > > > > >
> > > > > > > The overwhelming majority appear to be threads in
> TIMED_WAITING,
> > > all
> > > > > > > waiting on the same
> > > > > > > thing:
> > > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > > >
> > > > > > > I've attached a screenshot which includes the stack trace.
> > Stopping
> > > > all
> > > > > > > queries to the instance and waiting didn't result in any
> > noticeable
> > > > > > > decrease in the number of threads so it looks like despite
> being
> > > > timed,
> > > > > > > they're simply not getting terminated.
> > > > > > >
> > > > > > > Restarting the service takes me back down to just 53 threads;
> > > > > re-running
> > > > > > a
> > > > > > > test results in many new threads immediately coming into being,
> > > this
> > > > > time
> > > > > > > with a higher proportion of threads BLOCKED on
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > > - See second screenshot. The stack trace for those is too big
> for
> > > one
> > > > > > > screen so here's the output:
> > > > > > >
> > > > > > > qtp178604517-861 (861)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > >
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > > >    -
> > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > > >    -
> > > > > >
> > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > > >    -
> org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > > >    -
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > > >    -
> > > > > >
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > > >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > > >    -
> > > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > > >    -
> > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > > >    -
> > > > > > >
> > > > >
> > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > > >    Source)
> > > > > > >    -
> > > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > > >    -
> > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > > >    -
> > > > > > >    org.eclipse.jetty.io
> > > > > >
> > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > > >    - org.eclipse.jetty.io
> > > > .FillInterest.fillable(FillInterest.java:105)
> > > > > > >    - org.eclipse.jetty.io
> > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > > >    -
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > > > > > >
> > > > > > > [image: image.png]
> > > > > > > [image: image.png]
> > > > > > >
> > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <
> joelsolr@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> There is a thread dump on the Solr admin. You can use that to
> > > > > determine
> > > > > > >> what all those threads are doing and where they are getting
> > stuck.
> > > > You
> > > > > > can
> > > > > > >> post parts of the thread dump back to this email thread as
> well.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> Joel Bernstein
> > > > > > >> http://joelsolr.blogspot.com/
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > > >>
> > > > > > >> > We run 8.3.1 in prod without any problems, but we're having
> > > issues
> > > > > > with
> > > > > > >> > trying to upgrade.
> > > > > > >> >
> > > > > > >> > I've created an 8.9.0 leader & follower, imported our live
> > data
> > > > into
> > > > > > it,
> > > > > > >> > and am testing it via replaying requests made to prod. We're
> > > > seeing
> > > > > a
> > > > > > >> big
> > > > > > >> > problem where fairly moderate request rates are causing the
> > > > instance
> > > > > > to
> > > > > > >> > become so slow it fails healthcheck. The logs showed a lot
> of
> > > > errors
> > > > > > >> around
> > > > > > >> > creating threads:
> > > > > > >> >
> > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to
> start
> > > > > thread -
> > > > > > >> > pthread_create failed (EAGAIN) for attributes: stacksize:
> > 256k,
> > > > > > >> guardsize:
> > > > > > >> > 0k, detached.
> > > > > > >> >
> > > > > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > > > > > >> > java.lang.OutOfMemoryError: unable to create native thread:
> > > > possibly
> > > > > > >> out of
> > > > > > >> > memory or process/resource limits reached
> > > > > > >> >
> > > > > > >> > So I monitored thread count for the process whilst running
> the
> > > > test
> > > > > > >> suite
> > > > > > >> > and saw a persistent pattern: Threads increased until maxed
> > out,
> > > > the
> > > > > > >> logs
> > > > > > >> > flooded with errors as it tried to create still more
> threads,
> > > and
> > > > > the
> > > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > > >> >
> > > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising and
> > > > lowering
> > > > > it
> > > > > > >> but
> > > > > > >> > regardless of value the result is the same: it gets maxed
> and
> > > > > > everything
> > > > > > >> > slows down.
> > > > > > >> >
> > > > > > >> > Is there anything I can do to stop solr spinning up so many
> > > > threads
> > > > > it
> > > > > > >> > ceases to function? There have been a few test passes where
> it
> > > > > > >> > spontaneously dropped threadcount from thousands to hundreds
> > and
> > > > > > stayed
> > > > > > >> up
> > > > > > >> > longer, but there seems no pattern to when this happens.
> > Running
> > > > the
> > > > > > >> same
> > > > > > >> > tests on 8.3.1 results in a much slower increase in threads
> > and
> > > it
> > > > > > never
> > > > > > >> > quite maxes them so things continue to function.
> > > > > > >> >
> > > > > > >> > See below for the thread count and healthcheck times seen
> on a
> > > > > (fairly
> > > > > > >> > harsh) test run of 100 requests/sec
> > > > > > >> >
> > > > > > >> > Thanks
> > > > > > >> >
> > > > > > >> > Dominic
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Threadcount:
> > > > > > >> >
> > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF |
> > grep
> > > > > > >> 'start.jar'
> > > > > > >> > | wc -l; sleep 10s; done
> > > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > > >> > 52
> > > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > > >> > 52
> > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > >> > 52
> > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > >> > 52
> > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > >> > 569
> > > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > > >> > 899
> > > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > > >> > 1198
> > > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > > >> > 1589
> > > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > > >> > 2016
> > > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > > >> > 2451
> > > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > > >> > 2851
> > > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > > >> > 2934
> > > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > > >> > 3249
> > > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > > >> > 3501
> > > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > > >> > 3734
> > > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > > >> > 4128
> > > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > > >> > 4374
> > > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > > >> > 4637
> > > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > > >> > 4693
> > > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > > >> > 4807
> > > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > > >> > 4916
> > > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > > >> > 4916
> > > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Healthcheck:
> > > > > > >> >
> > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----';
> > sleep
> > > > > > >> > 10s; done
> > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:02
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:12
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:01
> > > > > > --:--:--
> > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:23
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:01
> > > > > > --:--:--
> > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:03
> > > > > > --:--:--
> > > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:09
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:11
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:03
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> > 0:00:02
> > > > > > --:--:--
> > > > > > >> >   0< HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > >> > > GET /solr/ HTTP/1.1
> > > > > > >> > < HTTP/1.1 200 OK
> > > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > > >> > ----
> > > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Keeping threads around for re-use would certainly make sense to me, but
when the code starts erroring because it's trying to create ever-more new
ones rather than re-use old ones, it seems there must be a bug

On Wed, 13 Oct 2021 at 15:03, Deepak Goel <de...@gmail.com> wrote:

> We will have to check the solr logic. Threads waiting might get reused in
> the future, so might be good for all you know.
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > Yep - I edited the two instances to remove those differences and re-ran a
> > fairly tortuous testing cycle - 100 requests/sec.
> >
> > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point and
> > neither hit the threads limit or failed to create new threads. But both
> > scaled up to well above 2k threads and even after stopping the test, the
> > threads remained in a timed wait condition. So I'm still thinking
> there's a
> > problem with threads not getting terminated somewhere. Test output below
> if
> > it's of use:
> >
> > 8.3.1
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> > Requests      [total, rate, throughput]  6000, 100.02, 25.05
> > Duration      [total, attack, wait]      1m29.97089157s, 59.990075461s,
> > 29.980816109s
> > Latencies     [mean, 50, 95, 99, max]    11.011900024s, 5.175300857s,
> > 30.00091013s, 30.001077592s, 30.014343222s
> > Bytes In      [total, mean]              42385846, 7064.31
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    37.57%
> > Status Codes  [code:count]               0:3741  200:2254  500:5
> > Error Set:
> > 500 Server Error
> > Wed Oct 13 12:20:27 UTC 2021
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> > Requests      [total, rate, throughput]  6000, 100.02, 10.98
> > Duration      [total, attack, wait]      1m29.931457331s, 59.990859187s,
> > 29.940598144s
> > Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> > 30.000885208s, 30.00105703s, 30.011268185s
> > Bytes In      [total, mean]              21287225, 3547.87
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    16.45%
> > Status Codes  [code:count]               0:5007  200:987  500:6
> > Error Set:
> > 500 Server Error
> > Wed Oct 13 12:21:58 UTC 2021
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> > Requests      [total, rate, throughput]  6000, 100.02, 11.68
> > Duration      [total, attack, wait]      1m29.58141878s, 59.990945585s,
> > 29.590473195s
> > Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> > 30.000880433s, 30.001052307s, 30.004624884s
> > Bytes In      [total, mean]              17434533, 2905.76
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    17.43%
> > Status Codes  [code:count]               0:4874  200:1046  500:80
> > Error Set:
> > 500 Server Error
> > context deadline exceeded (Client.Timeout or context cancellation while
> > reading body)
> > Wed Oct 13 12:23:29 UTC 2021
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> > Requests      [total, rate, throughput]  6000, 100.02, 1.70
> > Duration      [total, attack, wait]      1m29.541198374s, 59.990833778s,
> > 29.550364596s
> > Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> > 30.000892751s, 30.001051441s, 30.009856373s
> > Bytes In      [total, mean]              2211649, 368.61
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    2.53%
> > Status Codes  [code:count]               0:5810  200:152  500:38
> > Error Set:
> > 500 Server Error
> > Wed Oct 13 12:24:59 UTC 2021
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> > Requests      [total, rate, throughput]  6000, 100.02, 0.73
> > Duration      [total, attack, wait]      1m29.451307081s, 59.990601937s,
> > 29.460705144s
> > Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> > 30.000885386s, 30.001023192s, 30.009992128s
> > Bytes In      [total, mean]              837808, 139.63
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    1.08%
> > Status Codes  [code:count]               0:5907  200:65  500:28
> > Error Set:
> > 500 Server Error
> > context deadline exceeded (Client.Timeout or context cancellation while
> > reading body)
> > Wed Oct 13 12:26:30 UTC 2021
> > upload: ./831_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> > Requests      [total, rate, throughput]  6000, 100.02, 1.65
> > Duration      [total, attack, wait]      1m29.760671622s, 59.990482467s,
> > 29.770189155s
> > Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> > 30.000894904s, 30.00104114s, 30.016506845s
> > Bytes In      [total, mean]              1973103, 328.85
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    2.47%
> > Status Codes  [code:count]               0:5821  200:148  500:31
> > Error Set:
> > 500 Server Error
> >
> >
> > 8.9.0
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> > Requests      [total, rate, throughput]  6000, 100.02, 28.50
> > Duration      [total, attack, wait]      1m29.930332927s, 59.990809535s,
> > 29.939523392s
> > Latencies     [mean, 50, 95, 99, max]    10.958871049s, 5.184437078s,
> > 30.000885573s, 30.001088926s, 30.00946393s
> > Bytes In      [total, mean]              51095634, 8515.94
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    42.72%
> > Status Codes  [code:count]               0:3432  200:2563  500:5
> > Error Set:
> > 500 Server Error
> > Wed Oct 13 12:20:28 UTC 2021
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> > Requests      [total, rate, throughput]  6000, 100.02, 38.15
> > Duration      [total, attack, wait]      1m29.951119527s, 59.990310608s,
> > 29.960808919s
> > Latencies     [mean, 50, 95, 99, max]    10.089597212s, 4.856237339s,
> > 30.000863563s, 30.001063417s, 30.013413816s
> > Bytes In      [total, mean]              64455304, 10742.55
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    57.20%
> > Status Codes  [code:count]               0:2565  200:3432  500:3
> > Error Set:
> > 500 Server Error
> > Wed Oct 13 12:22:00 UTC 2021
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> > Requests      [total, rate, throughput]  6000, 100.02, 31.12
> > Duration      [total, attack, wait]      1m29.971513653s, 59.990882146s,
> > 29.980631507s
> > Latencies     [mean, 50, 95, 99, max]    7.982882635s, 4.086936886s,
> > 30.000611417s, 30.001022921s, 30.00949772s
> > Bytes In      [total, mean]              49838902, 8306.48
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    46.67%
> > Status Codes  [code:count]               0:3011  200:2800  500:4  503:185
> > Error Set:
> > 500 Server Error
> > unexpected EOF
> > 503 Service Unavailable
> > Wed Oct 13 12:23:31 UTC 2021
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> > Requests      [total, rate, throughput]  6000, 100.02, 30.21
> > Duration      [total, attack, wait]      1m29.871026977s, 59.990066513s,
> > 29.880960464s
> > Latencies     [mean, 50, 95, 99, max]    9.187948603s, 5.166973696s,
> > 30.000723538s, 30.001043637s, 30.009522646s
> > Bytes In      [total, mean]              52508950, 8751.49
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    45.25%
> > Status Codes  [code:count]               0:3118  200:2715  500:4  503:163
> > Error Set:
> > 500 Server Error
> > unexpected EOF
> > 503 Service Unavailable
> > Wed Oct 13 12:25:02 UTC 2021
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> > Requests      [total, rate, throughput]  6000, 100.02, 28.33
> > Duration      [total, attack, wait]      1m29.931899376s, 59.990583341s,
> > 29.941316035s
> > Latencies     [mean, 50, 95, 99, max]    10.643567102s, 4.009658048s,
> > 30.000888301s, 30.00107338s, 30.010319307s
> > Bytes In      [total, mean]              51869384, 8644.90
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    42.47%
> > Status Codes  [code:count]               0:3448  200:2548  500:4
> > Error Set:
> > 500 Server Error
> > context deadline exceeded (Client.Timeout or context cancellation while
> > reading body)
> > Wed Oct 13 12:26:34 UTC 2021
> > upload: ./890_tests_results.bin to
> > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> > Requests      [total, rate, throughput]  6000, 100.02, 40.46
> > Duration      [total, attack, wait]      1m29.540373624s, 59.990568262s,
> > 29.549805362s
> > Latencies     [mean, 50, 95, 99, max]    9.938485443s, 5.811191827s,
> > 30.000761953s, 30.001041833s, 30.014534484s
> > Bytes In      [total, mean]              64483027, 10747.17
> > Bytes Out     [total, mean]              0, 0.00
> > Success       [ratio]                    60.38%
> > Status Codes  [code:count]               0:2372  200:3623  500:5
> > Error Set:
> > 500 Server Error
> >
> >
> >
> > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com> wrote:
> >
> > > Hello
> > >
> > > I can as of now see two changes:
> > >
> > > 1.  -Xmx
> > >
> > > 2. +ExplicitGCInvokesConcurrent
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > > <do...@adzuna.com.invalid> wrote:
> > >
> > > > CLI invocation for 8.3.1 is
> > > > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> -XX:+UseLargePages
> > > > -XX:+AlwaysPreTouch
> > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > -Dcom.sun.management.jmxremote.port=18983
> > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > -Dsolr.log.dir=/srv/solr/logs
> > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > -Duser.timezone=UTC
> > > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> > -Dsolr.jetty.https.port=8983
> > > > -jar start.jar --module=http
> > > > I believe the key items are:
> > > > -XX:+AlwaysPreTouch
> > > > -XX:+ParallelRefProcEnabled
> > > > -XX:+PerfDisableSharedMem
> > > > -XX:+UseG1GC
> > > > -XX:+UseLargePages
> > > > -XX:MaxGCPauseMillis=250
> > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > -Xmx15826m
> > > > -Xss256k
> > > >
> > > > And for 8.9.0 is
> > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250
> -XX:+UseLargePages
> > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
> > > > -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote.local.only=false
> > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > -Dcom.sun.management.jmxremote.port=18983
> > > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > > -Dsolr.log.dir=/srv/solr/logs
> > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > > -Duser.timezone=UTC
> > > > -XX:-OmitStackTraceInFastThrow
> > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > /srv/solr/logs
> > > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > > >
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> > --module=http
> > > > Key:
> > > > -XX:+AlwaysPreTouch
> > > > -XX:+ExplicitGCInvokesConcurrent
> > > > -XX:+ParallelRefProcEnabled
> > > > -XX:+PerfDisableSharedMem
> > > > -XX:+UseG1GC
> > > > -XX:+UseLargePages
> > > > -XX:-OmitStackTraceInFastThrow
> > > > -XX:MaxGCPauseMillis=250
> > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > > /srv/solr/logs
> > > >
> > > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > > -Xmx7913m
> > > > -Xss256k
> > > >
> > > > Xmx values are based on the instance RAM, currently they're running
> on
> > > two
> > > > different instance types but we see the same behaviour when they're
> on
> > > > identical types too.
> > > >
> > > > Many thanks
> > > >
> > > > Dominic
> > > >
> > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com> wrote:
> > > >
> > > > > Hello
> > > > >
> > > > > Can you please tell us the JVM Heap Setting for both the versions:
> > > 8.3.1,
> > > > > 8.9.0?
> > > > >
> > > > > I will also have to look into the following code:
> > > > FileFloatSource.java:210.
> > > > > (will do it tonite-IST and update)
> > > > >
> > > > > Deepak
> > > > > "The greatness of a nation can be judged by the way its animals are
> > > > treated
> > > > > - Mahatma Gandhi"
> > > > >
> > > > > +91 73500 12833
> > > > > deicool@gmail.com
> > > > >
> > > > > Facebook: https://www.facebook.com/deicool
> > > > > LinkedIn: www.linkedin.com/in/deicool
> > > > >
> > > > > "Plant a Tree, Go Green"
> > > > >
> > > > > Make In India : http://www.makeinindia.com/home
> > > > >
> > > > >
> > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > > <do...@adzuna.com.invalid> wrote:
> > > > >
> > > > > > Oh, that's very helpful to know about, ty
> > > > > >
> > > > > > The overwhelming majority appear to be threads in TIMED_WAITING,
> > all
> > > > > > waiting on the same
> > > > > > thing:
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > > >
> > > > > > I've attached a screenshot which includes the stack trace.
> Stopping
> > > all
> > > > > > queries to the instance and waiting didn't result in any
> noticeable
> > > > > > decrease in the number of threads so it looks like despite being
> > > timed,
> > > > > > they're simply not getting terminated.
> > > > > >
> > > > > > Restarting the service takes me back down to just 53 threads;
> > > > re-running
> > > > > a
> > > > > > test results in many new threads immediately coming into being,
> > this
> > > > time
> > > > > > with a higher proportion of threads BLOCKED on
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > > - See second screenshot. The stack trace for those is too big for
> > one
> > > > > > screen so here's the output:
> > > > > >
> > > > > > qtp178604517-861 (861)
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > >
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > > >    -
> > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > > >    -
> > > > >
> org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > > >    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > > >    -
> > > > > >
> > > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > > >    -
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > > >    -
> > > > >
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > > >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > > >    -
> > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > > >    -
> > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > > >    -
> > > > > >
> > > >
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > > >    -
> > > > > >
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > > >    Source)
> > > > > >    -
> > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > > >    -
> > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > > >    -
> > > > > >    org.eclipse.jetty.io
> > > > >
> > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > > >    - org.eclipse.jetty.io
> > > .FillInterest.fillable(FillInterest.java:105)
> > > > > >    - org.eclipse.jetty.io
> > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > > >    -
> > > > > >
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > > > > >
> > > > > > [image: image.png]
> > > > > > [image: image.png]
> > > > > >
> > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <joelsolr@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > >> There is a thread dump on the Solr admin. You can use that to
> > > > determine
> > > > > >> what all those threads are doing and where they are getting
> stuck.
> > > You
> > > > > can
> > > > > >> post parts of the thread dump back to this email thread as well.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> Joel Bernstein
> > > > > >> http://joelsolr.blogspot.com/
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > > >> <do...@adzuna.com.invalid> wrote:
> > > > > >>
> > > > > >> > We run 8.3.1 in prod without any problems, but we're having
> > issues
> > > > > with
> > > > > >> > trying to upgrade.
> > > > > >> >
> > > > > >> > I've created an 8.9.0 leader & follower, imported our live
> data
> > > into
> > > > > it,
> > > > > >> > and am testing it via replaying requests made to prod. We're
> > > seeing
> > > > a
> > > > > >> big
> > > > > >> > problem where fairly moderate request rates are causing the
> > > instance
> > > > > to
> > > > > >> > become so slow it fails healthcheck. The logs showed a lot of
> > > errors
> > > > > >> around
> > > > > >> > creating threads:
> > > > > >> >
> > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to start
> > > > thread -
> > > > > >> > pthread_create failed (EAGAIN) for attributes: stacksize:
> 256k,
> > > > > >> guardsize:
> > > > > >> > 0k, detached.
> > > > > >> >
> > > > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > > > > >> > java.lang.OutOfMemoryError: unable to create native thread:
> > > possibly
> > > > > >> out of
> > > > > >> > memory or process/resource limits reached
> > > > > >> >
> > > > > >> > So I monitored thread count for the process whilst running the
> > > test
> > > > > >> suite
> > > > > >> > and saw a persistent pattern: Threads increased until maxed
> out,
> > > the
> > > > > >> logs
> > > > > >> > flooded with errors as it tried to create still more threads,
> > and
> > > > the
> > > > > >> > instance slowed down until terminated as unhealthy.
> > > > > >> >
> > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising and
> > > lowering
> > > > it
> > > > > >> but
> > > > > >> > regardless of value the result is the same: it gets maxed and
> > > > > everything
> > > > > >> > slows down.
> > > > > >> >
> > > > > >> > Is there anything I can do to stop solr spinning up so many
> > > threads
> > > > it
> > > > > >> > ceases to function? There have been a few test passes where it
> > > > > >> > spontaneously dropped threadcount from thousands to hundreds
> and
> > > > > stayed
> > > > > >> up
> > > > > >> > longer, but there seems no pattern to when this happens.
> Running
> > > the
> > > > > >> same
> > > > > >> > tests on 8.3.1 results in a much slower increase in threads
> and
> > it
> > > > > never
> > > > > >> > quite maxes them so things continue to function.
> > > > > >> >
> > > > > >> > See below for the thread count and healthcheck times seen on a
> > > > (fairly
> > > > > >> > harsh) test run of 100 requests/sec
> > > > > >> >
> > > > > >> > Thanks
> > > > > >> >
> > > > > >> > Dominic
> > > > > >> >
> > > > > >> >
> > > > > >> > Threadcount:
> > > > > >> >
> > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF |
> grep
> > > > > >> 'start.jar'
> > > > > >> > | wc -l; sleep 10s; done
> > > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > > >> > 52
> > > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > > >> > 52
> > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > >> > 52
> > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > >> > 52
> > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > >> > 569
> > > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > > >> > 899
> > > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > > >> > 1198
> > > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > > >> > 1589
> > > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > > >> > 2016
> > > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > > >> > 2451
> > > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > > >> > 2851
> > > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > > >> > 2934
> > > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > > >> > 3249
> > > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > > >> > 3501
> > > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > > >> > 3734
> > > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > > >> > 4128
> > > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > > >> > 4374
> > > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > > >> > 4637
> > > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > > >> > 4693
> > > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > > >> > 4807
> > > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > > >> > 4916
> > > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > > >> > 4916
> > > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > >> >
> > > > > >> >
> > > > > >> > Healthcheck:
> > > > > >> >
> > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----';
> sleep
> > > > > >> > 10s; done
> > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:02
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:12
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:01
> > > > > --:--:--
> > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:23
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:01
> > > > > --:--:--
> > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:03
> > > > > --:--:--
> > > > > >> >   0> GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:09
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:11
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:03
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> >   0     0    0     0    0     0      0      0 --:--:--
> 0:00:02
> > > > > --:--:--
> > > > > >> >   0< HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > > >> > ----
> > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > >> > > GET /solr/ HTTP/1.1
> > > > > >> > < HTTP/1.1 200 OK
> > > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > > >> > ----
> > > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
We will have to check the solr logic. Threads waiting might get reused in
the future, so might be good for all you know.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> Yep - I edited the two instances to remove those differences and re-ran a
> fairly tortuous testing cycle - 100 requests/sec.
>
> Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point and
> neither hit the threads limit or failed to create new threads. But both
> scaled up to well above 2k threads and even after stopping the test, the
> threads remained in a timed wait condition. So I'm still thinking there's a
> problem with threads not getting terminated somewhere. Test output below if
> it's of use:
>
> 8.3.1
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
> Requests      [total, rate, throughput]  6000, 100.02, 25.05
> Duration      [total, attack, wait]      1m29.97089157s, 59.990075461s,
> 29.980816109s
> Latencies     [mean, 50, 95, 99, max]    11.011900024s, 5.175300857s,
> 30.00091013s, 30.001077592s, 30.014343222s
> Bytes In      [total, mean]              42385846, 7064.31
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    37.57%
> Status Codes  [code:count]               0:3741  200:2254  500:5
> Error Set:
> 500 Server Error
> Wed Oct 13 12:20:27 UTC 2021
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
> Requests      [total, rate, throughput]  6000, 100.02, 10.98
> Duration      [total, attack, wait]      1m29.931457331s, 59.990859187s,
> 29.940598144s
> Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
> 30.000885208s, 30.00105703s, 30.011268185s
> Bytes In      [total, mean]              21287225, 3547.87
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    16.45%
> Status Codes  [code:count]               0:5007  200:987  500:6
> Error Set:
> 500 Server Error
> Wed Oct 13 12:21:58 UTC 2021
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
> Requests      [total, rate, throughput]  6000, 100.02, 11.68
> Duration      [total, attack, wait]      1m29.58141878s, 59.990945585s,
> 29.590473195s
> Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
> 30.000880433s, 30.001052307s, 30.004624884s
> Bytes In      [total, mean]              17434533, 2905.76
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    17.43%
> Status Codes  [code:count]               0:4874  200:1046  500:80
> Error Set:
> 500 Server Error
> context deadline exceeded (Client.Timeout or context cancellation while
> reading body)
> Wed Oct 13 12:23:29 UTC 2021
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
> Requests      [total, rate, throughput]  6000, 100.02, 1.70
> Duration      [total, attack, wait]      1m29.541198374s, 59.990833778s,
> 29.550364596s
> Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
> 30.000892751s, 30.001051441s, 30.009856373s
> Bytes In      [total, mean]              2211649, 368.61
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    2.53%
> Status Codes  [code:count]               0:5810  200:152  500:38
> Error Set:
> 500 Server Error
> Wed Oct 13 12:24:59 UTC 2021
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
> Requests      [total, rate, throughput]  6000, 100.02, 0.73
> Duration      [total, attack, wait]      1m29.451307081s, 59.990601937s,
> 29.460705144s
> Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
> 30.000885386s, 30.001023192s, 30.009992128s
> Bytes In      [total, mean]              837808, 139.63
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    1.08%
> Status Codes  [code:count]               0:5907  200:65  500:28
> Error Set:
> 500 Server Error
> context deadline exceeded (Client.Timeout or context cancellation while
> reading body)
> Wed Oct 13 12:26:30 UTC 2021
> upload: ./831_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
> Requests      [total, rate, throughput]  6000, 100.02, 1.65
> Duration      [total, attack, wait]      1m29.760671622s, 59.990482467s,
> 29.770189155s
> Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
> 30.000894904s, 30.00104114s, 30.016506845s
> Bytes In      [total, mean]              1973103, 328.85
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    2.47%
> Status Codes  [code:count]               0:5821  200:148  500:31
> Error Set:
> 500 Server Error
>
>
> 8.9.0
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
> Requests      [total, rate, throughput]  6000, 100.02, 28.50
> Duration      [total, attack, wait]      1m29.930332927s, 59.990809535s,
> 29.939523392s
> Latencies     [mean, 50, 95, 99, max]    10.958871049s, 5.184437078s,
> 30.000885573s, 30.001088926s, 30.00946393s
> Bytes In      [total, mean]              51095634, 8515.94
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    42.72%
> Status Codes  [code:count]               0:3432  200:2563  500:5
> Error Set:
> 500 Server Error
> Wed Oct 13 12:20:28 UTC 2021
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
> Requests      [total, rate, throughput]  6000, 100.02, 38.15
> Duration      [total, attack, wait]      1m29.951119527s, 59.990310608s,
> 29.960808919s
> Latencies     [mean, 50, 95, 99, max]    10.089597212s, 4.856237339s,
> 30.000863563s, 30.001063417s, 30.013413816s
> Bytes In      [total, mean]              64455304, 10742.55
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    57.20%
> Status Codes  [code:count]               0:2565  200:3432  500:3
> Error Set:
> 500 Server Error
> Wed Oct 13 12:22:00 UTC 2021
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
> Requests      [total, rate, throughput]  6000, 100.02, 31.12
> Duration      [total, attack, wait]      1m29.971513653s, 59.990882146s,
> 29.980631507s
> Latencies     [mean, 50, 95, 99, max]    7.982882635s, 4.086936886s,
> 30.000611417s, 30.001022921s, 30.00949772s
> Bytes In      [total, mean]              49838902, 8306.48
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    46.67%
> Status Codes  [code:count]               0:3011  200:2800  500:4  503:185
> Error Set:
> 500 Server Error
> unexpected EOF
> 503 Service Unavailable
> Wed Oct 13 12:23:31 UTC 2021
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
> Requests      [total, rate, throughput]  6000, 100.02, 30.21
> Duration      [total, attack, wait]      1m29.871026977s, 59.990066513s,
> 29.880960464s
> Latencies     [mean, 50, 95, 99, max]    9.187948603s, 5.166973696s,
> 30.000723538s, 30.001043637s, 30.009522646s
> Bytes In      [total, mean]              52508950, 8751.49
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    45.25%
> Status Codes  [code:count]               0:3118  200:2715  500:4  503:163
> Error Set:
> 500 Server Error
> unexpected EOF
> 503 Service Unavailable
> Wed Oct 13 12:25:02 UTC 2021
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
> Requests      [total, rate, throughput]  6000, 100.02, 28.33
> Duration      [total, attack, wait]      1m29.931899376s, 59.990583341s,
> 29.941316035s
> Latencies     [mean, 50, 95, 99, max]    10.643567102s, 4.009658048s,
> 30.000888301s, 30.00107338s, 30.010319307s
> Bytes In      [total, mean]              51869384, 8644.90
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    42.47%
> Status Codes  [code:count]               0:3448  200:2548  500:4
> Error Set:
> 500 Server Error
> context deadline exceeded (Client.Timeout or context cancellation while
> reading body)
> Wed Oct 13 12:26:34 UTC 2021
> upload: ./890_tests_results.bin to
> s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
> Requests      [total, rate, throughput]  6000, 100.02, 40.46
> Duration      [total, attack, wait]      1m29.540373624s, 59.990568262s,
> 29.549805362s
> Latencies     [mean, 50, 95, 99, max]    9.938485443s, 5.811191827s,
> 30.000761953s, 30.001041833s, 30.014534484s
> Bytes In      [total, mean]              64483027, 10747.17
> Bytes Out     [total, mean]              0, 0.00
> Success       [ratio]                    60.38%
> Status Codes  [code:count]               0:2372  200:3623  500:5
> Error Set:
> 500 Server Error
>
>
>
> On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com> wrote:
>
> > Hello
> >
> > I can as of now see two changes:
> >
> > 1.  -Xmx
> >
> > 2. +ExplicitGCInvokesConcurrent
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> > <do...@adzuna.com.invalid> wrote:
> >
> > > CLI invocation for 8.3.1 is
> > > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> > > -XX:+AlwaysPreTouch
> > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote.local.only=false
> > > -Dcom.sun.management.jmxremote.ssl=false
> > > -Dcom.sun.management.jmxremote.authenticate=false
> > > -Dcom.sun.management.jmxremote.port=18983
> > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > -Dsolr.log.dir=/srv/solr/logs
> > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > -Duser.timezone=UTC
> > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > > -Dsolr.disable.shardsWhitelist=true -Xss256k
> -Dsolr.jetty.https.port=8983
> > > -jar start.jar --module=http
> > > I believe the key items are:
> > > -XX:+AlwaysPreTouch
> > > -XX:+ParallelRefProcEnabled
> > > -XX:+PerfDisableSharedMem
> > > -XX:+UseG1GC
> > > -XX:+UseLargePages
> > > -XX:MaxGCPauseMillis=250
> > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > -Xmx15826m
> > > -Xss256k
> > >
> > > And for 8.9.0 is
> > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
> > > -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote.local.only=false
> > > -Dcom.sun.management.jmxremote.ssl=false
> > > -Dcom.sun.management.jmxremote.authenticate=false
> > > -Dcom.sun.management.jmxremote.port=18983
> > > -Dcom.sun.management.jmxremote.rmi.port=18983
> > -Dsolr.log.dir=/srv/solr/logs
> > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> > -Duser.timezone=UTC
> > > -XX:-OmitStackTraceInFastThrow
> > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > /srv/solr/logs
> > > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> > >
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar
> --module=http
> > > Key:
> > > -XX:+AlwaysPreTouch
> > > -XX:+ExplicitGCInvokesConcurrent
> > > -XX:+ParallelRefProcEnabled
> > > -XX:+PerfDisableSharedMem
> > > -XX:+UseG1GC
> > > -XX:+UseLargePages
> > > -XX:-OmitStackTraceInFastThrow
> > > -XX:MaxGCPauseMillis=250
> > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> > /srv/solr/logs
> > >
> > >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > > -Xmx7913m
> > > -Xss256k
> > >
> > > Xmx values are based on the instance RAM, currently they're running on
> > two
> > > different instance types but we see the same behaviour when they're on
> > > identical types too.
> > >
> > > Many thanks
> > >
> > > Dominic
> > >
> > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com> wrote:
> > >
> > > > Hello
> > > >
> > > > Can you please tell us the JVM Heap Setting for both the versions:
> > 8.3.1,
> > > > 8.9.0?
> > > >
> > > > I will also have to look into the following code:
> > > FileFloatSource.java:210.
> > > > (will do it tonite-IST and update)
> > > >
> > > > Deepak
> > > > "The greatness of a nation can be judged by the way its animals are
> > > treated
> > > > - Mahatma Gandhi"
> > > >
> > > > +91 73500 12833
> > > > deicool@gmail.com
> > > >
> > > > Facebook: https://www.facebook.com/deicool
> > > > LinkedIn: www.linkedin.com/in/deicool
> > > >
> > > > "Plant a Tree, Go Green"
> > > >
> > > > Make In India : http://www.makeinindia.com/home
> > > >
> > > >
> > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > > <do...@adzuna.com.invalid> wrote:
> > > >
> > > > > Oh, that's very helpful to know about, ty
> > > > >
> > > > > The overwhelming majority appear to be threads in TIMED_WAITING,
> all
> > > > > waiting on the same
> > > > > thing:
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > > >
> > > > > I've attached a screenshot which includes the stack trace. Stopping
> > all
> > > > > queries to the instance and waiting didn't result in any noticeable
> > > > > decrease in the number of threads so it looks like despite being
> > timed,
> > > > > they're simply not getting terminated.
> > > > >
> > > > > Restarting the service takes me back down to just 53 threads;
> > > re-running
> > > > a
> > > > > test results in many new threads immediately coming into being,
> this
> > > time
> > > > > with a higher proportion of threads BLOCKED on
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > > - See second screenshot. The stack trace for those is too big for
> one
> > > > > screen so here's the output:
> > > > >
> > > > > qtp178604517-861 (861)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > >
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > > >    -
> org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > > >    -
> > > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > > >    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > > >    -
> > > > >
> > > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > > >    -
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > > >    -
> > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > > >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > > >    -
> > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > > >    -
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > > >    -
> > > > >
> > > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > > >    -
> > > > >
> > > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > > >    Source)
> > > > >    -
> > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > > >    -
> > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > > >    -
> > > > >    org.eclipse.jetty.io
> > > >
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > > >    - org.eclipse.jetty.io
> > .FillInterest.fillable(FillInterest.java:105)
> > > > >    - org.eclipse.jetty.io
> > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > > >    -
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > > > >
> > > > > [image: image.png]
> > > > > [image: image.png]
> > > > >
> > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com>
> > > wrote:
> > > > >
> > > > >> There is a thread dump on the Solr admin. You can use that to
> > > determine
> > > > >> what all those threads are doing and where they are getting stuck.
> > You
> > > > can
> > > > >> post parts of the thread dump back to this email thread as well.
> > > > >>
> > > > >>
> > > > >>
> > > > >> Joel Bernstein
> > > > >> http://joelsolr.blogspot.com/
> > > > >>
> > > > >>
> > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > > >> <do...@adzuna.com.invalid> wrote:
> > > > >>
> > > > >> > We run 8.3.1 in prod without any problems, but we're having
> issues
> > > > with
> > > > >> > trying to upgrade.
> > > > >> >
> > > > >> > I've created an 8.9.0 leader & follower, imported our live data
> > into
> > > > it,
> > > > >> > and am testing it via replaying requests made to prod. We're
> > seeing
> > > a
> > > > >> big
> > > > >> > problem where fairly moderate request rates are causing the
> > instance
> > > > to
> > > > >> > become so slow it fails healthcheck. The logs showed a lot of
> > errors
> > > > >> around
> > > > >> > creating threads:
> > > > >> >
> > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to start
> > > thread -
> > > > >> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> > > > >> guardsize:
> > > > >> > 0k, detached.
> > > > >> >
> > > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > > > >> > java.lang.OutOfMemoryError: unable to create native thread:
> > possibly
> > > > >> out of
> > > > >> > memory or process/resource limits reached
> > > > >> >
> > > > >> > So I monitored thread count for the process whilst running the
> > test
> > > > >> suite
> > > > >> > and saw a persistent pattern: Threads increased until maxed out,
> > the
> > > > >> logs
> > > > >> > flooded with errors as it tried to create still more threads,
> and
> > > the
> > > > >> > instance slowed down until terminated as unhealthy.
> > > > >> >
> > > > >> > The DefaultTasksMax is set to 4915, I've tried raising and
> > lowering
> > > it
> > > > >> but
> > > > >> > regardless of value the result is the same: it gets maxed and
> > > > everything
> > > > >> > slows down.
> > > > >> >
> > > > >> > Is there anything I can do to stop solr spinning up so many
> > threads
> > > it
> > > > >> > ceases to function? There have been a few test passes where it
> > > > >> > spontaneously dropped threadcount from thousands to hundreds and
> > > > stayed
> > > > >> up
> > > > >> > longer, but there seems no pattern to when this happens. Running
> > the
> > > > >> same
> > > > >> > tests on 8.3.1 results in a much slower increase in threads and
> it
> > > > never
> > > > >> > quite maxes them so things continue to function.
> > > > >> >
> > > > >> > See below for the thread count and healthcheck times seen on a
> > > (fairly
> > > > >> > harsh) test run of 100 requests/sec
> > > > >> >
> > > > >> > Thanks
> > > > >> >
> > > > >> > Dominic
> > > > >> >
> > > > >> >
> > > > >> > Threadcount:
> > > > >> >
> > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> > > > >> 'start.jar'
> > > > >> > | wc -l; sleep 10s; done
> > > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > > >> > 52
> > > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > > >> > 52
> > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > >> > 52
> > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > >> > 52
> > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > >> > 569
> > > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > > >> > 899
> > > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > > >> > 1198
> > > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > > >> > 1589
> > > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > > >> > 2016
> > > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > > >> > 2451
> > > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > > >> > 2851
> > > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > > >> > 2934
> > > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > > >> > 3249
> > > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > > >> > 3501
> > > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > > >> > 3734
> > > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > > >> > 4128
> > > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > > >> > 4374
> > > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > > >> > 4637
> > > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > > >> > 4693
> > > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > > >> > 4807
> > > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > > >> > 4916
> > > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > > >> > 4916
> > > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > >> >
> > > > >> >
> > > > >> > Healthcheck:
> > > > >> >
> > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> > > > >> > 10s; done
> > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > > > --:--:--
> > > > >> >   0> GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > > > --:--:--
> > > > >> >   0> GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > > > --:--:--
> > > > >> >   0> GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > > > --:--:--
> > > > >> >   0< HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > > >> > ----
> > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > >> > > GET /solr/ HTTP/1.1
> > > > >> > < HTTP/1.1 200 OK
> > > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > > >> > ----
> > > > >> > Connection to 10.40.22.166 closed by remote host.
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Yep - I edited the two instances to remove those differences and re-ran a
fairly tortuous testing cycle - 100 requests/sec.

Interestingly, 8.9.0 outperformed 8.3.1 significantly at this point and
neither hit the threads limit or failed to create new threads. But both
scaled up to well above 2k threads and even after stopping the test, the
threads remained in a timed wait condition. So I'm still thinking there's a
problem with threads not getting terminated somewhere. Test output below if
it's of use:

8.3.1
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25
Requests      [total, rate, throughput]  6000, 100.02, 25.05
Duration      [total, attack, wait]      1m29.97089157s, 59.990075461s,
29.980816109s
Latencies     [mean, 50, 95, 99, max]    11.011900024s, 5.175300857s,
30.00091013s, 30.001077592s, 30.014343222s
Bytes In      [total, mean]              42385846, 7064.31
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    37.57%
Status Codes  [code:count]               0:3741  200:2254  500:5
Error Set:
500 Server Error
Wed Oct 13 12:20:27 UTC 2021
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57
Requests      [total, rate, throughput]  6000, 100.02, 10.98
Duration      [total, attack, wait]      1m29.931457331s, 59.990859187s,
29.940598144s
Latencies     [mean, 50, 95, 99, max]    7.74211412s, 57.37µs,
30.000885208s, 30.00105703s, 30.011268185s
Bytes In      [total, mean]              21287225, 3547.87
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    16.45%
Status Codes  [code:count]               0:5007  200:987  500:6
Error Set:
500 Server Error
Wed Oct 13 12:21:58 UTC 2021
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27
Requests      [total, rate, throughput]  6000, 100.02, 11.68
Duration      [total, attack, wait]      1m29.58141878s, 59.990945585s,
29.590473195s
Latencies     [mean, 50, 95, 99, max]    9.05339281s, 58.742µs,
30.000880433s, 30.001052307s, 30.004624884s
Bytes In      [total, mean]              17434533, 2905.76
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    17.43%
Status Codes  [code:count]               0:4874  200:1046  500:80
Error Set:
500 Server Error
context deadline exceeded (Client.Timeout or context cancellation while
reading body)
Wed Oct 13 12:23:29 UTC 2021
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58
Requests      [total, rate, throughput]  6000, 100.02, 1.70
Duration      [total, attack, wait]      1m29.541198374s, 59.990833778s,
29.550364596s
Latencies     [mean, 50, 95, 99, max]    8.155874514s, 51.073µs,
30.000892751s, 30.001051441s, 30.009856373s
Bytes In      [total, mean]              2211649, 368.61
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    2.53%
Status Codes  [code:count]               0:5810  200:152  500:38
Error Set:
500 Server Error
Wed Oct 13 12:24:59 UTC 2021
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29
Requests      [total, rate, throughput]  6000, 100.02, 0.73
Duration      [total, attack, wait]      1m29.451307081s, 59.990601937s,
29.460705144s
Latencies     [mean, 50, 95, 99, max]    7.257427504s, 50.866µs,
30.000885386s, 30.001023192s, 30.009992128s
Bytes In      [total, mean]              837808, 139.63
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    1.08%
Status Codes  [code:count]               0:5907  200:65  500:28
Error Set:
500 Server Error
context deadline exceeded (Client.Timeout or context cancellation while
reading body)
Wed Oct 13 12:26:30 UTC 2021
upload: ./831_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00
Requests      [total, rate, throughput]  6000, 100.02, 1.65
Duration      [total, attack, wait]      1m29.760671622s, 59.990482467s,
29.770189155s
Latencies     [mean, 50, 95, 99, max]    8.288506559s, 50.424µs,
30.000894904s, 30.00104114s, 30.016506845s
Bytes In      [total, mean]              1973103, 328.85
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    2.47%
Status Codes  [code:count]               0:5821  200:148  500:31
Error Set:
500 Server Error


8.9.0
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27
Requests      [total, rate, throughput]  6000, 100.02, 28.50
Duration      [total, attack, wait]      1m29.930332927s, 59.990809535s,
29.939523392s
Latencies     [mean, 50, 95, 99, max]    10.958871049s, 5.184437078s,
30.000885573s, 30.001088926s, 30.00946393s
Bytes In      [total, mean]              51095634, 8515.94
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    42.72%
Status Codes  [code:count]               0:3432  200:2563  500:5
Error Set:
500 Server Error
Wed Oct 13 12:20:28 UTC 2021
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58
Requests      [total, rate, throughput]  6000, 100.02, 38.15
Duration      [total, attack, wait]      1m29.951119527s, 59.990310608s,
29.960808919s
Latencies     [mean, 50, 95, 99, max]    10.089597212s, 4.856237339s,
30.000863563s, 30.001063417s, 30.013413816s
Bytes In      [total, mean]              64455304, 10742.55
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    57.20%
Status Codes  [code:count]               0:2565  200:3432  500:3
Error Set:
500 Server Error
Wed Oct 13 12:22:00 UTC 2021
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30
Requests      [total, rate, throughput]  6000, 100.02, 31.12
Duration      [total, attack, wait]      1m29.971513653s, 59.990882146s,
29.980631507s
Latencies     [mean, 50, 95, 99, max]    7.982882635s, 4.086936886s,
30.000611417s, 30.001022921s, 30.00949772s
Bytes In      [total, mean]              49838902, 8306.48
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    46.67%
Status Codes  [code:count]               0:3011  200:2800  500:4  503:185
Error Set:
500 Server Error
unexpected EOF
503 Service Unavailable
Wed Oct 13 12:23:31 UTC 2021
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01
Requests      [total, rate, throughput]  6000, 100.02, 30.21
Duration      [total, attack, wait]      1m29.871026977s, 59.990066513s,
29.880960464s
Latencies     [mean, 50, 95, 99, max]    9.187948603s, 5.166973696s,
30.000723538s, 30.001043637s, 30.009522646s
Bytes In      [total, mean]              52508950, 8751.49
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    45.25%
Status Codes  [code:count]               0:3118  200:2715  500:4  503:163
Error Set:
500 Server Error
unexpected EOF
503 Service Unavailable
Wed Oct 13 12:25:02 UTC 2021
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33
Requests      [total, rate, throughput]  6000, 100.02, 28.33
Duration      [total, attack, wait]      1m29.931899376s, 59.990583341s,
29.941316035s
Latencies     [mean, 50, 95, 99, max]    10.643567102s, 4.009658048s,
30.000888301s, 30.00107338s, 30.010319307s
Bytes In      [total, mean]              51869384, 8644.90
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    42.47%
Status Codes  [code:count]               0:3448  200:2548  500:4
Error Set:
500 Server Error
context deadline exceeded (Client.Timeout or context cancellation while
reading body)
Wed Oct 13 12:26:34 UTC 2021
upload: ./890_tests_results.bin to
s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03
Requests      [total, rate, throughput]  6000, 100.02, 40.46
Duration      [total, attack, wait]      1m29.540373624s, 59.990568262s,
29.549805362s
Latencies     [mean, 50, 95, 99, max]    9.938485443s, 5.811191827s,
30.000761953s, 30.001041833s, 30.014534484s
Bytes In      [total, mean]              64483027, 10747.17
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    60.38%
Status Codes  [code:count]               0:2372  200:3623  500:5
Error Set:
500 Server Error



On Wed, 13 Oct 2021 at 13:06, Deepak Goel <de...@gmail.com> wrote:

> Hello
>
> I can as of now see two changes:
>
> 1.  -Xmx
>
> 2. +ExplicitGCInvokesConcurrent
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > CLI invocation for 8.3.1 is
> > java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> > -XX:+AlwaysPreTouch
> >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.local.only=false
> > -Dcom.sun.management.jmxremote.ssl=false
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.port=18983
> > -Dcom.sun.management.jmxremote.rmi.port=18983
> -Dsolr.log.dir=/srv/solr/logs
> > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> -Duser.timezone=UTC
> > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> > -Dsolr.disable.shardsWhitelist=true -Xss256k -Dsolr.jetty.https.port=8983
> > -jar start.jar --module=http
> > I believe the key items are:
> > -XX:+AlwaysPreTouch
> > -XX:+ParallelRefProcEnabled
> > -XX:+PerfDisableSharedMem
> > -XX:+UseG1GC
> > -XX:+UseLargePages
> > -XX:MaxGCPauseMillis=250
> >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > -Xmx15826m
> > -Xss256k
> >
> > And for 8.9.0 is
> > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
> >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
> > -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.local.only=false
> > -Dcom.sun.management.jmxremote.ssl=false
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.port=18983
> > -Dcom.sun.management.jmxremote.rmi.port=18983
> -Dsolr.log.dir=/srv/solr/logs
> > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> -Duser.timezone=UTC
> > -XX:-OmitStackTraceInFastThrow
> > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> /srv/solr/logs
> > -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> >
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> > -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar --module=http
> > Key:
> > -XX:+AlwaysPreTouch
> > -XX:+ExplicitGCInvokesConcurrent
> > -XX:+ParallelRefProcEnabled
> > -XX:+PerfDisableSharedMem
> > -XX:+UseG1GC
> > -XX:+UseLargePages
> > -XX:-OmitStackTraceInFastThrow
> > -XX:MaxGCPauseMillis=250
> > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983
> /srv/solr/logs
> >
> >
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > -Xmx7913m
> > -Xss256k
> >
> > Xmx values are based on the instance RAM, currently they're running on
> two
> > different instance types but we see the same behaviour when they're on
> > identical types too.
> >
> > Many thanks
> >
> > Dominic
> >
> > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com> wrote:
> >
> > > Hello
> > >
> > > Can you please tell us the JVM Heap Setting for both the versions:
> 8.3.1,
> > > 8.9.0?
> > >
> > > I will also have to look into the following code:
> > FileFloatSource.java:210.
> > > (will do it tonite-IST and update)
> > >
> > > Deepak
> > > "The greatness of a nation can be judged by the way its animals are
> > treated
> > > - Mahatma Gandhi"
> > >
> > > +91 73500 12833
> > > deicool@gmail.com
> > >
> > > Facebook: https://www.facebook.com/deicool
> > > LinkedIn: www.linkedin.com/in/deicool
> > >
> > > "Plant a Tree, Go Green"
> > >
> > > Make In India : http://www.makeinindia.com/home
> > >
> > >
> > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > > <do...@adzuna.com.invalid> wrote:
> > >
> > > > Oh, that's very helpful to know about, ty
> > > >
> > > > The overwhelming majority appear to be threads in TIMED_WAITING, all
> > > > waiting on the same
> > > > thing:
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > > >
> > > > I've attached a screenshot which includes the stack trace. Stopping
> all
> > > > queries to the instance and waiting didn't result in any noticeable
> > > > decrease in the number of threads so it looks like despite being
> timed,
> > > > they're simply not getting terminated.
> > > >
> > > > Restarting the service takes me back down to just 53 threads;
> > re-running
> > > a
> > > > test results in many new threads immediately coming into being, this
> > time
> > > > with a higher proportion of threads BLOCKED on
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > > - See second screenshot. The stack trace for those is too big for one
> > > > screen so here's the output:
> > > >
> > > > qtp178604517-861 (861)
> > > >
> > > >
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > >
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > > >    -
> > > >
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > > >    -
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > > >    -
> > > >
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > > >    - org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > > >    -
> > > >
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > > >    -
> > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > > >    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > > >    -
> > > >
> > >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > > >    -
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > > >    -
> > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > > >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > > >    -
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > > >    - org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > > >    -
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > > >    -
> > > >
> > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > > >    -
> > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > > >    -
> > >
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > > >    Source)
> > > >    -
> > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > > >    -
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > > >    -
> > > >    org.eclipse.jetty.io
> > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > > >    - org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:105)
> > > >    - org.eclipse.jetty.io
> > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > > >    -
> > > >
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > > >
> > > > [image: image.png]
> > > > [image: image.png]
> > > >
> > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com>
> > wrote:
> > > >
> > > >> There is a thread dump on the Solr admin. You can use that to
> > determine
> > > >> what all those threads are doing and where they are getting stuck.
> You
> > > can
> > > >> post parts of the thread dump back to this email thread as well.
> > > >>
> > > >>
> > > >>
> > > >> Joel Bernstein
> > > >> http://joelsolr.blogspot.com/
> > > >>
> > > >>
> > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > > >> <do...@adzuna.com.invalid> wrote:
> > > >>
> > > >> > We run 8.3.1 in prod without any problems, but we're having issues
> > > with
> > > >> > trying to upgrade.
> > > >> >
> > > >> > I've created an 8.9.0 leader & follower, imported our live data
> into
> > > it,
> > > >> > and am testing it via replaying requests made to prod. We're
> seeing
> > a
> > > >> big
> > > >> > problem where fairly moderate request rates are causing the
> instance
> > > to
> > > >> > become so slow it fails healthcheck. The logs showed a lot of
> errors
> > > >> around
> > > >> > creating threads:
> > > >> >
> > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to start
> > thread -
> > > >> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> > > >> guardsize:
> > > >> > 0k, detached.
> > > >> >
> > > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > > >> > java.lang.OutOfMemoryError: unable to create native thread:
> possibly
> > > >> out of
> > > >> > memory or process/resource limits reached
> > > >> >
> > > >> > So I monitored thread count for the process whilst running the
> test
> > > >> suite
> > > >> > and saw a persistent pattern: Threads increased until maxed out,
> the
> > > >> logs
> > > >> > flooded with errors as it tried to create still more threads, and
> > the
> > > >> > instance slowed down until terminated as unhealthy.
> > > >> >
> > > >> > The DefaultTasksMax is set to 4915, I've tried raising and
> lowering
> > it
> > > >> but
> > > >> > regardless of value the result is the same: it gets maxed and
> > > everything
> > > >> > slows down.
> > > >> >
> > > >> > Is there anything I can do to stop solr spinning up so many
> threads
> > it
> > > >> > ceases to function? There have been a few test passes where it
> > > >> > spontaneously dropped threadcount from thousands to hundreds and
> > > stayed
> > > >> up
> > > >> > longer, but there seems no pattern to when this happens. Running
> the
> > > >> same
> > > >> > tests on 8.3.1 results in a much slower increase in threads and it
> > > never
> > > >> > quite maxes them so things continue to function.
> > > >> >
> > > >> > See below for the thread count and healthcheck times seen on a
> > (fairly
> > > >> > harsh) test run of 100 requests/sec
> > > >> >
> > > >> > Thanks
> > > >> >
> > > >> > Dominic
> > > >> >
> > > >> >
> > > >> > Threadcount:
> > > >> >
> > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> > > >> 'start.jar'
> > > >> > | wc -l; sleep 10s; done
> > > >> > Tue Oct 12 14:27:33 UTC 2021
> > > >> > 52
> > > >> > Tue Oct 12 14:27:43 UTC 2021
> > > >> > 52
> > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > >> > 52
> > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > >> > 52
> > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > >> > 569
> > > >> > Tue Oct 12 14:28:24 UTC 2021
> > > >> > 899
> > > >> > Tue Oct 12 14:28:34 UTC 2021
> > > >> > 1198
> > > >> > Tue Oct 12 14:28:44 UTC 2021
> > > >> > 1589
> > > >> > Tue Oct 12 14:28:54 UTC 2021
> > > >> > 2016
> > > >> > Tue Oct 12 14:29:05 UTC 2021
> > > >> > 2451
> > > >> > Tue Oct 12 14:29:15 UTC 2021
> > > >> > 2851
> > > >> > Tue Oct 12 14:29:26 UTC 2021
> > > >> > 2934
> > > >> > Tue Oct 12 14:29:36 UTC 2021
> > > >> > 3249
> > > >> > Tue Oct 12 14:29:46 UTC 2021
> > > >> > 3501
> > > >> > Tue Oct 12 14:29:57 UTC 2021
> > > >> > 3734
> > > >> > Tue Oct 12 14:30:07 UTC 2021
> > > >> > 4128
> > > >> > Tue Oct 12 14:30:18 UTC 2021
> > > >> > 4374
> > > >> > Tue Oct 12 14:30:29 UTC 2021
> > > >> > 4637
> > > >> > Tue Oct 12 14:30:39 UTC 2021
> > > >> > 4693
> > > >> > Tue Oct 12 14:30:50 UTC 2021
> > > >> > 4807
> > > >> > Tue Oct 12 14:31:01 UTC 2021
> > > >> > 4916
> > > >> > Tue Oct 12 14:31:11 UTC 2021
> > > >> > 4916
> > > >> > Tue Oct 12 14:31:22 UTC 2021
> > > >> > Connection to 10.40.22.166 closed by remote host.
> > > >> >
> > > >> >
> > > >> > Healthcheck:
> > > >> >
> > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> > > >> > 10s; done
> > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:27:34 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:27:44 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:27:54 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:28:04 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:28:14 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:28:16 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:28:26 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:28:39 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:28:49 UTC 2021
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > > --:--:--
> > > >> >   0> GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:29:13 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:29:23 UTC 2021
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > > --:--:--
> > > >> >   0> GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:29:25 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:29:35 UTC 2021
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > > --:--:--
> > > >> >   0> GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:29:44 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:29:54 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:30:06 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:30:16 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:30:20 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:30:30 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > > --:--:--
> > > >> >   0< HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:30:33 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:30:43 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:30:53 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > Tue Oct 12 14:30:55 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:31:05 UTC 2021
> > > >> > ----
> > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > >> > > GET /solr/ HTTP/1.1
> > > >> > < HTTP/1.1 200 OK
> > > >> > Tue Oct 12 14:31:15 UTC 2021
> > > >> > ----
> > > >> > Connection to 10.40.22.166 closed by remote host.
> > > >> >
> > > >>
> > > >
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
Hello

I can as of now see two changes:

1.  -Xmx

2. +ExplicitGCInvokesConcurrent

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> CLI invocation for 8.3.1 is
> java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> -XX:+AlwaysPreTouch
>
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=18983
> -Dcom.sun.management.jmxremote.rmi.port=18983 -Dsolr.log.dir=/srv/solr/logs
> -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
> -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
> -Dsolr.disable.shardsWhitelist=true -Xss256k -Dsolr.jetty.https.port=8983
> -jar start.jar --module=http
> I believe the key items are:
> -XX:+AlwaysPreTouch
> -XX:+ParallelRefProcEnabled
> -XX:+PerfDisableSharedMem
> -XX:+UseG1GC
> -XX:+UseLargePages
> -XX:MaxGCPauseMillis=250
>
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Xmx15826m
> -Xss256k
>
> And for 8.9.0 is
> java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
> -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
> -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
>
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=18983
> -Dcom.sun.management.jmxremote.rmi.port=18983 -Dsolr.log.dir=/srv/solr/logs
> -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
> -XX:-OmitStackTraceInFastThrow
> -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 /srv/solr/logs
> -Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
> -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
> -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
> -Dlog4j.configurationFile=/srv/solr/log4j2.xml
> -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar --module=http
> Key:
> -XX:+AlwaysPreTouch
> -XX:+ExplicitGCInvokesConcurrent
> -XX:+ParallelRefProcEnabled
> -XX:+PerfDisableSharedMem
> -XX:+UseG1GC
> -XX:+UseLargePages
> -XX:-OmitStackTraceInFastThrow
> -XX:MaxGCPauseMillis=250
> -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 /srv/solr/logs
>
> -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> -Xmx7913m
> -Xss256k
>
> Xmx values are based on the instance RAM, currently they're running on two
> different instance types but we see the same behaviour when they're on
> identical types too.
>
> Many thanks
>
> Dominic
>
> On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com> wrote:
>
> > Hello
> >
> > Can you please tell us the JVM Heap Setting for both the versions: 8.3.1,
> > 8.9.0?
> >
> > I will also have to look into the following code:
> FileFloatSource.java:210.
> > (will do it tonite-IST and update)
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> > <do...@adzuna.com.invalid> wrote:
> >
> > > Oh, that's very helpful to know about, ty
> > >
> > > The overwhelming majority appear to be threads in TIMED_WAITING, all
> > > waiting on the same
> > > thing:
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> > >
> > > I've attached a screenshot which includes the stack trace. Stopping all
> > > queries to the instance and waiting didn't result in any noticeable
> > > decrease in the number of threads so it looks like despite being timed,
> > > they're simply not getting terminated.
> > >
> > > Restarting the service takes me back down to just 53 threads;
> re-running
> > a
> > > test results in many new threads immediately coming into being, this
> time
> > > with a higher proportion of threads BLOCKED on
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > > - See second screenshot. The stack trace for those is too big for one
> > > screen so here's the output:
> > >
> > > qtp178604517-861 (861)
> > >
> > >
> > >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > >
> > >    -
> > >
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> > >    -
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> > >    -
> > >
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> > >    -
> > >
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> > >    -
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> > >    -
> > >
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> > >    - org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> > >    -
> > >
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> > >    -
> > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> > >    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> > >    -
> > >
> > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> > >    -
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> > >    -
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > >    -
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> > >    -
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> > >    -
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> > >    -
> > >
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> > >    -
> > >
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> > >    -
> > >
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> > >    -
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> > >    -
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> > >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> > >    -
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> > >    - org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> > >    -
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > >    -
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> > >    -
> > >
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> > >    -
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> > >    -
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > >    -
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> > >    -
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> > >    -
> > >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> > >    -
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > >    -
> > >
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> > >    -
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> > >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> > >    -
> > >
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> > >    -
> >
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> > >    Source)
> > >    -
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> > >    - org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> > >    -
> > >
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> > >    -
> > >    org.eclipse.jetty.io
> > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> > >    - org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
> > >    - org.eclipse.jetty.io
> > .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> > >    -
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> > >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> > >
> > > [image: image.png]
> > > [image: image.png]
> > >
> > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com>
> wrote:
> > >
> > >> There is a thread dump on the Solr admin. You can use that to
> determine
> > >> what all those threads are doing and where they are getting stuck. You
> > can
> > >> post parts of the thread dump back to this email thread as well.
> > >>
> > >>
> > >>
> > >> Joel Bernstein
> > >> http://joelsolr.blogspot.com/
> > >>
> > >>
> > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> > >> <do...@adzuna.com.invalid> wrote:
> > >>
> > >> > We run 8.3.1 in prod without any problems, but we're having issues
> > with
> > >> > trying to upgrade.
> > >> >
> > >> > I've created an 8.9.0 leader & follower, imported our live data into
> > it,
> > >> > and am testing it via replaying requests made to prod. We're seeing
> a
> > >> big
> > >> > problem where fairly moderate request rates are causing the instance
> > to
> > >> > become so slow it fails healthcheck. The logs showed a lot of errors
> > >> around
> > >> > creating threads:
> > >> >
> > >> > solr[4507]: [124136.511s][warning][os,thread] Failed to start
> thread -
> > >> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> > >> guardsize:
> > >> > 0k, detached.
> > >> >
> > >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > >> > java.lang.OutOfMemoryError: unable to create native thread: possibly
> > >> out of
> > >> > memory or process/resource limits reached
> > >> >
> > >> > So I monitored thread count for the process whilst running the test
> > >> suite
> > >> > and saw a persistent pattern: Threads increased until maxed out, the
> > >> logs
> > >> > flooded with errors as it tried to create still more threads, and
> the
> > >> > instance slowed down until terminated as unhealthy.
> > >> >
> > >> > The DefaultTasksMax is set to 4915, I've tried raising and lowering
> it
> > >> but
> > >> > regardless of value the result is the same: it gets maxed and
> > everything
> > >> > slows down.
> > >> >
> > >> > Is there anything I can do to stop solr spinning up so many threads
> it
> > >> > ceases to function? There have been a few test passes where it
> > >> > spontaneously dropped threadcount from thousands to hundreds and
> > stayed
> > >> up
> > >> > longer, but there seems no pattern to when this happens. Running the
> > >> same
> > >> > tests on 8.3.1 results in a much slower increase in threads and it
> > never
> > >> > quite maxes them so things continue to function.
> > >> >
> > >> > See below for the thread count and healthcheck times seen on a
> (fairly
> > >> > harsh) test run of 100 requests/sec
> > >> >
> > >> > Thanks
> > >> >
> > >> > Dominic
> > >> >
> > >> >
> > >> > Threadcount:
> > >> >
> > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> > >> 'start.jar'
> > >> > | wc -l; sleep 10s; done
> > >> > Tue Oct 12 14:27:33 UTC 2021
> > >> > 52
> > >> > Tue Oct 12 14:27:43 UTC 2021
> > >> > 52
> > >> > Tue Oct 12 14:27:54 UTC 2021
> > >> > 52
> > >> > Tue Oct 12 14:28:04 UTC 2021
> > >> > 52
> > >> > Tue Oct 12 14:28:14 UTC 2021
> > >> > 569
> > >> > Tue Oct 12 14:28:24 UTC 2021
> > >> > 899
> > >> > Tue Oct 12 14:28:34 UTC 2021
> > >> > 1198
> > >> > Tue Oct 12 14:28:44 UTC 2021
> > >> > 1589
> > >> > Tue Oct 12 14:28:54 UTC 2021
> > >> > 2016
> > >> > Tue Oct 12 14:29:05 UTC 2021
> > >> > 2451
> > >> > Tue Oct 12 14:29:15 UTC 2021
> > >> > 2851
> > >> > Tue Oct 12 14:29:26 UTC 2021
> > >> > 2934
> > >> > Tue Oct 12 14:29:36 UTC 2021
> > >> > 3249
> > >> > Tue Oct 12 14:29:46 UTC 2021
> > >> > 3501
> > >> > Tue Oct 12 14:29:57 UTC 2021
> > >> > 3734
> > >> > Tue Oct 12 14:30:07 UTC 2021
> > >> > 4128
> > >> > Tue Oct 12 14:30:18 UTC 2021
> > >> > 4374
> > >> > Tue Oct 12 14:30:29 UTC 2021
> > >> > 4637
> > >> > Tue Oct 12 14:30:39 UTC 2021
> > >> > 4693
> > >> > Tue Oct 12 14:30:50 UTC 2021
> > >> > 4807
> > >> > Tue Oct 12 14:31:01 UTC 2021
> > >> > 4916
> > >> > Tue Oct 12 14:31:11 UTC 2021
> > >> > 4916
> > >> > Tue Oct 12 14:31:22 UTC 2021
> > >> > Connection to 10.40.22.166 closed by remote host.
> > >> >
> > >> >
> > >> > Healthcheck:
> > >> >
> > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> > >> > 10s; done
> > >> > Tue Oct 12 14:27:34 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:27:34 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:27:44 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:27:44 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:27:54 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:27:54 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:28:04 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:28:04 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:28:14 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:28:16 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:28:26 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:28:39 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:28:49 UTC 2021
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > --:--:--
> > >> >   0> GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:29:13 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:29:23 UTC 2021
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> > --:--:--
> > >> >   0> GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:29:25 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:29:35 UTC 2021
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > --:--:--
> > >> >   0> GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:29:44 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:29:54 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:30:06 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:30:16 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:30:20 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:30:30 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> > --:--:--
> > >> >   0< HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:30:33 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:30:43 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:30:43 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:30:53 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > Tue Oct 12 14:30:55 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:31:05 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:31:05 UTC 2021
> > >> > ----
> > >> > Tue Oct 12 14:31:15 UTC 2021
> > >> > > GET /solr/ HTTP/1.1
> > >> > < HTTP/1.1 200 OK
> > >> > Tue Oct 12 14:31:15 UTC 2021
> > >> > ----
> > >> > Connection to 10.40.22.166 closed by remote host.
> > >> >
> > >>
> > >
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
CLI invocation for 8.3.1 is
java -server -Xmx15826m -XX:+UseG1GC -XX:+PerfDisableSharedMem
-XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
-XX:+AlwaysPreTouch
-Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=18983
-Dcom.sun.management.jmxremote.rmi.port=18983 -Dsolr.log.dir=/srv/solr/logs
-Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
-Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
-Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
-Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
-Dlog4j.configurationFile=file:/srv/solr/log4j2.xml
-Dsolr.disable.shardsWhitelist=true -Xss256k -Dsolr.jetty.https.port=8983
-jar start.jar --module=http
I believe the key items are:
-XX:+AlwaysPreTouch
-XX:+ParallelRefProcEnabled
-XX:+PerfDisableSharedMem
-XX:+UseG1GC
-XX:+UseLargePages
-XX:MaxGCPauseMillis=250
-Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Xmx15826m
-Xss256k

And for 8.9.0 is
java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem
-XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages
-XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent
-Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=18983
-Dcom.sun.management.jmxremote.rmi.port=18983 -Dsolr.log.dir=/srv/solr/logs
-Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC
-XX:-OmitStackTraceInFastThrow
-XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 /srv/solr/logs
-Djetty.home=/usr/local/solr/server -Dsolr.solr.home=/srv/solr/data
-Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr
-Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf
-Dlog4j.configurationFile=/srv/solr/log4j2.xml
-Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar --module=http
Key:
-XX:+AlwaysPreTouch
-XX:+ExplicitGCInvokesConcurrent
-XX:+ParallelRefProcEnabled
-XX:+PerfDisableSharedMem
-XX:+UseG1GC
-XX:+UseLargePages
-XX:-OmitStackTraceInFastThrow
-XX:MaxGCPauseMillis=250
-XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 /srv/solr/logs
-Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Xmx7913m
-Xss256k

Xmx values are based on the instance RAM, currently they're running on two
different instance types but we see the same behaviour when they're on
identical types too.

Many thanks

Dominic

On Wed, 13 Oct 2021 at 12:07, Deepak Goel <de...@gmail.com> wrote:

> Hello
>
> Can you please tell us the JVM Heap Setting for both the versions: 8.3.1,
> 8.9.0?
>
> I will also have to look into the following code: FileFloatSource.java:210.
> (will do it tonite-IST and update)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > Oh, that's very helpful to know about, ty
> >
> > The overwhelming majority appear to be threads in TIMED_WAITING, all
> > waiting on the same
> > thing:
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
> >
> > I've attached a screenshot which includes the stack trace. Stopping all
> > queries to the instance and waiting didn't result in any noticeable
> > decrease in the number of threads so it looks like despite being timed,
> > they're simply not getting terminated.
> >
> > Restarting the service takes me back down to just 53 threads; re-running
> a
> > test results in many new threads immediately coming into being, this time
> > with a higher proportion of threads BLOCKED on
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> > - See second screenshot. The stack trace for those is too big for one
> > screen so here's the output:
> >
> > qtp178604517-861 (861)
> >
> >
> >
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> >
> >    -
> >
> org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
> >    -
> >
> org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
> >    -
> >
> org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
> >    -
> >
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
> >    -
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
> >    -
> >
> org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
> >    - org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
> >    -
> >
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> >    -
> org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> >    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
> >    -
> >
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
> >    -
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
> >    -
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> >    -
> >
> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
> >    -
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
> >    -
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
> >    -
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
> >    -
> >
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
> >    -
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> >    -
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
> >    -
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> >    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
> >    - org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
> >    - org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
> >    -
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> >    -
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
> >    -
> >    org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
> >    -
> >
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
> >    -
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >    -
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
> >    -
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> >    -
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> >    -
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> >    -
> >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
> >    -
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> >    -
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
> >    -
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >    -
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
> >    -
> >
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> >    -
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> >    -
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> >    -
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> >    -
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> >    - org.eclipse.jetty.server.Server.handle(Server.java:516)
> >    -
> >
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
> >    -
> org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
> >    Source)
> >    - org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
> >    - org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
> >    -
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
> >    -
> >    org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
> >    - org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
> >    - org.eclipse.jetty.io
> .ChannelEndPoint$1.run(ChannelEndPoint.java:104)
> >    -
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> >    -
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> >    -
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> >    -
> >
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> >    -
> >
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
> >    -
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
> >    -
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
> >    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
> >
> > [image: image.png]
> > [image: image.png]
> >
> > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com> wrote:
> >
> >> There is a thread dump on the Solr admin. You can use that to determine
> >> what all those threads are doing and where they are getting stuck. You
> can
> >> post parts of the thread dump back to this email thread as well.
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
> >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> >> <do...@adzuna.com.invalid> wrote:
> >>
> >> > We run 8.3.1 in prod without any problems, but we're having issues
> with
> >> > trying to upgrade.
> >> >
> >> > I've created an 8.9.0 leader & follower, imported our live data into
> it,
> >> > and am testing it via replaying requests made to prod. We're seeing a
> >> big
> >> > problem where fairly moderate request rates are causing the instance
> to
> >> > become so slow it fails healthcheck. The logs showed a lot of errors
> >> around
> >> > creating threads:
> >> >
> >> > solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
> >> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> >> guardsize:
> >> > 0k, detached.
> >> >
> >> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> >> > java.lang.OutOfMemoryError: unable to create native thread: possibly
> >> out of
> >> > memory or process/resource limits reached
> >> >
> >> > So I monitored thread count for the process whilst running the test
> >> suite
> >> > and saw a persistent pattern: Threads increased until maxed out, the
> >> logs
> >> > flooded with errors as it tried to create still more threads, and the
> >> > instance slowed down until terminated as unhealthy.
> >> >
> >> > The DefaultTasksMax is set to 4915, I've tried raising and lowering it
> >> but
> >> > regardless of value the result is the same: it gets maxed and
> everything
> >> > slows down.
> >> >
> >> > Is there anything I can do to stop solr spinning up so many threads it
> >> > ceases to function? There have been a few test passes where it
> >> > spontaneously dropped threadcount from thousands to hundreds and
> stayed
> >> up
> >> > longer, but there seems no pattern to when this happens. Running the
> >> same
> >> > tests on 8.3.1 results in a much slower increase in threads and it
> never
> >> > quite maxes them so things continue to function.
> >> >
> >> > See below for the thread count and healthcheck times seen on a (fairly
> >> > harsh) test run of 100 requests/sec
> >> >
> >> > Thanks
> >> >
> >> > Dominic
> >> >
> >> >
> >> > Threadcount:
> >> >
> >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> >> 'start.jar'
> >> > | wc -l; sleep 10s; done
> >> > Tue Oct 12 14:27:33 UTC 2021
> >> > 52
> >> > Tue Oct 12 14:27:43 UTC 2021
> >> > 52
> >> > Tue Oct 12 14:27:54 UTC 2021
> >> > 52
> >> > Tue Oct 12 14:28:04 UTC 2021
> >> > 52
> >> > Tue Oct 12 14:28:14 UTC 2021
> >> > 569
> >> > Tue Oct 12 14:28:24 UTC 2021
> >> > 899
> >> > Tue Oct 12 14:28:34 UTC 2021
> >> > 1198
> >> > Tue Oct 12 14:28:44 UTC 2021
> >> > 1589
> >> > Tue Oct 12 14:28:54 UTC 2021
> >> > 2016
> >> > Tue Oct 12 14:29:05 UTC 2021
> >> > 2451
> >> > Tue Oct 12 14:29:15 UTC 2021
> >> > 2851
> >> > Tue Oct 12 14:29:26 UTC 2021
> >> > 2934
> >> > Tue Oct 12 14:29:36 UTC 2021
> >> > 3249
> >> > Tue Oct 12 14:29:46 UTC 2021
> >> > 3501
> >> > Tue Oct 12 14:29:57 UTC 2021
> >> > 3734
> >> > Tue Oct 12 14:30:07 UTC 2021
> >> > 4128
> >> > Tue Oct 12 14:30:18 UTC 2021
> >> > 4374
> >> > Tue Oct 12 14:30:29 UTC 2021
> >> > 4637
> >> > Tue Oct 12 14:30:39 UTC 2021
> >> > 4693
> >> > Tue Oct 12 14:30:50 UTC 2021
> >> > 4807
> >> > Tue Oct 12 14:31:01 UTC 2021
> >> > 4916
> >> > Tue Oct 12 14:31:11 UTC 2021
> >> > 4916
> >> > Tue Oct 12 14:31:22 UTC 2021
> >> > Connection to 10.40.22.166 closed by remote host.
> >> >
> >> >
> >> > Healthcheck:
> >> >
> >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> >> > 10s; done
> >> > Tue Oct 12 14:27:34 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:27:34 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:27:44 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:27:44 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:27:54 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:27:54 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:28:04 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:28:04 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:28:14 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:28:16 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:28:26 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:28:39 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:28:49 UTC 2021
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> --:--:--
> >> >   0> GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:29:13 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:29:23 UTC 2021
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01
> --:--:--
> >> >   0> GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:29:25 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:29:35 UTC 2021
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> --:--:--
> >> >   0> GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:29:44 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:29:54 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:30:06 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:30:16 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:30:20 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:30:30 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02
> --:--:--
> >> >   0< HTTP/1.1 200 OK
> >> > Tue Oct 12 14:30:33 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:30:43 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:30:43 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:30:53 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > Tue Oct 12 14:30:55 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:31:05 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:31:05 UTC 2021
> >> > ----
> >> > Tue Oct 12 14:31:15 UTC 2021
> >> > > GET /solr/ HTTP/1.1
> >> > < HTTP/1.1 200 OK
> >> > Tue Oct 12 14:31:15 UTC 2021
> >> > ----
> >> > Connection to 10.40.22.166 closed by remote host.
> >> >
> >>
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Deepak Goel <de...@gmail.com>.
Hello

Can you please tell us the JVM Heap Setting for both the versions: 8.3.1,
8.9.0?

I will also have to look into the following code: FileFloatSource.java:210.
(will do it tonite-IST and update)

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> Oh, that's very helpful to know about, ty
>
> The overwhelming majority appear to be threads in TIMED_WAITING, all
> waiting on the same
> thing: java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb
>
> I've attached a screenshot which includes the stack trace. Stopping all
> queries to the instance and waiting didn't result in any noticeable
> decrease in the number of threads so it looks like despite being timed,
> they're simply not getting terminated.
>
> Restarting the service takes me back down to just 53 threads; re-running a
> test results in many new threads immediately coming into being, this time
> with a higher proportion of threads BLOCKED on
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
> - See second screenshot. The stack trace for those is too big for one
> screen so here's the output:
>
> qtp178604517-861 (861)
>
>
> org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
>
>    -
>    org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
>    -
>    org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
>    -
>    org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
>    -
>    org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
>    -
>    org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
>    -
>    org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
>    - org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
>    -
>    org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
>    - org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
>    - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
>    -
>    org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
>    - org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
>    - org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
>    -
>    org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
>    -
>    org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
>    -
>    org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
>    -
>    org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
>    -
>    org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
>    -
>    org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
>    -
>    org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
>    -
>    org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
>    - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
>    - org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
>    - org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
>    -
>    org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
>    -
>    org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
>    -
>    org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
>    -
>    org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>    -
>    org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>    -
>    org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
>    -
>    org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>    -
>    org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>    -
>    org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>    -
>    org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
>    -
>    org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>    -
>    org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
>    -
>    org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>    -
>    org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>    -
>    org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
>    -
>    org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>    -
>    org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>    -
>    org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
>    -
>    org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>    - org.eclipse.jetty.server.Server.handle(Server.java:516)
>    -
>    org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
>    - org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
>    Source)
>    - org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
>    - org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
>    -
>    org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
>    -
>    org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
>    - org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
>    - org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
>    -
>    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
>    -
>    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
>    -
>    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
>    -
>    org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
>    -
>    org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
>    -
>    org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
>    -
>    org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
>    - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)
>
> [image: image.png]
> [image: image.png]
>
> On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com> wrote:
>
>> There is a thread dump on the Solr admin. You can use that to determine
>> what all those threads are doing and where they are getting stuck. You can
>> post parts of the thread dump back to this email thread as well.
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
>> <do...@adzuna.com.invalid> wrote:
>>
>> > We run 8.3.1 in prod without any problems, but we're having issues with
>> > trying to upgrade.
>> >
>> > I've created an 8.9.0 leader & follower, imported our live data into it,
>> > and am testing it via replaying requests made to prod. We're seeing a
>> big
>> > problem where fairly moderate request rates are causing the instance to
>> > become so slow it fails healthcheck. The logs showed a lot of errors
>> around
>> > creating threads:
>> >
>> > solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
>> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
>> guardsize:
>> > 0k, detached.
>> >
>> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
>> > java.lang.OutOfMemoryError: unable to create native thread: possibly
>> out of
>> > memory or process/resource limits reached
>> >
>> > So I monitored thread count for the process whilst running the test
>> suite
>> > and saw a persistent pattern: Threads increased until maxed out, the
>> logs
>> > flooded with errors as it tried to create still more threads, and the
>> > instance slowed down until terminated as unhealthy.
>> >
>> > The DefaultTasksMax is set to 4915, I've tried raising and lowering it
>> but
>> > regardless of value the result is the same: it gets maxed and everything
>> > slows down.
>> >
>> > Is there anything I can do to stop solr spinning up so many threads it
>> > ceases to function? There have been a few test passes where it
>> > spontaneously dropped threadcount from thousands to hundreds and stayed
>> up
>> > longer, but there seems no pattern to when this happens. Running the
>> same
>> > tests on 8.3.1 results in a much slower increase in threads and it never
>> > quite maxes them so things continue to function.
>> >
>> > See below for the thread count and healthcheck times seen on a (fairly
>> > harsh) test run of 100 requests/sec
>> >
>> > Thanks
>> >
>> > Dominic
>> >
>> >
>> > Threadcount:
>> >
>> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
>> 'start.jar'
>> > | wc -l; sleep 10s; done
>> > Tue Oct 12 14:27:33 UTC 2021
>> > 52
>> > Tue Oct 12 14:27:43 UTC 2021
>> > 52
>> > Tue Oct 12 14:27:54 UTC 2021
>> > 52
>> > Tue Oct 12 14:28:04 UTC 2021
>> > 52
>> > Tue Oct 12 14:28:14 UTC 2021
>> > 569
>> > Tue Oct 12 14:28:24 UTC 2021
>> > 899
>> > Tue Oct 12 14:28:34 UTC 2021
>> > 1198
>> > Tue Oct 12 14:28:44 UTC 2021
>> > 1589
>> > Tue Oct 12 14:28:54 UTC 2021
>> > 2016
>> > Tue Oct 12 14:29:05 UTC 2021
>> > 2451
>> > Tue Oct 12 14:29:15 UTC 2021
>> > 2851
>> > Tue Oct 12 14:29:26 UTC 2021
>> > 2934
>> > Tue Oct 12 14:29:36 UTC 2021
>> > 3249
>> > Tue Oct 12 14:29:46 UTC 2021
>> > 3501
>> > Tue Oct 12 14:29:57 UTC 2021
>> > 3734
>> > Tue Oct 12 14:30:07 UTC 2021
>> > 4128
>> > Tue Oct 12 14:30:18 UTC 2021
>> > 4374
>> > Tue Oct 12 14:30:29 UTC 2021
>> > 4637
>> > Tue Oct 12 14:30:39 UTC 2021
>> > 4693
>> > Tue Oct 12 14:30:50 UTC 2021
>> > 4807
>> > Tue Oct 12 14:31:01 UTC 2021
>> > 4916
>> > Tue Oct 12 14:31:11 UTC 2021
>> > 4916
>> > Tue Oct 12 14:31:22 UTC 2021
>> > Connection to 10.40.22.166 closed by remote host.
>> >
>> >
>> > Healthcheck:
>> >
>> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
>> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
>> > 10s; done
>> > Tue Oct 12 14:27:34 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:27:34 UTC 2021
>> > ----
>> > Tue Oct 12 14:27:44 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:27:44 UTC 2021
>> > ----
>> > Tue Oct 12 14:27:54 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:27:54 UTC 2021
>> > ----
>> > Tue Oct 12 14:28:04 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:28:04 UTC 2021
>> > ----
>> > Tue Oct 12 14:28:14 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:28:16 UTC 2021
>> > ----
>> > Tue Oct 12 14:28:26 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:28:39 UTC 2021
>> > ----
>> > Tue Oct 12 14:28:49 UTC 2021
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
>> >   0> GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:29:13 UTC 2021
>> > ----
>> > Tue Oct 12 14:29:23 UTC 2021
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
>> >   0> GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:29:25 UTC 2021
>> > ----
>> > Tue Oct 12 14:29:35 UTC 2021
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
>> >   0> GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:29:44 UTC 2021
>> > ----
>> > Tue Oct 12 14:29:54 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:30:06 UTC 2021
>> > ----
>> > Tue Oct 12 14:30:16 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:30:20 UTC 2021
>> > ----
>> > Tue Oct 12 14:30:30 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
>> >   0< HTTP/1.1 200 OK
>> > Tue Oct 12 14:30:33 UTC 2021
>> > ----
>> > Tue Oct 12 14:30:43 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:30:43 UTC 2021
>> > ----
>> > Tue Oct 12 14:30:53 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > Tue Oct 12 14:30:55 UTC 2021
>> > ----
>> > Tue Oct 12 14:31:05 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:31:05 UTC 2021
>> > ----
>> > Tue Oct 12 14:31:15 UTC 2021
>> > > GET /solr/ HTTP/1.1
>> > < HTTP/1.1 200 OK
>> > Tue Oct 12 14:31:15 UTC 2021
>> > ----
>> > Connection to 10.40.22.166 closed by remote host.
>> >
>>
>

Re: Solr keeps creating new threads until it dies

Posted by Dominic Humphries <do...@adzuna.com.INVALID>.
Oh, that's very helpful to know about, ty

The overwhelming majority appear to be threads in TIMED_WAITING, all
waiting on the same
thing: java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb

I've attached a screenshot which includes the stack trace. Stopping all
queries to the instance and waiting didn't result in any noticeable
decrease in the number of threads so it looks like despite being timed,
they're simply not getting terminated.

Restarting the service takes me back down to just 53 threads; re-running a
test results in many new threads immediately coming into being, this time
with a higher proportion of threads BLOCKED on
org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de
- See second screenshot. The stack trace for those is too big for one
screen so here's the output:

qtp178604517-861 (861)

org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de

   -
   org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210)
   -
   org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158)
   -
   org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97)
   -
   org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203)
   -
   org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261)
   -
   org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224)
   - org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
   -
   org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
   - org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
   - org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
   -
   org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
   - org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
   - org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
   -
   org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211)
   -
   org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705)
   -
   org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408)
   -
   org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
   -
   org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500)
   -
   org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
   -
   org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369)
   -
   org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
   - org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
   - org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794)
   - org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567)
   -
   org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
   -
   org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
   - org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
   -
   org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
   -
   org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
   -
   org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
   -
   org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
   -
   org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
   -
   org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
   -
   org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
   -
   org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
   -
   org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
   -
   org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
   -
   org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
   -
   org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
   -
   org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
   -
   org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   -
   org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
   -
   org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   - org.eclipse.jetty.server.Server.handle(Server.java:516)
   -
   org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
   - org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown
   Source)
   - org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
   - org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
   -
   org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
   -
   org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
   - org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
   - org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
   -
   org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
   -
   org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
   -
   org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
   -
   org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
   -
   org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)
   -
   org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
   -
   org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
   - java.base@11.0.5/java.lang.Thread.run(Thread.java:834)

[image: image.png]
[image: image.png]

On Wed, 13 Oct 2021 at 00:03, Joel Bernstein <jo...@gmail.com> wrote:

> There is a thread dump on the Solr admin. You can use that to determine
> what all those threads are doing and where they are getting stuck. You can
> post parts of the thread dump back to this email thread as well.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
> <do...@adzuna.com.invalid> wrote:
>
> > We run 8.3.1 in prod without any problems, but we're having issues with
> > trying to upgrade.
> >
> > I've created an 8.9.0 leader & follower, imported our live data into it,
> > and am testing it via replaying requests made to prod. We're seeing a big
> > problem where fairly moderate request rates are causing the instance to
> > become so slow it fails healthcheck. The logs showed a lot of errors
> around
> > creating threads:
> >
> > solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
> > pthread_create failed (EAGAIN) for attributes: stacksize: 256k,
> guardsize:
> > 0k, detached.
> >
> > WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> > java.lang.OutOfMemoryError: unable to create native thread: possibly out
> of
> > memory or process/resource limits reached
> >
> > So I monitored thread count for the process whilst running the test suite
> > and saw a persistent pattern: Threads increased until maxed out, the logs
> > flooded with errors as it tried to create still more threads, and the
> > instance slowed down until terminated as unhealthy.
> >
> > The DefaultTasksMax is set to 4915, I've tried raising and lowering it
> but
> > regardless of value the result is the same: it gets maxed and everything
> > slows down.
> >
> > Is there anything I can do to stop solr spinning up so many threads it
> > ceases to function? There have been a few test passes where it
> > spontaneously dropped threadcount from thousands to hundreds and stayed
> up
> > longer, but there seems no pattern to when this happens. Running the same
> > tests on 8.3.1 results in a much slower increase in threads and it never
> > quite maxes them so things continue to function.
> >
> > See below for the thread count and healthcheck times seen on a (fairly
> > harsh) test run of 100 requests/sec
> >
> > Thanks
> >
> > Dominic
> >
> >
> > Threadcount:
> >
> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep
> 'start.jar'
> > | wc -l; sleep 10s; done
> > Tue Oct 12 14:27:33 UTC 2021
> > 52
> > Tue Oct 12 14:27:43 UTC 2021
> > 52
> > Tue Oct 12 14:27:54 UTC 2021
> > 52
> > Tue Oct 12 14:28:04 UTC 2021
> > 52
> > Tue Oct 12 14:28:14 UTC 2021
> > 569
> > Tue Oct 12 14:28:24 UTC 2021
> > 899
> > Tue Oct 12 14:28:34 UTC 2021
> > 1198
> > Tue Oct 12 14:28:44 UTC 2021
> > 1589
> > Tue Oct 12 14:28:54 UTC 2021
> > 2016
> > Tue Oct 12 14:29:05 UTC 2021
> > 2451
> > Tue Oct 12 14:29:15 UTC 2021
> > 2851
> > Tue Oct 12 14:29:26 UTC 2021
> > 2934
> > Tue Oct 12 14:29:36 UTC 2021
> > 3249
> > Tue Oct 12 14:29:46 UTC 2021
> > 3501
> > Tue Oct 12 14:29:57 UTC 2021
> > 3734
> > Tue Oct 12 14:30:07 UTC 2021
> > 4128
> > Tue Oct 12 14:30:18 UTC 2021
> > 4374
> > Tue Oct 12 14:30:29 UTC 2021
> > 4637
> > Tue Oct 12 14:30:39 UTC 2021
> > 4693
> > Tue Oct 12 14:30:50 UTC 2021
> > 4807
> > Tue Oct 12 14:31:01 UTC 2021
> > 4916
> > Tue Oct 12 14:31:11 UTC 2021
> > 4916
> > Tue Oct 12 14:31:22 UTC 2021
> > Connection to 10.40.22.166 closed by remote host.
> >
> >
> > Healthcheck:
> >
> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> > 10s; done
> > Tue Oct 12 14:27:34 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:34 UTC 2021
> > ----
> > Tue Oct 12 14:27:44 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:44 UTC 2021
> > ----
> > Tue Oct 12 14:27:54 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:27:54 UTC 2021
> > ----
> > Tue Oct 12 14:28:04 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:28:04 UTC 2021
> > ----
> > Tue Oct 12 14:28:14 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:28:16 UTC 2021
> > ----
> > Tue Oct 12 14:28:26 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:28:39 UTC 2021
> > ----
> > Tue Oct 12 14:28:49 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
> >   0> GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:29:13 UTC 2021
> > ----
> > Tue Oct 12 14:29:23 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
> >   0> GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:29:25 UTC 2021
> > ----
> > Tue Oct 12 14:29:35 UTC 2021
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
> >   0> GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:29:44 UTC 2021
> > ----
> > Tue Oct 12 14:29:54 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:06 UTC 2021
> > ----
> > Tue Oct 12 14:30:16 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:20 UTC 2021
> > ----
> > Tue Oct 12 14:30:30 UTC 2021
> > > GET /solr/ HTTP/1.1
> >   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
> >   0< HTTP/1.1 200 OK
> > Tue Oct 12 14:30:33 UTC 2021
> > ----
> > Tue Oct 12 14:30:43 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:30:43 UTC 2021
> > ----
> > Tue Oct 12 14:30:53 UTC 2021
> > > GET /solr/ HTTP/1.1
> > Tue Oct 12 14:30:55 UTC 2021
> > ----
> > Tue Oct 12 14:31:05 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:31:05 UTC 2021
> > ----
> > Tue Oct 12 14:31:15 UTC 2021
> > > GET /solr/ HTTP/1.1
> > < HTTP/1.1 200 OK
> > Tue Oct 12 14:31:15 UTC 2021
> > ----
> > Connection to 10.40.22.166 closed by remote host.
> >
>

Re: Solr keeps creating new threads until it dies

Posted by Joel Bernstein <jo...@gmail.com>.
There is a thread dump on the Solr admin. You can use that to determine
what all those threads are doing and where they are getting stuck. You can
post parts of the thread dump back to this email thread as well.



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries
<do...@adzuna.com.invalid> wrote:

> We run 8.3.1 in prod without any problems, but we're having issues with
> trying to upgrade.
>
> I've created an 8.9.0 leader & follower, imported our live data into it,
> and am testing it via replaying requests made to prod. We're seeing a big
> problem where fairly moderate request rates are causing the instance to
> become so slow it fails healthcheck. The logs showed a lot of errors around
> creating threads:
>
> solr[4507]: [124136.511s][warning][os,thread] Failed to start thread -
> pthread_create failed (EAGAIN) for attributes: stacksize: 256k, guardsize:
> 0k, detached.
>
> WARN  (qtp178604517-3891) [   ] o.e.j.i.ManagedSelector  =>
> java.lang.OutOfMemoryError: unable to create native thread: possibly out of
> memory or process/resource limits reached
>
> So I monitored thread count for the process whilst running the test suite
> and saw a persistent pattern: Threads increased until maxed out, the logs
> flooded with errors as it tried to create still more threads, and the
> instance slowed down until terminated as unhealthy.
>
> The DefaultTasksMax is set to 4915, I've tried raising and lowering it but
> regardless of value the result is the same: it gets maxed and everything
> slows down.
>
> Is there anything I can do to stop solr spinning up so many threads it
> ceases to function? There have been a few test passes where it
> spontaneously dropped threadcount from thousands to hundreds and stayed up
> longer, but there seems no pattern to when this happens. Running the same
> tests on 8.3.1 results in a much slower increase in threads and it never
> quite maxes them so things continue to function.
>
> See below for the thread count and healthcheck times seen on a (fairly
> harsh) test run of 100 requests/sec
>
> Thanks
>
> Dominic
>
>
> Threadcount:
>
> ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps -eLF | grep 'start.jar'
> | wc -l; sleep 10s; done
> Tue Oct 12 14:27:33 UTC 2021
> 52
> Tue Oct 12 14:27:43 UTC 2021
> 52
> Tue Oct 12 14:27:54 UTC 2021
> 52
> Tue Oct 12 14:28:04 UTC 2021
> 52
> Tue Oct 12 14:28:14 UTC 2021
> 569
> Tue Oct 12 14:28:24 UTC 2021
> 899
> Tue Oct 12 14:28:34 UTC 2021
> 1198
> Tue Oct 12 14:28:44 UTC 2021
> 1589
> Tue Oct 12 14:28:54 UTC 2021
> 2016
> Tue Oct 12 14:29:05 UTC 2021
> 2451
> Tue Oct 12 14:29:15 UTC 2021
> 2851
> Tue Oct 12 14:29:26 UTC 2021
> 2934
> Tue Oct 12 14:29:36 UTC 2021
> 3249
> Tue Oct 12 14:29:46 UTC 2021
> 3501
> Tue Oct 12 14:29:57 UTC 2021
> 3734
> Tue Oct 12 14:30:07 UTC 2021
> 4128
> Tue Oct 12 14:30:18 UTC 2021
> 4374
> Tue Oct 12 14:30:29 UTC 2021
> 4637
> Tue Oct 12 14:30:39 UTC 2021
> 4693
> Tue Oct 12 14:30:50 UTC 2021
> 4807
> Tue Oct 12 14:31:01 UTC 2021
> 4916
> Tue Oct 12 14:31:11 UTC 2021
> 4916
> Tue Oct 12 14:31:22 UTC 2021
> Connection to 10.40.22.166 closed by remote host.
>
>
> Healthcheck:
>
> ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl -v
> localhost:8983/solr/ 2>&1 | grep HTTP; date; echo '----'; sleep
> 10s; done
> Tue Oct 12 14:27:34 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:27:34 UTC 2021
> ----
> Tue Oct 12 14:27:44 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:27:44 UTC 2021
> ----
> Tue Oct 12 14:27:54 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:27:54 UTC 2021
> ----
> Tue Oct 12 14:28:04 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:28:04 UTC 2021
> ----
> Tue Oct 12 14:28:14 UTC 2021
> > GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:28:16 UTC 2021
> ----
> Tue Oct 12 14:28:26 UTC 2021
> > GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:28:39 UTC 2021
> ----
> Tue Oct 12 14:28:49 UTC 2021
>   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
>   0> GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:29:13 UTC 2021
> ----
> Tue Oct 12 14:29:23 UTC 2021
>   0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--
>   0> GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:29:25 UTC 2021
> ----
> Tue Oct 12 14:29:35 UTC 2021
>   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
>   0> GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:29:44 UTC 2021
> ----
> Tue Oct 12 14:29:54 UTC 2021
> > GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:30:06 UTC 2021
> ----
> Tue Oct 12 14:30:16 UTC 2021
> > GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:30:20 UTC 2021
> ----
> Tue Oct 12 14:30:30 UTC 2021
> > GET /solr/ HTTP/1.1
>   0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--
>   0< HTTP/1.1 200 OK
> Tue Oct 12 14:30:33 UTC 2021
> ----
> Tue Oct 12 14:30:43 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:30:43 UTC 2021
> ----
> Tue Oct 12 14:30:53 UTC 2021
> > GET /solr/ HTTP/1.1
> Tue Oct 12 14:30:55 UTC 2021
> ----
> Tue Oct 12 14:31:05 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:31:05 UTC 2021
> ----
> Tue Oct 12 14:31:15 UTC 2021
> > GET /solr/ HTTP/1.1
> < HTTP/1.1 200 OK
> Tue Oct 12 14:31:15 UTC 2021
> ----
> Connection to 10.40.22.166 closed by remote host.
>