You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2018/08/08 13:26:40 UTC

7.2.1 Solr collection sluggish

Hello,

We've got, again, a little mystery here. Our main text collection is suddenly running at a snail's pace since Monday very early in the morning, the monitoring graph for response time went up. This is not unusual for Solr so the JVM's were all restarted, it always solves a sluggish collection, not this time. They were restarted yesterday as well, but no change. The VM's Solr is running on were rebooted today, also no change.

Not all queries are slow all the time, a random query is just slow sometimes, or sometime most of the times. All 6 replica's are sometimes slow.

We also took a good look at our monitoring, JVM heap was normal, IO was normal, CPU was normal until the first restart. CPU usage is since the first restart erratic but not worryingly off the charts, just not 'normal' as usual. 

No changes were made to the collection for days before it became sluggish.

CPU sampling with VisualVM is not helpful either, nothing really stands out, especially when i compare it to another cluster that is still healthy. GC is also normal.

So, any ideas out here?

Many thanks,
Markus


Re: 7.2.1 Solr collection sluggish

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/8/2018 7:26 AM, Markus Jelsma wrote:
> We also took a good look at our monitoring, JVM heap was normal, IO was normal, CPU was normal until the first restart. CPU usage is since the first restart erratic but not worryingly off the charts, just not 'normal' as usual.

I've seen systems with severe performance issues where the user did not
see anything out of the ordinary for these metrics.  Sometimes this is
because they do not know what to look for.  What exactly does "normal"
mean to you?

> No changes were made to the collection for days before it became sluggish.
>
> CPU sampling with VisualVM is not helpful either, nothing really stands out, especially when i compare it to another cluster that is still healthy. GC is also normal.
>
> So, any ideas out here?

Here's the initial questions for a performance issue, to see whether
it'srelated to available memory or not:

* What OS is it running on?
* How much memory does the server have?
* How much index data is being handled by all Solr instances on that
machine?
* What is the total size of all Solr heaps on that machine?
* Is there any other software besides Solr on the machine?

If the OS is Linux or another POSIX operating system that has the gnu
version of "top" installed, then the following information is
*extremely* helpful, and can answer most of the questions asked above:

Run the "top" program.  Don't use htop or some other variant, it must be
the actual program named "top" and it should be the version of that
program from the Gnu projectso that Gnu keyboard shortcuts work.

Press shift-M to sort the listing by resident memory size.  If your
version of top is not from the Gnu project, this might not work ... but
this is an extremely important step in these instructions, so if you
don't have gnu top, you should see if you can get your version to sort
by the resident memory column, descending.

Grab a screenshot of the top listing and share it with a file-sharing
website.  Dropbox is usually a good choice.

If you're running Solr on Windows, you can use the program named
"Resource Monitor" to get something very similar.  In that program,
click on the Memory tab, click the "Working Set" column until it's
sorted descending, and grab a screenshot.  If necessary, expand the
columns so all the numbers can be seen clearly.

Thanks,
Shawn