You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by zoolette <ga...@gmail.com> on 2018/10/16 12:04:36 UTC

Device I/O trouble with solr 7.5

Hi folks,

We are today running under SOLR 6.6 on our production environnement.
On the end of august, i planned to upgrade SOLR to 7.4 (7.5 since that
moment) but I encounter some trouble.
Our master SOLR is replicated to a slave SOLR. I tried to upgrade the
replica first, this is this one that makes me trouble.
This is a shared server half for a mysql replication server and the
replication SOLR server.
The server is running under debian 7 (wheezy) and java 1.8.0u45
The SOLR java HEAP is configured with a 12G Xmx value.

On this SOLR instance there is 6 cores.
- 2 cores are dedicated to main search on 2 different website (they are
each 20Gb)
- 2 cores are dedicated for the autocpletion feature of these 2 websites
(~2Gb each)
- 2 other cores very small occasionnaly used by one of the website

The SOLR instance in 7.5 is up and ready but no trafic is sent to it.
On the 2 websites, one generated approximately between 5000 and 8000
requests / minute on SOLR on 2 handlers.
One search handler is dedicated to complex search from the search bar and
the other handler treat back search such a return document for a specified
id or return the chained documents, this kind of stuff.

The second website use identical handlers than the first one, the only
difference is that it generates less SOLR requests  : 1000 to 2000 requests
/ minute.

To upgrade the master I need to send all the SOLR trafic on this instance.
I first redirect the bigger one. The reponse time grown a lot but SOLR
stabilized it quickly. After 10 minutes as all was ok, I redirect the the
website with the lower trafic rate. And immediatly, the number of java
processes quickly increased, on munin the device busy increased to 100%
(read operations) and the load average of the server drastically grown, it
reach 120, SOLR began to not respond.

I tried this several time, sometimes it happens immmediatly sometimes after
10 minutes.
I don't realy understand what's going on.
For this upgrade, I also changed the basic fields type from tried fields to
pointed fields but I don't think that make a difference.

And the more incomprehensible is that all works fine in SOLR 6.6.I cna
switch all the traffic without any issue.

Does anybody have an idea of what can go wrong. Debian version ? java
version ? configuration problem ?

Thanks for you helpfull answers,
Regards,

Sébastien

Re: Device I/O trouble with solr 7.5

Posted by zoolette <ga...@gmail.com>.
Hi Shawn,
Thanks for you're quick answer.
I know it's not ideal to have mysql and SOLR on the same server. The use is
perfect for replication, but actually is too short to handle a high query
rate.
And of course I didn't see the warning about pointed fields and performance
on the documentation until you pointed it, my bad.
So I will first try to return on trie fields and see if I can switch trafic
without any issue.
If not, I wil reconsider to split my server in one mysql and one SOLR.

Regards,
Sébastien

Le mar. 16 oct. 2018 à 14:29, Shawn Heisey <ap...@elyograg.org> a écrit :

> On 10/16/2018 6:04 AM, zoolette wrote:
> > We are today running under SOLR 6.6 on our production environnement.
> > On the end of august, i planned to upgrade SOLR to 7.4 (7.5 since that
> > moment) but I encounter some trouble.
> > Our master SOLR is replicated to a slave SOLR. I tried to upgrade the
> > replica first, this is this one that makes me trouble.
> > This is a shared server half for a mysql replication server and the
> > replication SOLR server.
> > The server is running under debian 7 (wheezy) and java 1.8.0u45
> > The SOLR java HEAP is configured with a 12G Xmx value.
> >
> > On this SOLR instance there is 6 cores.
> > - 2 cores are dedicated to main search on 2 different website (they are
> > each 20Gb)
> > - 2 cores are dedicated for the autocpletion feature of these 2 websites
> > (~2Gb each)
> > - 2 other cores very small occasionnaly used by one of the website
>
> With about 45GB of index data and a 12GB heap, ideal performance is
> going to require 64GB of total memory -- and that's if Solr is the only
> software on the machine.  You might be able to get good performance with
> less memory, but your query rates are very high, so that's probably not
> a good idea.
>
> Adding MySQL, if the databases are of any significant size, could
> require significantly more memory.
>
> > The SOLR instance in 7.5 is up and ready but no trafic is sent to it.
> > On the 2 websites, one generated approximately between 5000 and 8000
> > requests / minute on SOLR on 2 handlers.
> > One search handler is dedicated to complex search from the search bar and
> > the other handler treat back search such a return document for a
> specified
> > id or return the chained documents, this kind of stuff.
> >
> > The second website use identical handlers than the first one, the only
> > difference is that it generates less SOLR requests  : 1000 to 2000
> requests
> > / minute.
>
> As I mentioned above, this is a significant query rate. Handling that
> with only two servers will require a a LOT of memory for caching
> purposes, and you'll want the servers to be dedicated to Solr -- not
> running MySQL as well.
>
> > To upgrade the master I need to send all the SOLR trafic on this
> instance.
> > I first redirect the bigger one. The reponse time grown a lot but SOLR
> > stabilized it quickly. After 10 minutes as all was ok, I redirect the the
> > website with the lower trafic rate. And immediatly, the number of java
> > processes quickly increased, on munin the device busy increased to 100%
> > (read operations) and the load average of the server drastically grown,
> it
> > reach 120, SOLR began to not respond.
>
> A high load average often means that there's a lot of disk I/O, and
> processes are spending a lot of time waiting for that I/O. On Linux, run
> the "top" program and look for the iowait percentage, sometimes
> abbreviated "wa".  This should be as close to zero as you can get it.
> Even a small number in iowait can cause major performance issues.  For
> Solr, whenever Solr must actually read the disk (instead of reading
> index data from memory -- the OS disk cache) performance is going to be
> terrible.
>
> https://wiki.apache.org/solr/SolrPerformanceProblems#RAM
>
> > For this upgrade, I also changed the basic fields type from tried fields
> to
> > pointed fields but I don't think that make a difference.
>
> Trie fields have really good performance for both range queries and
> single value lookups.  Point fields have better performance for range
> queries, but absolutely terrible performance for field:value (single
> value lookup) queries.
>
> > And the more incomprehensible is that all works fine in SOLR 6.6.I cna
> > switch all the traffic without any issue.
> >
> > Does anybody have an idea of what can go wrong. Debian version ? java
> > version ? configuration problem ?
>
> Best guess is one (or both) of these problems:
> 1) Limitations of Point field types
> 2) Not enough memory.
>
> Another possibility, which I think is less likely but I can't rule out
> with the info I have, is that a 12GB heap is big enough for 6.6, but not
> quite big enough for the same indexes on 7.x.  Making the heap larger
> would answer that question.
>
> Thanks,
> Shawn
>
>

Re: Device I/O trouble with solr 7.5

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/16/2018 6:04 AM, zoolette wrote:
> We are today running under SOLR 6.6 on our production environnement.
> On the end of august, i planned to upgrade SOLR to 7.4 (7.5 since that
> moment) but I encounter some trouble.
> Our master SOLR is replicated to a slave SOLR. I tried to upgrade the
> replica first, this is this one that makes me trouble.
> This is a shared server half for a mysql replication server and the
> replication SOLR server.
> The server is running under debian 7 (wheezy) and java 1.8.0u45
> The SOLR java HEAP is configured with a 12G Xmx value.
>
> On this SOLR instance there is 6 cores.
> - 2 cores are dedicated to main search on 2 different website (they are
> each 20Gb)
> - 2 cores are dedicated for the autocpletion feature of these 2 websites
> (~2Gb each)
> - 2 other cores very small occasionnaly used by one of the website

With about 45GB of index data and a 12GB heap, ideal performance is 
going to require 64GB of total memory -- and that's if Solr is the only 
software on the machine.  You might be able to get good performance with 
less memory, but your query rates are very high, so that's probably not 
a good idea.

Adding MySQL, if the databases are of any significant size, could 
require significantly more memory.

> The SOLR instance in 7.5 is up and ready but no trafic is sent to it.
> On the 2 websites, one generated approximately between 5000 and 8000
> requests / minute on SOLR on 2 handlers.
> One search handler is dedicated to complex search from the search bar and
> the other handler treat back search such a return document for a specified
> id or return the chained documents, this kind of stuff.
>
> The second website use identical handlers than the first one, the only
> difference is that it generates less SOLR requests  : 1000 to 2000 requests
> / minute.

As I mentioned above, this is a significant query rate. Handling that 
with only two servers will require a a LOT of memory for caching 
purposes, and you'll want the servers to be dedicated to Solr -- not 
running MySQL as well.

> To upgrade the master I need to send all the SOLR trafic on this instance.
> I first redirect the bigger one. The reponse time grown a lot but SOLR
> stabilized it quickly. After 10 minutes as all was ok, I redirect the the
> website with the lower trafic rate. And immediatly, the number of java
> processes quickly increased, on munin the device busy increased to 100%
> (read operations) and the load average of the server drastically grown, it
> reach 120, SOLR began to not respond.

A high load average often means that there's a lot of disk I/O, and 
processes are spending a lot of time waiting for that I/O. On Linux, run 
the "top" program and look for the iowait percentage, sometimes 
abbreviated "wa".  This should be as close to zero as you can get it.  
Even a small number in iowait can cause major performance issues.  For 
Solr, whenever Solr must actually read the disk (instead of reading 
index data from memory -- the OS disk cache) performance is going to be 
terrible.

https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

> For this upgrade, I also changed the basic fields type from tried fields to
> pointed fields but I don't think that make a difference.

Trie fields have really good performance for both range queries and 
single value lookups.  Point fields have better performance for range 
queries, but absolutely terrible performance for field:value (single 
value lookup) queries.

> And the more incomprehensible is that all works fine in SOLR 6.6.I cna
> switch all the traffic without any issue.
>
> Does anybody have an idea of what can go wrong. Debian version ? java
> version ? configuration problem ?

Best guess is one (or both) of these problems:
1) Limitations of Point field types
2) Not enough memory.

Another possibility, which I think is less likely but I can't rule out 
with the info I have, is that a 12GB heap is big enough for 6.6, but not 
quite big enough for the same indexes on 7.x.  Making the heap larger 
would answer that question.

Thanks,
Shawn


Re: Device I/O trouble with solr 7.5

Posted by Toke Eskildsen <to...@kb.dk>.
On Tue, 2018-10-16 at 14:04 +0200, zoolette wrote:
> The SOLR instance in 7.5 is up and ready but no trafic is sent to it.
> On the 2 websites, one generated approximately between 5000 and 8000
> requests / minute on SOLR on 2 handlers.
> One search handler is dedicated to complex search from the search bar
> and the other handler treat back search such a return document for a
> specified id or return the chained documents, this kind of stuff.

I am currently working on a DocValues performance regression in Solr 7,
where one of the symptoms is a lot of read activity (LUCENE-8374).

With that in mind, could you tell me

* How many documents you have in your index?
* Whether you use stored or docValues for the fields that you retrieve
  as part of the search result?
* If you perform heavy faceting, grouping or stats?

Maybe provide a sample query, if you are able?

Thanks,
Toke Eskildsen, Royal Danish Library