You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Rudenko, Artur" <Ar...@verint.com> on 2020/02/11 11:17:05 UTC

Possible performance issue in my environment setup

I'm am currently investigating a performance issue in our environment (20M large PARENT documents and 800M nested small CHILD documents). The system inserts about 400K PARENT documents and 16M CHILD documents per day.
This is a solr cloud 8.3 environment with 7 servers (64 VCPU 128 GB RAM each, 24GB allocated to Solr) with single collection (32 shards and replication factor 2).

Solr config related info :

<autoCommit>
              <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
              <maxDocs>${solr.autoCommit.maxDocs:50000}</maxDocs>
              <openSearcher>true</openSearcher>
       </autoCommit>


       <autoSoftCommit>
              <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
       </autoSoftCommit>

I found in the solr log the following log line:

[2020-02-10T00:01:00.522] INFO [qtp1686100174-100525] org.apache.solr.search.SolrIndexSearcher Opening [Searcher@37c9205b[0_shard29_replica_n112] realtime]

From a log with 100K records, the above log record appears 65K times.

We are experiencing extremely slow query time while the indexing time is fast and sufficient.

Is this a possible direction to keep investigating? If so, any advices?


Thanks,
Artur Rudenko


This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.

RE: Possible performance issue in my environment setup

Posted by "Rudenko, Artur" <Ar...@verint.com>.
Thanks for helping, I will keep investigating.

Just note, we did stopped indexing and we did not saw any significant changes.

Artur Rudenko
Analytics Developer
Customer Engagement Solutions, VERINT
T +972.74.747.2536 | M +972.52.425.4686

-----Original Message-----
From: Erick Erickson <er...@gmail.com>
Sent: Tuesday, February 11, 2020 4:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Possible performance issue in my environment setup

My first bit of advice would be to fix your autocommit intervals. There’s not much point in having openSearcher set to true _and_ having your soft commit times also set, all soft commit does is open a searcher and your autocommit does that.

I’d also reduce the time for autoCommit. You’re _probably_ being saved by the maxDoc entry,

Fix here is set openSearcher=false in autoCommit, and reduce the time. And let soft commit handle opening searchers. Here’s more than you want to know about how all this works:

https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Given your observation that you see a new searcher being opened 65K times, my bet is that you’re somehow committing far, far too often. What’s the rate of opening new searchers? Do those 65K entries span an hour? 10 days? Either you’re sending 50K docs very frequently or your client is sending commits.

So here’s what I’d do as a quick-n-dirty triage of where to look first:

- first turn off indexing. Does your query performance improve? If so, consider autowarming and tuning your commit interval.

- next, add &debug=timing to some of your queries. That’ll tell you if a particular _component_ is taking a long time, something like faceting say.

- If nothing jumps out, throw a profiler at Solr to see where it’s spending it’s time.

Best,
Erick

> On Feb 11, 2020, at 6:17 AM, Rudenko, Artur <Ar...@verint.com> wrote:
>
> I'm am currently investigating a performance issue in our environment (20M large PARENT documents and 800M nested small CHILD documents). The system inserts about 400K PARENT documents and 16M CHILD documents per day.
> This is a solr cloud 8.3 environment with 7 servers (64 VCPU 128 GB RAM each, 24GB allocated to Solr) with single collection (32 shards and replication factor 2).
>
> Solr config related info :
>
> <autoCommit>
>              <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>              <maxDocs>${solr.autoCommit.maxDocs:50000}</maxDocs>
>              <openSearcher>true</openSearcher>
>       </autoCommit>
>
>
>       <autoSoftCommit>
>              <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
>       </autoSoftCommit>
>
> I found in the solr log the following log line:
>
> [2020-02-10T00:01:00.522] INFO [qtp1686100174-100525]
> org.apache.solr.search.SolrIndexSearcher Opening
> [Searcher@37c9205b[0_shard29_replica_n112] realtime]
>
> From a log with 100K records, the above log record appears 65K times.
>
> We are experiencing extremely slow query time while the indexing time is fast and sufficient.
>
> Is this a possible direction to keep investigating? If so, any advices?
>
>
> Thanks,
> Artur Rudenko
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.



This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.

Re: Possible performance issue in my environment setup

Posted by Erick Erickson <er...@gmail.com>.
My first bit of advice would be to fix your autocommit intervals. There’s not much point
in having openSearcher set to true _and_ having your soft commit times also set, all
soft commit does is open a searcher and your autocommit does that.

I’d also reduce the time for autoCommit. You’re _probably_ being saved by the 
maxDoc entry,

Fix here is set openSearcher=false in autoCommit, and reduce the time. And let
soft commit handle opening searchers. Here’s
more than you want to know about how all this works:

https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Given your observation that you see a new searcher being opened
65K times, my bet is that you’re somehow committing far, far too
often. What’s the rate of opening new searchers? Do those 65K
entries span an hour? 10 days? Either you’re sending 50K docs very
frequently or your client is sending commits.

So here’s what I’d do as a quick-n-dirty triage of where to look first:

- first turn off indexing. Does your query performance improve? If so, consider autowarming and tuning your commit interval.

- next, add &debug=timing to some of your queries. That’ll tell you if a particular _component_ is taking a long time, something like faceting say.

- If nothing jumps out, throw a profiler at Solr to see where it’s spending it’s time.

Best,
Erick

> On Feb 11, 2020, at 6:17 AM, Rudenko, Artur <Ar...@verint.com> wrote:
> 
> I'm am currently investigating a performance issue in our environment (20M large PARENT documents and 800M nested small CHILD documents). The system inserts about 400K PARENT documents and 16M CHILD documents per day.
> This is a solr cloud 8.3 environment with 7 servers (64 VCPU 128 GB RAM each, 24GB allocated to Solr) with single collection (32 shards and replication factor 2).
> 
> Solr config related info :
> 
> <autoCommit>
>              <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>              <maxDocs>${solr.autoCommit.maxDocs:50000}</maxDocs>
>              <openSearcher>true</openSearcher>
>       </autoCommit>
> 
> 
>       <autoSoftCommit>
>              <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
>       </autoSoftCommit>
> 
> I found in the solr log the following log line:
> 
> [2020-02-10T00:01:00.522] INFO [qtp1686100174-100525] org.apache.solr.search.SolrIndexSearcher Opening [Searcher@37c9205b[0_shard29_replica_n112] realtime]
> 
> From a log with 100K records, the above log record appears 65K times.
> 
> We are experiencing extremely slow query time while the indexing time is fast and sufficient.
> 
> Is this a possible direction to keep investigating? If so, any advices?
> 
> 
> Thanks,
> Artur Rudenko
> 
> 
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.