You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Modassar Ather <mo...@gmail.com> on 2022/03/26 08:49:06 UTC

High CPU utilisation on Solr-8.11.0

Hi,

We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are the
details of Solr installation.

Server : EC2 instance with 32 CPUs and 521 GB
Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
Ubuntu :Ubuntu 20.04.3 LTS
Java : openjdk 11.0.14
SolrCloud : 12 shards having a total 4+ TB index. Each node has a 30GB max
memory limit.
GC setting on Solr : G1GC
Solr query timeout : 5 minutes

During testing we observed a high CPU utilisation and few of the queries
with wildcard queries are timing out. These queries are getting executed
completely on Solr-6.5.1.
After tuning a few of the parameters of GC settings the CPU utilisation
came down but it is still high when compared with Solr-6.5.1 and some
queries with wildcard queries are still failing.

Kindly provide your suggestions.

Thanks,
Modassar

Re: High CPU utilisation on Solr-8.11.0

Posted by Modassar Ather <mo...@gmail.com>.

Hi,

I tried with separate volumes on each instance and the results are still
slow. The addition of more rows in the search query causes the search time
to increase by multiple folds.
The QTime and Elapsed time are both increased for request GET_TOP_GROUPS.
Following is the group field definition.

<fieldType name="string" class="solr.StrField" sortMissingLast="true"
stored="false" omitNorms="true"/>

I also tried GC tuning with some ExperimentalVMOptions and noticed the
speed of search improved a little bit with Provisioned IOPS SSD (io2) and
10000 IOPS but it is not matching with the speed of Solr-6.5.1 on EFS.

I have tried many options but not getting the desired performance. I have
also validated the configurations and schema.
Not sure what is the reason for the slowness or am I missing any
configurations.

Kindly advise.

Thanks,
Modassar

On Fri, Apr 8, 2022 at 11:04 AM Modassar Ather <mo...@gmail.com>
wrote:

> Thanks Walter for your reply. Yes it is the same disk shared on all
> instances.
>
> Thanks,
> Modassar
>
> On Fri, Apr 8, 2022 at 10:54 AM Walter Underwood <wu...@wunderwood.org>
> wrote:
>
>> Are you sharing the same disk volume on all instances? I would expect
>> that to be slow and cause index corruption. Each instance should have its
>> own disk volumes. I’m looking at this part of your config.
>>
>> Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
>> IOPS.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>> > On Apr 7, 2022, at 10:07 PM, Modassar Ather <mo...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I tried a few different settings of GC and observed the following. The
>> best
>> > result I got with the following environment and GC settings but still
>> it is
>> > comparatively slower than the previous Solr-6.5.1 setup.
>> >
>> > Total index : 4+ TB
>> > Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and
>> 256
>> > GB RAM.
>> > Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
>> > IOPS.
>> >
>> > GC settings
>> > 1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
>> > -XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8
>> > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
>> > -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55"
>> >
>> > 2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
>> > -XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem
>> > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100
>> > -XX:InitiatingHeapOccupancyPercent=55"
>> >
>> > The second GC setting is better for our environment but even with these
>> > settings we are facing issues of slowness even for simple term queries.
>> > The memory is not highly utilised and CPU% is also not very high. The
>> > slowness increases if we try to fetch more rows per request as well.
>> >
>> > Please provide your inputs.
>> >
>> > Thanks,
>> > Modassar
>> >
>> >
>> >
>> > On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <ap...@elyograg.org>
>> wrote:
>> >
>> >> On 3/27/2022 1:20 PM, Modassar Ather wrote:
>> >>> Just to add one point, even the queries without the wildcards e.g. a
>> >>> boolean query or a query with 10000 ids ORed has also become slow and
>> it
>> >> is
>> >>> also taking more CPU and finally ending up taking more time.
>> >>> I understand this is due to many GC pauses so if we fine tune the GC
>> >>> settings the CPU utilisation should go down.
>> >>
>> >> The GC settings that Solr 8.x comes with out of the box are already
>> very
>> >> good.  Disclaimer: They are very similar to settings that I came up
>> with
>> >> after some intense testing work.
>> >>
>> >> If you are running a recent release of Java 11 or OpenJDK 11 and have a
>> >> test environment, you could try Shenandoah.  My testing shows that this
>> >> collector makes a significant difference in GC pause activity, but that
>> >> throughput takes a definite hit.  If your indexing speed is
>> sufficiently
>> >> fast, you could give these settings a try in your solr.in.sh file:
>> >>
>> >> GC_TUNE=" \
>> >>   -XX:+UseShenandoahGC \
>> >>   -XX:+AlwaysPreTouch \
>> >>   -XX:+PerfDisableSharedMem \
>> >>   -XX:+ParallelRefProcEnabled \
>> >>   -XX:+UseStringDeduplication \
>> >>   -XX:ParallelGCThreads=2 \
>> >>   -XX:+UseNUMA
>> >>
>> >> Note that UseNUMA will only make a difference if your server has more
>> >> than one NUMA node.  But it will not harm anything if the server does
>> >> not have it.
>> >>
>> >> I mention indexing speed and throughput because indexing speed is where
>> >> I noticed a decrease with Shenandoah.  Fully reindexing my dovecot
>> >> install (about 150K messages) takes about 8 minutes with G1 and about 9
>> >> minutes with Shenandoah.  GC analysis revealed a larger number of
>> >> significantly smaller GC pauses, with the total pause time a little
>> >> lower.  But my co-conspirator on Shendandoah testing (with a much
>> larger
>> >> index than mine) said that their re-indexing process failed to complete
>> >> with Shenandoah, so if you have a test system you can try it on, I
>> would
>> >> recommend doing that before deploying to production.
>> >>
>> >> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now
>> >> has 11.0.14, which in theory should be a lot more stable.
>> >>
>> >> I also came up with some good CMS settings.  But I think the CMS
>> >> collector has been deprecated, though I do not know what version of
>> Java
>> >> might ultimately remove it.
>> >>
>> >> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey
>> >>
>> >> Thanks,
>> >> Shawn
>> >>
>> >>
>>
>>

Re: High CPU utilisation on Solr-8.11.0

Posted by Modassar Ather <mo...@gmail.com>.

Thanks Walter for your reply. Yes it is the same disk shared on all
instances.

Thanks,
Modassar

On Fri, Apr 8, 2022 at 10:54 AM Walter Underwood <wu...@wunderwood.org>
wrote:

> Are you sharing the same disk volume on all instances? I would expect that
> to be slow and cause index corruption. Each instance should have its own
> disk volumes. I’m looking at this part of your config.
>
> Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
> IOPS.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Apr 7, 2022, at 10:07 PM, Modassar Ather <mo...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I tried a few different settings of GC and observed the following. The
> best
> > result I got with the following environment and GC settings but still it
> is
> > comparatively slower than the previous Solr-6.5.1 setup.
> >
> > Total index : 4+ TB
> > Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and 256
> > GB RAM.
> > Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
> > IOPS.
> >
> > GC settings
> > 1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
> > -XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8
> > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
> > -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55"
> >
> > 2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
> > -XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem
> > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100
> > -XX:InitiatingHeapOccupancyPercent=55"
> >
> > The second GC setting is better for our environment but even with these
> > settings we are facing issues of slowness even for simple term queries.
> > The memory is not highly utilised and CPU% is also not very high. The
> > slowness increases if we try to fetch more rows per request as well.
> >
> > Please provide your inputs.
> >
> > Thanks,
> > Modassar
> >
> >
> >
> > On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <ap...@elyograg.org>
> wrote:
> >
> >> On 3/27/2022 1:20 PM, Modassar Ather wrote:
> >>> Just to add one point, even the queries without the wildcards e.g. a
> >>> boolean query or a query with 10000 ids ORed has also become slow and
> it
> >> is
> >>> also taking more CPU and finally ending up taking more time.
> >>> I understand this is due to many GC pauses so if we fine tune the GC
> >>> settings the CPU utilisation should go down.
> >>
> >> The GC settings that Solr 8.x comes with out of the box are already very
> >> good.  Disclaimer: They are very similar to settings that I came up with
> >> after some intense testing work.
> >>
> >> If you are running a recent release of Java 11 or OpenJDK 11 and have a
> >> test environment, you could try Shenandoah.  My testing shows that this
> >> collector makes a significant difference in GC pause activity, but that
> >> throughput takes a definite hit.  If your indexing speed is sufficiently
> >> fast, you could give these settings a try in your solr.in.sh file:
> >>
> >> GC_TUNE=" \
> >>   -XX:+UseShenandoahGC \
> >>   -XX:+AlwaysPreTouch \
> >>   -XX:+PerfDisableSharedMem \
> >>   -XX:+ParallelRefProcEnabled \
> >>   -XX:+UseStringDeduplication \
> >>   -XX:ParallelGCThreads=2 \
> >>   -XX:+UseNUMA
> >>
> >> Note that UseNUMA will only make a difference if your server has more
> >> than one NUMA node.  But it will not harm anything if the server does
> >> not have it.
> >>
> >> I mention indexing speed and throughput because indexing speed is where
> >> I noticed a decrease with Shenandoah.  Fully reindexing my dovecot
> >> install (about 150K messages) takes about 8 minutes with G1 and about 9
> >> minutes with Shenandoah.  GC analysis revealed a larger number of
> >> significantly smaller GC pauses, with the total pause time a little
> >> lower.  But my co-conspirator on Shendandoah testing (with a much larger
> >> index than mine) said that their re-indexing process failed to complete
> >> with Shenandoah, so if you have a test system you can try it on, I would
> >> recommend doing that before deploying to production.
> >>
> >> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now
> >> has 11.0.14, which in theory should be a lot more stable.
> >>
> >> I also came up with some good CMS settings.  But I think the CMS
> >> collector has been deprecated, though I do not know what version of Java
> >> might ultimately remove it.
> >>
> >> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Walter Underwood <wu...@wunderwood.org>.

Are you sharing the same disk volume on all instances? I would expect that to be slow and cause index corruption. Each instance should have its own disk volumes. I’m looking at this part of your config.

Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000 IOPS.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 7, 2022, at 10:07 PM, Modassar Ather <mo...@gmail.com> wrote:
> 
> Hi,
> 
> I tried a few different settings of GC and observed the following. The best
> result I got with the following environment and GC settings but still it is
> comparatively slower than the previous Solr-6.5.1 setup.
> 
> Total index : 4+ TB
> Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and 256
> GB RAM.
> Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
> IOPS.
> 
> GC settings
> 1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
> -XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8
> -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
> -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55"
> 
> 2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
> -XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem
> -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100
> -XX:InitiatingHeapOccupancyPercent=55"
> 
> The second GC setting is better for our environment but even with these
> settings we are facing issues of slowness even for simple term queries.
> The memory is not highly utilised and CPU% is also not very high. The
> slowness increases if we try to fetch more rows per request as well.
> 
> Please provide your inputs.
> 
> Thanks,
> Modassar
> 
> 
> 
> On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <ap...@elyograg.org> wrote:
> 
>> On 3/27/2022 1:20 PM, Modassar Ather wrote:
>>> Just to add one point, even the queries without the wildcards e.g. a
>>> boolean query or a query with 10000 ids ORed has also become slow and it
>> is
>>> also taking more CPU and finally ending up taking more time.
>>> I understand this is due to many GC pauses so if we fine tune the GC
>>> settings the CPU utilisation should go down.
>> 
>> The GC settings that Solr 8.x comes with out of the box are already very
>> good.  Disclaimer: They are very similar to settings that I came up with
>> after some intense testing work.
>> 
>> If you are running a recent release of Java 11 or OpenJDK 11 and have a
>> test environment, you could try Shenandoah.  My testing shows that this
>> collector makes a significant difference in GC pause activity, but that
>> throughput takes a definite hit.  If your indexing speed is sufficiently
>> fast, you could give these settings a try in your solr.in.sh file:
>> 
>> GC_TUNE=" \
>>   -XX:+UseShenandoahGC \
>>   -XX:+AlwaysPreTouch \
>>   -XX:+PerfDisableSharedMem \
>>   -XX:+ParallelRefProcEnabled \
>>   -XX:+UseStringDeduplication \
>>   -XX:ParallelGCThreads=2 \
>>   -XX:+UseNUMA
>> 
>> Note that UseNUMA will only make a difference if your server has more
>> than one NUMA node.  But it will not harm anything if the server does
>> not have it.
>> 
>> I mention indexing speed and throughput because indexing speed is where
>> I noticed a decrease with Shenandoah.  Fully reindexing my dovecot
>> install (about 150K messages) takes about 8 minutes with G1 and about 9
>> minutes with Shenandoah.  GC analysis revealed a larger number of
>> significantly smaller GC pauses, with the total pause time a little
>> lower.  But my co-conspirator on Shendandoah testing (with a much larger
>> index than mine) said that their re-indexing process failed to complete
>> with Shenandoah, so if you have a test system you can try it on, I would
>> recommend doing that before deploying to production.
>> 
>> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now
>> has 11.0.14, which in theory should be a lot more stable.
>> 
>> I also came up with some good CMS settings.  But I think the CMS
>> collector has been deprecated, though I do not know what version of Java
>> might ultimately remove it.
>> 
>> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey
>> 
>> Thanks,
>> Shawn
>> 
>>

Re: High CPU utilisation on Solr-8.11.0

Posted by Modassar Ather <mo...@gmail.com>.

Hi,

I tried a few different settings of GC and observed the following. The best
result I got with the following environment and GC settings but still it is
comparatively slower than the previous Solr-6.5.1 setup.

Total index : 4+ TB
Servers : 3 instances of x2gd.4xlarge systems each having 16 CPUs and 256
GB RAM.
Storage : Multi-attach EBS Volume. Provisioned IOPS SSD (io1) with 3000
IOPS.

GC settings
1. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
-XX:+UseStringDeduplication -XX:MaxTenuringThreshold=8
-XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
-XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=55"

2. "-XX:+UseG1GC -XX:InitialHeapSize=30g -XX:MaxHeapSize=30g
-XX:MaxTenuringThreshold=8 -XX:+PerfDisableSharedMem
-XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=100
-XX:InitiatingHeapOccupancyPercent=55"

The second GC setting is better for our environment but even with these
settings we are facing issues of slowness even for simple term queries.
The memory is not highly utilised and CPU% is also not very high. The
slowness increases if we try to fetch more rows per request as well.

Please provide your inputs.

Thanks,
Modassar



On Mon, Mar 28, 2022 at 2:51 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/27/2022 1:20 PM, Modassar Ather wrote:
> > Just to add one point, even the queries without the wildcards e.g. a
> > boolean query or a query with 10000 ids ORed has also become slow and it
> is
> > also taking more CPU and finally ending up taking more time.
> > I understand this is due to many GC pauses so if we fine tune the GC
> > settings the CPU utilisation should go down.
>
> The GC settings that Solr 8.x comes with out of the box are already very
> good.  Disclaimer: They are very similar to settings that I came up with
> after some intense testing work.
>
> If you are running a recent release of Java 11 or OpenJDK 11 and have a
> test environment, you could try Shenandoah.  My testing shows that this
> collector makes a significant difference in GC pause activity, but that
> throughput takes a definite hit.  If your indexing speed is sufficiently
> fast, you could give these settings a try in your solr.in.sh file:
>
> GC_TUNE=" \
>    -XX:+UseShenandoahGC \
>    -XX:+AlwaysPreTouch \
>    -XX:+PerfDisableSharedMem \
>    -XX:+ParallelRefProcEnabled \
>    -XX:+UseStringDeduplication \
>    -XX:ParallelGCThreads=2 \
>    -XX:+UseNUMA
>
> Note that UseNUMA will only make a difference if your server has more
> than one NUMA node.  But it will not harm anything if the server does
> not have it.
>
> I mention indexing speed and throughput because indexing speed is where
> I noticed a decrease with Shenandoah.  Fully reindexing my dovecot
> install (about 150K messages) takes about 8 minutes with G1 and about 9
> minutes with Shenandoah.  GC analysis revealed a larger number of
> significantly smaller GC pauses, with the total pause time a little
> lower.  But my co-conspirator on Shendandoah testing (with a much larger
> index than mine) said that their re-indexing process failed to complete
> with Shenandoah, so if you have a test system you can try it on, I would
> recommend doing that before deploying to production.
>
> My Shenandoah testing was done with OpenJDK 11.0.3, and that server now
> has 11.0.14, which in theory should be a lot more stable.
>
> I also came up with some good CMS settings.  But I think the CMS
> collector has been deprecated, though I do not know what version of Java
> might ultimately remove it.
>
> https://cwiki.apache.org/confluence/display/solr/ShawnHeisey
>
> Thanks,
> Shawn
>
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Shawn Heisey <ap...@elyograg.org>.

On 3/27/2022 1:20 PM, Modassar Ather wrote:
> Just to add one point, even the queries without the wildcards e.g. a
> boolean query or a query with 10000 ids ORed has also become slow and it is
> also taking more CPU and finally ending up taking more time.
> I understand this is due to many GC pauses so if we fine tune the GC
> settings the CPU utilisation should go down.

The GC settings that Solr 8.x comes with out of the box are already very 
good.  Disclaimer: They are very similar to settings that I came up with 
after some intense testing work.

If you are running a recent release of Java 11 or OpenJDK 11 and have a 
test environment, you could try Shenandoah.  My testing shows that this 
collector makes a significant difference in GC pause activity, but that 
throughput takes a definite hit.  If your indexing speed is sufficiently 
fast, you could give these settings a try in your solr.in.sh file:

GC_TUNE=" \
   -XX:+UseShenandoahGC \
   -XX:+AlwaysPreTouch \
   -XX:+PerfDisableSharedMem \
   -XX:+ParallelRefProcEnabled \
   -XX:+UseStringDeduplication \
   -XX:ParallelGCThreads=2 \
   -XX:+UseNUMA

Note that UseNUMA will only make a difference if your server has more 
than one NUMA node.  But it will not harm anything if the server does 
not have it.

I mention indexing speed and throughput because indexing speed is where 
I noticed a decrease with Shenandoah.  Fully reindexing my dovecot 
install (about 150K messages) takes about 8 minutes with G1 and about 9 
minutes with Shenandoah.  GC analysis revealed a larger number of 
significantly smaller GC pauses, with the total pause time a little 
lower.  But my co-conspirator on Shendandoah testing (with a much larger 
index than mine) said that their re-indexing process failed to complete 
with Shenandoah, so if you have a test system you can try it on, I would 
recommend doing that before deploying to production.

My Shenandoah testing was done with OpenJDK 11.0.3, and that server now 
has 11.0.14, which in theory should be a lot more stable.

I also came up with some good CMS settings.  But I think the CMS 
collector has been deprecated, though I do not know what version of Java 
might ultimately remove it.

https://cwiki.apache.org/confluence/display/solr/ShawnHeisey

Thanks,
Shawn

Re: High CPU utilisation on Solr-8.11.0

Posted by Modassar Ather <mo...@gmail.com>.

Thanks for your replies.

Yes, adding more physical memory will help but in the current situation
even the GC settings which we have used may not be the optimal one. Can you
please provide some suggestions on GC settings?
We are also planning to add more shards and create smaller indexes per
shard on different smaller EC2s. We will evaluate the requirement of memory
and the search performance.

Just to add one point, even the queries without the wildcards e.g. a
boolean query or a query with 10000 ids ORed has also become slow and it is
also taking more CPU and finally ending up taking more time.
I understand this is due to many GC pauses so if we fine tune the GC
settings the CPU utilisation should go down.

I will try to set pf= empty string and check the performance and also look
at "enbleGraphQueries" settings. Will a custom written word delimiter
filter which creates fewer tokens as per our need will help in search
performance?

Thanks,
Modassar

On Mon, Mar 28, 2022 at 12:10 AM Michael Gibney <mi...@michaelgibney.net>
wrote:

> I agree with Shawn about ideally wanting more memory for the OS.
>
> That said, the WordDelimiterFilter config you sent aligns with my suspicion
> that "graph phrase" issues are likely to explain the difference between 6.5
> and 8.11. At query-time, WordDelimiterFilter (and also equally
> WordDelimiterGraphFilter) both trigger "graph phrase" behavior on `pf`
> (phrase fields), and in 6.5 these would I'm fairly certain have been
> completely ignored.
>
> So 6.5 as a point of comparison is unlikely to be helpful going forward,
> since the "better performance" of 6.5 was a consequence of a bug that
> caused `pf` "graph phrase" queries not being executed at all.
>
> This mailing list exchange from June 2021 [1] should be helpful/relevant.
> (Also note that wrt the issue you're encountering, there's no real
> difference between WordDelimiterFilter and WordDelimiterGraphFilter).
>
> [1] https://lists.apache.org/thread/kbjgztckqdody9859knq05swvx5xj20f
>
> On Sun, Mar 27, 2022 at 11:51 AM Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 3/27/2022 5:30 AM, Modassar Ather wrote:
> > > The wildcard queries are executed against the text data and yes there
> > are a
> > > huge number of possible expansions of the wildcard query.
> > > All the 12 shards are on a single machine with 521 GB memory and each
> > shard
> > > is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared
> > by
> > > all the 12 shards.
> >
> > I believe that my initial thought is correct -- you need more memory to
> > handle 4TB of index data.  I'm talking about more memory available to
> > the OS, not Solr.  This would have most likely been a problem in 6.x
> > too, but I've seen situations where upgrading Solr can mean that
> > insufficient memory is even more of a noticeable problem than it was in
> > an older version.
> >
> > Something you could try is increasing the heap size to 31g.  I wouldn't
> > suggest going any higher unless you see evidence that you actually need
> > more .. Java switches to 64-bit pointers at a heap size of 32GB, and you
> > probably need to go to something like 48GB before things break even.  I
> > actually don't expect going to a 31GB heap to make things better ... but
> > if it does, then you might also be running into the other main problem
> > mentioned on the wiki page -- a heap size that's too small.  That makes
> > it so Java spends more time collecting garbage than it does running the
> > application.
> >
> > I didn't know about the things Michael mentioned regarding Solr not
> > utilizing the full capability of WordDelimiterFilter and
> > WordDelimiterGraphFilter in older versions.  Those filters tend to
> > greatly increase cardinality, and apparently also increase heap memory
> > utilization in recent Solr versions.
> >
> > Thanks,
> > Shawn
> >
> >
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Michael Gibney <mi...@michaelgibney.net>.

I agree with Shawn about ideally wanting more memory for the OS.

That said, the WordDelimiterFilter config you sent aligns with my suspicion
that "graph phrase" issues are likely to explain the difference between 6.5
and 8.11. At query-time, WordDelimiterFilter (and also equally
WordDelimiterGraphFilter) both trigger "graph phrase" behavior on `pf`
(phrase fields), and in 6.5 these would I'm fairly certain have been
completely ignored.

So 6.5 as a point of comparison is unlikely to be helpful going forward,
since the "better performance" of 6.5 was a consequence of a bug that
caused `pf` "graph phrase" queries not being executed at all.

This mailing list exchange from June 2021 [1] should be helpful/relevant.
(Also note that wrt the issue you're encountering, there's no real
difference between WordDelimiterFilter and WordDelimiterGraphFilter).

[1] https://lists.apache.org/thread/kbjgztckqdody9859knq05swvx5xj20f

On Sun, Mar 27, 2022 at 11:51 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/27/2022 5:30 AM, Modassar Ather wrote:
> > The wildcard queries are executed against the text data and yes there
> are a
> > huge number of possible expansions of the wildcard query.
> > All the 12 shards are on a single machine with 521 GB memory and each
> shard
> > is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared
> by
> > all the 12 shards.
>
> I believe that my initial thought is correct -- you need more memory to
> handle 4TB of index data.  I'm talking about more memory available to
> the OS, not Solr.  This would have most likely been a problem in 6.x
> too, but I've seen situations where upgrading Solr can mean that
> insufficient memory is even more of a noticeable problem than it was in
> an older version.
>
> Something you could try is increasing the heap size to 31g.  I wouldn't
> suggest going any higher unless you see evidence that you actually need
> more .. Java switches to 64-bit pointers at a heap size of 32GB, and you
> probably need to go to something like 48GB before things break even.  I
> actually don't expect going to a 31GB heap to make things better ... but
> if it does, then you might also be running into the other main problem
> mentioned on the wiki page -- a heap size that's too small.  That makes
> it so Java spends more time collecting garbage than it does running the
> application.
>
> I didn't know about the things Michael mentioned regarding Solr not
> utilizing the full capability of WordDelimiterFilter and
> WordDelimiterGraphFilter in older versions.  Those filters tend to
> greatly increase cardinality, and apparently also increase heap memory
> utilization in recent Solr versions.
>
> Thanks,
> Shawn
>
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Shawn Heisey <ap...@elyograg.org>.

On 3/27/2022 5:30 AM, Modassar Ather wrote:
> The wildcard queries are executed against the text data and yes there are a
> huge number of possible expansions of the wildcard query.
> All the 12 shards are on a single machine with 521 GB memory and each shard
> is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared by
> all the 12 shards.

I believe that my initial thought is correct -- you need more memory to 
handle 4TB of index data.  I'm talking about more memory available to 
the OS, not Solr.  This would have most likely been a problem in 6.x 
too, but I've seen situations where upgrading Solr can mean that 
insufficient memory is even more of a noticeable problem than it was in 
an older version.

Something you could try is increasing the heap size to 31g.  I wouldn't 
suggest going any higher unless you see evidence that you actually need 
more .. Java switches to 64-bit pointers at a heap size of 32GB, and you 
probably need to go to something like 48GB before things break even.  I 
actually don't expect going to a 31GB heap to make things better ... but 
if it does, then you might also be running into the other main problem 
mentioned on the wiki page -- a heap size that's too small.  That makes 
it so Java spends more time collecting garbage than it does running the 
application.

I didn't know about the things Michael mentioned regarding Solr not 
utilizing the full capability of WordDelimiterFilter and 
WordDelimiterGraphFilter in older versions.  Those filters tend to 
greatly increase cardinality, and apparently also increase heap memory 
utilization in recent Solr versions.

Thanks,
Shawn

Re: High CPU utilisation on Solr-8.11.0

Posted by Modassar Ather <mo...@gmail.com>.

Thank you all for the suggestions.

I will try to profile and find the bottleneck.

I am getting the following exception which I understand may be due to the
multiterm field expansion for the wildcard query. Please correct me if I am
wrong.
*The request took too long to iterate over doc values.*

The WordDelimiterGraphFilter is not used but WordDelimiterFilter is used
and following is its configuration. As WordDelimiterFilter is deprecated we
will remove it in the next step.
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/>

The wildcard queries are executed against the text data and yes there are a
huge number of possible expansions of the wildcard query.
All the 12 shards are on a single machine with 521 GB memory and each shard
is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared by
all the 12 shards.

Following is the cache configuration.
<filterCache class="solr.FastLRUCache" size="128" initialSize="128"
autowarmCount="0"/>
<queryResultCache class="solr.LRUCache" size="128" initialSize="128"
autowarmCount="0"/>
<documentCache class="solr.LRUCache" size="128" initialSize="128"
autowarmCount="0"/>

Thanks,
Modassar

On Sat, Mar 26, 2022 at 9:43 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/26/2022 6:24 AM, Mike Drob wrote:
> > Can you provide more details on what they CPU time is spent on? Maybe
> look
> > at some JFR profiles or collect several jstacks to see where they
> > bottlenecks are.
> >
> > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <mo...@gmail.com>
> > wrote:
> >
> >> We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are
> the
> >> details of Solr installation.
> The output from "vmstat 5 -w" run for several minutes while the problem
> is happening will also help pinpoint bottlenecks.   This is best run in
> a terminal that is very wide -- say 132 columns or more.  I'm not saying
> Mike is wrong, just giving you another data point you can look at and
> share.
>
> Wildcard queries tend to be VERY inefficient unless they are executed on
> fields with very low cardinality.  I bet you're running them on the
> highest cardinality fields you have, fields which probably have millions
> or billions of unique tokens.
>
> What I suspect here is that you are running on the very edge of
> "insufficient system memory for disk caching" ... to the point where
> 6.5.1 was just barely able to handle things well, but changes since then
> have shifted things a little bit and now you're over the line into
> performance problems.
>
> Is a full copy of all 12 shards (totaling 4TB) resident on each machine,
> with 512GB memory?  If so, you probably need more memory installed in
> each server.
>
> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems
>
> (I wrote that wiki page, so if there are errors they are mine)
>
> Thanks,
> Shawn
>
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Shawn Heisey <ap...@elyograg.org>.

On 3/26/2022 6:24 AM, Mike Drob wrote:
> Can you provide more details on what they CPU time is spent on? Maybe look
> at some JFR profiles or collect several jstacks to see where they
> bottlenecks are.
>
> On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <mo...@gmail.com>
> wrote:
>
>> We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are the
>> details of Solr installation.
The output from "vmstat 5 -w" run for several minutes while the problem 
is happening will also help pinpoint bottlenecks.   This is best run in 
a terminal that is very wide -- say 132 columns or more.  I'm not saying 
Mike is wrong, just giving you another data point you can look at and share.

Wildcard queries tend to be VERY inefficient unless they are executed on 
fields with very low cardinality.  I bet you're running them on the 
highest cardinality fields you have, fields which probably have millions 
or billions of unique tokens.

What I suspect here is that you are running on the very edge of 
"insufficient system memory for disk caching" ... to the point where 
6.5.1 was just barely able to handle things well, but changes since then 
have shifted things a little bit and now you're over the line into 
performance problems.

Is a full copy of all 12 shards (totaling 4TB) resident on each machine, 
with 512GB memory?  If so, you probably need more memory installed in 
each server.

https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems

(I wrote that wiki page, so if there are errors they are mine)

Thanks,
Shawn

Re: High CPU utilisation on Solr-8.11.0

Posted by Satya Nand <sa...@indiamart.com.INVALID>.

Thanks, Michael.

I missed that mail thread among all responses. I will check that too.



On Thu, May 5, 2022 at 6:26 PM Michael Gibney <mi...@michaelgibney.net>
wrote:

> Did you yet dig through the mailing list thread link that I posted earlier
> in this thread? It explains in more depth, suggests a number of possible
> mitigations, and has a bunch of links to jira issues that provide extra
> context. Off the cuff, I'd say that setting `enableGraphQueries=false` may
> be most immediately helpful in terms of restoring performance.
>
> (As an aside: from my perspective though, even if you can restore
> performance, it would be at the expense of nuances of functionality. Longer
> term I'd really like to help solve this properly, involving some
> combination of the issues linked to in the above thread ...)
>
> Michael
>
> On Thu, May 5, 2022 at 3:01 AM Satya Nand <satya.nand@indiamart.com
> .invalid>
> wrote:
>
> > Hi Michael,
> >
> >
> > 1. set `pf=` (phrase field empty string), disabling implicit phrase query
> > > building. This would help give a sense of whether phrase queries are
> > > involved in the performance issues you're seeing.
> >
> >
> > We are also in the process of moving from standalone 6.6 to 8.7 Solr
> cloud,
> > We also noticed a huge response time increase (91 ms To 170 ms +).
> >
> > We tried applying the tweak of disabling the pf field and response time
> was
> > back to normal. So somehow pf was responsible for increased response
> time.
> >
> > We are using both query-time multi-term synonyms and
> > WordDelimiter[Graph]Filter.
> >
> > What should we do next from here as we can't disable the pf field?
> >
> > Cluster Configuration:
> >
> > 3 Solr Nodes: 5 CPU, 42 GB Ram (Each)
> > 3 Zookeeper Nodes:  1 CPU, 2 GB Ram (Each)
> > 3 Shards: 42m Documents, 42 GB (Each)
> > Heap: 8 GB
> >
> >
> > There are no deleted documents in the cluster and no updates going on. We
> > are trying to match the performance first.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, Mar 26, 2022 at 9:42 PM Michael Gibney <
> michael@michaelgibney.net>
> > wrote:
> >
> > > Are you using query-time multi-term synonyms or
> > WordDelimiter[Graph]Filter?
> > > -- these can trigger "graph phrase" queries, which are handled _quite_
> > > differently in Solr 8.11 vs 6.5 (and although unlikely to directly
> cause
> > > the performance issues you're observing, might well explain the
> > performance
> > > discrepancy). If you're _not_ using either of those, then the rest of
> > this
> > > message is likely irrelevant.
> > >
> > > One thing to possibly keep an eye out for (in addition to gathering
> more
> > > evidence, as Mike Drob suggests): 6.5 started using span queries for
> > "graph
> > > phrase" queries (LUCENE-7699), but the resulting phrase queries were
> > > completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely
> > > ignoring complex phrase queries did however greatly reduce latency and
> > CPU
> > > load on 6.5!
> > >
> > > 7.6 started paying attention to these queries again (SOLR-12243), but
> > also
> > > went back to "fully-enumerated" combinatoric approach to phrase queries
> > > when `ps` (phrase slop) is greater than 0 (LUCENE-8531).
> > >
> > > Some parameters you could tweak, assuming you're using edismax:
> > > 1. set `pf=` (phrase field empty string), disabling implicit phrase
> query
> > > building. This would help give a sense of whether phrase queries are
> > > involved in the performance issues you're seeing.
> > > 2. set `ps=0` (phrase slop 0), this should allow span queries to be
> > built,
> > > which should generally be more efficient than analogous non-span-query
> > > approach (basically this would make the change introduced by
> LUCENE-8531
> > > irrelevant); tangentially: the special case building span queries for
> > > `ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 --
> not
> > > directly relevant to this issue though).
> > >
> > > Michael
> > >
> > > On Sat, Mar 26, 2022 at 8:26 AM Mike Drob <md...@mdrob.com> wrote:
> > >
> > > > Can you provide more details on what they CPU time is spent on? Maybe
> > > look
> > > > at some JFR profiles or collect several jstacks to see where they
> > > > bottlenecks are.
> > > >
> > > > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <
> modather1981@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following
> > are
> > > > the
> > > > > details of Solr installation.
> > > > >
> > > > > Server : EC2 instance with 32 CPUs and 521 GB
> > > > > Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
> > > > > Ubuntu :Ubuntu 20.04.3 LTS
> > > > > Java : openjdk 11.0.14
> > > > > SolrCloud : 12 shards having a total 4+ TB index. Each node has a
> > 30GB
> > > > max
> > > > > memory limit.
> > > > > GC setting on Solr : G1GC
> > > > > Solr query timeout : 5 minutes
> > > > >
> > > > > During testing we observed a high CPU utilisation and few of the
> > > queries
> > > > > with wildcard queries are timing out. These queries are getting
> > > executed
> > > > > completely on Solr-6.5.1.
> > > > > After tuning a few of the parameters of GC settings the CPU
> > utilisation
> > > > > came down but it is still high when compared with Solr-6.5.1 and
> some
> > > > > queries with wildcard queries are still failing.
> > > > >
> > > > > Kindly provide your suggestions.
> > > > >
> > > > > Thanks,
> > > > > Modassar
> > > > >
> > > >
> > >
> >
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Michael Gibney <mi...@michaelgibney.net>.

Did you yet dig through the mailing list thread link that I posted earlier
in this thread? It explains in more depth, suggests a number of possible
mitigations, and has a bunch of links to jira issues that provide extra
context. Off the cuff, I'd say that setting `enableGraphQueries=false` may
be most immediately helpful in terms of restoring performance.

(As an aside: from my perspective though, even if you can restore
performance, it would be at the expense of nuances of functionality. Longer
term I'd really like to help solve this properly, involving some
combination of the issues linked to in the above thread ...)

Michael

On Thu, May 5, 2022 at 3:01 AM Satya Nand <sa...@indiamart.com.invalid>
wrote:

> Hi Michael,
>
>
> 1. set `pf=` (phrase field empty string), disabling implicit phrase query
> > building. This would help give a sense of whether phrase queries are
> > involved in the performance issues you're seeing.
>
>
> We are also in the process of moving from standalone 6.6 to 8.7 Solr cloud,
> We also noticed a huge response time increase (91 ms To 170 ms +).
>
> We tried applying the tweak of disabling the pf field and response time was
> back to normal. So somehow pf was responsible for increased response time.
>
> We are using both query-time multi-term synonyms and
> WordDelimiter[Graph]Filter.
>
> What should we do next from here as we can't disable the pf field?
>
> Cluster Configuration:
>
> 3 Solr Nodes: 5 CPU, 42 GB Ram (Each)
> 3 Zookeeper Nodes:  1 CPU, 2 GB Ram (Each)
> 3 Shards: 42m Documents, 42 GB (Each)
> Heap: 8 GB
>
>
> There are no deleted documents in the cluster and no updates going on. We
> are trying to match the performance first.
>
>
>
>
>
>
>
> On Sat, Mar 26, 2022 at 9:42 PM Michael Gibney <mi...@michaelgibney.net>
> wrote:
>
> > Are you using query-time multi-term synonyms or
> WordDelimiter[Graph]Filter?
> > -- these can trigger "graph phrase" queries, which are handled _quite_
> > differently in Solr 8.11 vs 6.5 (and although unlikely to directly cause
> > the performance issues you're observing, might well explain the
> performance
> > discrepancy). If you're _not_ using either of those, then the rest of
> this
> > message is likely irrelevant.
> >
> > One thing to possibly keep an eye out for (in addition to gathering more
> > evidence, as Mike Drob suggests): 6.5 started using span queries for
> "graph
> > phrase" queries (LUCENE-7699), but the resulting phrase queries were
> > completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely
> > ignoring complex phrase queries did however greatly reduce latency and
> CPU
> > load on 6.5!
> >
> > 7.6 started paying attention to these queries again (SOLR-12243), but
> also
> > went back to "fully-enumerated" combinatoric approach to phrase queries
> > when `ps` (phrase slop) is greater than 0 (LUCENE-8531).
> >
> > Some parameters you could tweak, assuming you're using edismax:
> > 1. set `pf=` (phrase field empty string), disabling implicit phrase query
> > building. This would help give a sense of whether phrase queries are
> > involved in the performance issues you're seeing.
> > 2. set `ps=0` (phrase slop 0), this should allow span queries to be
> built,
> > which should generally be more efficient than analogous non-span-query
> > approach (basically this would make the change introduced by LUCENE-8531
> > irrelevant); tangentially: the special case building span queries for
> > `ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 -- not
> > directly relevant to this issue though).
> >
> > Michael
> >
> > On Sat, Mar 26, 2022 at 8:26 AM Mike Drob <md...@mdrob.com> wrote:
> >
> > > Can you provide more details on what they CPU time is spent on? Maybe
> > look
> > > at some JFR profiles or collect several jstacks to see where they
> > > bottlenecks are.
> > >
> > > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <modather1981@gmail.com
> >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following
> are
> > > the
> > > > details of Solr installation.
> > > >
> > > > Server : EC2 instance with 32 CPUs and 521 GB
> > > > Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
> > > > Ubuntu :Ubuntu 20.04.3 LTS
> > > > Java : openjdk 11.0.14
> > > > SolrCloud : 12 shards having a total 4+ TB index. Each node has a
> 30GB
> > > max
> > > > memory limit.
> > > > GC setting on Solr : G1GC
> > > > Solr query timeout : 5 minutes
> > > >
> > > > During testing we observed a high CPU utilisation and few of the
> > queries
> > > > with wildcard queries are timing out. These queries are getting
> > executed
> > > > completely on Solr-6.5.1.
> > > > After tuning a few of the parameters of GC settings the CPU
> utilisation
> > > > came down but it is still high when compared with Solr-6.5.1 and some
> > > > queries with wildcard queries are still failing.
> > > >
> > > > Kindly provide your suggestions.
> > > >
> > > > Thanks,
> > > > Modassar
> > > >
> > >
> >
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Satya Nand <sa...@indiamart.com.INVALID>.

Hi Michael,


1. set `pf=` (phrase field empty string), disabling implicit phrase query
> building. This would help give a sense of whether phrase queries are
> involved in the performance issues you're seeing.


We are also in the process of moving from standalone 6.6 to 8.7 Solr cloud,
We also noticed a huge response time increase (91 ms To 170 ms +).

We tried applying the tweak of disabling the pf field and response time was
back to normal. So somehow pf was responsible for increased response time.

We are using both query-time multi-term synonyms and
WordDelimiter[Graph]Filter.

What should we do next from here as we can't disable the pf field?

Cluster Configuration:

3 Solr Nodes: 5 CPU, 42 GB Ram (Each)
3 Zookeeper Nodes:  1 CPU, 2 GB Ram (Each)
3 Shards: 42m Documents, 42 GB (Each)
Heap: 8 GB


There are no deleted documents in the cluster and no updates going on. We
are trying to match the performance first.







On Sat, Mar 26, 2022 at 9:42 PM Michael Gibney <mi...@michaelgibney.net>
wrote:

> Are you using query-time multi-term synonyms or WordDelimiter[Graph]Filter?
> -- these can trigger "graph phrase" queries, which are handled _quite_
> differently in Solr 8.11 vs 6.5 (and although unlikely to directly cause
> the performance issues you're observing, might well explain the performance
> discrepancy). If you're _not_ using either of those, then the rest of this
> message is likely irrelevant.
>
> One thing to possibly keep an eye out for (in addition to gathering more
> evidence, as Mike Drob suggests): 6.5 started using span queries for "graph
> phrase" queries (LUCENE-7699), but the resulting phrase queries were
> completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely
> ignoring complex phrase queries did however greatly reduce latency and CPU
> load on 6.5!
>
> 7.6 started paying attention to these queries again (SOLR-12243), but also
> went back to "fully-enumerated" combinatoric approach to phrase queries
> when `ps` (phrase slop) is greater than 0 (LUCENE-8531).
>
> Some parameters you could tweak, assuming you're using edismax:
> 1. set `pf=` (phrase field empty string), disabling implicit phrase query
> building. This would help give a sense of whether phrase queries are
> involved in the performance issues you're seeing.
> 2. set `ps=0` (phrase slop 0), this should allow span queries to be built,
> which should generally be more efficient than analogous non-span-query
> approach (basically this would make the change introduced by LUCENE-8531
> irrelevant); tangentially: the special case building span queries for
> `ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 -- not
> directly relevant to this issue though).
>
> Michael
>
> On Sat, Mar 26, 2022 at 8:26 AM Mike Drob <md...@mdrob.com> wrote:
>
> > Can you provide more details on what they CPU time is spent on? Maybe
> look
> > at some JFR profiles or collect several jstacks to see where they
> > bottlenecks are.
> >
> > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <mo...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are
> > the
> > > details of Solr installation.
> > >
> > > Server : EC2 instance with 32 CPUs and 521 GB
> > > Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
> > > Ubuntu :Ubuntu 20.04.3 LTS
> > > Java : openjdk 11.0.14
> > > SolrCloud : 12 shards having a total 4+ TB index. Each node has a 30GB
> > max
> > > memory limit.
> > > GC setting on Solr : G1GC
> > > Solr query timeout : 5 minutes
> > >
> > > During testing we observed a high CPU utilisation and few of the
> queries
> > > with wildcard queries are timing out. These queries are getting
> executed
> > > completely on Solr-6.5.1.
> > > After tuning a few of the parameters of GC settings the CPU utilisation
> > > came down but it is still high when compared with Solr-6.5.1 and some
> > > queries with wildcard queries are still failing.
> > >
> > > Kindly provide your suggestions.
> > >
> > > Thanks,
> > > Modassar
> > >
> >
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Michael Gibney <mi...@michaelgibney.net>.

Are you using query-time multi-term synonyms or WordDelimiter[Graph]Filter?
-- these can trigger "graph phrase" queries, which are handled _quite_
differently in Solr 8.11 vs 6.5 (and although unlikely to directly cause
the performance issues you're observing, might well explain the performance
discrepancy). If you're _not_ using either of those, then the rest of this
message is likely irrelevant.

One thing to possibly keep an eye out for (in addition to gathering more
evidence, as Mike Drob suggests): 6.5 started using span queries for "graph
phrase" queries (LUCENE-7699), but the resulting phrase queries were
completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely
ignoring complex phrase queries did however greatly reduce latency and CPU
load on 6.5!

7.6 started paying attention to these queries again (SOLR-12243), but also
went back to "fully-enumerated" combinatoric approach to phrase queries
when `ps` (phrase slop) is greater than 0 (LUCENE-8531).

Some parameters you could tweak, assuming you're using edismax:
1. set `pf=` (phrase field empty string), disabling implicit phrase query
building. This would help give a sense of whether phrase queries are
involved in the performance issues you're seeing.
2. set `ps=0` (phrase slop 0), this should allow span queries to be built,
which should generally be more efficient than analogous non-span-query
approach (basically this would make the change introduced by LUCENE-8531
irrelevant); tangentially: the special case building span queries for
`ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 -- not
directly relevant to this issue though).

Michael

On Sat, Mar 26, 2022 at 8:26 AM Mike Drob <md...@mdrob.com> wrote:

> Can you provide more details on what they CPU time is spent on? Maybe look
> at some JFR profiles or collect several jstacks to see where they
> bottlenecks are.
>
> On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <mo...@gmail.com>
> wrote:
>
> > Hi,
> >
> > We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are
> the
> > details of Solr installation.
> >
> > Server : EC2 instance with 32 CPUs and 521 GB
> > Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
> > Ubuntu :Ubuntu 20.04.3 LTS
> > Java : openjdk 11.0.14
> > SolrCloud : 12 shards having a total 4+ TB index. Each node has a 30GB
> max
> > memory limit.
> > GC setting on Solr : G1GC
> > Solr query timeout : 5 minutes
> >
> > During testing we observed a high CPU utilisation and few of the queries
> > with wildcard queries are timing out. These queries are getting executed
> > completely on Solr-6.5.1.
> > After tuning a few of the parameters of GC settings the CPU utilisation
> > came down but it is still high when compared with Solr-6.5.1 and some
> > queries with wildcard queries are still failing.
> >
> > Kindly provide your suggestions.
> >
> > Thanks,
> > Modassar
> >
>

Re: High CPU utilisation on Solr-8.11.0

Posted by Mike Drob <md...@mdrob.com>.

Can you provide more details on what they CPU time is spent on? Maybe look
at some JFR profiles or collect several jstacks to see where they
bottlenecks are.

On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather <mo...@gmail.com>
wrote:

> Hi,
>
> We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are the
> details of Solr installation.
>
> Server : EC2 instance with 32 CPUs and 521 GB
> Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS
> Ubuntu :Ubuntu 20.04.3 LTS
> Java : openjdk 11.0.14
> SolrCloud : 12 shards having a total 4+ TB index. Each node has a 30GB max
> memory limit.
> GC setting on Solr : G1GC
> Solr query timeout : 5 minutes
>
> During testing we observed a high CPU utilisation and few of the queries
> with wildcard queries are timing out. These queries are getting executed
> completely on Solr-6.5.1.
> After tuning a few of the parameters of GC settings the CPU utilisation
> came down but it is still high when compared with Solr-6.5.1 and some
> queries with wildcard queries are still failing.
>
> Kindly provide your suggestions.
>
> Thanks,
> Modassar
>