You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Andy Lester <an...@petdance.com> on 2021/10/18 16:03:03 UTC

How can I performance-tune my warming queries?

I’m trying to figure out why my warming is taking so long.  It’s taking about 20-40 seconds on average.   Can I measure where it’s spending its time?


I’ve got my firstSearcher and newSearcher set up like this:

<listener event="firstSearcher" class="solr.QuerySenderListener">
    <arr name="queries">
        <lst>
            ...
            <str name="q">world</str>
            <str name="sort">popular_score desc, grouping asc, copyrightyear desc, flrid asc</str>
            <str name="rows">2500</str>

            <str name="fq">(languagecode:"eng")</str>
            <str name="fq">(titletype:"BK")</str>
            <str name="fq">((grouping:"1" OR grouping:"2" OR grouping:"4"))</str>
            <str name="fq">(languagecode:"eng" OR solrtype:"N")</str>
            <str name="fq">(ib_searchable:"Y")</str>
            <str name="fq">((grouping:"1" OR grouping:"2"))</str>
            …
            <str name="facet.range">arrl</str>
            <str name="f.arrl.facet.range.start">0</str>
            <str name="f.arrl.facet.range.end">17.9</str>
            <str name="f.arrl.facet.range.gap">2</str>
            <str name="f.arrl.facet.range.other">before</str>
            <str name="facet.field">itemtypesubcode</str>
            <str name="f.itemtypesubcode.facet.method">fc</str>



All the FQs are the most common FQs that come out of analyzing our app logs.  There are about 35 of them.  The facet queries are all the facets that our app requests.  There are about 25 of them, spread across facet.field, facet.range and facet.query.

What I’m afraid of is that one of the warming facets or FQs is taking up all the time.  Can I tell where the warmer is spending its time?

Thanks,
Andy

Re: How can I performance-tune my warming queries?

Posted by Shawn Heisey <el...@elyograg.org>.

On 10/18/21 1:43 PM, Andy Lester wrote:
> This sounds like you’re saying there is no value in having warming 
> queries in the newSearcher. Is that correct? 

If my understanding is completely correct, then I have to concur with 
that statement. It's better to rely on filterCache autowarming for fq, 
and disk caching for facets.  Something to share from my own experience 
with filterCache:  I found that I had to go with a very small number for 
autowarmCount -- four. And the cache warming would still take up to 15 
seconds even with that small number.  This was on index shards (manual 
sharding, no SolrCloud) with core size at about 50GB.

Also if my understanding is correct, facet entries in firstSearcher will 
probably only have value if you reboot or do something else that clears 
the OS disk cache.  But fq entries in firstSearcher would still have 
value when Solr restarts (and probably also on core reload) -- 
populating an empty filterCache.

If your testing reveals that I have erred in my understanding, I would 
definitely appreciate knowing.

Thanks,
Shawn

Re: How can I performance-tune my warming queries?

Posted by Andy Lester <an...@petdance.com>.


> On Oct 18, 2021, at 2:38 PM, Shawn Heisey <el...@elyograg.org> wrote:
> 
>> What should we have in the newSearcher startup query, if the new searcher is going to bring over the cached FQs from an existing searcher?
> 
> I know that filterCache handles autowarming for fq parameters.  I do not know whether queryResultCache stores anything related to facets, but I would guess that it doesn't.
> 
> If your facet fields all have docValues, then I would expect OS disk caching to be the most important thing to have to speed those up, not Solr/Lucene caching.  In the absence of docValues, the data structures required for faceting must be generated in the Java heap, which takes time and memory, potentially a large amount.


This sounds like you’re saying there is no value in having warming queries in the newSearcher.  Is that correct?

Re: How can I performance-tune my warming queries?

Posted by Shawn Heisey <el...@elyograg.org>.

On 10/18/21 12:53 PM, Andy Lester wrote:
> What should we have in the newSearcher startup query, if the new searcher is going to bring over the cached FQs from an existing searcher?

I know that filterCache handles autowarming for fq parameters.  I do not 
know whether queryResultCache stores anything related to facets, but I 
would guess that it doesn't.

If your facet fields all have docValues, then I would expect OS disk 
caching to be the most important thing to have to speed those up, not 
Solr/Lucene caching.  In the absence of docValues, the data structures 
required for faceting must be generated in the Java heap, which takes 
time and memory, potentially a large amount.

It is always possible that I have some details wrong there.

Thanks,
Shawn

Re: How can I performance-tune my warming queries?

Posted by Andy Lester <an...@petdance.com>.

Thanks very much for this.  This is a huge help.


> What I would recommend is that (at a time when query traffic is lowest) you turn off all warming, restart, and then do some manual queries where you check each fq and facet individually.  Rebooting or clearing the OS disk cache before each query test would give you worst-case information.

That’s my plan, but I wanted to check first to see if there was a tool to save me from the drudgery.


> I would personally remove all the fqs from the newSearcher config and let the filterCache autowarming take care of warming the most commonly used fq values.  Leave them in the firstSearcher and configure solr to use a cold searcher. 

What should we have in the newSearcher startup query, if the new searcher is going to bring over the cached FQs from an existing searcher?

Thanks,
Andy

Re: How can I performance-tune my warming queries?

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/18/21 10:03 AM, Andy Lester wrote:
> I’m trying to figure out why my warming is taking so long.  It’s taking about 20-40 seconds on average.   Can I measure where it’s spending its time?

I do not know of any way to figure out which little parts of a massive 
query are the most time consuming.  You could try doing the query 
manually with debug turned on and see whether the reported timings are 
useful.  It might be that they are generally useful, but not for specifics.

What I would recommend is that (at a time when query traffic is lowest) 
you turn off all warming, restart, and then do some manual queries where 
you check each fq and facet individually.  Rebooting or clearing the OS 
disk cache before each query test would give you worst-case information.

Initially you could just try removing all the facets or removing all the 
fqs to see where you need to spend your testing time.

I would personally remove all the fqs from the newSearcher config and 
let the filterCache autowarming take care of warming the most commonly 
used fq values.  Leave them in the firstSearcher and configure solr to 
use a cold searcher.  I am not sure whether autowarming can handle 
facets ... it seems unlikely to me.

You'll want to be sure that any field you're using for facets is 
configured with docValues, and reindex from scratch if you need to add 
that config.  If you are using any TextField based fields for faceting, 
you won't be able to configure those with docValues.  I should say that 
a field using TextField is probably a poor candidate for facets anyway, 
because the cardinality on such fields is usually very high.  High 
cardinality makes for extremely slow facets.

And it will be exceedingly important that you have enough spare memory 
so the OS can cache the index data effectively, especially for facets.

I would bet on facets being the big trouble here.  If you find that a 
few of the facets are major problems, you could remove those from your 
warming and advise your users that to improve general performance for 
the majority of users, changes were required that make those specific 
queries slow.

Thanks,
Shawn