You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Permakoff, Vadim" <Va...@verisk.com> on 2020/10/05 16:12:46 UTC

RE: Slow Solr 8 response for long query

Hi Erick,
Thank you for looking into my question.

Below is timing for Solr 6 and Solr 8. I see that the search time depends on grouping, without grouping it is very fast and approx. the same for both solr 6 & 8, but with grouping the solr 8 is much slower. The difference grows with number of returned results (groups). For 30 results the difference is not that big, but for 300 results Solr 6 speed is almost the same, but Solr 8 is about 10 times slower. The data is the same, the indexing done from scratch.
The documents are nested, we are searching children and grouping on a field, which may group children from different parents, but in this particular case groups are only from one parent.
This is the query example:
qt=/select&wt=json&indent=true&start=0&rows=30&df=_text_sp_&q=VERY_LONG_BOOLEAN_QUERY_USING_SEVERAL_INDEXED_STRING_FIELDS_FROM_CHILDREN&q.op=OR&logParamsList=q&preferLocalShards=true&fq=_nested_id:child&group=true&group.ngroups=true&group.field=uniqueId&group.main=true&fl=id,score&debug=timing

Solr-8:
  "debug":{
    "timing":{
      "time":22258.0,
      "prepare":{
        "time":20.0,
        "query":{
          "time":20.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "terms":{
          "time":0.0},
        "debug":{
          "time":0.0}},
      "process":{
        "time":22210.0,
        "query":{
          "time":22210.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "terms":{
          "time":0.0},
        "debug":{
          "time":0.0}}}}}

Solr-6:
  "debug":{
    "timing":{
      "time":16157.0,
      "prepare":{
        "time":14.0,
        "query":{
          "time":14.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "terms":{
          "time":0.0},
        "debug":{
          "time":0.0}},
      "process":{
        "time":16133.0,
        "query":{
          "time":16133.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "terms":{
          "time":0.0},
        "debug":{
          "time":0.0}}}}}

Best Regards,
Vadim Permakoff


-----Original Message-----
From: Erick Erickson <er...@gmail.com> 
Sent: Wednesday, September 30, 2020 8:04 AM
To: solr-user@lucene.apache.org
Subject: Re: Slow Solr 8 response for long query

Caution: This email originated outside of the organization

Increasing the number of rows should not have this kind of impact in either version of Solr, so I think there’s something fundamentally strange in your setup.

Whether returning 10 or 300 documents, every document has to be scored. There are two differences between 10 and 300 rows:

1> when returning 10 rows, Solr keeps a sorted list of 10 doc, just IDs and score (assuming you’re sorting by relevance), when returning 300 the list is 300 long. I find it hard to believe that keeping a list 300 items long is making that much of a difference.

2> Solr needs to fetch/decompress/assemble 300 documents .vs. 10 documents for the response. Regardless of the fields returned, the entire document will be decompresses if you return any fields that are not docValues=true. So it’s possible that what you’re seeing is related.

Try adding, as Alexandre suggests, &debug to the query. Pay particular attention to the “timings” section too, that’ll show you the time each component took _exclusive_ of step <2> above and should give a clue.


All that said, fq clauses don’t score, so scoring is certainly involved in why the query takes so long to return even 10 rows but gets faster when you move the clause to a filter query, but my intuition is that there’s something else going on as well to account for the difference when you return 300 rows.

Best,
Erick

> On Sep 29, 2020, at 8:52 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
>
> What do the debug versions of the query show between two versions?
>
> One thing that changed is sow (split on whitespace) parameter among 
> many. It is unlikely to be the cause, but I am mentioning just in 
> case.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org
> _solr_guide_8-5F6_the-2Dstandard-2Dquery-2Dparser.html-23standard-2Dqu
> ery-2Dparser-2Dparameters&d=DwIFaQ&c=birp9sjcGzT9DCP3EIAtLA&r=T7Y0P9fY
> -fUzzabuVL6cMrBieBBqDIpnUbUy8vL_a1g&m=RUATSH_cpLfFDdUDmbHILMZFCZb7-4Ld
> nFI45UJRwrk&s=tkGnQKurRTwtyBUB8v3-C8khRra5oR7My0EaXsA7_LI&e=
>
> Regards,
>   Alex
>
> On Tue, 29 Sep 2020 at 20:47, Permakoff, Vadim 
> <Va...@verisk.com> wrote:
>>
>> Hi Solr Experts!
>> We are moving from Solr 6.5.1 to Solr 8.5.0 and having a problem with long query, which has a search text plus many OR and AND conditions (all in one place, the query is about 20KB long).
>> For the same set of data (about 500K docs) and the same schema the query in Solr 6 return results in less than 2 sec, Solr 8 takes more than 10 sec to get 10 results. If I increase the number of rows to 300, in Solr 6 it takes about 10 sec, in Solr 8 it takes more than 1 min. The results are small, just IDs. It looks like the relevancy scoring plays role, because if I move this query to filter query - both Solr versions work pretty fast.
>> The right way should be to change the query, but unfortunately it is difficult to modify the application which creates these queries, so I want to find some temporary workaround.
>>
>> What was changed from Solr 6 to Solr 8 in terms of scoring with many conditions, which affects the search speed negatively?
>> Is there anything to configure in Solr 8 to get the same performance for such query like it was in Solr 6?
>>
>> Thank you,
>> Vadim
>>
>> ________________________________
>>
>> This email is intended solely for the recipient. It may contain privileged, proprietary or confidential information or material. If you are not the intended recipient, please delete this email and any attachments and notify the sender of the error.