You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by kumar gaurav <kg...@gmail.com> on 2023/04/24 07:42:38 UTC

Help regarding solr request timeout because of spellcheck component performance.

HI Everyone

I am getting a solr socket timeout exception in the select search query
because of bad spellcheck performance.

I am using the spellcheck component in solr select request handler.
solrconfig

<requestHandler name="/select" class="solr.SearchHandler" lazy="true">

  <lst name="defaults">
    <str name="defType">edismax</str>
    <str name="facet">true</str>
    <str name="facet.mincount">1</str>
    <str name="q.op">AND</str>
    <str name="mm">100</str>
    <str name="sow">true</str>
    <str name="spellcheck.count">25</str>
    <str name="spellcheck.onlyMorePopular">false</str>
    <str name="spellcheck.collate">true</str>
    <str name="spellcheck.collateExtendedResults">true</str>
    <str name="spellcheck">true</str>
    <str name="spellcheck.extendedResults">false</str>
    <str name="spellcheck.maxCollations">10</str>
    <str name="spellcheck.maxCollationTries">150</str>
    <str name="spellcheck.collateParam.mm">100%</str>
    <str name="spellcheck.dictionary">default</str>
    <str name="spellcheck.dictionary">wordbreak</str>
  </lst>
  <arr name="last-components">
    <str>spellcheck</str>
  </arr>
</requestHandler>


Do we have any time allowed parameter for spellcheck like query timeAllowed
parameter ?

how can i identify query timeout because of spellcheck component process ?

Please help. Thanks in advance.



-- 
Thanks & Regards
Kumar Gaurav

Re: Help regarding solr request timeout because of spellcheck component performance.

Posted by Chris Hostetter <ho...@fucit.org>.
: timeAllowed does not limit spellcheck i have tried.

Hmmm .... that doesn't sound right -- unless you are using a really old 
version of solr, any index based spellchecker (like Direct and WordBreak) 
should be respecting timeAllowed due to the underlying Lucene IndexReader 
enforcing it.

- What version of solr are you using?

- do you have enough query volume (and do these time outs happen often 
enough) that you can take a lot of threaddumps and identify any "hot 
spots" in the spellchecking code?

- if the problem is sporadic, do you see any "patterns" in the reuests 
that cause the problem (i'm specifically wondering about long query 
strings that might be triggering the WordBreak bug i linked to before)

- have you tried only using only one dictionary or the other to narrow 
down the problem?

- I know it comes from a documented example, but maxChanges=10 with 
WordBreak is excessive for most "real world" word combinations i've seen 
in practice, and exacerbates the problem in the WordBreak bug i linked to 
before.  does lowering that to something like2 or 3 reduce this problem?


: 
: Following are the spellcheck configuration.  Can you suggest something ?
: 
: <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
:   <str name="queryAnalyzerFieldType">text_general</str>
: 
:   <!-- a spellchecker built from a field of the main index -->
:   <lst name="spellchecker">
:     <str name="name">default</str>
:     <str name="field">text</str>
:     <str name="classname">solr.DirectSolrSpellChecker</str>
:     <str name="distanceMeasure">internal</str>
:     <float name="accuracy">0.5</float>
:     <int name="maxEdits">2</int>
:     <int name="minPrefix">1</int>
:     <int name="maxInspections">5</int>
:     <int name="minQueryLength">4</int>
:     <float name="maxQueryFrequency">0.01</float>
:     <!-- uncomment this to require suggestions to occur in 1% of the documents
:       <float name="thresholdTokenFrequency">.01</float>
:     -->
:   </lst>
: 
:   <!-- a spellchecker that can break or combine words.  See "/spell"
: handler below for usage -->
:   <lst name="spellchecker">
:     <str name="name">wordbreak</str>
:     <str name="classname">solr.WordBreakSolrSpellChecker</str>
:     <str name="field">name</str>
:     <str name="combineWords">true</str>
:     <str name="breakWords">true</str>
:     <int name="maxChanges">10</int>
:   </lst>
: </searchComponent>
: 
: 
: On Thu, 4 May 2023 at 06:34, Chris Hostetter <ho...@fucit.org>
: wrote:
: 
: >
: > 1) timeAllowed does limit spellcheck (at least in all the code paths i can
: > think of that may be "slow") ... have you tried it?
: >
: > 2) what is your configuration for the dictionaries you are using?
: >
: > 3) be wary of https://github.com/apache/lucene/issues/12077
: >
: >
: > : Date: Tue, 2 May 2023 00:04:27 +0530
: > : From: kumar gaurav <kg...@gmail.com>
: > : Reply-To: users@solr.apache.org
: > : To: solr-user@lucene.apache.org, users@solr.apache.org
: > : Subject: Re: Help regarding solr request timeout because of spellcheck
: > :     component performance.
: > :
: > : Just a reminder if someone can help here.
: > :
: > : On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
: > :
: > : > ++ users@solr.apache.org
: > : >
: > : > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
: > : >
: > : >> HI Everyone
: > : >>
: > : >> I am getting a solr socket timeout exception in the select search
: > query
: > : >> because of bad spellcheck performance.
: > : >>
: > : >> I am using the spellcheck component in solr select request handler.
: > : >> solrconfig
: > : >>
: > : >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
: > : >>
: > : >>   <lst name="defaults">
: > : >>     <str name="defType">edismax</str>
: > : >>     <str name="facet">true</str>
: > : >>     <str name="facet.mincount">1</str>
: > : >>     <str name="q.op">AND</str>
: > : >>     <str name="mm">100</str>
: > : >>     <str name="sow">true</str>
: > : >>     <str name="spellcheck.count">25</str>
: > : >>     <str name="spellcheck.onlyMorePopular">false</str>
: > : >>     <str name="spellcheck.collate">true</str>
: > : >>     <str name="spellcheck.collateExtendedResults">true</str>
: > : >>     <str name="spellcheck">true</str>
: > : >>     <str name="spellcheck.extendedResults">false</str>
: > : >>     <str name="spellcheck.maxCollations">10</str>
: > : >>     <str name="spellcheck.maxCollationTries">150</str>
: > : >>     <str name="spellcheck.collateParam.mm">100%</str>
: > : >>     <str name="spellcheck.dictionary">default</str>
: > : >>     <str name="spellcheck.dictionary">wordbreak</str>
: > : >>   </lst>
: > : >>   <arr name="last-components">
: > : >>     <str>spellcheck</str>
: > : >>   </arr>
: > : >> </requestHandler>
: > : >>
: > : >>
: > : >> Do we have any time allowed parameter for spellcheck like query
: > : >> timeAllowed parameter ?
: > : >>
: > : >> how can i identify query timeout because of spellcheck component
: > process ?
: > : >>
: > : >> Please help. Thanks in advance.
: > : >>
: > : >>
: > : >>
: > : >> --
: > : >> Thanks & Regards
: > : >> Kumar Gaurav
: > : >>
: > : >
: > :
: >
: > -Hoss
: > http://www.lucidworks.com/
: >
: 

-Hoss
http://www.lucidworks.com/

Re: Help regarding solr request timeout because of spellcheck component performance.

Posted by kumar gaurav <kg...@gmail.com>.
HI Chris

Thanks a lot for your reply.

timeAllowed does not limit spellcheck i have tried.

Following are the spellcheck configuration.  Can you suggest something ?

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
  <str name="queryAnalyzerFieldType">text_general</str>

  <!-- a spellchecker built from a field of the main index -->
  <lst name="spellchecker">
    <str name="name">default</str>
    <str name="field">text</str>
    <str name="classname">solr.DirectSolrSpellChecker</str>
    <str name="distanceMeasure">internal</str>
    <float name="accuracy">0.5</float>
    <int name="maxEdits">2</int>
    <int name="minPrefix">1</int>
    <int name="maxInspections">5</int>
    <int name="minQueryLength">4</int>
    <float name="maxQueryFrequency">0.01</float>
    <!-- uncomment this to require suggestions to occur in 1% of the documents
      <float name="thresholdTokenFrequency">.01</float>
    -->
  </lst>

  <!-- a spellchecker that can break or combine words.  See "/spell"
handler below for usage -->
  <lst name="spellchecker">
    <str name="name">wordbreak</str>
    <str name="classname">solr.WordBreakSolrSpellChecker</str>
    <str name="field">name</str>
    <str name="combineWords">true</str>
    <str name="breakWords">true</str>
    <int name="maxChanges">10</int>
  </lst>
</searchComponent>


On Thu, 4 May 2023 at 06:34, Chris Hostetter <ho...@fucit.org>
wrote:

>
> 1) timeAllowed does limit spellcheck (at least in all the code paths i can
> think of that may be "slow") ... have you tried it?
>
> 2) what is your configuration for the dictionaries you are using?
>
> 3) be wary of https://github.com/apache/lucene/issues/12077
>
>
> : Date: Tue, 2 May 2023 00:04:27 +0530
> : From: kumar gaurav <kg...@gmail.com>
> : Reply-To: users@solr.apache.org
> : To: solr-user@lucene.apache.org, users@solr.apache.org
> : Subject: Re: Help regarding solr request timeout because of spellcheck
> :     component performance.
> :
> : Just a reminder if someone can help here.
> :
> : On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
> :
> : > ++ users@solr.apache.org
> : >
> : > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
> : >
> : >> HI Everyone
> : >>
> : >> I am getting a solr socket timeout exception in the select search
> query
> : >> because of bad spellcheck performance.
> : >>
> : >> I am using the spellcheck component in solr select request handler.
> : >> solrconfig
> : >>
> : >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
> : >>
> : >>   <lst name="defaults">
> : >>     <str name="defType">edismax</str>
> : >>     <str name="facet">true</str>
> : >>     <str name="facet.mincount">1</str>
> : >>     <str name="q.op">AND</str>
> : >>     <str name="mm">100</str>
> : >>     <str name="sow">true</str>
> : >>     <str name="spellcheck.count">25</str>
> : >>     <str name="spellcheck.onlyMorePopular">false</str>
> : >>     <str name="spellcheck.collate">true</str>
> : >>     <str name="spellcheck.collateExtendedResults">true</str>
> : >>     <str name="spellcheck">true</str>
> : >>     <str name="spellcheck.extendedResults">false</str>
> : >>     <str name="spellcheck.maxCollations">10</str>
> : >>     <str name="spellcheck.maxCollationTries">150</str>
> : >>     <str name="spellcheck.collateParam.mm">100%</str>
> : >>     <str name="spellcheck.dictionary">default</str>
> : >>     <str name="spellcheck.dictionary">wordbreak</str>
> : >>   </lst>
> : >>   <arr name="last-components">
> : >>     <str>spellcheck</str>
> : >>   </arr>
> : >> </requestHandler>
> : >>
> : >>
> : >> Do we have any time allowed parameter for spellcheck like query
> : >> timeAllowed parameter ?
> : >>
> : >> how can i identify query timeout because of spellcheck component
> process ?
> : >>
> : >> Please help. Thanks in advance.
> : >>
> : >>
> : >>
> : >> --
> : >> Thanks & Regards
> : >> Kumar Gaurav
> : >>
> : >
> :
>
> -Hoss
> http://www.lucidworks.com/
>

Re: Help regarding solr request timeout because of spellcheck component performance.

Posted by Chris Hostetter <ho...@fucit.org>.
1) timeAllowed does limit spellcheck (at least in all the code paths i can 
think of that may be "slow") ... have you tried it?

2) what is your configuration for the dictionaries you are using?

3) be wary of https://github.com/apache/lucene/issues/12077


: Date: Tue, 2 May 2023 00:04:27 +0530
: From: kumar gaurav <kg...@gmail.com>
: Reply-To: users@solr.apache.org
: To: solr-user@lucene.apache.org, users@solr.apache.org
: Subject: Re: Help regarding solr request timeout because of spellcheck
:     component performance.
: 
: Just a reminder if someone can help here.
: 
: On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
: 
: > ++ users@solr.apache.org
: >
: > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
: >
: >> HI Everyone
: >>
: >> I am getting a solr socket timeout exception in the select search query
: >> because of bad spellcheck performance.
: >>
: >> I am using the spellcheck component in solr select request handler.
: >> solrconfig
: >>
: >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
: >>
: >>   <lst name="defaults">
: >>     <str name="defType">edismax</str>
: >>     <str name="facet">true</str>
: >>     <str name="facet.mincount">1</str>
: >>     <str name="q.op">AND</str>
: >>     <str name="mm">100</str>
: >>     <str name="sow">true</str>
: >>     <str name="spellcheck.count">25</str>
: >>     <str name="spellcheck.onlyMorePopular">false</str>
: >>     <str name="spellcheck.collate">true</str>
: >>     <str name="spellcheck.collateExtendedResults">true</str>
: >>     <str name="spellcheck">true</str>
: >>     <str name="spellcheck.extendedResults">false</str>
: >>     <str name="spellcheck.maxCollations">10</str>
: >>     <str name="spellcheck.maxCollationTries">150</str>
: >>     <str name="spellcheck.collateParam.mm">100%</str>
: >>     <str name="spellcheck.dictionary">default</str>
: >>     <str name="spellcheck.dictionary">wordbreak</str>
: >>   </lst>
: >>   <arr name="last-components">
: >>     <str>spellcheck</str>
: >>   </arr>
: >> </requestHandler>
: >>
: >>
: >> Do we have any time allowed parameter for spellcheck like query
: >> timeAllowed parameter ?
: >>
: >> how can i identify query timeout because of spellcheck component process ?
: >>
: >> Please help. Thanks in advance.
: >>
: >>
: >>
: >> --
: >> Thanks & Regards
: >> Kumar Gaurav
: >>
: >
: 

-Hoss
http://www.lucidworks.com/

Re: Help regarding solr request timeout because of spellcheck component performance.

Posted by kumar gaurav <kg...@gmail.com>.
Just a reminder if someone can help here.

On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:

> ++ users@solr.apache.org
>
> On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
>
>> HI Everyone
>>
>> I am getting a solr socket timeout exception in the select search query
>> because of bad spellcheck performance.
>>
>> I am using the spellcheck component in solr select request handler.
>> solrconfig
>>
>> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
>>
>>   <lst name="defaults">
>>     <str name="defType">edismax</str>
>>     <str name="facet">true</str>
>>     <str name="facet.mincount">1</str>
>>     <str name="q.op">AND</str>
>>     <str name="mm">100</str>
>>     <str name="sow">true</str>
>>     <str name="spellcheck.count">25</str>
>>     <str name="spellcheck.onlyMorePopular">false</str>
>>     <str name="spellcheck.collate">true</str>
>>     <str name="spellcheck.collateExtendedResults">true</str>
>>     <str name="spellcheck">true</str>
>>     <str name="spellcheck.extendedResults">false</str>
>>     <str name="spellcheck.maxCollations">10</str>
>>     <str name="spellcheck.maxCollationTries">150</str>
>>     <str name="spellcheck.collateParam.mm">100%</str>
>>     <str name="spellcheck.dictionary">default</str>
>>     <str name="spellcheck.dictionary">wordbreak</str>
>>   </lst>
>>   <arr name="last-components">
>>     <str>spellcheck</str>
>>   </arr>
>> </requestHandler>
>>
>>
>> Do we have any time allowed parameter for spellcheck like query
>> timeAllowed parameter ?
>>
>> how can i identify query timeout because of spellcheck component process ?
>>
>> Please help. Thanks in advance.
>>
>>
>>
>> --
>> Thanks & Regards
>> Kumar Gaurav
>>
>

Re: Help regarding solr request timeout because of spellcheck component performance.

Posted by kumar gaurav <kg...@gmail.com>.
++ users@solr.apache.org

On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:

> HI Everyone
>
> I am getting a solr socket timeout exception in the select search query
> because of bad spellcheck performance.
>
> I am using the spellcheck component in solr select request handler.
> solrconfig
>
> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
>
>   <lst name="defaults">
>     <str name="defType">edismax</str>
>     <str name="facet">true</str>
>     <str name="facet.mincount">1</str>
>     <str name="q.op">AND</str>
>     <str name="mm">100</str>
>     <str name="sow">true</str>
>     <str name="spellcheck.count">25</str>
>     <str name="spellcheck.onlyMorePopular">false</str>
>     <str name="spellcheck.collate">true</str>
>     <str name="spellcheck.collateExtendedResults">true</str>
>     <str name="spellcheck">true</str>
>     <str name="spellcheck.extendedResults">false</str>
>     <str name="spellcheck.maxCollations">10</str>
>     <str name="spellcheck.maxCollationTries">150</str>
>     <str name="spellcheck.collateParam.mm">100%</str>
>     <str name="spellcheck.dictionary">default</str>
>     <str name="spellcheck.dictionary">wordbreak</str>
>   </lst>
>   <arr name="last-components">
>     <str>spellcheck</str>
>   </arr>
> </requestHandler>
>
>
> Do we have any time allowed parameter for spellcheck like query
> timeAllowed parameter ?
>
> how can i identify query timeout because of spellcheck component process ?
>
> Please help. Thanks in advance.
>
>
>
> --
> Thanks & Regards
> Kumar Gaurav
>