You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by kumar gaurav <kg...@gmail.com> on 2023/04/24 07:42:38 UTC
Help regarding solr request timeout because of spellcheck component performance.
HI Everyone
I am getting a solr socket timeout exception in the select search query
because of bad spellcheck performance.
I am using the spellcheck component in solr select request handler.
solrconfig
<requestHandler name="/select" class="solr.SearchHandler" lazy="true">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="facet">true</str>
<str name="facet.mincount">1</str>
<str name="q.op">AND</str>
<str name="mm">100</str>
<str name="sow">true</str>
<str name="spellcheck.count">25</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck">true</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.maxCollations">10</str>
<str name="spellcheck.maxCollationTries">150</str>
<str name="spellcheck.collateParam.mm">100%</str>
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.dictionary">wordbreak</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
Do we have any time allowed parameter for spellcheck like query timeAllowed
parameter ?
how can i identify query timeout because of spellcheck component process ?
Please help. Thanks in advance.
--
Thanks & Regards
Kumar Gaurav
Re: Help regarding solr request timeout because of spellcheck component performance.
Posted by Chris Hostetter <ho...@fucit.org>.
: timeAllowed does not limit spellcheck i have tried.
Hmmm .... that doesn't sound right -- unless you are using a really old
version of solr, any index based spellchecker (like Direct and WordBreak)
should be respecting timeAllowed due to the underlying Lucene IndexReader
enforcing it.
- What version of solr are you using?
- do you have enough query volume (and do these time outs happen often
enough) that you can take a lot of threaddumps and identify any "hot
spots" in the spellchecking code?
- if the problem is sporadic, do you see any "patterns" in the reuests
that cause the problem (i'm specifically wondering about long query
strings that might be triggering the WordBreak bug i linked to before)
- have you tried only using only one dictionary or the other to narrow
down the problem?
- I know it comes from a documented example, but maxChanges=10 with
WordBreak is excessive for most "real world" word combinations i've seen
in practice, and exacerbates the problem in the WordBreak bug i linked to
before. does lowering that to something like2 or 3 reduce this problem?
:
: Following are the spellcheck configuration. Can you suggest something ?
:
: <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
: <str name="queryAnalyzerFieldType">text_general</str>
:
: <!-- a spellchecker built from a field of the main index -->
: <lst name="spellchecker">
: <str name="name">default</str>
: <str name="field">text</str>
: <str name="classname">solr.DirectSolrSpellChecker</str>
: <str name="distanceMeasure">internal</str>
: <float name="accuracy">0.5</float>
: <int name="maxEdits">2</int>
: <int name="minPrefix">1</int>
: <int name="maxInspections">5</int>
: <int name="minQueryLength">4</int>
: <float name="maxQueryFrequency">0.01</float>
: <!-- uncomment this to require suggestions to occur in 1% of the documents
: <float name="thresholdTokenFrequency">.01</float>
: -->
: </lst>
:
: <!-- a spellchecker that can break or combine words. See "/spell"
: handler below for usage -->
: <lst name="spellchecker">
: <str name="name">wordbreak</str>
: <str name="classname">solr.WordBreakSolrSpellChecker</str>
: <str name="field">name</str>
: <str name="combineWords">true</str>
: <str name="breakWords">true</str>
: <int name="maxChanges">10</int>
: </lst>
: </searchComponent>
:
:
: On Thu, 4 May 2023 at 06:34, Chris Hostetter <ho...@fucit.org>
: wrote:
:
: >
: > 1) timeAllowed does limit spellcheck (at least in all the code paths i can
: > think of that may be "slow") ... have you tried it?
: >
: > 2) what is your configuration for the dictionaries you are using?
: >
: > 3) be wary of https://github.com/apache/lucene/issues/12077
: >
: >
: > : Date: Tue, 2 May 2023 00:04:27 +0530
: > : From: kumar gaurav <kg...@gmail.com>
: > : Reply-To: users@solr.apache.org
: > : To: solr-user@lucene.apache.org, users@solr.apache.org
: > : Subject: Re: Help regarding solr request timeout because of spellcheck
: > : component performance.
: > :
: > : Just a reminder if someone can help here.
: > :
: > : On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
: > :
: > : > ++ users@solr.apache.org
: > : >
: > : > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
: > : >
: > : >> HI Everyone
: > : >>
: > : >> I am getting a solr socket timeout exception in the select search
: > query
: > : >> because of bad spellcheck performance.
: > : >>
: > : >> I am using the spellcheck component in solr select request handler.
: > : >> solrconfig
: > : >>
: > : >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
: > : >>
: > : >> <lst name="defaults">
: > : >> <str name="defType">edismax</str>
: > : >> <str name="facet">true</str>
: > : >> <str name="facet.mincount">1</str>
: > : >> <str name="q.op">AND</str>
: > : >> <str name="mm">100</str>
: > : >> <str name="sow">true</str>
: > : >> <str name="spellcheck.count">25</str>
: > : >> <str name="spellcheck.onlyMorePopular">false</str>
: > : >> <str name="spellcheck.collate">true</str>
: > : >> <str name="spellcheck.collateExtendedResults">true</str>
: > : >> <str name="spellcheck">true</str>
: > : >> <str name="spellcheck.extendedResults">false</str>
: > : >> <str name="spellcheck.maxCollations">10</str>
: > : >> <str name="spellcheck.maxCollationTries">150</str>
: > : >> <str name="spellcheck.collateParam.mm">100%</str>
: > : >> <str name="spellcheck.dictionary">default</str>
: > : >> <str name="spellcheck.dictionary">wordbreak</str>
: > : >> </lst>
: > : >> <arr name="last-components">
: > : >> <str>spellcheck</str>
: > : >> </arr>
: > : >> </requestHandler>
: > : >>
: > : >>
: > : >> Do we have any time allowed parameter for spellcheck like query
: > : >> timeAllowed parameter ?
: > : >>
: > : >> how can i identify query timeout because of spellcheck component
: > process ?
: > : >>
: > : >> Please help. Thanks in advance.
: > : >>
: > : >>
: > : >>
: > : >> --
: > : >> Thanks & Regards
: > : >> Kumar Gaurav
: > : >>
: > : >
: > :
: >
: > -Hoss
: > http://www.lucidworks.com/
: >
:
-Hoss
http://www.lucidworks.com/
Re: Help regarding solr request timeout because of spellcheck component performance.
Posted by kumar gaurav <kg...@gmail.com>.
HI Chris
Thanks a lot for your reply.
timeAllowed does not limit spellcheck i have tried.
Following are the spellcheck configuration. Can you suggest something ?
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text_general</str>
<!-- a spellchecker built from a field of the main index -->
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">text</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">4</int>
<float name="maxQueryFrequency">0.01</float>
<!-- uncomment this to require suggestions to occur in 1% of the documents
<float name="thresholdTokenFrequency">.01</float>
-->
</lst>
<!-- a spellchecker that can break or combine words. See "/spell"
handler below for usage -->
<lst name="spellchecker">
<str name="name">wordbreak</str>
<str name="classname">solr.WordBreakSolrSpellChecker</str>
<str name="field">name</str>
<str name="combineWords">true</str>
<str name="breakWords">true</str>
<int name="maxChanges">10</int>
</lst>
</searchComponent>
On Thu, 4 May 2023 at 06:34, Chris Hostetter <ho...@fucit.org>
wrote:
>
> 1) timeAllowed does limit spellcheck (at least in all the code paths i can
> think of that may be "slow") ... have you tried it?
>
> 2) what is your configuration for the dictionaries you are using?
>
> 3) be wary of https://github.com/apache/lucene/issues/12077
>
>
> : Date: Tue, 2 May 2023 00:04:27 +0530
> : From: kumar gaurav <kg...@gmail.com>
> : Reply-To: users@solr.apache.org
> : To: solr-user@lucene.apache.org, users@solr.apache.org
> : Subject: Re: Help regarding solr request timeout because of spellcheck
> : component performance.
> :
> : Just a reminder if someone can help here.
> :
> : On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
> :
> : > ++ users@solr.apache.org
> : >
> : > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
> : >
> : >> HI Everyone
> : >>
> : >> I am getting a solr socket timeout exception in the select search
> query
> : >> because of bad spellcheck performance.
> : >>
> : >> I am using the spellcheck component in solr select request handler.
> : >> solrconfig
> : >>
> : >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
> : >>
> : >> <lst name="defaults">
> : >> <str name="defType">edismax</str>
> : >> <str name="facet">true</str>
> : >> <str name="facet.mincount">1</str>
> : >> <str name="q.op">AND</str>
> : >> <str name="mm">100</str>
> : >> <str name="sow">true</str>
> : >> <str name="spellcheck.count">25</str>
> : >> <str name="spellcheck.onlyMorePopular">false</str>
> : >> <str name="spellcheck.collate">true</str>
> : >> <str name="spellcheck.collateExtendedResults">true</str>
> : >> <str name="spellcheck">true</str>
> : >> <str name="spellcheck.extendedResults">false</str>
> : >> <str name="spellcheck.maxCollations">10</str>
> : >> <str name="spellcheck.maxCollationTries">150</str>
> : >> <str name="spellcheck.collateParam.mm">100%</str>
> : >> <str name="spellcheck.dictionary">default</str>
> : >> <str name="spellcheck.dictionary">wordbreak</str>
> : >> </lst>
> : >> <arr name="last-components">
> : >> <str>spellcheck</str>
> : >> </arr>
> : >> </requestHandler>
> : >>
> : >>
> : >> Do we have any time allowed parameter for spellcheck like query
> : >> timeAllowed parameter ?
> : >>
> : >> how can i identify query timeout because of spellcheck component
> process ?
> : >>
> : >> Please help. Thanks in advance.
> : >>
> : >>
> : >>
> : >> --
> : >> Thanks & Regards
> : >> Kumar Gaurav
> : >>
> : >
> :
>
> -Hoss
> http://www.lucidworks.com/
>
Re: Help regarding solr request timeout because of spellcheck component performance.
Posted by Chris Hostetter <ho...@fucit.org>.
1) timeAllowed does limit spellcheck (at least in all the code paths i can
think of that may be "slow") ... have you tried it?
2) what is your configuration for the dictionaries you are using?
3) be wary of https://github.com/apache/lucene/issues/12077
: Date: Tue, 2 May 2023 00:04:27 +0530
: From: kumar gaurav <kg...@gmail.com>
: Reply-To: users@solr.apache.org
: To: solr-user@lucene.apache.org, users@solr.apache.org
: Subject: Re: Help regarding solr request timeout because of spellcheck
: component performance.
:
: Just a reminder if someone can help here.
:
: On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
:
: > ++ users@solr.apache.org
: >
: > On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
: >
: >> HI Everyone
: >>
: >> I am getting a solr socket timeout exception in the select search query
: >> because of bad spellcheck performance.
: >>
: >> I am using the spellcheck component in solr select request handler.
: >> solrconfig
: >>
: >> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
: >>
: >> <lst name="defaults">
: >> <str name="defType">edismax</str>
: >> <str name="facet">true</str>
: >> <str name="facet.mincount">1</str>
: >> <str name="q.op">AND</str>
: >> <str name="mm">100</str>
: >> <str name="sow">true</str>
: >> <str name="spellcheck.count">25</str>
: >> <str name="spellcheck.onlyMorePopular">false</str>
: >> <str name="spellcheck.collate">true</str>
: >> <str name="spellcheck.collateExtendedResults">true</str>
: >> <str name="spellcheck">true</str>
: >> <str name="spellcheck.extendedResults">false</str>
: >> <str name="spellcheck.maxCollations">10</str>
: >> <str name="spellcheck.maxCollationTries">150</str>
: >> <str name="spellcheck.collateParam.mm">100%</str>
: >> <str name="spellcheck.dictionary">default</str>
: >> <str name="spellcheck.dictionary">wordbreak</str>
: >> </lst>
: >> <arr name="last-components">
: >> <str>spellcheck</str>
: >> </arr>
: >> </requestHandler>
: >>
: >>
: >> Do we have any time allowed parameter for spellcheck like query
: >> timeAllowed parameter ?
: >>
: >> how can i identify query timeout because of spellcheck component process ?
: >>
: >> Please help. Thanks in advance.
: >>
: >>
: >>
: >> --
: >> Thanks & Regards
: >> Kumar Gaurav
: >>
: >
:
-Hoss
http://www.lucidworks.com/
Re: Help regarding solr request timeout because of spellcheck component performance.
Posted by kumar gaurav <kg...@gmail.com>.
Just a reminder if someone can help here.
On Mon, 24 Apr 2023 at 13:40, kumar gaurav <kg...@gmail.com> wrote:
> ++ users@solr.apache.org
>
> On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
>
>> HI Everyone
>>
>> I am getting a solr socket timeout exception in the select search query
>> because of bad spellcheck performance.
>>
>> I am using the spellcheck component in solr select request handler.
>> solrconfig
>>
>> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
>>
>> <lst name="defaults">
>> <str name="defType">edismax</str>
>> <str name="facet">true</str>
>> <str name="facet.mincount">1</str>
>> <str name="q.op">AND</str>
>> <str name="mm">100</str>
>> <str name="sow">true</str>
>> <str name="spellcheck.count">25</str>
>> <str name="spellcheck.onlyMorePopular">false</str>
>> <str name="spellcheck.collate">true</str>
>> <str name="spellcheck.collateExtendedResults">true</str>
>> <str name="spellcheck">true</str>
>> <str name="spellcheck.extendedResults">false</str>
>> <str name="spellcheck.maxCollations">10</str>
>> <str name="spellcheck.maxCollationTries">150</str>
>> <str name="spellcheck.collateParam.mm">100%</str>
>> <str name="spellcheck.dictionary">default</str>
>> <str name="spellcheck.dictionary">wordbreak</str>
>> </lst>
>> <arr name="last-components">
>> <str>spellcheck</str>
>> </arr>
>> </requestHandler>
>>
>>
>> Do we have any time allowed parameter for spellcheck like query
>> timeAllowed parameter ?
>>
>> how can i identify query timeout because of spellcheck component process ?
>>
>> Please help. Thanks in advance.
>>
>>
>>
>> --
>> Thanks & Regards
>> Kumar Gaurav
>>
>
Re: Help regarding solr request timeout because of spellcheck component performance.
Posted by kumar gaurav <kg...@gmail.com>.
++ users@solr.apache.org
On Mon, 24 Apr 2023 at 13:12, kumar gaurav <kg...@gmail.com> wrote:
> HI Everyone
>
> I am getting a solr socket timeout exception in the select search query
> because of bad spellcheck performance.
>
> I am using the spellcheck component in solr select request handler.
> solrconfig
>
> <requestHandler name="/select" class="solr.SearchHandler" lazy="true">
>
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="facet">true</str>
> <str name="facet.mincount">1</str>
> <str name="q.op">AND</str>
> <str name="mm">100</str>
> <str name="sow">true</str>
> <str name="spellcheck.count">25</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.collateExtendedResults">true</str>
> <str name="spellcheck">true</str>
> <str name="spellcheck.extendedResults">false</str>
> <str name="spellcheck.maxCollations">10</str>
> <str name="spellcheck.maxCollationTries">150</str>
> <str name="spellcheck.collateParam.mm">100%</str>
> <str name="spellcheck.dictionary">default</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> </lst>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> </requestHandler>
>
>
> Do we have any time allowed parameter for spellcheck like query
> timeAllowed parameter ?
>
> how can i identify query timeout because of spellcheck component process ?
>
> Please help. Thanks in advance.
>
>
>
> --
> Thanks & Regards
> Kumar Gaurav
>