You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sebastian Riemer <s....@littera.eu> on 2020/07/24 07:53:00 UTC

solr suggester.rebuild takes forever and eventually runs out of memory on production

Dear mailing list community,

we have troubles when starting the Suggester-Build on one of our production servers.


1.       We execute the required query with the suggest.build parameter

2.       It seems solr is taking up the task to recreate the suggester index (we see that the CPU rises significantly)

3.       It takes forever to build (and seems to never finish!)

4.       Sometimes  the linux OOM killer strikes and usually picks the solr process and kills it
5.       During the rebuild calling the suggester results in "suggester not built" exception
6.       Restarting the solr-Service has no effect. It just continues the rebuild

How long should it take for that task, given that our index currently holds approximately 7,2 Mio  documents in a parent/child structure?
Is it possible, to query the progress of the suggester.build task after it was started?
How can we be sure, whether the suggester.build task is still running or whether it is finished?

Which factors have the most significant impact on the duration of the rebuild process, given that we use the config below? (Let me now, if you need additional information)
Can we speed up the process somehow?

Best regards,
Sebastian
Solrconfig.xml
<searchComponent name="suggest" class="solr.SuggestComponent">
                               <lst name="suggester">
                                               <str name="name">infixSuggester</str>
                                               <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
                                               <str name="indexPath">infix_suggestions</str>
                                               <!--<str name="dictionaryImpl">DocumentDictionaryFactory</str>-->
                                               <str name="dictionaryImpl">DocumentDictionaryFactory</str>
                                               <str name="field">SUGGEST</str>
                                               <str name="suggestAnalyzerFieldType">textSuggest</str>
                                               <str name="buildOnStartup">false</str>
                                               <str name="buildOnCommit">false</str>
                               </lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
                               <lst name="defaults">
                                               <str name="suggest">true</str>
                                               <str name="suggest.dictionary">infixSuggester</str>
                                               <str name="suggest.onlyMorePopular">true</str>
                                               <str name="suggest.count">500</str>
                                               <str name="suggest.collate">true</str>
                               </lst>
                               <arr name="components">
                                               <str>suggest</str>
                               </arr>
</requestHandler>


Mit freundlichen Grüßen
Sebastian Riemer, BSc

Re: solr suggester.rebuild takes forever and eventually runs out of memory on production

Posted by Anthony Groves <ag...@oreilly.com>.
Hi Sebastian,
I saw similar undesirable results with the standard SuggestComponent when
we upgraded multiple Solr versions. For our index of 2.8M docs, it takes
6-7 minutes to build a suggestion index using SuggestComponent out of the
box, so I wouldn't be surprised if your 7.2M doc index takes a good while.

Do you have any custom Solr plugins? Suggestions are an area where
customization really seems to come in handy. We handle that "suggester not
built" exception here, by just returning an empty suggest results so the
clients do not blow up:
https://github.com/oreillymedia/ifpress-solr-plugin/blob/bf3b07c5be32fbcfa7b6fdfd439d511ef60dab68/src/main/java/com/ifactory/press/db/solr/spelling/suggest/SafariInfixSuggester.java#L141
I was also able to speed up the build slightly by excluding specific
suggest contexts at build time rather than query time (a specific type of
documents we didn't care to include in suggestions).

We did way more sophisticated things with the older SpellCheckComponent
(see MultiSuggester in that same repo) like progress logging and returning
the existing suggest index while a new one is building. If you do choose to
write a custom plugin, feel free to re-use any code there that may help
(Apache License 2.0).


*Anthony Groves*  | Technical Lead, Search
O'Reilly Media, Inc.  | 724.255.7323 <724-255-7323>  |  oreilly.com
<http://oreilly.com/search>


On Fri, Jul 24, 2020 at 4:52 AM Sebastian Riemer <s....@littera.eu>
wrote:

> Oh, I am sorry, I totally forgot to mention our solr version, it's 7.7.3.
>
> -----Ursprüngliche Nachricht-----
> Von: Sebastian Riemer [mailto:s.riemer@littera.eu]
> Gesendet: Freitag, 24. Juli 2020 09:53
> An: solr-user@lucene.apache.org
> Betreff: solr suggester.rebuild takes forever and eventually runs out of
> memory on production
>
> Dear mailing list community,
>
> we have troubles when starting the Suggester-Build on one of our
> production servers.
>
>
> 1.       We execute the required query with the suggest.build parameter
>
> 2.       It seems solr is taking up the task to recreate the suggester
> index (we see that the CPU rises significantly)
>
> 3.       It takes forever to build (and seems to never finish!)
>
> 4.       Sometimes  the linux OOM killer strikes and usually picks the
> solr process and kills it
> 5.       During the rebuild calling the suggester results in "suggester
> not built" exception
> 6.       Restarting the solr-Service has no effect. It just continues the
> rebuild
>
> How long should it take for that task, given that our index currently
> holds approximately 7,2 Mio  documents in a parent/child structure?
> Is it possible, to query the progress of the suggester.build task after it
> was started?
> How can we be sure, whether the suggester.build task is still running or
> whether it is finished?
>
> Which factors have the most significant impact on the duration of the
> rebuild process, given that we use the config below? (Let me now, if you
> need additional information) Can we speed up the process somehow?
>
> Best regards,
> Sebastian
> Solrconfig.xml
> <searchComponent name="suggest" class="solr.SuggestComponent">
>                                <lst name="suggester">
>                                                <str
> name="name">infixSuggester</str>
>                                                <str
> name="lookupImpl">AnalyzingInfixLookupFactory</str>
>                                                <str
> name="indexPath">infix_suggestions</str>
>                                                <!--<str
> name="dictionaryImpl">DocumentDictionaryFactory</str>-->
>                                                <str
> name="dictionaryImpl">DocumentDictionaryFactory</str>
>                                                <str
> name="field">SUGGEST</str>
>                                                <str
> name="suggestAnalyzerFieldType">textSuggest</str>
>                                                <str
> name="buildOnStartup">false</str>
>                                                <str
> name="buildOnCommit">false</str>
>                                </lst>
> </searchComponent>
> <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
>                                <lst name="defaults">
>                                                <str
> name="suggest">true</str>
>                                                <str
> name="suggest.dictionary">infixSuggester</str>
>                                                <str
> name="suggest.onlyMorePopular">true</str>
>                                                <str
> name="suggest.count">500</str>
>                                                <str
> name="suggest.collate">true</str>
>                                </lst>
>                                <arr name="components">
>                                                <str>suggest</str>
>                                </arr>
> </requestHandler>
>
>
> Mit freundlichen Grüßen
> Sebastian Riemer, BSc
>

AW: solr suggester.rebuild takes forever and eventually runs out of memory on production

Posted by Sebastian Riemer <s....@littera.eu>.
Oh, I am sorry, I totally forgot to mention our solr version, it's 7.7.3.

-----Ursprüngliche Nachricht-----
Von: Sebastian Riemer [mailto:s.riemer@littera.eu] 
Gesendet: Freitag, 24. Juli 2020 09:53
An: solr-user@lucene.apache.org
Betreff: solr suggester.rebuild takes forever and eventually runs out of memory on production

Dear mailing list community,

we have troubles when starting the Suggester-Build on one of our production servers.


1.       We execute the required query with the suggest.build parameter

2.       It seems solr is taking up the task to recreate the suggester index (we see that the CPU rises significantly)

3.       It takes forever to build (and seems to never finish!)

4.       Sometimes  the linux OOM killer strikes and usually picks the solr process and kills it
5.       During the rebuild calling the suggester results in "suggester not built" exception
6.       Restarting the solr-Service has no effect. It just continues the rebuild

How long should it take for that task, given that our index currently holds approximately 7,2 Mio  documents in a parent/child structure?
Is it possible, to query the progress of the suggester.build task after it was started?
How can we be sure, whether the suggester.build task is still running or whether it is finished?

Which factors have the most significant impact on the duration of the rebuild process, given that we use the config below? (Let me now, if you need additional information) Can we speed up the process somehow?

Best regards,
Sebastian
Solrconfig.xml
<searchComponent name="suggest" class="solr.SuggestComponent">
                               <lst name="suggester">
                                               <str name="name">infixSuggester</str>
                                               <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
                                               <str name="indexPath">infix_suggestions</str>
                                               <!--<str name="dictionaryImpl">DocumentDictionaryFactory</str>-->
                                               <str name="dictionaryImpl">DocumentDictionaryFactory</str>
                                               <str name="field">SUGGEST</str>
                                               <str name="suggestAnalyzerFieldType">textSuggest</str>
                                               <str name="buildOnStartup">false</str>
                                               <str name="buildOnCommit">false</str>
                               </lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
                               <lst name="defaults">
                                               <str name="suggest">true</str>
                                               <str name="suggest.dictionary">infixSuggester</str>
                                               <str name="suggest.onlyMorePopular">true</str>
                                               <str name="suggest.count">500</str>
                                               <str name="suggest.collate">true</str>
                               </lst>
                               <arr name="components">
                                               <str>suggest</str>
                               </arr>
</requestHandler>


Mit freundlichen Grüßen
Sebastian Riemer, BSc