You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ratnadeep Rakshit <ra...@qedrix.com> on 2018/07/01 15:00:45 UTC

Re: Solr Suggest Component and OOM

Has anyone ever been successful in processing 150M records using the
Suggester Component? The make of the component, please comment.

On Tue, Jun 26, 2018 at 1:37 AM, Ratnadeep Rakshit <ra...@qedrix.com>
wrote:

> The site_address field has all the address of United states. Idea is to
> build something similar to Google Places autosuggest.
>
> Here's an example query: curl "http://localhost/solr/
> addressbook/suggest?suggest.q=1054%20club&wt=json"
>
> Response:
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 3125,
> "params": {
> "suggest.q": "1054 club",
> "wt": "json"
> }
> },
> "suggest": {
> "mySuggester2": {
> "1054 club": {
> "numFound": 3,
> "suggestions": [{
> "term": "<b>1054</b> null N COUNTRY <b>CLUB</b> null BLVD null STOCKTON CA
> 95204 5008",
> "weight": 0,
> "payload": "0023865882|06077|37.970769,-121.310433"
> }, {
> "term": "<b>1054</b> null E HERITAGE <b>CLUB</b> null CIR null DELRAY
> BEACH FL 33483 3482",
> "weight": 0,
> "payload": "0117190535|12099|26.445485,-80.069336"
> }, {
> "term": "<b>1054</b> null null CORAL <b>CLUB</b> null DR <b>1054</b> CORAL
> SPRINGS FL 33071 5657",
> "weight": 0,
> "payload": "0111342342|12011|26.243918,-80.267577"
> }]
> }
> },
> "mySuggester1": {
> "1054 club": {
> "numFound": 0,
> "suggestions": []
> }
> }
> }
> }
>
> Now when I start building with 25M address records in the addressbook
> core, the process runs smoothly. I can check the Heap utilization upto 56%
> max out of the 20GB allotted to Solr.
> I am not very experienced in metering solr performance. But it looks like
> when I increase the record size beyond 25M in the core, the build process
> fails. The query process of the suggester still works.
>
> Did that answer your questions correctly?
>
> On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti <
> a.benedetti@sease.io> wrote:
>
>> Hi,
>> first of all the two different suggesters you are using are based on
>> different data structures ( with different memory utilisation) :
>>
>> - FuzzyLookupFactory -> FST ( in memory and stored binary on disk)
>> - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index
>>
>> Both the data structures should be very memory efficient ( both in
>> building
>> and storage).
>> What is the cardinality of the fields you are building suggestions from ?
>> (
>> site_address and site_address_other)
>> What is the memory situation in Solr when you start the suggester
>> building ?
>> You are allocating much more memory to the JVM Solr process than the OS (
>> which in your situation doesn't fit the entire index ideal scenario).
>>
>> I would recommend to put some monitoring in place ( there are plenty of
>> open
>> source tools to do that)
>>
>> Regards
>>
>>
>>
>> -----
>> ---------------
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>
>