You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jan Høydahl (JIRA)" <ji...@apache.org> on 2017/05/12 11:54:04 UTC

[jira] [Closed] (SOLR-10676) Optimize the reindexing of sunspot solr

     [ https://issues.apache.org/jira/browse/SOLR-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl closed SOLR-10676.
------------------------------
    Resolution: Invalid

Please ask your questions to the Sunspot community. This JIRA is a bug tracker for bugs or new features in the core Solr product.

If you still need to ask a solr-core related question, please post that to the solr-user@lucene.apache.org mailing list, and not to this bug tracker. See http://lucene.apache.org/solr/community.html#mailing-lists-irc

PS: I would imagine that the Sunspot code could be optimized to not send documents one by one but in batches and perhaps avoid explicit commits etc??

Closing this issue as invalid

> Optimize the reindexing of sunspot solr
> ---------------------------------------
>
>                 Key: SOLR-10676
>                 URL: https://issues.apache.org/jira/browse/SOLR-10676
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: clients - ruby - flare
>    Affects Versions: 5.0
>            Reporter: Krishna Sahoo
>
> We are using solr 5.0. <luceneMatchVersion>5.0.0</luceneMatchVersion>
> We have more than 5 million products. It is taking around 3.30 hours to reindex all the products.
> For optimizing the reindexing speed, we have used the following configurations
> <indexConfig>
>     <ramBufferSizeMB>960</ramBufferSizeMB>
>     <mergePolicyFactory>100</mergePolicyFactory>
>     <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>+
>  </indexConfig>
> <autoCommit> 
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
>        <openSearcher>false</openSearcher> 
>      </autoCommit>
> <autoSoftCommit> 
>        <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> 
>      </autoSoftCommit>
> We are indexing with the following option 
> { :batch_commit => false,:batch_size=>20000 }
> We have set autocommit false in our model. So whenever a new record is inserted it is not automatically added to the solr index. But when a record is updated we manually call the Sunspot.index! method for that particular product data. Everyday we are inserting around .2 millions of records. We have target of 50 million products. 
> Is there any way, that we can add to index only the new records or updated records?
> Can we increase the indexing speed by changing any of the current configurations?
> If we add new products to the solr through ruby code by using loop, it fails miserably as it takes too much time.
> Please help to find the best way to improve the indexing speed of solr.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org