You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by TomWilliamson <to...@hotmail.co.uk> on 2009/03/26 23:29:42 UTC

Initial query performance poor after update / delete

I'm developing a site (currently single server) that uses localsolr to
perform geo searches on ~200,000 small records although I'm expecting this
to grow significantly once I go live.

So far, so good but I've noticed that after any updates / deletions to the
index the first query is then very slow (typically 3 secs compared to 0.01).
Subsequent queries are then fine until the next update / deletion.

Having read the documentation - it seems that this is expected behaviour but
it is unclear to me on the best way to resolve it - I'm expecting many
updates throughout the day and obviously don't want query performance to
suffer.

Am I right in assuming that if I add an additional server and setup
replication (using one for updates and the other for queries) this will
resolve my issue? Although it doesn't need to be realtime - I would like
updates to be live within ~5 minutes.

I'm quickly getting confused with multi-core, replication options,
distributed SOLR, Collection Scripts etc. etc. Is there any documentation on
the best circumstances for using each of these technologies?

Many thanks,
Tom

PS. I'm using SOLR1.3/LocalSolrR2.
-- 
View this message in context: http://www.nabble.com/Initial-query-performance-poor-after-update---delete-tp22732463p22732463.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Initial query performance poor after update / delete

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Tom,

 
> Thanks Otis. After some further testing - I've noticed that initial searches
> are only slow if I include the qt=geo parameter. Searches without this
> parameter appear to show no slow down whatsoever after updates - so I'm
> wondering if the problem is actually a localsolr one.
> 
> Can you tell me where I can specify the configuration to set up the
> parameters for swapping the searchers? Is this within solrconfig.xml? Any
> light you could shed on this would be really appreciated.

In a single server environment searchers should be swapped whenever you issue a commit.

> Thanks again,
> Tom
> 
> PS. If you wrote a "SOLR in Action" - I would buy it today!

Careful what you wish! ;)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: Initial query performance poor after update / delete

Posted by TomWilliamson <to...@hotmail.co.uk>.
Thanks Otis. After some further testing - I've noticed that initial searches
are only slow if I include the qt=geo parameter. Searches without this
parameter appear to show no slow down whatsoever after updates - so I'm
wondering if the problem is actually a localsolr one.

Can you tell me where I can specify the configuration to set up the
parameters for swapping the searchers? Is this within solrconfig.xml? Any
light you could shed on this would be really appreciated.

Thanks again,
Tom

PS. If you wrote a "SOLR in Action" - I would buy it today!
-- 
View this message in context: http://www.nabble.com/Initial-query-performance-poor-after-update---delete-tp22732463p22739929.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Initial query performance poor after update / delete

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Tom,

Aha, so you are using a single server for index updates, deleted, and searches.  This is OK for small setups and in itself is not the source of this slowness.  The problem is likely caused by you swapping searchers after each index update/delete, and probably without warming up the new searcher before exposing it to the new searches.  So if you just don't swap searchers every single time and instead do it every N minutes, and if you warm up your new searcher first, you should be able to stick to your single server setup.

As for multi-server setup, you listed several things, but all you need to look for (on the Wiki and ML archives) is info about:
- index/collection replication
- master/slave setup (which is really all about replication)


You can ignore distributed search and multi-core, you don't need that for what you described.

Good luck!

Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: TomWilliamson <to...@hotmail.co.uk>
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 26, 2009 6:29:42 PM
> Subject: Initial query performance poor after update / delete
> 
> 
> I'm developing a site (currently single server) that uses localsolr to
> perform geo searches on ~200,000 small records although I'm expecting this
> to grow significantly once I go live.
> 
> So far, so good but I've noticed that after any updates / deletions to the
> index the first query is then very slow (typically 3 secs compared to 0.01).
> Subsequent queries are then fine until the next update / deletion.
> 
> Having read the documentation - it seems that this is expected behaviour but
> it is unclear to me on the best way to resolve it - I'm expecting many
> updates throughout the day and obviously don't want query performance to
> suffer.
> 
> Am I right in assuming that if I add an additional server and setup
> replication (using one for updates and the other for queries) this will
> resolve my issue? Although it doesn't need to be realtime - I would like
> updates to be live within ~5 minutes.
> 
> I'm quickly getting confused with multi-core, replication options,
> distributed SOLR, Collection Scripts etc. etc. Is there any documentation on
> the best circumstances for using each of these technologies?
> 
> Many thanks,
> Tom
> 
> PS. I'm using SOLR1.3/LocalSolrR2.
> -- 
> View this message in context: 
> http://www.nabble.com/Initial-query-performance-poor-after-update---delete-tp22732463p22732463.html
> Sent from the Solr - User mailing list archive at Nabble.com.