You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Hill <so...@zvents.com> on 2007/05/29 20:31:10 UTC

Optimizing frequently updated index

Hi -

I have an index that is updated fairly frequently (every few seconds), and
I'm replicating to several slave servers.

Because of the frequent updates, I'm usually pushing an index that is not
optimized. And, as it takes several minutes to optimize, I don't want to do
it every time I replicate (at least not on the master).

I was wondering if it make sense to replicate to a slave instance, optimize
it there, and then distribute the optimized index from the first level
slave?

Any thoughts?

Thanks,

Tom

Re: Optimizing frequently updated index

Posted by Chris Hostetter <ho...@fucit.org>.
: I have an index that is updated fairly frequently (every few seconds), and
: I'm replicating to several slave servers.

how often do you replicate?

: I was wondering if it make sense to replicate to a slave instance, optimize
: it there, and then distribute the optimized index from the first level
: slave?

I think that could work assuming you run snappuller from the "master" to
the intermediate "slave" infrequently enough ... but you'd have to really
be careful that optimize never runs longer then your snappuller
interval or you could get a nasty collision ... when you tell Solr to
update/optimize an index, it assumes it's hte only thing writing to that
directory (which makes sense on a master) and when you run snapinstaller
it assumes it's the only thing modifying that directory (which makes sense
on a slave) ... i don't imagine those two will play very nicely with
eachother if they collide.

My advice: don't worry about having optimized indexes if you're really
updateing continously every few seconds (if there is downtime once a day
or week when you get no updates for a little while, optimize then)

If you find the performance of an unoptimized index is really bad in some
use cases, it's likely a code path in Lucene that no one has ever bothered
trying to optimize, and the time you spend getting your system to optimize
continuously might be better spent helping to improve the slow code path
in lucene :)


-Hoss