You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mike Austin <mi...@juggle.com> on 2011/08/31 20:16:16 UTC

Solr commit process and read downtime

I've set up a master slave configuration and it's working great!  I know
this is the better setup but if I had just one index due to requirements,
I'd like to know more about the performance hit of the commit. let's just
assume I have a decent size index of a few gig normal sized documents with
high traffic.  A few questions:

- (main question) When you do a commit on a single index, is there anytime
when the reads will not have an index to search on?
- With the rebuilding of caches and whatever else happens, is the only
downside the fact that the server performance will be degraded due to file
copy, cache warming, etc.. or will the index be actually locked at some
point?
- On a commit, do the files get copied so you need double the space or is
that just the optimize?

I know a master/slave setup is used to reduce these issues, but if I had
only one server I need to know the potential risks.

Thanks,
Mike

Re: Solr commit process and read downtime

Posted by Mike Austin <mi...@juggle.com>.
Wow.. thanks for the great answers Erick!  This answered my concerns
perfectly.

Mike

On Thu, Sep 1, 2011 at 7:54 AM, Erick Erickson <er...@gmail.com>wrote:

> See below:
>
> On Wed, Aug 31, 2011 at 2:16 PM, Mike Austin <mi...@juggle.com>
> wrote:
> > I've set up a master slave configuration and it's working great!  I know
> > this is the better setup but if I had just one index due to requirements,
> > I'd like to know more about the performance hit of the commit. let's just
> > assume I have a decent size index of a few gig normal sized documents
> with
> > high traffic.  A few questions:
> >
> > - (main question) When you do a commit on a single index, is there
> anytime
> > when the reads will not have an index to search on?
> No. While the new searcher is warming up, all incoming searches are
> handled by the old searcher. When the new searcher is warmed up,
> new requests are routed to it, and when the last search is completed
> in the old searcher, it's shut down
>
> > - With the rebuilding of caches and whatever else happens, is the only
> > downside the fact that the server performance will be degraded due to
> file
> > copy, cache warming, etc.. or will the index be actually locked at some
> > point?
> The index will not be locked, if by locked you mean the searches will
> not happen. See above. The server will certainly have more work to
> do, and if you're running close to the limits you might notice some
> slowdown. But often there is no noticeable pause. Note that while
> all this goes on, you will have *two* copies of the caches etc. in
> memory...
>
> > - On a commit, do the files get copied so you need double the space or is
> > that just the optimize?
> You have to allow for the relatively rare instance when the merge
> process combines all your segments into one, which will require
> at least double the disk space. Optimize guarantees this
> will happen, but it can (and will) happen on commit occasionally.
>
> >
> > I know a master/slave setup is used to reduce these issues, but if I had
> > only one server I need to know the potential risks.
> Well, you're just putting lots of stuff on a server. Solr will quite
> happily deal
> with this situation and, depending upon how much traffic you have and
> your machine's size, this may be fine. Do be aware of the "warmup hell"
> problem and don't commit too frequently or your warming searchers
> may tie their knickers in a knot.
>
> And one risk in this setup is that you have no way to quickly bring up
> a server if your one machine crashes, you have to re-index *all* your data.
>
> Best
> Erick
>
> >
> > Thanks,
> > Mike
> >
>

Re: Solr commit process and read downtime

Posted by Erick Erickson <er...@gmail.com>.
See below:

On Wed, Aug 31, 2011 at 2:16 PM, Mike Austin <mi...@juggle.com> wrote:
> I've set up a master slave configuration and it's working great!  I know
> this is the better setup but if I had just one index due to requirements,
> I'd like to know more about the performance hit of the commit. let's just
> assume I have a decent size index of a few gig normal sized documents with
> high traffic.  A few questions:
>
> - (main question) When you do a commit on a single index, is there anytime
> when the reads will not have an index to search on?
No. While the new searcher is warming up, all incoming searches are
handled by the old searcher. When the new searcher is warmed up,
new requests are routed to it, and when the last search is completed
in the old searcher, it's shut down

> - With the rebuilding of caches and whatever else happens, is the only
> downside the fact that the server performance will be degraded due to file
> copy, cache warming, etc.. or will the index be actually locked at some
> point?
The index will not be locked, if by locked you mean the searches will
not happen. See above. The server will certainly have more work to
do, and if you're running close to the limits you might notice some
slowdown. But often there is no noticeable pause. Note that while
all this goes on, you will have *two* copies of the caches etc. in
memory...

> - On a commit, do the files get copied so you need double the space or is
> that just the optimize?
You have to allow for the relatively rare instance when the merge
process combines all your segments into one, which will require
at least double the disk space. Optimize guarantees this
will happen, but it can (and will) happen on commit occasionally.

>
> I know a master/slave setup is used to reduce these issues, but if I had
> only one server I need to know the potential risks.
Well, you're just putting lots of stuff on a server. Solr will quite
happily deal
with this situation and, depending upon how much traffic you have and
your machine's size, this may be fine. Do be aware of the "warmup hell"
problem and don't commit too frequently or your warming searchers
may tie their knickers in a knot.

And one risk in this setup is that you have no way to quickly bring up
a server if your one machine crashes, you have to re-index *all* your data.

Best
Erick

>
> Thanks,
> Mike
>