You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Charlie Jackson <Ch...@cision.com> on 2007/05/03 17:55:40 UTC

Index corruptions?

I have a couple of questions regarding index corruptions. 

 

1) Has anyone using Solr in a production environment ever experienced an
index corruption? If so, how frequently do they occur?

 

2) It seems like the CollectionDistribution setup would be a good way to
put in place a recovery plan for (or at least have some viable backups
of) the index. However, I have a small concern that if the index gets
corrupted on the master server, the corruption would propagate down to
the slave servers as well. Is this concern unfounded? Also, each of the
snapshots taken by snapshooter are viable full indexes, correct? If so,
that means I'd have a backup of the index each and every time a commit
(or optimize for that matter) is done, which would be awesome.

 

One of our biggest requirements for the indexing process is to have a
good backup/recover strategy in place and I want to make sure Solr will
be able to provide that. 

 

Thanks in advance!

 

Charlie


Re: Index corruptions?

Posted by Bill Au <bi...@gmail.com>.
In additional to snapshot, you can also make backup copies of your Solr
index using the backup script.
Backup are created the same way as snapshots using hard links.  Each one is
a viable full index.

Bill

On 5/3/07, Charlie Jackson <Ch...@cision.com> wrote:
>
> I have a couple of questions regarding index corruptions.
>
>
>
> 1) Has anyone using Solr in a production environment ever experienced an
> index corruption? If so, how frequently do they occur?
>
>
>
> 2) It seems like the CollectionDistribution setup would be a good way to
> put in place a recovery plan for (or at least have some viable backups
> of) the index. However, I have a small concern that if the index gets
> corrupted on the master server, the corruption would propagate down to
> the slave servers as well. Is this concern unfounded? Also, each of the
> snapshots taken by snapshooter are viable full indexes, correct? If so,
> that means I'd have a backup of the index each and every time a commit
> (or optimize for that matter) is done, which would be awesome.
>
>
>
> One of our biggest requirements for the indexing process is to have a
> good backup/recover strategy in place and I want to make sure Solr will
> be able to provide that.
>
>
>
> Thanks in advance!
>
>
>
> Charlie
>
>

Re: Index corruptions?

Posted by Yonik Seeley <yo...@apache.org>.
On 5/7/07, Tom Hill <so...@zvents.com> wrote:
> Is the "cp-lr" in snapshot really guaranteed to be atomic? Or is it just
> fast, and unlikely to be interrupted?

It's called from Solr within a synchronized context, and it's
guaranteed that no index changes (via Solr at least) will happen
concurrently.

-Yonik

Re: Index corruptions?

Posted by Tom Hill <so...@zvents.com>.
Hi Charlie,

On 5/3/07, Charlie Jackson <Ch...@cision.com> wrote:
>
> I have a couple of questions regarding index corruptions.
>
> 1) Has anyone using Solr in a production environment ever experienced an
> index corruption? If so, how frequently do they occur?


I once had all slaves complain about a missing file in the index. The master
never had a problem. The problem went away at the next snapshot.

Is the "cp-lr" in snapshot really guaranteed to be atomic? Or is it just
fast, and unlikely to be interrupted?

This has only occurred once over the last 5  months.

2) It seems like the CollectionDistribution setup would be a good way to
> put in place a recovery plan for (or at least have some viable backups
> of) the index. However, I have a small concern that if the index gets
> corrupted on the master server, the corruption would propagate down to
> the slave servers as well. Is this concern unfounded?


I would expect this to be true.

Also, each of the
> snapshots taken by snapshooter are viable full indexes, correct? If so,
> that means I'd have a backup of the index each and every time a commit
> (or optimize for that matter) is done, which would be awesome.


That's my understanding.

Tom