You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by athir nuaimi <at...@nuaim.com> on 2010/01/24 16:35:07 UTC

strategy for snapshottig Solr data directory on EC2

We are running two Solr servers (master/slave) on EC2 and have the solr home directories on EBS drives that we snapshot every 12 hours.  While that will mean that we will lose at most 12 hours of data, I wondered if there was a way I could reduce the window of data loss.   With our mysql servers, we snapshot every 12 hours but also copy the binary logs to S3 every 5 minutes.

We are doing commits every 10 minutes on the master and will be using the built-in java replication (today we are using snapshotting to replicate but are in the process of migrating from 1.3 to 1.4).

On a related note, are we doing the right thing in having our slave solr home directory on an EBS volume?  If the slave were to die and we had to create a fresh one, will it just resync the entire index from the master?  is the reason to have the slave on an EBS volume so that the slave has less data to resync on startup?

thanks in advance
Athir

Re: strategy for snapshottig Solr data directory on EC2

Posted by William Pierce <ev...@hotmail.com>.
Our setup on ec2 is as follows:

a) mysql master on ebs volume.
b) solr master on its own ebs volume
c) solr slaves do not use ebs -- but rather use the ephemeral instance 
stores.  There is a small period of time where the solr slave has to re-sync 
the data from the solr master.

Cheers,

Bill

--------------------------------------------------
From: "athir nuaimi" <at...@nuaim.com>
Sent: Sunday, January 24, 2010 7:35 AM
To: <so...@lucene.apache.org>
Subject: strategy for snapshottig Solr data directory on EC2

> We are running two Solr servers (master/slave) on EC2 and have the solr 
> home directories on EBS drives that we snapshot every 12 hours.  While 
> that will mean that we will lose at most 12 hours of data, I wondered if 
> there was a way I could reduce the window of data loss.   With our mysql 
> servers, we snapshot every 12 hours but also copy the binary logs to S3 
> every 5 minutes.
>
> We are doing commits every 10 minutes on the master and will be using the 
> built-in java replication (today we are using snapshotting to replicate 
> but are in the process of migrating from 1.3 to 1.4).
>
> On a related note, are we doing the right thing in having our slave solr 
> home directory on an EBS volume?  If the slave were to die and we had to 
> create a fresh one, will it just resync the entire index from the master? 
> is the reason to have the slave on an EBS volume so that the slave has 
> less data to resync on startup?
>
> thanks in advance
> Athir