You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adfel70 <ad...@gmail.com> on 2013/11/07 13:28:25 UTC

Re: solrcloud shards backup/restoration

did you solve this eventually?


Aditya Sakhuja wrote
> How does one recover from an index corruption ? That's what I am trying to
> eventually tackle here.
> 
> Thanks
> Aditya
> 
> On Thursday, September 19, 2013, Aditya Sakhuja wrote:
> 
>> Hi,
>>
>> Sorry for the late followup on this. Let me put in more details here.
>>
>> *The problem:*
>>
>> Cannot successfully restore back the index backed up with
>> '/replication?command=backup'. The backup was generated as *
>> snapshot.yyyymmdd*
>>
>> *My setup and steps:*
>> *
>> *
>> 6 solrcloud instances
>> 7 zookeepers instances
>>
>> Steps:
>>
>> 1.> Take snapshot using
>> *http://host1:8893/solr/replication?command=backup
>> *, on one host only. move *snapshot.yyyymmdd *to some reliable storage.
>>
>> 2.> Stop all 6 solr instances, all 7 zk instances.
>>
>> 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
>> the index data completely.
>>
>> 4.> Delete zookeeper/data/version*/* on all zookeeper nodes.
>>
>> 5.> Copy back index from backup to one of the nodes.
>>      \> cp *snapshot.yyyymmdd/*  *../collectionname/data/index/*
>>
>> 6.> Restart all zk instances. Restart all solrcloud instances.
>>
>>
>> *Outcome:*
>> *
>> *
>> All solr instances are up. However, *num of docs = 0 *for all nodes.
>> Looking at the node where the index was restored, there is a new
>> index.yymmddhhmmss directory being created and index.properties pointing
>> to
>> it. That explains why no documents are reported.
>>
>>
>> How do I have solrcloud pickup data from the index directory on a restart
>> ?
>>
>> Thanks in advance,
>> Aditya
>>
>>
>>
>> On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja &lt;

> aditya.sakhuja@

> &gt;wrote:
>>
>> Thanks Shalin and Mark for your responses. I am on the same page about
>> the
>> conventions for taking the backup. However, I am less sure about the
>> restoration of the index. Lets say we have 3 shards across 3 solrcloud
>> servers.
>>
>> 1.> I am assuming we should take a backup from each of the shard leaders
>> to get a complete collection. do you think that will get the complete
>> index
>> ( not worrying about what is not hard committed at the time of backup ).
>> ?
>>
>> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
>> From the structure of the snapshot I took, I did not see any
>> replication.properties or index.properties  which I see normally on a
>> healthy solrcloud cluster nodes.
>> if I have the snapshot named snapshot.20130905 does the
>> snapshot.20130905/* go into data/index ?
>>
>> Thanks
>> Aditya
>>
>>
>>
>> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller &lt;

> markrmiller@

> &gt; wrote:
>>
>> Phone typing. The end should not say "don't hard commit" - it should say
>> "do a hard commit and take a snapshot".
>>
>> Mark
>>
>> Sent from my iPhone
>>
>> On Sep 6, 2013, at 7:26 AM, Mark Miller &lt;

> markrmiller@

> &gt; wrote:
>>
>> > I don't know that it's too bad though - its always been the case that
>> if
>> you do a backup while indexing, it's just going to get up to the last
>> hard
>> commit. With SolrCloud that will still be the case. So just make sure you
>> do a hard commit right before taking the backup - yes, it might miss a
>> few
>> docs in the tran log, but if you are taking a back up while indexing, you
>> don't have great precision in any case - you will roughly get a snapshot
>> for around that time - even without SolrCloud, if you are worried about
>> precision and getting every update into that backup, you want to stop
>> indexing and commit first. But if you just want a rough snapshot for
>> around
>> that time, in both cases you can still just don't hard commit and take a
>> snapshot.
>> >
>> > Mark
>> >
>> > Sent from my iPhone
>> >
>> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
>> 

> shalinmangar@

>> wrote:
>> >
>> >> The replication handler's backup command was built for pre-SolrCloud.
>> >> It takes a snapshot of the index but it is unaware of the transaction
>> >> log which is a key component in SolrCloud. Hence unless you stop
>> >> updates, commit your changes and then take a backup, you will likely
>> >> miss some updates.
>> >>
>> >> That being said, I'm curious to see how peer sync behaves when you try
>> >> to restore from a snapshot. When you say that you haven't been
>> >> successful in restoring, what exactly is the behaviour you observed?
>> >>
>> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
>> 

> aditya.sakhuja@

>> wrote:
>> >>> Hello,
>> >>>
>> >>> I was looking for a good backup / recovery solution for the solrcloud
>> >>> indexes. I am more looking for restoring the indexes from the index
>> >>> snapshot, which can be taken using the replicationHandler's backup
>> command.
>> >>>
>> >>> I am looking for something that works with solrcloud 4.3 eventually,
>> but
>> >>> still relevant if you tested with a previous version.
>> >>>
>> >>> I haven't been successful in have the restored index replicate across
>> the
>> >>> new replicas, after I restart all the nodes, with one node having the
>> >>> restored index.
>> >>>
>> >>> Is restoring the indexes on all the nodes the best way to do it ?
>> >>> --
>> >>> Regards,
>> >>> -Aditya Sakhuja
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Shalin Shekhar Mangar.
>>
>>
>>
>>
>> --
>> Regards,
>> -Aditya Sakhuja
>>
>> --
>> Regards,
>> -Aditya Sakhuja
>>
> 
> 
> -- 
> Regards,
> -Aditya Sakhuja





--
View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4099789.html
Sent from the Solr - User mailing list archive at Nabble.com.