You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aditya Sakhuja <ad...@gmail.com> on 2013/09/06 01:44:41 UTC

solrcloud shards backup/restoration

Hello,

I was looking for a good backup / recovery solution for the solrcloud
indexes. I am more looking for restoring the indexes from the index
snapshot, which can be taken using the replicationHandler's backup command.

I am looking for something that works with solrcloud 4.3 eventually, but
still relevant if you tested with a previous version.

I haven't been successful in have the restored index replicate across the
new replicas, after I restart all the nodes, with one node having the
restored index.

Is restoring the indexes on all the nodes the best way to do it ?
-- 
Regards,
-Aditya Sakhuja

Re: solrcloud shards backup/restoration

Posted by rulinma <ru...@gmail.com>.
I also want to know how to realization it.



--
View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4138358.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solrcloud shards backup/restoration

Posted by Greg Walters <gr...@answers.com>.
We've managed some success restoring existing/backed up indexes into solr cloud and even building the indexes offline and dumping the lucene files into the directories that solr expects. The general steps we follow are:

1) Round up your files. It doesn't matter if you pull from a master or slave so long as you've committed and get a consistent copy of the data. 

2) Use the collection api to create a collection in solr. The collection you're creating must have the same number of shards as the collection you've backed up and are restoring.

3) Stop all solr nodes. 

4) Remove the <index_name>/data/ directory from the shards you're going to make the leader. In our case we've got 6 shards and a replication factor of 3 on a 6 node cluster so each server/jvm has three shards on it. Conveniently the shards are all either even or odd per jvm.

5) Populate the <index_name>/data/ directories on your intended leaders. As mentioned above since we've got six shards and any two jvm contain the entire index we only populate the data on two servers.

6) Start up *JUST* the servers that you've just populated. The goal here is to make these servers you've populated the leaders for the new collection and to have the official "full copy" of the index. Upon startup you might have to wait $leaderVoteWait for previously non-leader servers to timeout and become leaders

7) Once you've got at least one core up in each shard of your collection go ahead and start the others up.

I think Aditya was failing by removing all the zookeeper data and starting everything up at once. If you force solr's hand a bit to pick leaders with the data that you want you'll have success when it replicates out to other nodes. It might also be possible to do this on-line by not stopping solr after creating the empty collection then copying the files into place on the leaders and issuing a RELOAD to pick up the changed indexes. I'm not sure how replicas would handle that though.

Thanks,
Greg


On Jan 24, 2014, at 12:47 AM, Allan Mascarenhas <al...@gmail.com> wrote:

> Any update on this ? 
> 
> I am also stuck with same problem, I want to install snapshot of master solr
> server to my local environment. but i could't  :(
> 
> All most spend 2 days to figure it out the way. Please help!!
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4113142.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: solrcloud shards backup/restoration

Posted by Allan Mascarenhas <al...@gmail.com>.
Any update on this ? 

I am also stuck with same problem, I want to install snapshot of master solr
server to my local environment. but i could't  :(

All most spend 2 days to figure it out the way. Please help!!



--
View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4113142.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solrcloud shards backup/restoration

Posted by adfel70 <ad...@gmail.com>.
did you solve this eventually?


Aditya Sakhuja wrote
> How does one recover from an index corruption ? That's what I am trying to
> eventually tackle here.
> 
> Thanks
> Aditya
> 
> On Thursday, September 19, 2013, Aditya Sakhuja wrote:
> 
>> Hi,
>>
>> Sorry for the late followup on this. Let me put in more details here.
>>
>> *The problem:*
>>
>> Cannot successfully restore back the index backed up with
>> '/replication?command=backup'. The backup was generated as *
>> snapshot.yyyymmdd*
>>
>> *My setup and steps:*
>> *
>> *
>> 6 solrcloud instances
>> 7 zookeepers instances
>>
>> Steps:
>>
>> 1.> Take snapshot using
>> *http://host1:8893/solr/replication?command=backup
>> *, on one host only. move *snapshot.yyyymmdd *to some reliable storage.
>>
>> 2.> Stop all 6 solr instances, all 7 zk instances.
>>
>> 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
>> the index data completely.
>>
>> 4.> Delete zookeeper/data/version*/* on all zookeeper nodes.
>>
>> 5.> Copy back index from backup to one of the nodes.
>>      \> cp *snapshot.yyyymmdd/*  *../collectionname/data/index/*
>>
>> 6.> Restart all zk instances. Restart all solrcloud instances.
>>
>>
>> *Outcome:*
>> *
>> *
>> All solr instances are up. However, *num of docs = 0 *for all nodes.
>> Looking at the node where the index was restored, there is a new
>> index.yymmddhhmmss directory being created and index.properties pointing
>> to
>> it. That explains why no documents are reported.
>>
>>
>> How do I have solrcloud pickup data from the index directory on a restart
>> ?
>>
>> Thanks in advance,
>> Aditya
>>
>>
>>
>> On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja &lt;

> aditya.sakhuja@

> &gt;wrote:
>>
>> Thanks Shalin and Mark for your responses. I am on the same page about
>> the
>> conventions for taking the backup. However, I am less sure about the
>> restoration of the index. Lets say we have 3 shards across 3 solrcloud
>> servers.
>>
>> 1.> I am assuming we should take a backup from each of the shard leaders
>> to get a complete collection. do you think that will get the complete
>> index
>> ( not worrying about what is not hard committed at the time of backup ).
>> ?
>>
>> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
>> From the structure of the snapshot I took, I did not see any
>> replication.properties or index.properties  which I see normally on a
>> healthy solrcloud cluster nodes.
>> if I have the snapshot named snapshot.20130905 does the
>> snapshot.20130905/* go into data/index ?
>>
>> Thanks
>> Aditya
>>
>>
>>
>> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller &lt;

> markrmiller@

> &gt; wrote:
>>
>> Phone typing. The end should not say "don't hard commit" - it should say
>> "do a hard commit and take a snapshot".
>>
>> Mark
>>
>> Sent from my iPhone
>>
>> On Sep 6, 2013, at 7:26 AM, Mark Miller &lt;

> markrmiller@

> &gt; wrote:
>>
>> > I don't know that it's too bad though - its always been the case that
>> if
>> you do a backup while indexing, it's just going to get up to the last
>> hard
>> commit. With SolrCloud that will still be the case. So just make sure you
>> do a hard commit right before taking the backup - yes, it might miss a
>> few
>> docs in the tran log, but if you are taking a back up while indexing, you
>> don't have great precision in any case - you will roughly get a snapshot
>> for around that time - even without SolrCloud, if you are worried about
>> precision and getting every update into that backup, you want to stop
>> indexing and commit first. But if you just want a rough snapshot for
>> around
>> that time, in both cases you can still just don't hard commit and take a
>> snapshot.
>> >
>> > Mark
>> >
>> > Sent from my iPhone
>> >
>> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
>> 

> shalinmangar@

>> wrote:
>> >
>> >> The replication handler's backup command was built for pre-SolrCloud.
>> >> It takes a snapshot of the index but it is unaware of the transaction
>> >> log which is a key component in SolrCloud. Hence unless you stop
>> >> updates, commit your changes and then take a backup, you will likely
>> >> miss some updates.
>> >>
>> >> That being said, I'm curious to see how peer sync behaves when you try
>> >> to restore from a snapshot. When you say that you haven't been
>> >> successful in restoring, what exactly is the behaviour you observed?
>> >>
>> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
>> 

> aditya.sakhuja@

>> wrote:
>> >>> Hello,
>> >>>
>> >>> I was looking for a good backup / recovery solution for the solrcloud
>> >>> indexes. I am more looking for restoring the indexes from the index
>> >>> snapshot, which can be taken using the replicationHandler's backup
>> command.
>> >>>
>> >>> I am looking for something that works with solrcloud 4.3 eventually,
>> but
>> >>> still relevant if you tested with a previous version.
>> >>>
>> >>> I haven't been successful in have the restored index replicate across
>> the
>> >>> new replicas, after I restart all the nodes, with one node having the
>> >>> restored index.
>> >>>
>> >>> Is restoring the indexes on all the nodes the best way to do it ?
>> >>> --
>> >>> Regards,
>> >>> -Aditya Sakhuja
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Shalin Shekhar Mangar.
>>
>>
>>
>>
>> --
>> Regards,
>> -Aditya Sakhuja
>>
>> --
>> Regards,
>> -Aditya Sakhuja
>>
> 
> 
> -- 
> Regards,
> -Aditya Sakhuja





--
View this message in context: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-tp4088447p4099789.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solrcloud shards backup/restoration

Posted by Aditya Sakhuja <ad...@gmail.com>.
How does one recover from an index corruption ? That's what I am trying to
eventually tackle here.

Thanks
Aditya

On Thursday, September 19, 2013, Aditya Sakhuja wrote:

> Hi,
>
> Sorry for the late followup on this. Let me put in more details here.
>
> *The problem:*
>
> Cannot successfully restore back the index backed up with
> '/replication?command=backup'. The backup was generated as *
> snapshot.yyyymmdd*
>
> *My setup and steps:*
> *
> *
> 6 solrcloud instances
> 7 zookeepers instances
>
> Steps:
>
> 1.> Take snapshot using *http://host1:8893/solr/replication?command=backup
> *, on one host only. move *snapshot.yyyymmdd *to some reliable storage.
>
> 2.> Stop all 6 solr instances, all 7 zk instances.
>
> 3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
> the index data completely.
>
> 4.> Delete zookeeper/data/version*/* on all zookeeper nodes.
>
> 5.> Copy back index from backup to one of the nodes.
>      \> cp *snapshot.yyyymmdd/*  *../collectionname/data/index/*
>
> 6.> Restart all zk instances. Restart all solrcloud instances.
>
>
> *Outcome:*
> *
> *
> All solr instances are up. However, *num of docs = 0 *for all nodes.
> Looking at the node where the index was restored, there is a new
> index.yymmddhhmmss directory being created and index.properties pointing to
> it. That explains why no documents are reported.
>
>
> How do I have solrcloud pickup data from the index directory on a restart
> ?
>
> Thanks in advance,
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja <ad...@gmail.com>wrote:
>
> Thanks Shalin and Mark for your responses. I am on the same page about the
> conventions for taking the backup. However, I am less sure about the
> restoration of the index. Lets say we have 3 shards across 3 solrcloud
> servers.
>
> 1.> I am assuming we should take a backup from each of the shard leaders
> to get a complete collection. do you think that will get the complete index
> ( not worrying about what is not hard committed at the time of backup ). ?
>
> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
> From the structure of the snapshot I took, I did not see any
> replication.properties or index.properties  which I see normally on a
> healthy solrcloud cluster nodes.
> if I have the snapshot named snapshot.20130905 does the
> snapshot.20130905/* go into data/index ?
>
> Thanks
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller <ma...@gmail.com> wrote:
>
> Phone typing. The end should not say "don't hard commit" - it should say
> "do a hard commit and take a snapshot".
>
> Mark
>
> Sent from my iPhone
>
> On Sep 6, 2013, at 7:26 AM, Mark Miller <ma...@gmail.com> wrote:
>
> > I don't know that it's too bad though - its always been the case that if
> you do a backup while indexing, it's just going to get up to the last hard
> commit. With SolrCloud that will still be the case. So just make sure you
> do a hard commit right before taking the backup - yes, it might miss a few
> docs in the tran log, but if you are taking a back up while indexing, you
> don't have great precision in any case - you will roughly get a snapshot
> for around that time - even without SolrCloud, if you are worried about
> precision and getting every update into that backup, you want to stop
> indexing and commit first. But if you just want a rough snapshot for around
> that time, in both cases you can still just don't hard commit and take a
> snapshot.
> >
> > Mark
> >
> > Sent from my iPhone
> >
> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
> shalinmangar@gmail.com> wrote:
> >
> >> The replication handler's backup command was built for pre-SolrCloud.
> >> It takes a snapshot of the index but it is unaware of the transaction
> >> log which is a key component in SolrCloud. Hence unless you stop
> >> updates, commit your changes and then take a backup, you will likely
> >> miss some updates.
> >>
> >> That being said, I'm curious to see how peer sync behaves when you try
> >> to restore from a snapshot. When you say that you haven't been
> >> successful in restoring, what exactly is the behaviour you observed?
> >>
> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
> aditya.sakhuja@gmail.com> wrote:
> >>> Hello,
> >>>
> >>> I was looking for a good backup / recovery solution for the solrcloud
> >>> indexes. I am more looking for restoring the indexes from the index
> >>> snapshot, which can be taken using the replicationHandler's backup
> command.
> >>>
> >>> I am looking for something that works with solrcloud 4.3 eventually,
> but
> >>> still relevant if you tested with a previous version.
> >>>
> >>> I haven't been successful in have the restored index replicate across
> the
> >>> new replicas, after I restart all the nodes, with one node having the
> >>> restored index.
> >>>
> >>> Is restoring the indexes on all the nodes the best way to do it ?
> >>> --
> >>> Regards,
> >>> -Aditya Sakhuja
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Shalin Shekhar Mangar.
>
>
>
>
> --
> Regards,
> -Aditya Sakhuja
>
> --
> Regards,
> -Aditya Sakhuja
>


-- 
Regards,
-Aditya Sakhuja

Re: solrcloud shards backup/restoration

Posted by Aditya Sakhuja <ad...@gmail.com>.
Hi,

Sorry for the late followup on this. Let me put in more details here.

*The problem:*

Cannot successfully restore back the index backed up with
'/replication?command=backup'. The backup was generated as *
snapshot.yyyymmdd*

*My setup and steps:*
*
*
6 solrcloud instances
7 zookeepers instances

Steps:

1.> Take snapshot using *http://host1:8893/solr/replication?command=backup*,
on one host only. move *snapshot.yyyymmdd *to some reliable storage.

2.> Stop all 6 solr instances, all 7 zk instances.

3.> Delete ../collectionname/data/* on all solrcloud nodes. ie. deleting
the index data completely.

4.> Delete zookeeper/data/version*/* on all zookeeper nodes.

5.> Copy back index from backup to one of the nodes.
     \> cp *snapshot.yyyymmdd/*  *../collectionname/data/index/*

6.> Restart all zk instances. Restart all solrcloud instances.


*Outcome:*
*
*
All solr instances are up. However, *num of docs = 0 *for all nodes.
Looking at the node where the index was restored, there is a new
index.yymmddhhmmss directory being created and index.properties pointing to
it. That explains why no documents are reported.


How do I have solrcloud pickup data from the index directory on a restart ?

Thanks in advance,
Aditya



On Fri, Sep 6, 2013 at 3:41 PM, Aditya Sakhuja <ad...@gmail.com>wrote:

> Thanks Shalin and Mark for your responses. I am on the same page about the
> conventions for taking the backup. However, I am less sure about the
> restoration of the index. Lets say we have 3 shards across 3 solrcloud
> servers.
>
> 1.> I am assuming we should take a backup from each of the shard leaders
> to get a complete collection. do you think that will get the complete index
> ( not worrying about what is not hard committed at the time of backup ). ?
>
> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
> From the structure of the snapshot I took, I did not see any
> replication.properties or index.properties  which I see normally on a
> healthy solrcloud cluster nodes.
> if I have the snapshot named snapshot.20130905 does the
> snapshot.20130905/* go into data/index ?
>
> Thanks
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller <ma...@gmail.com> wrote:
>
>> Phone typing. The end should not say "don't hard commit" - it should say
>> "do a hard commit and take a snapshot".
>>
>> Mark
>>
>> Sent from my iPhone
>>
>> On Sep 6, 2013, at 7:26 AM, Mark Miller <ma...@gmail.com> wrote:
>>
>> > I don't know that it's too bad though - its always been the case that
>> if you do a backup while indexing, it's just going to get up to the last
>> hard commit. With SolrCloud that will still be the case. So just make sure
>> you do a hard commit right before taking the backup - yes, it might miss a
>> few docs in the tran log, but if you are taking a back up while indexing,
>> you don't have great precision in any case - you will roughly get a
>> snapshot for around that time - even without SolrCloud, if you are worried
>> about precision and getting every update into that backup, you want to stop
>> indexing and commit first. But if you just want a rough snapshot for around
>> that time, in both cases you can still just don't hard commit and take a
>> snapshot.
>> >
>> > Mark
>> >
>> > Sent from my iPhone
>> >
>> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
>> shalinmangar@gmail.com> wrote:
>> >
>> >> The replication handler's backup command was built for pre-SolrCloud.
>> >> It takes a snapshot of the index but it is unaware of the transaction
>> >> log which is a key component in SolrCloud. Hence unless you stop
>> >> updates, commit your changes and then take a backup, you will likely
>> >> miss some updates.
>> >>
>> >> That being said, I'm curious to see how peer sync behaves when you try
>> >> to restore from a snapshot. When you say that you haven't been
>> >> successful in restoring, what exactly is the behaviour you observed?
>> >>
>> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
>> aditya.sakhuja@gmail.com> wrote:
>> >>> Hello,
>> >>>
>> >>> I was looking for a good backup / recovery solution for the solrcloud
>> >>> indexes. I am more looking for restoring the indexes from the index
>> >>> snapshot, which can be taken using the replicationHandler's backup
>> command.
>> >>>
>> >>> I am looking for something that works with solrcloud 4.3 eventually,
>> but
>> >>> still relevant if you tested with a previous version.
>> >>>
>> >>> I haven't been successful in have the restored index replicate across
>> the
>> >>> new replicas, after I restart all the nodes, with one node having the
>> >>> restored index.
>> >>>
>> >>> Is restoring the indexes on all the nodes the best way to do it ?
>> >>> --
>> >>> Regards,
>> >>> -Aditya Sakhuja
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Shalin Shekhar Mangar.
>>
>
>
>
> --
> Regards,
> -Aditya Sakhuja
>



-- 
Regards,
-Aditya Sakhuja

Re: solrcloud shards backup/restoration

Posted by Tim Vaillancourt <ti...@elementspace.com>.
I wouldn't say I love this idea, but wouldn't it be safe to LVM snapshot
the Solr index? I think this may even work on a live server, depending on
some file I/O details. Has anyone tried this?

An in-Solr solution sounds more elegant, but considering the tlog concern
Shalin mentioned, I think this may work as an interim solution.

Cheers!

Tim


On 6 September 2013 15:41, Aditya Sakhuja <ad...@gmail.com> wrote:

> Thanks Shalin and Mark for your responses. I am on the same page about the
> conventions for taking the backup. However, I am less sure about the
> restoration of the index. Lets say we have 3 shards across 3 solrcloud
> servers.
>
> 1.> I am assuming we should take a backup from each of the shard leaders to
> get a complete collection. do you think that will get the complete index (
> not worrying about what is not hard committed at the time of backup ). ?
>
> 2.> How do we go about restoring the index in a fresh solrcloud cluster ?
> From the structure of the snapshot I took, I did not see any
> replication.properties or index.properties  which I see normally on a
> healthy solrcloud cluster nodes.
> if I have the snapshot named snapshot.20130905 does the snapshot.20130905/*
> go into data/index ?
>
> Thanks
> Aditya
>
>
>
> On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller <ma...@gmail.com> wrote:
>
> > Phone typing. The end should not say "don't hard commit" - it should say
> > "do a hard commit and take a snapshot".
> >
> > Mark
> >
> > Sent from my iPhone
> >
> > On Sep 6, 2013, at 7:26 AM, Mark Miller <ma...@gmail.com> wrote:
> >
> > > I don't know that it's too bad though - its always been the case that
> if
> > you do a backup while indexing, it's just going to get up to the last
> hard
> > commit. With SolrCloud that will still be the case. So just make sure you
> > do a hard commit right before taking the backup - yes, it might miss a
> few
> > docs in the tran log, but if you are taking a back up while indexing, you
> > don't have great precision in any case - you will roughly get a snapshot
> > for around that time - even without SolrCloud, if you are worried about
> > precision and getting every update into that backup, you want to stop
> > indexing and commit first. But if you just want a rough snapshot for
> around
> > that time, in both cases you can still just don't hard commit and take a
> > snapshot.
> > >
> > > Mark
> > >
> > > Sent from my iPhone
> > >
> > > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
> > shalinmangar@gmail.com> wrote:
> > >
> > >> The replication handler's backup command was built for pre-SolrCloud.
> > >> It takes a snapshot of the index but it is unaware of the transaction
> > >> log which is a key component in SolrCloud. Hence unless you stop
> > >> updates, commit your changes and then take a backup, you will likely
> > >> miss some updates.
> > >>
> > >> That being said, I'm curious to see how peer sync behaves when you try
> > >> to restore from a snapshot. When you say that you haven't been
> > >> successful in restoring, what exactly is the behaviour you observed?
> > >>
> > >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
> > aditya.sakhuja@gmail.com> wrote:
> > >>> Hello,
> > >>>
> > >>> I was looking for a good backup / recovery solution for the solrcloud
> > >>> indexes. I am more looking for restoring the indexes from the index
> > >>> snapshot, which can be taken using the replicationHandler's backup
> > command.
> > >>>
> > >>> I am looking for something that works with solrcloud 4.3 eventually,
> > but
> > >>> still relevant if you tested with a previous version.
> > >>>
> > >>> I haven't been successful in have the restored index replicate across
> > the
> > >>> new replicas, after I restart all the nodes, with one node having the
> > >>> restored index.
> > >>>
> > >>> Is restoring the indexes on all the nodes the best way to do it ?
> > >>> --
> > >>> Regards,
> > >>> -Aditya Sakhuja
> > >>
> > >>
> > >>
> > >> --
> > >> Regards,
> > >> Shalin Shekhar Mangar.
> >
>
>
>
> --
> Regards,
> -Aditya Sakhuja
>

Re: solrcloud shards backup/restoration

Posted by Aditya Sakhuja <ad...@gmail.com>.
Thanks Shalin and Mark for your responses. I am on the same page about the
conventions for taking the backup. However, I am less sure about the
restoration of the index. Lets say we have 3 shards across 3 solrcloud
servers.

1.> I am assuming we should take a backup from each of the shard leaders to
get a complete collection. do you think that will get the complete index (
not worrying about what is not hard committed at the time of backup ). ?

2.> How do we go about restoring the index in a fresh solrcloud cluster ?
>From the structure of the snapshot I took, I did not see any
replication.properties or index.properties  which I see normally on a
healthy solrcloud cluster nodes.
if I have the snapshot named snapshot.20130905 does the snapshot.20130905/*
go into data/index ?

Thanks
Aditya



On Fri, Sep 6, 2013 at 7:28 AM, Mark Miller <ma...@gmail.com> wrote:

> Phone typing. The end should not say "don't hard commit" - it should say
> "do a hard commit and take a snapshot".
>
> Mark
>
> Sent from my iPhone
>
> On Sep 6, 2013, at 7:26 AM, Mark Miller <ma...@gmail.com> wrote:
>
> > I don't know that it's too bad though - its always been the case that if
> you do a backup while indexing, it's just going to get up to the last hard
> commit. With SolrCloud that will still be the case. So just make sure you
> do a hard commit right before taking the backup - yes, it might miss a few
> docs in the tran log, but if you are taking a back up while indexing, you
> don't have great precision in any case - you will roughly get a snapshot
> for around that time - even without SolrCloud, if you are worried about
> precision and getting every update into that backup, you want to stop
> indexing and commit first. But if you just want a rough snapshot for around
> that time, in both cases you can still just don't hard commit and take a
> snapshot.
> >
> > Mark
> >
> > Sent from my iPhone
> >
> > On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <
> shalinmangar@gmail.com> wrote:
> >
> >> The replication handler's backup command was built for pre-SolrCloud.
> >> It takes a snapshot of the index but it is unaware of the transaction
> >> log which is a key component in SolrCloud. Hence unless you stop
> >> updates, commit your changes and then take a backup, you will likely
> >> miss some updates.
> >>
> >> That being said, I'm curious to see how peer sync behaves when you try
> >> to restore from a snapshot. When you say that you haven't been
> >> successful in restoring, what exactly is the behaviour you observed?
> >>
> >> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <
> aditya.sakhuja@gmail.com> wrote:
> >>> Hello,
> >>>
> >>> I was looking for a good backup / recovery solution for the solrcloud
> >>> indexes. I am more looking for restoring the indexes from the index
> >>> snapshot, which can be taken using the replicationHandler's backup
> command.
> >>>
> >>> I am looking for something that works with solrcloud 4.3 eventually,
> but
> >>> still relevant if you tested with a previous version.
> >>>
> >>> I haven't been successful in have the restored index replicate across
> the
> >>> new replicas, after I restart all the nodes, with one node having the
> >>> restored index.
> >>>
> >>> Is restoring the indexes on all the nodes the best way to do it ?
> >>> --
> >>> Regards,
> >>> -Aditya Sakhuja
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Shalin Shekhar Mangar.
>



-- 
Regards,
-Aditya Sakhuja

Re: solrcloud shards backup/restoration

Posted by Mark Miller <ma...@gmail.com>.
Phone typing. The end should not say "don't hard commit" - it should say "do a hard commit and take a snapshot". 

Mark

Sent from my iPhone

On Sep 6, 2013, at 7:26 AM, Mark Miller <ma...@gmail.com> wrote:

> I don't know that it's too bad though - its always been the case that if you do a backup while indexing, it's just going to get up to the last hard commit. With SolrCloud that will still be the case. So just make sure you do a hard commit right before taking the backup - yes, it might miss a few docs in the tran log, but if you are taking a back up while indexing, you don't have great precision in any case - you will roughly get a snapshot for around that time - even without SolrCloud, if you are worried about precision and getting every update into that backup, you want to stop indexing and commit first. But if you just want a rough snapshot for around that time, in both cases you can still just don't hard commit and take a snapshot. 
> 
> Mark
> 
> Sent from my iPhone
> 
> On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <sh...@gmail.com> wrote:
> 
>> The replication handler's backup command was built for pre-SolrCloud.
>> It takes a snapshot of the index but it is unaware of the transaction
>> log which is a key component in SolrCloud. Hence unless you stop
>> updates, commit your changes and then take a backup, you will likely
>> miss some updates.
>> 
>> That being said, I'm curious to see how peer sync behaves when you try
>> to restore from a snapshot. When you say that you haven't been
>> successful in restoring, what exactly is the behaviour you observed?
>> 
>> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <ad...@gmail.com> wrote:
>>> Hello,
>>> 
>>> I was looking for a good backup / recovery solution for the solrcloud
>>> indexes. I am more looking for restoring the indexes from the index
>>> snapshot, which can be taken using the replicationHandler's backup command.
>>> 
>>> I am looking for something that works with solrcloud 4.3 eventually, but
>>> still relevant if you tested with a previous version.
>>> 
>>> I haven't been successful in have the restored index replicate across the
>>> new replicas, after I restart all the nodes, with one node having the
>>> restored index.
>>> 
>>> Is restoring the indexes on all the nodes the best way to do it ?
>>> --
>>> Regards,
>>> -Aditya Sakhuja
>> 
>> 
>> 
>> -- 
>> Regards,
>> Shalin Shekhar Mangar.

Re: solrcloud shards backup/restoration

Posted by Mark Miller <ma...@gmail.com>.
I don't know that it's too bad though - its always been the case that if you do a backup while indexing, it's just going to get up to the last hard commit. With SolrCloud that will still be the case. So just make sure you do a hard commit right before taking the backup - yes, it might miss a few docs in the tran log, but if you are taking a back up while indexing, you don't have great precision in any case - you will roughly get a snapshot for around that time - even without SolrCloud, if you are worried about precision and getting every update into that backup, you want to stop indexing and commit first. But if you just want a rough snapshot for around that time, in both cases you can still just don't hard commit and take a snapshot. 

Mark

Sent from my iPhone

On Sep 6, 2013, at 1:13 AM, Shalin Shekhar Mangar <sh...@gmail.com> wrote:

> The replication handler's backup command was built for pre-SolrCloud.
> It takes a snapshot of the index but it is unaware of the transaction
> log which is a key component in SolrCloud. Hence unless you stop
> updates, commit your changes and then take a backup, you will likely
> miss some updates.
> 
> That being said, I'm curious to see how peer sync behaves when you try
> to restore from a snapshot. When you say that you haven't been
> successful in restoring, what exactly is the behaviour you observed?
> 
> On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <ad...@gmail.com> wrote:
>> Hello,
>> 
>> I was looking for a good backup / recovery solution for the solrcloud
>> indexes. I am more looking for restoring the indexes from the index
>> snapshot, which can be taken using the replicationHandler's backup command.
>> 
>> I am looking for something that works with solrcloud 4.3 eventually, but
>> still relevant if you tested with a previous version.
>> 
>> I haven't been successful in have the restored index replicate across the
>> new replicas, after I restart all the nodes, with one node having the
>> restored index.
>> 
>> Is restoring the indexes on all the nodes the best way to do it ?
>> --
>> Regards,
>> -Aditya Sakhuja
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.

Re: solrcloud shards backup/restoration

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
The replication handler's backup command was built for pre-SolrCloud.
It takes a snapshot of the index but it is unaware of the transaction
log which is a key component in SolrCloud. Hence unless you stop
updates, commit your changes and then take a backup, you will likely
miss some updates.

That being said, I'm curious to see how peer sync behaves when you try
to restore from a snapshot. When you say that you haven't been
successful in restoring, what exactly is the behaviour you observed?

On Fri, Sep 6, 2013 at 5:14 AM, Aditya Sakhuja <ad...@gmail.com> wrote:
> Hello,
>
> I was looking for a good backup / recovery solution for the solrcloud
> indexes. I am more looking for restoring the indexes from the index
> snapshot, which can be taken using the replicationHandler's backup command.
>
> I am looking for something that works with solrcloud 4.3 eventually, but
> still relevant if you tested with a previous version.
>
> I haven't been successful in have the restored index replicate across the
> new replicas, after I restart all the nodes, with one node having the
> restored index.
>
> Is restoring the indexes on all the nodes the best way to do it ?
> --
> Regards,
> -Aditya Sakhuja



-- 
Regards,
Shalin Shekhar Mangar.