You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Ian Rose <ia...@fullstory.com> on 2014/11/07 16:24:19 UTC

Migrating shards

Howdy -

What is the current best practice for migrating shards to another machine?
I have heard suggestions that it is "add replica on new machine, wait for
it to catch up, delete original replica on old machine".  But I wanted to
check to make sure...

And if that is the best method, two follow-up questions:

1. Is there a best practice for knowing when the new replica has "caught
up" or do you just do a "*:*" query on both, compare counts, and call it a
day when they are the same (or nearly so, since the slave replica might lag
a little bit)?

2. When deleting the original (old) replica, since that one could be the
leader, is the replica deletion done in a safe manner such that no
documents will be lost (e.g. ones that were recently received by the leader
and not yet synced over to the slave replica before the leader is deleted)?

Thanks as always,
Ian

Re: Migrating shards

Posted by Ian Rose <ia...@fullstory.com>.

Sounds great - thanks all.

On Fri, Nov 7, 2014 at 2:06 PM, Erick Erickson <er...@gmail.com>
wrote:

> bq: I think ADD/DELETE replica APIs are best for within a SolrCloud
>
> I second this, if for no other reason than I'd expect this to get
> more attention than the underlying core admin API.
>
> That said, I believe ADD/DELETE replica just makes use of the core
> admin API under the covers, in which case you'd get all the goodness
> baked into the core admin API plus whatever extra is written into
> the collections api processing.
>
> Best,
> Erick
>
> On Fri, Nov 7, 2014 at 8:28 AM, ralph tice <ra...@gmail.com> wrote:
> > I think ADD/DELETE replica APIs are best for within a SolrCloud,
> > however if you need to move data across SolrClouds you will have to
> > resort to older APIs, which I didn't find good documentation of but
> > many references to.  So I wrote up the instructions to do so here:
> > https://gist.github.com/ralph-tice/887414a7f8082a0cb828
> >
> > I haven't had much time to think about how to translate this to more
> > generic documentation for inclusion in the community wiki but I would
> > love to hear some feedback if anyone else has a similar use case for
> > moving Solr indexes across SolrClouds.
> >
> >
> >
> > On Fri, Nov 7, 2014 at 10:18 AM, Michael Della Bitta
> > <mi...@appinions.com> wrote:
> >> 1. The new replica will not begin serving data until it's all there and
> >> caught up. You can watch the replica status on the Cloud screen to see
> it
> >> catch up; when it's green, you're done. If you're trying to automate
> this,
> >> you're going to look for the replica that says "recovering" in
> >> clusterstate.json and wait until it's "active."
> >>
> >> 2. I believe this to be the case, but I'll wait for someone else to
> chime in
> >> who knows better. Also, I wonder if there's a difference between
> >> DELETEREPLICA and unloading the core directly.
> >>
> >> Michael
> >>
> >>
> >>
> >> On 11/7/14 10:24, Ian Rose wrote:
> >>>
> >>> Howdy -
> >>>
> >>> What is the current best practice for migrating shards to another
> machine?
> >>> I have heard suggestions that it is "add replica on new machine, wait
> for
> >>> it to catch up, delete original replica on old machine".  But I wanted
> to
> >>> check to make sure...
> >>>
> >>> And if that is the best method, two follow-up questions:
> >>>
> >>> 1. Is there a best practice for knowing when the new replica has
> "caught
> >>> up" or do you just do a "*:*" query on both, compare counts, and call
> it a
> >>> day when they are the same (or nearly so, since the slave replica might
> >>> lag
> >>> a little bit)?
> >>>
> >>> 2. When deleting the original (old) replica, since that one could be
> the
> >>> leader, is the replica deletion done in a safe manner such that no
> >>> documents will be lost (e.g. ones that were recently received by the
> >>> leader
> >>> and not yet synced over to the slave replica before the leader is
> >>> deleted)?
> >>>
> >>> Thanks as always,
> >>> Ian
> >>>
> >>
>

Re: Migrating shards

Posted by Erick Erickson <er...@gmail.com>.

bq: I think ADD/DELETE replica APIs are best for within a SolrCloud

I second this, if for no other reason than I'd expect this to get
more attention than the underlying core admin API.

That said, I believe ADD/DELETE replica just makes use of the core
admin API under the covers, in which case you'd get all the goodness
baked into the core admin API plus whatever extra is written into
the collections api processing.

Best,
Erick

On Fri, Nov 7, 2014 at 8:28 AM, ralph tice <ra...@gmail.com> wrote:
> I think ADD/DELETE replica APIs are best for within a SolrCloud,
> however if you need to move data across SolrClouds you will have to
> resort to older APIs, which I didn't find good documentation of but
> many references to.  So I wrote up the instructions to do so here:
> https://gist.github.com/ralph-tice/887414a7f8082a0cb828
>
> I haven't had much time to think about how to translate this to more
> generic documentation for inclusion in the community wiki but I would
> love to hear some feedback if anyone else has a similar use case for
> moving Solr indexes across SolrClouds.
>
>
>
> On Fri, Nov 7, 2014 at 10:18 AM, Michael Della Bitta
> <mi...@appinions.com> wrote:
>> 1. The new replica will not begin serving data until it's all there and
>> caught up. You can watch the replica status on the Cloud screen to see it
>> catch up; when it's green, you're done. If you're trying to automate this,
>> you're going to look for the replica that says "recovering" in
>> clusterstate.json and wait until it's "active."
>>
>> 2. I believe this to be the case, but I'll wait for someone else to chime in
>> who knows better. Also, I wonder if there's a difference between
>> DELETEREPLICA and unloading the core directly.
>>
>> Michael
>>
>>
>>
>> On 11/7/14 10:24, Ian Rose wrote:
>>>
>>> Howdy -
>>>
>>> What is the current best practice for migrating shards to another machine?
>>> I have heard suggestions that it is "add replica on new machine, wait for
>>> it to catch up, delete original replica on old machine".  But I wanted to
>>> check to make sure...
>>>
>>> And if that is the best method, two follow-up questions:
>>>
>>> 1. Is there a best practice for knowing when the new replica has "caught
>>> up" or do you just do a "*:*" query on both, compare counts, and call it a
>>> day when they are the same (or nearly so, since the slave replica might
>>> lag
>>> a little bit)?
>>>
>>> 2. When deleting the original (old) replica, since that one could be the
>>> leader, is the replica deletion done in a safe manner such that no
>>> documents will be lost (e.g. ones that were recently received by the
>>> leader
>>> and not yet synced over to the slave replica before the leader is
>>> deleted)?
>>>
>>> Thanks as always,
>>> Ian
>>>
>>

Re: Migrating shards

Posted by ralph tice <ra...@gmail.com>.

I think ADD/DELETE replica APIs are best for within a SolrCloud,
however if you need to move data across SolrClouds you will have to
resort to older APIs, which I didn't find good documentation of but
many references to.  So I wrote up the instructions to do so here:
https://gist.github.com/ralph-tice/887414a7f8082a0cb828

I haven't had much time to think about how to translate this to more
generic documentation for inclusion in the community wiki but I would
love to hear some feedback if anyone else has a similar use case for
moving Solr indexes across SolrClouds.



On Fri, Nov 7, 2014 at 10:18 AM, Michael Della Bitta
<mi...@appinions.com> wrote:
> 1. The new replica will not begin serving data until it's all there and
> caught up. You can watch the replica status on the Cloud screen to see it
> catch up; when it's green, you're done. If you're trying to automate this,
> you're going to look for the replica that says "recovering" in
> clusterstate.json and wait until it's "active."
>
> 2. I believe this to be the case, but I'll wait for someone else to chime in
> who knows better. Also, I wonder if there's a difference between
> DELETEREPLICA and unloading the core directly.
>
> Michael
>
>
>
> On 11/7/14 10:24, Ian Rose wrote:
>>
>> Howdy -
>>
>> What is the current best practice for migrating shards to another machine?
>> I have heard suggestions that it is "add replica on new machine, wait for
>> it to catch up, delete original replica on old machine".  But I wanted to
>> check to make sure...
>>
>> And if that is the best method, two follow-up questions:
>>
>> 1. Is there a best practice for knowing when the new replica has "caught
>> up" or do you just do a "*:*" query on both, compare counts, and call it a
>> day when they are the same (or nearly so, since the slave replica might
>> lag
>> a little bit)?
>>
>> 2. When deleting the original (old) replica, since that one could be the
>> leader, is the replica deletion done in a safe manner such that no
>> documents will be lost (e.g. ones that were recently received by the
>> leader
>> and not yet synced over to the slave replica before the leader is
>> deleted)?
>>
>> Thanks as always,
>> Ian
>>
>

Re: Migrating shards

Posted by Michael Della Bitta <mi...@appinions.com>.

1. The new replica will not begin serving data until it's all there and 
caught up. You can watch the replica status on the Cloud screen to see 
it catch up; when it's green, you're done. If you're trying to automate 
this, you're going to look for the replica that says "recovering" in 
clusterstate.json and wait until it's "active."

2. I believe this to be the case, but I'll wait for someone else to 
chime in who knows better. Also, I wonder if there's a difference 
between DELETEREPLICA and unloading the core directly.

Michael

On 11/7/14 10:24, Ian Rose wrote:
> Howdy -
>
> What is the current best practice for migrating shards to another machine?
> I have heard suggestions that it is "add replica on new machine, wait for
> it to catch up, delete original replica on old machine".  But I wanted to
> check to make sure...
>
> And if that is the best method, two follow-up questions:
>
> 1. Is there a best practice for knowing when the new replica has "caught
> up" or do you just do a "*:*" query on both, compare counts, and call it a
> day when they are the same (or nearly so, since the slave replica might lag
> a little bit)?
>
> 2. When deleting the original (old) replica, since that one could be the
> leader, is the replica deletion done in a safe manner such that no
> documents will be lost (e.g. ones that were recently received by the leader
> and not yet synced over to the slave replica before the leader is deleted)?
>
> Thanks as always,
> Ian
>