You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Natarajan, Rajeswari" <ra...@sap.com> on 2019/10/09 04:32:37 UTC

Re: Solr 7.7 restore issue

I am also facing the same issue. With Solr 7.6 restore fails with below rule. Would like to place one replica per node by below rule

 with the rule to place one replica per node
"set-cluster-policy": [{
        "replica": "<2",
        "shard": "#EACH",
        "node": "#ANY"
    }]

Without the rule the restore works. But we need this rule. Any suggestions to overcome this issue. 

Thanks,
Rajeswari

On 7/12/19, 11:00 AM, "Mark Thill" <ma...@gmail.com> wrote:

    I have a 4 node cluster.  My goal is to have 2 shards with two replicas
    each and only allowing 1 core on each node.  I have a cluster policy set to:
    
    [{"replica":"2", "shard": "#EACH", "collection":"test",
    "port":"8983"},{"cores":"1", "node":"#ANY"}]
    
    I then manually create a collection with:
    
    name: test
    config set: test
    numShards: 2
    replicationFact: 2
    
    This works and I get a collection that looks like what I expect.  I then
    backup this collection.  But when I try to restore the collection it fails
    and says
    
    "Error getting replica locations : No node can satisfy the rules"
    [{"replica":"2", "shard": "#EACH", "collection":"test",
    "port":"8983"},{"cores":"1", "node":"#ANY"}]
    
    If I set my cluster-policy rules back to [] and try to restore it then
    successfully restores my collection exactly how I expect it to be.  It
    appears that having any cluster-policy rules in place is affecting my
    restore, but the "error getting replica locations" is strange.
    
    Any suggestions?
    
    mark <ma...@gmail.com>
    


Re: Solr 7.7 restore issue

Posted by mirei <a1...@bestbuy.com.INVALID>.
Unfortunately, upon further testing, my above suggestion about using only the
set-policy does not actually solve the issue.

The reason my testing above worked with restore was only because I left out
the problematic set-cluster-policy autoscaling rule for replica count. And
the truth is that restore was already always working when that cluster
autoscaling rule was missing. The key point is that I only tested restore
functionality above. When I tested to see if autoscaling itself still worked
by adding/placing replicas within our intended limits, that part wasn't
working.

Now it seems we are back to the two workarounds mentioned previously:
1. Clear out the cluster-policy for replica count, restore, then add back
the cluster-policy.
2. Create or modify your collections attaching a 'rule=replica:<2,node:*' to
match your autoscaling policy.


Out of curiousity, I did try testing using both the set-cluster-policy and
the set-policy where I used MODIFYCOLLECTION to attach the policy to my
collection similar to attaching a rule to the collection, but that produced
the same error when attempting to restore:

"Error getting replica locations :  No node can satisfy the rules
"[{replica=<2, node=#ANY, shard=#EACH, collection=gettingstarted}]"




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr 7.7 restore issue

Posted by mirei <a1...@bestbuy.com.INVALID>.
I am now noticing a difference when using set-policy instead of
set-cluster-policy. I'm still testing it to make sure, but just wanted to
report early that the bug may rely on the type of policy being put in place.

So if you run into this issue when using set-cluster-policy, consider trying
a set-policy like below:

curl -X POST "http://localhost:8983/solr/admin/autoscaling" --data-binary \
'{"set-policy": {"policy1": [{"replica": "<2", "shard": "#EACH", "node":
"#ANY"}]}}'

Whether this is a bug or not, I'll leave to the Solr devs.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr 7.7 restore issue

Posted by Koen De Groote <ko...@limecraft.com>.
Oh hey, I once talked in this thread.

I ended up just not using any rules. There's a logic there, but I could not
tell if the behavior was bugged or not.

On Wed, Aug 19, 2020 at 7:24 PM mirei <a1...@bestbuy.com.invalid> wrote:

> Would anyone be able to confirm that this is indeed a Solr bug between
> restore and autoscaling?
>
> I have done some local testing and found the following patterns and
> discoveries.
>
> When using a replica count autoscaling policy {"replica": "<2","shard":
> "#EACH","node": "#ANY"}, it breaks Solr restore functionality because for
> some reason Solr code needs double the amount of replica count for restore
> to work from an existing backup.
>
> If 1 replica exists on a node on a backup, restore with autoscaling
> requires
> a rule that allows 2 replicas to exist on any node.
>
> If 2 replicas exist on a node on a backup, restore with autoscaling
> requires
> a rule that allows 4 replicas to exist on any node.
>
> If 3 replicas exist on a node on a backup, restore with autoscaling
> requires
> a rule that allows 6 replicas to exist on any node.
>
> NOTE: When given double room, the backup comes up exactly as it was before
> the restore, so nothing is actually duplicated after restore. It's just for
> some reason, the current restore code may be bugged where it actually needs
> more room than necessary to restore.(when there's a replica count
> autoscaling policy)
>
>
>
> Rajeswari posted a separate reply to this thread that brought me to another
> discovery.
>
> https://lucene.472066.n3.nabble.com/Re-CAUTION-Re-Solr-7-7-restore-issue-tp4450714.html
>
> In it, they reference the legacy rule based replica placement
> documentation:
> https://lucene.apache.org/solr/guide/7_6/rule-based-replica-placement.html
>
> After doing some more local testing, I found that adding the same replica
> count restraint as a rule onto the collection somehow now allows Solr
> restore to work as intended. Example below:
>
> collection rule: replica:<2,node:*
> autoscaling policy: {"replica": "<2","node": "#ANY"}
>
> When both are in place, restore functionality finally works.
>
>
>
> Ideally, we should not have to do anything extra outside of placing the
> original autoscaling replica count policy. But as of right now, it appears
> that two workarounds involve either removing the cluster policy during
> restore, or adding a legacy collection rule in addition to autoscaling
> policy for restore to work.
>
> Please let me know if something crucial is being missed, otherwise I hope
> the above can help in tracking down any actual bug. Thanks.
>
>
>
> Repeatable steps if you want to test locally using Solr tutorial:
> ./bin/solr stop -all ; rm -Rf example/cloud/
> ./bin/solr start -e cloud
> (choose 1 node for gettingstarted with 1 shard 1 replica)
>
> curl -X POST "http://localhost:8983/solr/admin/autoscaling" --data-binary
> \
> '{"set-cluster-policy": [{"replica": "<2","shard": "#EACH","node":
> "#ANY"}]}'
>
> curl
> '
> http://localhost:8983/solr/admin/collections?action=BACKUP&name=myBackupName&collection=gettingstarted&location=/choose/location/
> '
> curl
> '
> http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted
> '
> curl
> '
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=myBackupName&location=/choose/location/&collection=gettingstarted
> '
>
>
> (use this before backup, and then restore works)
> curl
> '
> http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&collection=gettingstarted&rule=shard:*,replica
> :<2,node:*'
>
>
>
>
>
> Koen De Groote wrote
> > I also ran into this while researching cluster policies. Solr 7.6
> >
> > Except same situation: introduce a rule to control placement of
> > collections. Backup. Delete. Restore. Solr complains it can't do it.
> >
> > I don't need them just yet, so I stopped there, but reading this is quite
> > disturbing.
> >
> > Does deleting the rule, restore and then immediately re-instating the
> rule
> > work?
> >
> >
> >
> > On Wed, Oct 9, 2019 at 6:33 AM Natarajan, Rajeswari <
>
> > rajeswari.natarajan@
>
> >> wrote:
> >
> >> I am also facing the same issue. With Solr 7.6 restore fails with below
> >> rule. Would like to place one replica per node by below rule
> >>
> >>  with the rule to place one replica per node
> >> "set-cluster-policy": [{
> >>         "replica": "<2",
> >>         "shard": "#EACH",
> >>         "node": "#ANY"
> >>     }]
> >>
> >> Without the rule the restore works. But we need this rule. Any
> >> suggestions
> >> to overcome this issue.
> >>
> >> Thanks,
> >> Rajeswari
> >>
> >> On 7/12/19, 11:00 AM, "Mark Thill" &lt;
>
> > mark.thill@
>
> > &gt; wrote:
> >>
> >>     I have a 4 node cluster.  My goal is to have 2 shards with two
> >> replicas
> >>     each and only allowing 1 core on each node.  I have a cluster policy
> >> set to:
> >>
> >>     [{"replica":"2", "shard": "#EACH", "collection":"test",
> >>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
> >>
> >>     I then manually create a collection with:
> >>
> >>     name: test
> >>     config set: test
> >>     numShards: 2
> >>     replicationFact: 2
> >>
> >>     This works and I get a collection that looks like what I expect.  I
> >> then
> >>     backup this collection.  But when I try to restore the collection it
> >> fails
> >>     and says
> >>
> >>     "Error getting replica locations : No node can satisfy the rules"
> >>     [{"replica":"2", "shard": "#EACH", "collection":"test",
> >>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
> >>
> >>     If I set my cluster-policy rules back to [] and try to restore it
> >> then
> >>     successfully restores my collection exactly how I expect it to be.
> >> It
> >>     appears that having any cluster-policy rules in place is affecting
> my
> >>     restore, but the "error getting replica locations" is strange.
> >>
> >>     Any suggestions?
> >>
> >>     mark &lt;
>
> > mark.thill@
>
> > &gt;
> >>
> >>
> >>
>
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Solr 7.7 restore issue

Posted by mirei <a1...@bestbuy.com.INVALID>.
Would anyone be able to confirm that this is indeed a Solr bug between
restore and autoscaling?

I have done some local testing and found the following patterns and
discoveries.

When using a replica count autoscaling policy {"replica": "<2","shard":
"#EACH","node": "#ANY"}, it breaks Solr restore functionality because for
some reason Solr code needs double the amount of replica count for restore
to work from an existing backup.

If 1 replica exists on a node on a backup, restore with autoscaling requires
a rule that allows 2 replicas to exist on any node.

If 2 replicas exist on a node on a backup, restore with autoscaling requires
a rule that allows 4 replicas to exist on any node.

If 3 replicas exist on a node on a backup, restore with autoscaling requires
a rule that allows 6 replicas to exist on any node.

NOTE: When given double room, the backup comes up exactly as it was before
the restore, so nothing is actually duplicated after restore. It's just for
some reason, the current restore code may be bugged where it actually needs
more room than necessary to restore.(when there's a replica count
autoscaling policy)



Rajeswari posted a separate reply to this thread that brought me to another
discovery.
https://lucene.472066.n3.nabble.com/Re-CAUTION-Re-Solr-7-7-restore-issue-tp4450714.html

In it, they reference the legacy rule based replica placement documentation:
https://lucene.apache.org/solr/guide/7_6/rule-based-replica-placement.html

After doing some more local testing, I found that adding the same replica
count restraint as a rule onto the collection somehow now allows Solr
restore to work as intended. Example below:

collection rule: replica:<2,node:*
autoscaling policy: {"replica": "<2","node": "#ANY"}

When both are in place, restore functionality finally works.



Ideally, we should not have to do anything extra outside of placing the
original autoscaling replica count policy. But as of right now, it appears
that two workarounds involve either removing the cluster policy during
restore, or adding a legacy collection rule in addition to autoscaling
policy for restore to work.

Please let me know if something crucial is being missed, otherwise I hope
the above can help in tracking down any actual bug. Thanks.



Repeatable steps if you want to test locally using Solr tutorial:
./bin/solr stop -all ; rm -Rf example/cloud/
./bin/solr start -e cloud
(choose 1 node for gettingstarted with 1 shard 1 replica)

curl -X POST "http://localhost:8983/solr/admin/autoscaling" --data-binary \
'{"set-cluster-policy": [{"replica": "<2","shard": "#EACH","node":
"#ANY"}]}'

curl
'http://localhost:8983/solr/admin/collections?action=BACKUP&name=myBackupName&collection=gettingstarted&location=/choose/location/'
curl
'http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted'
curl
'http://localhost:8983/solr/admin/collections?action=RESTORE&name=myBackupName&location=/choose/location/&collection=gettingstarted'


(use this before backup, and then restore works)
curl
'http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&collection=gettingstarted&rule=shard:*,replica:<2,node:*'





Koen De Groote wrote
> I also ran into this while researching cluster policies. Solr 7.6
> 
> Except same situation: introduce a rule to control placement of
> collections. Backup. Delete. Restore. Solr complains it can't do it.
> 
> I don't need them just yet, so I stopped there, but reading this is quite
> disturbing.
> 
> Does deleting the rule, restore and then immediately re-instating the rule
> work?
> 
> 
> 
> On Wed, Oct 9, 2019 at 6:33 AM Natarajan, Rajeswari <

> rajeswari.natarajan@

>> wrote:
> 
>> I am also facing the same issue. With Solr 7.6 restore fails with below
>> rule. Would like to place one replica per node by below rule
>>
>>  with the rule to place one replica per node
>> "set-cluster-policy": [{
>>         "replica": "<2",
>>         "shard": "#EACH",
>>         "node": "#ANY"
>>     }]
>>
>> Without the rule the restore works. But we need this rule. Any
>> suggestions
>> to overcome this issue.
>>
>> Thanks,
>> Rajeswari
>>
>> On 7/12/19, 11:00 AM, "Mark Thill" &lt;

> mark.thill@

> &gt; wrote:
>>
>>     I have a 4 node cluster.  My goal is to have 2 shards with two
>> replicas
>>     each and only allowing 1 core on each node.  I have a cluster policy
>> set to:
>>
>>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>>
>>     I then manually create a collection with:
>>
>>     name: test
>>     config set: test
>>     numShards: 2
>>     replicationFact: 2
>>
>>     This works and I get a collection that looks like what I expect.  I
>> then
>>     backup this collection.  But when I try to restore the collection it
>> fails
>>     and says
>>
>>     "Error getting replica locations : No node can satisfy the rules"
>>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>>
>>     If I set my cluster-policy rules back to [] and try to restore it
>> then
>>     successfully restores my collection exactly how I expect it to be. 
>> It
>>     appears that having any cluster-policy rules in place is affecting my
>>     restore, but the "error getting replica locations" is strange.
>>
>>     Any suggestions?
>>
>>     mark &lt;

> mark.thill@

> &gt;
>>
>>
>>





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr 7.7 restore issue

Posted by Koen De Groote <ko...@limecraft.com>.
I also ran into this while researching cluster policies. Solr 7.6

Except same situation: introduce a rule to control placement of
collections. Backup. Delete. Restore. Solr complains it can't do it.

I don't need them just yet, so I stopped there, but reading this is quite
disturbing.

Does deleting the rule, restore and then immediately re-instating the rule
work?



On Wed, Oct 9, 2019 at 6:33 AM Natarajan, Rajeswari <
rajeswari.natarajan@sap.com> wrote:

> I am also facing the same issue. With Solr 7.6 restore fails with below
> rule. Would like to place one replica per node by below rule
>
>  with the rule to place one replica per node
> "set-cluster-policy": [{
>         "replica": "<2",
>         "shard": "#EACH",
>         "node": "#ANY"
>     }]
>
> Without the rule the restore works. But we need this rule. Any suggestions
> to overcome this issue.
>
> Thanks,
> Rajeswari
>
> On 7/12/19, 11:00 AM, "Mark Thill" <ma...@gmail.com> wrote:
>
>     I have a 4 node cluster.  My goal is to have 2 shards with two replicas
>     each and only allowing 1 core on each node.  I have a cluster policy
> set to:
>
>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>
>     I then manually create a collection with:
>
>     name: test
>     config set: test
>     numShards: 2
>     replicationFact: 2
>
>     This works and I get a collection that looks like what I expect.  I
> then
>     backup this collection.  But when I try to restore the collection it
> fails
>     and says
>
>     "Error getting replica locations : No node can satisfy the rules"
>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>
>     If I set my cluster-policy rules back to [] and try to restore it then
>     successfully restores my collection exactly how I expect it to be.  It
>     appears that having any cluster-policy rules in place is affecting my
>     restore, but the "error getting replica locations" is strange.
>
>     Any suggestions?
>
>     mark <ma...@gmail.com>
>
>
>