You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Oakley, Craig (NIH/NLM/NCBI) [C]" <cr...@nih.gov.INVALID> on 2021/01/12 19:00:55 UTC

RE: disallowing delete through security.json

Does anyone yet have any examples or suggestion for using the "method" section in lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html ?

Also, if anyone has any other suggestions of how to provide high availability while completely dropping and recreating and reloading a large collection (as required in order to complete the upgrade to a new release), let me know.

-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] <cr...@nih.gov.INVALID> 
Sent: Tuesday, November 24, 2020 1:56 PM
To: solr-user@lucene.apache.org
Subject: RE: disallowing delete through security.json

Thank you for the response

The use case I have in mind is trying to approximate incremental updates (as are available in Sybase or MSSQL, to which I am more accustomed).

We are wanting to upgrade a large collection from Solr7.4 to Solr8.5. It turns out that Solr8.5 cannot run against the current data, because the collection was created under Solr6.6. We want to migrate in such a way that, in a year or so, we will be able to migrate to Solr9 without worrying about Solr7.4 let alone Solr6.6. We want to create a new collection (of the same name) in a brand new Solr8.5 SolrCloud, and then to select everything from the current Solr7.4 collection in json format and load it into the new Solr8.5 collection. All of the fields have stored="true", with the exception of fields populated by copyField. The select will be done by ranges of id values, so as to avoid OutOfMemory errors. That process will take several days; and in the meanwhile, users will be continuing to add data. When all the data will have been copied (including that which is described below), we can switch port numbers so that the new Solr8.5 SolrCloud takes the place of the old Solr7.4 SolrCloud.

The plan is to find a value of _version_ (call it V1) which was in the Solr7.4 collection when we started the first select, but which is greater than almost all values of _version_ in the collection (we are fine with having an overlap of _version_ values, but we want to avoid losing anything by having a gap in _version_ values). After the initial selects are complete, we can run other selects by ranges of id with the additional criteria that the _version_ will be no lower than the V1 value. As we have seen in test runs, this will involve less data and will run faster. We will also keep note of a new value of _version_ (call it V2) which was in the Solr7.4 collection when we start the V1 select, but which is greater than almost all values of _version_ in the V1 select. Following this procedure through various iterations (V3, V4, however many it takes), we can load the V1 set of selects when we will have completed the loading of the initial set of selects. We can then load the V2 set of selects when we will have completed the loading of the V1 set of selects. The plan is that the selecting and loading of the last Vn set of selects will involve a maintenance window measured in minutes rather than in days.

The users claim that they never do deletes: which is good, because a delete would be something which would be missed by this plan. If (as you describe) the users were to update a record so that only the id field (and the _version_ field) are left, that update would get picked up by one of these incremental selects and would be applied to the new collection. A delete, however, would not be noticed: and the new Solr8.5 collection would still have the record which had been deleted from the old Solr7.4 collection. The users claim that they never do deletes: but it would seem safer to actually disallow deletes during the maintenance.

Let me know if you have any suggestions.

Thank you again for your reply.


-----Original Message-----
From: Jason Gerlowski <ge...@gmail.com> 
Sent: Tuesday, November 24, 2020 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: disallowing delete through security.json

Hey Craig,

I think this will be tricky to do with the current Rule-Based
Authorization support.  As you pointed out in your initial post -
there are lots of ways to delete documents.  The Rule-Based Auth code
doesn't inspect request bodies (AFAIK), so it's going to have trouble
differentiating between traditional "/update" requests with
method=POST that are request-body driven.

But to zoom out a bit, does it really make sense to lock down deletes,
but not updates more broadly?  After all, "updates" can remove and add
fields.  Users might submit an update that strips everything but "id"
from your documents.  In many/most usecases that'd be equally
concerning.  Just wondering what your usecase is - if it's generally
applicable this is probably worth a JIRA ticket.

Best,

Jason

On Thu, Nov 19, 2020 at 10:34 AM Oakley, Craig (NIH/NLM/NCBI) [C]
<cr...@nih.gov.invalid> wrote:
>
> Having not heard back, I thought I would ask again whether anyone else has been able to use security.json to disallow deletes, and/or if anyone has examples of using the "method" section in lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html
>
> -----Original Message-----
> From: Oakley, Craig (NIH/NLM/NCBI) [C] <cr...@nih.gov.INVALID>
> Sent: Monday, October 26, 2020 6:23 PM
> To: solr-user@lucene.apache.org
> Subject: disallowing delete through security.json
>
> I am interested in disallowing delete through security.json
>
> After seeing the "method" section in lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html my first attempt was as follows:
>
> {"set-permission":{
> "name":"NO_delete",
> "path":["/update/*","/update"],
> "collection":col_name,
> "role":"NoSuchRole",
> "method":"DELETE",
> "before":4}}
>
> I found, however, that this did not disallow deleted: I could still run
> curl -u ... "http://.../solr/col_name/update?commit=true" --data "<delete><query>id:11</query></delete>"
>
> After further experimentation, I seemed to have success with
> {"set-permission":
> {"name":"NO_delete6",
> "path":"/update/*",
> "collection":"col_name",
> "role":"NoSuchRole",
> "method":["REGEX:(?i)DELETE"],
> "before":4}}
>
> My initial impression was that this did what I wanted; but now I find that this disallows *any* updates to this collection (which had previously been allowed). Other attempts to tweak this strategy, such as granting permissions for "/update/*" for methods other than DELETE to a role which is granted to the desired user, have not yet been successful.
>
> Does anyone have an example of security.json disallowing a delete while still allowing an update?
>
> Thanks