You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "bdemers (via GitHub)" <gi...@apache.org> on 2023/11/08 15:40:22 UTC

[PR] Idea: Pass all search information into the repository implementation [directory-scimple]

bdemers opened a new pull request, #410:
URL: https://github.com/apache/directory-scimple/pull/410

   This change is backward compatible, but it's possible this should be a breaking change, and force implementations to use this method, instead of the other?
   When developer overrides this method, the original `find` method must still be defined, but is not used.
   
   Questions:
   -  I'm not really a fan of the FindRequest nameing, `SearchRequest` or `QueryRequest` might work better, but `SearchRequest` is used at the REST layer, so I didn't thing we should have two classes named `SearchRequest` (maybe we should rename the other...)
   - Backwards compatibilty, should we break this? (the migration guide would be simple)
   - Passing the exclude args down the the search is probably the right thing to do, but that potentially changes the meaning of the etag,  this is currently done at the REST layer, but maybe this should move down a layer?  We could still handle this by default 🤷
   
   We probably need to dig into how the etag should work when attributes are included/excluded
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "bdemers (via GitHub)" <gi...@apache.org>.
bdemers commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1804283550

   Oh, I see, this may not be related to the inclusion/exclusion attributes.
   
   Can you include a sample PATCH request you receive? (body and URL)
   (sanitized of course)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "kjthorpe18 (via GitHub)" <gi...@apache.org>.
kjthorpe18 commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1804273632

   Since I think it was my email that opened up this conversation, let me describe what I was facing. 
   
   For each of these requests, SCIMple uses the `IGroupService.get(id)` method to fetch the group and apply updates to the group's attributes and members, essentially a `GET /Groups/<id>` request for each membership change. This is then passed to the `IGroupService.update(updateRequest)` method.
   
   Azure sends group membership updates via the Group PATCH endpoint one-by-one for each user in a group. So, for each user added to a group, we're seeing a query to fetch every member of the group, resulting in latency, DB CPU utilization spiking, etc. For large groups (we did a test with 90k users each for multiple groups assigned to a SCIM application), this means 90k queries for members of a group, each increasing in time as the group grows... not great.
   
   Ideally, Azure would batch group updates together so this did not have to be done _for each user in a group_, but I don't expect them to change that quickly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "bdemers (via GitHub)" <gi...@apache.org>.
bdemers closed pull request #410: Idea: Pass all search information into the repository implementation
URL: https://github.com/apache/directory-scimple/pull/410


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "bdemers (via GitHub)" <gi...@apache.org>.
bdemers commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1808608057

   Replaced with: #411 
   
   @kjthorpe18 I'll watch both this issue and the newer one for your feedback.
   #411 supersedes this one, it moves the logic of the include/exclude attributes, and etags into the repository layer (and more importantly, it doesn't require a call to `repository.get(id)`, so you can optimize your queries. (note: you could still call get and [apply a patch](https://github.com/apache/directory-scimple/pull/411/files#diff-1e7d518ada8bd0781a1b6683f452c59f33374609a6ace8ce76bea598530dcbafR135-R136) if you choose to)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "kjthorpe18 (via GitHub)" <gi...@apache.org>.
kjthorpe18 commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1804346345

   @bdemers Here's an example pulled from a test. This is identical to Azure's requests (see https://learn.microsoft.com/en-us/entra/identity/app-provisioning/use-scim-to-provision-users-and-groups#update-group-add-members)
   ```
   Request method:	PATCH
   Request URI:	http://localhost:62028/api/scim/v2/Groups/035a3341-5cc0-4abd-983b-d4540064afc6
   Body:
   {
       "schemas": [
           "urn:ietf:params:scim:api:messages:2.0:PatchOp"
       ],
       "Operations": [
           {
               "op": "add",
               "path": "members",
               "value": [
                   {
                       "display": "Barbara",
                       "$ref": "https://example.com/v2/Users/aaebb169-b91d-481c-9957-097faeaf649a",
                       "value": "aaebb169-b91d-481c-9957-097faeaf649a"
                   }
               ]
           }
       ]
   }
   ```
   This request uses the `get(id)` method of the GroupService, where we fetch members of the group along with the group attributes. That result is given to the `update`. 
   
   So yeah, this isn't really an issue with `excludedAttributes`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "bdemers (via GitHub)" <gi...@apache.org>.
bdemers commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1804812728

   Very helpful!
   
   I agree, I  think this is separate from the `attributes` change I suggested in this PR.
   
   It is possibly related, in both cases I think the REST layer is doing too much.
   In your case the call to `T stored = repository.get(id);` is triggering a lookup without any other context (e.g. do you need to include all the members)
   
   The currently logic:
   https://github.com/apache/directory-scimple/blob/b91fee906b585a7832b6f45f0ea1e70092761d7d/scim-server/src/main/java/org/apache/directory/scim/server/rest/BaseResourceTypeResourceImpl.java#L292-L302
   Is basically doing:
   
   1. Get Object 
   2. Check if the object exists (404)
   3. Calculate the etag
   4. Update the Object
   
   IMHO, all of this should probably be pushed into the repository
   The etag usage should be an optional optimization if you can/want to support it.  That logic may need to be pushed down into a the datastore (pulling the object into memory, and then calculating the hash of the object could be expensive)
   
   
   I'm thinking the REST layer should delegate all of that logic into the `Repository` layer, possibly with separate `update` and `patch` methods 🤔 
   - Those method could throw a new `ResourceNotFoundException` (which the REST layer could map back to a 404), or return `null` to match the behavior of `repository.get(id)`.
   - Not sure about the case where the ETAG dictates nothing needs to be done (throw some sort of exception, return an object with the version/etag set, and the REST layer could do the comparison 🤷)  I'm not sure any of those are great options...
   
   Anyway, something like this would put all the control into the implementors hands:
   - Need to support etags, implement it efficiently in your datastore
   - Optimized queries for patch requests
   
   
   Questions:
   
   - Would separate `update` and `patch` methods be useful to you?
   - Do you care about etags?
   - Do you support patch requests for objects other than Groups? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org


Re: [PR] Idea: Pass all search information into the repository implementation [directory-scimple]

Posted by "kjthorpe18 (via GitHub)" <gi...@apache.org>.
kjthorpe18 commented on PR #410:
URL: https://github.com/apache/directory-scimple/pull/410#issuecomment-1811587872

   - Separate `update` and `patch` methods could work. We have some workarounds in our SCIM server that determine if an update request has one or more operations, and optimize based on that. This would make it easier to do that, plus avoid the intermediate `get(id)`
   - We don't use etags, as of now
   - Yes, we do support PATCH for other objects (Users). However, the main concern is Groups due to the inefficiency of getting all Group members on the intermediate `get`. We have created some workarounds to not fetch Group members every time we get a group, but did not have control over the `get` from the `update` call


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@directory.apache.org
For additional commands, e-mail: dev-help@directory.apache.org