You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "James Ashbourne (Jira)" <ji...@apache.org> on 2021/03/24 10:46:00 UTC

[jira] [Commented] (SOLR-15213) Add support for "merge" atomic update operation for child documents

    [ https://issues.apache.org/jira/browse/SOLR-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307748#comment-17307748 ] 

James Ashbourne commented on SOLR-15213:
----------------------------------------

[~thomas.woeckinger] essentially yes, the patch attached means that the application calling solr doesn't have to first get the document, check the children, manually merge the incoming changes to the child then send that to update with '_root_' specified.

This approach should be much more performant for everyone involved with the downside being that it pushes some of the complexity into solr but overall I think that's a good tradeoff. Also worth noting that this doesn't require any extra fetching of child docs from solr's point of view because that was happening with the existing operations anyway.

> Add support for "merge" atomic update operation for child documents
> -------------------------------------------------------------------
>
>                 Key: SOLR-15213
>                 URL: https://issues.apache.org/jira/browse/SOLR-15213
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: James Ashbourne
>            Priority: Major
>         Attachments: SOLR-15213.patch
>
>
> Solr has "add", "set", "add-distinct" which work but all have their limitations. Namely, there's currently no way to atomically update a document where that document may or may not be present already by merging if it is present and inserting if it isn't.
> i.e. in the scenario where we have a document with two nested children: 
>   
> {noformat}
> {"id": "ocean1", 
> "_isParent":"true", 
> "fish": [ 
>     {
>      "id": "fish1", 
>      "type_s": "fish", 
>      "name_s": "Doe", 
>      "_isParent":"false"}, 
>     {
>      "id": "fish2", 
>      "type_s": "fish", 
>      "name_s": "Hans", 
>      "_isParent":"false"}]
> }{noformat}
>  
>  If we later want to update that child doc e.g.:
> {noformat}
> {"id": "ocean1", 
> "_isParent":"true", 
> "fish": [ 
>     {
>      "id": "fish1", 
>      "type_s": "fish", 
>      "name_s": "James", // new name
>      "_isParent":"false"}, 
> ]
> }{noformat}
>  
>  Existing operations:
>  - "add" - will add another nested doc with the same id leaving us with two children with the same id.
>  - "set" - replaces the whole list of child docs with the single doc, we could use this but would first have to fetch all the existing children.
>  - "add-distinct" - will reject the update based on the doc already being present.
> I've got some changes (see patch) that a new option "merge" which checks based on the id and merges the new document with the old with a fall back to add if there is no id match.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)