You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Marcel Reutegger (JIRA)" <ji...@apache.org> on 2013/09/26 11:21:02 UTC

[jira] [Commented] (OAK-1044) Reduce traffic between MongoMK and MongoDB

    [ https://issues.apache.org/jira/browse/OAK-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778594#comment-13778594 ] 

Marcel Reutegger commented on OAK-1044:
---------------------------------------

Another area where an improvement is possible is the background write operation. MongoMK updates the nodes one by one, which means a request to MongoDB for each of the nodes it updates. While MongoDB does not support batch updates, it does have a way to update multiple documents, which match a given query. This is something we might be able to leverage in this situation. The background write usually updates multiple documents the same way. E.g. it sets a new _lastRev to a given revision. Those updates with the same revision could be packed together in a multi document update.
                
> Reduce traffic between MongoMK and MongoDB
> ------------------------------------------
>
>                 Key: OAK-1044
>                 URL: https://issues.apache.org/jira/browse/OAK-1044
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mongomk
>            Reporter: Marcel Reutegger
>
> There is quite some redundant traffic going on between MongoMK and MongoDB, which can have a major impact on performance. One of the problems we already saw in the past when there are many changes on a node. The document in MongoDB grows and with every modification to the node the cost increases because MongoMK always requests the complete old document from MongoDB in the response. This is done for several reasons:
> 1) MongoMK looks at the returned old document and checks if the updates applied do not conflict.
> 2) MongoMK updates its cache with the old document and the updates it applied.
> Splitting the documents when they reach a certain size keeps the cost for an update within bounds, but most of the time the response just contains redundant information already present in MongoMK. E.g. when no other MongoMK instance modified the node, the returned document is the same as the one potentially already in the cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira