You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Davide Giannella (JIRA)" <ji...@apache.org> on 2014/05/11 00:06:51 UTC

[jira] [Commented] (OAK-1717) Concurrent updates of ordered index in cluster may fail

    [ https://issues.apache.org/jira/browse/OAK-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992671#comment-13992671 ] 

Davide Giannella commented on OAK-1717:
---------------------------------------

outcome of a long thread on oak-dev and off-line discussions: 
[Avoiding conflicts (OAK-1717)|http://markmail.org/thread/qdrvm6rwnblu3rdn]

We're going to go for one of the following. The basic idea is that the
CommitHook add the node in a non-conflicting manner in a safe
area. Then there will be an asynchronous process that will merge this
area into the actual index.

The key requirement here is that it has to be possible to serve a
_merged_ results of the actual index and the _pending_ area. If not
possible there won't be any difference from having the index simply
asynchronous.

h4. Solution 1

{code}
:index : {
   _pending : {
      prop-550e8400-e29b-41d4-a716-446655440000 : {
        foo="/path/to/the/content" },
      prop-550e8400-e29b-41d4-a716-446655440000 : {
        bar="-/path/to/the/content" } <== deletion
   }
}
{code}

where each cluster node/session add a

{code}
prop-UUID : { $property-value = "${operation-if-needed}/pathtocontent" }
{code}

then the async merger will parse those, apply to the actual structure
and delete from the _pending area.

h5. Pros

* definitely avoid conflicts
* works as a sort of ChangeLog

h5. Cons

* will be harder the merging of results.

h4. Solution 2

{code}
:index : {
    keyA : {
        path : { to : { content : {match=true} } }
    },
    keyB : {
        path : { to : { content : {deleted=true} } }
    }
}
{code}

where each cluster use the _pending area as it would be a real index (no
order in here though) and will never physically delete a node.

The async merger will parse and merge into the actual index cleaning-up
while doing it.

h5. Pros

* Easy for on-the-fly merging while serving queries

h5. Cons

* Could not avoid conflict as we expect.


> Concurrent updates of ordered index in cluster may fail
> -------------------------------------------------------
>
>                 Key: OAK-1717
>                 URL: https://issues.apache.org/jira/browse/OAK-1717
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.20
>            Reporter: Marcel Reutegger
>         Attachments: OAK-1717-IT.patch
>
>
> In a clustered deployment with DocumentNodeStore on MongoDB it may happen that concurrent updates on the new ordered index fail because of conflicts.
> A common use case is maintaining an ordered index on a last modified date. When nodes with such a date are added concurrently on multiple cluster nodes, then all of them will try to update the ordered index at one end of the key list. The DocumentNodeStore will perform a couple of retries but there is no guarantee that the cluster nodes will sync within that time frame or some other session conflicts yet another time.
> A possible workaround is to declare the index as asynchronous.



--
This message was sent by Atlassian JIRA
(v6.2#6252)