You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Vikas Saurabh (JIRA)" <ji...@apache.org> on 2016/02/09 18:31:18 UTC

[jira] [Commented] (OAK-3653) Incorrect last revision of cached node state

    [ https://issues.apache.org/jira/browse/OAK-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139262#comment-15139262 ] 

Vikas Saurabh commented on OAK-3653:
------------------------------------

bq. Depending issue OAK-3634 is fixed. Resolving this issue as well.
Updating fixVersion to include 1.2.11 according to OAK-3634

> Incorrect last revision of cached node state
> --------------------------------------------
>
>                 Key: OAK-3653
>                 URL: https://issues.apache.org/jira/browse/OAK-3653
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: documentmk
>            Reporter: Vikas Saurabh
>            Assignee: Marcel Reutegger
>             Fix For: 1.3.14, 1.2.11
>
>         Attachments: Capture.JPG, Capture1.JPG, Capture2.JPG, diagram.png, oplog.json
>
>
> While installing a package in one of the systems (Oak-1.0.16), we observed broken workflow model. Upon further investigation, it was found that {{baseVersion}} of the model was pointing to a node in version storage which wasn't visible on the instance. This node was available in mongo and could also be seen on a different instance of the cluster. 
> The node was created on the same instance where it wasn't visible (cluster id - 3)
> Further investigation showed that for some (yet unknown) reason, {{lastRevision}} for cached node state of node's parent at revision where the node was created was stale (an older revision) which must have led to invalid children list and hence unavailable node.
> Attaching a few files which were captured during the investigation:
> * [^oplog.json] - oplog for +-10s for r151191c7601-0-3
> * [^Capture.JPG]- snapshot of groovy console output using script at \[0] to list node children cache entry for parent
> * [^Capture1.JPG]- snapshot of groovy console output using script at \[1] to traverse to invisible node
> * [^Capture2.JPG]- Almost same as Capture.jpg but this time for node cache instead (using script at \[2])
> The node which wasn't visible in mongo:
> {noformat}
> db.nodes.findOne({_id: "7:/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7"})
> {
>         "_id" : "7:/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7",
>         "_deleted" : {
>                 "r151191c7601-0-3" : "false"
>         },
>         "_commitRoot" : {
>                 "r151191c7601-0-3" : "0"
>         },
>         "_lastRev" : {
>                 "r0-0-3" : "r151191c7601-0-3"
>         },
>         "jcr:primaryType" : {
>                 "r151191c7601-0-3" : "\"nam:nt:version\""
>         },
>         "jcr:uuid" : {
>                 "r151191c7601-0-3" : "\"40234f1d-2435-4c6b-9962-20af22b1c948\""
>         },
>         "_modified" : NumberLong(1447825270),
>         "jcr:created" : {
>                 "r151191c7601-0-3" : "\"dat:2015-11-17T21:41:14.058-08:00\""
>         },
>         "_children" : true,
>         "jcr:predecessors" : {
>                 "r151191c7601-0-3" : "[\"ref:6cecc77b-3020-4a47-b9cc-084a618aa957\"]"
>         },
>         "jcr:successors" : {
>                 "r151191c7601-0-3" : "\"[0]:Reference\""
>         },
>         "_modCount" : NumberLong(1)
> }
> {noformat}
> Last revision entry of cached node state of {{/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e}} (parent) at revision {{r151191c7601-0-3}} was {{r14f035ebdf8-0-3}} (the last revs were propagated correctly up till root for r151191c7601-0-3).
> As a work-around, doing {{cache.invalidate(k)}} for node cache (inside the loop of script at \[2]) revived the visibility of the node.
> *Disclaimer*: _The issue was reported and investigated on a cluster backed by mongo running on Oak-1.0.16. But, the problem itself doesn't seem like any which have been fixed in later versions_
> \[0]: {noformat}
> def printNodeChildrenCache(def path) {
>   def session = osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
>   try {
>     def rootNode = session.getRootNode()
>     def cache = rootNode.sessionDelegate.root.store.nodeChildrenCache
>     def cacheMap = cache.asMap()
>     cacheMap.each{k,v -> 
>       if (k.toString().startsWith(path + "@")) {
>         println "${k}:${v}"
>       }
>     }
>   } finally {
>     session.logout()
>   }
> }
>  
> printNodeChildrenCache("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e")
> {noformat}
> \[1]: {noformat}
> def traverse(def path) {
>   def session = osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
>   try {
>     def rootNode = session.getRootNode()
>     def nb = rootNode.sessionDelegate.root.store.root
>     def p = ''
>     path.tokenize('/').each() {
>       p = p + '/' + it
>       nb = nb.getChildNode(it)
>       println "${p} ${nb}"
>     }
>   } finally {
>     session.logout()
>   }
> }
> traverse("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e/1.7")
> {noformat}
> \[2]: {noformat}
> def printNodeChildrenCache(def path) {
>   def session = osgi.getService(org.apache.sling.jcr.api.SlingRepository.class).loginAdministrative(null)
>   try {
>     def rootNode = session.getRootNode()
>     def cache = rootNode.sessionDelegate.root.store.nodeCache
>     def cacheMap = cache.asMap()
>     cacheMap.each{k,v -> 
>       if (k.toString().startsWith(path + "@")) {
>         println "${k}:${v}"
>         println "lastRev -> ${v.lastRevision}"
>       }
>     }
>   } finally {
>     session.logout()
>   }
> }
>  
> printNodeChildrenCache("/jcr:system/jcr:versionStorage/d9/9f/22/d99f2279-dcc9-40a0-9aa4-c91fe310287e")
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)