You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Julian Reschke (JIRA)" <ji...@apache.org> on 2016/11/22 13:49:59 UTC
[jira] [Updated] (JCR-3597) Dead lock when using LockManager session scoped locks on nodes

     [ https://issues.apache.org/jira/browse/JCR-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julian Reschke updated JCR-3597:
--------------------------------
    Priority: Minor  (was: Blocker)

> Dead lock when using LockManager session scoped locks on nodes
> --------------------------------------------------------------
>
>                 Key: JCR-3597
>                 URL: https://issues.apache.org/jira/browse/JCR-3597
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: jackrabbit-core
>    Affects Versions: 2.6.1
>         Environment: Sling + Jackrabbit
>            Reporter: Davide Maestroni
>            Priority: Minor
>         Attachments: thread_dump.txt
>
>
> I have a repository created using Jackrabbit and accessed through HTTP APIs via Sling. I use the LockManager class to create locks on nodes before adding new children and updating properties. When testing the speed of the system I ran across an issue when using different threads to add child nodes to different parent nodes: a dead lock happens quite consistently when one thread is logging out from a JCR session and another one is saving its own session (I'll attach the thread dump to the JIRA issue).
> So I started inspecting the source code, trying to understand what the problem was, and I think I located the problem. Basically it all happens when calling the SharedItemStateManager.beginUpdate() method: one thread, inside the Update.begin(), acquires the synchronized lock on the LocalItemStateManager instance and wait for the ISMLocking lock to be released (Thread t@295 in the logs); while the other, which owns the ISMLocking lock, inside the Update.end() triggers an event which in turn try to synchronize with the LocalItemStateManager instance (Thread t@293 in the logs).
> I noticed that the LocalItemStateManager implementation employs synchronized blocks only in two instances: one in the edit() method to synchronize the whole function, and one in the getItemState() method to synchronize the access to the cache object. Both are quite odd since: the edit() is the only synchronized method and there are several other places in which editMode and changeLog are modified, so it is quite difficult to understand the use of it; the cache block to be atomic can simply employ a dedicated object to synchronize on instead of locking the whole instance.
> Another suspicious piece of code is inside the LockManagerImpl class, where there are two blocks of code synchronized on the LocalItemStateManager instance. Again, since no method of the class, with the exception of the edit(), is synchronized, nothing prevents another thread to call the LocalItemStateManager instance at the same time.
> I finally identified the root cause of my issue in the "synchronized (this)" (line 170) inside the LocalItemStateManager.getItemState() method and, since the comment states that the lock is there just to ensure the calls to the cache to be atomically executed, I tried to add a simple object field name "monitor" to the class and use "synchronized (monitor)" instead. The patch looks to solve the dead lock, still I am confused about the pieces of code mentioned above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)