You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by Olivier Dony <ol...@denali.be> on 2007/03/13 18:19:19 UTC

Possible deadlock of jcr-server 1.2.1 (rmi)

Hi,

We are using the Repository Server deployment model for one of our  
systems, with 3 different web applications using the same jackrabbit  
server.
Each webapp is running in a separate Tomcat5 server, and jackrabbit  
1.2.1 is running as a jcr server in a 3rd Tomcat server.

Everything has been doing fine for weeks, but yesterday the  
jackrabbit server suddenly stopped responding to all requests,  
seemingly deadlocked.
We had the opportunity to take a threadump of the jackrabbit server  
before performing an emergency restart, which solved the situation.

The thread dump is attached. I tried to make some sense out of it,  
but the read/write locks are hard to follow.
Looks like all RMI-handling thread are waiting to acquire a reader  
lock on the SharedItemStateManager, except one which is waiting for a  
writer lock.
None appear to be ready to release a lock, which is why I suppose  
they were deadlocked.

Is this maybe related to a lock that isn't reentrant but should be?  
Or not?
Can anybody see anything there?

Thanks a lot for having a look!

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Tobias Bocanegra <to...@day.com>.

> > imo, we can't fixed the transaction/concurrency issues that occur
> > together with versioning without a bigger redesign of some of the core
> > parts of jackrabbit.
>
> Do we have some directions that seem worth pursuing? Would rethinking
> the locking mechanisms be enough, or do we need to fundamentally
> modify the basic ItemStateManager and VersionManager designs?

well. one of the problems are the 'virtual item states'. in a first
place they were introduced in the believe that the version manager
could be replaced by another implementation that does not use the
repository itself for storing the versions. but now i think this will
never happen (and will not work anyway with the current architecture).
with a proper separation from the version (i.e. system) workspace from
the other workspaces and an overlay mechanism that perhaps takes place
on a higher layer (e.g. item manager) we could circumvent some of the
concurrency issues.
further inspection needs the transaction handling - this should
provide a more global transaction that could span updates on multiple
workspaces (and should allow including the persistence manager
backends as xa resource).

regards, toby
-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Marcel Reutegger <ma...@gmx.net>.

Jukka Zitting wrote:
> Do we have some directions that seem worth pursuing? Would rethinking
> the locking mechanisms be enough, or do we need to fundamentally
> modify the basic ItemStateManager and VersionManager designs?

I haven't thought about this in much detail, but IMO the sequence how locks are 
acquired is the most problematic part of this issue. If we can ensure that locks 
are always acquired in the same sequence a deadlock shouldn't occur that easily.

Here's what I've been thinking about:

- Add a check to the SharedItemStateManager (SISM) if it has 
VirtualItemStateProvider (VISP). This will be the case for the workspace SISM, 
but not for the SISM in the version manager. Furthermore if the change log 
contains references into one of the VISPs, those VISPs must be write locked 
*before* this SISM is write locked. Otherwise only this SISM needs to be locked.

This should ensure that the lock sequence is always: VISP and then SISM.

I'm not sure about the lock in AbstractVersionManager (AVM), but since the AVM 
is on a higher layer than the VISP the overall lock sequence should be: AVM, 
VISP then SISM.

Thoughts?

regards
  marcel

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On 3/14/07, Tobias Bocanegra <to...@day.com> wrote:
> imo, we can't fixed the transaction/concurrency issues that occur
> together with versioning without a bigger redesign of some of the core
> parts of jackrabbit.

Do we have some directions that seem worth pursuing? Would rethinking
the locking mechanisms be enough, or do we need to fundamentally
modify the basic ItemStateManager and VersionManager designs?

BR,

Jukka Zitting

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Tobias Bocanegra <to...@day.com>.

wow, cool.
can you attach the patch to the jira issue? that would be great.
thanx.

regards, toby

On 3/15/07, Shane Preater <sh...@googlemail.com> wrote:
> Thanks for that Tobias.
>
> We have now implemented the fix proposed by Marcel and this has sorted out
> our dead lock issue (Based on the tests we created to verify that our issues
> were the same as that found by Olivier) so if anyone else is experiencing
> this issue then Marcel's fix is the way to go temporarily.
>
> Regards,
> Shane.
>
>
> On 15/03/07, Tobias Bocanegra <to...@day.com> wrote:
> > hi,
> > a quick search in jira shows that the following issues deal with
> > deadlocked repositories:
> >
> > http://issues.apache.org/jira/browse/JCR-546
> > http://issues.apache.org/jira/browse/JCR-672
> > http://issues.apache.org/jira/browse/JCR-447
> > http://issues.apache.org/jira/browse/JCR-443
> > http://issues.apache.org/jira/browse/JCR-335
> >
> > the hacks i mentioned earlier where fixes for some of those issues.
> > the solution that marcel proposed seems reasonable and could help
> > solving this issues in the  short run.
> >
> > regards, toby
> >
> > On 3/15/07, Shane Preater <sh...@googlemail.com> wrote:
> > > Tobias,
> > > We are also experiencing this problem with deadlocks on our system could
> you
> > > outline the "hacks" you have used to fix this issue. We are using
> versioning
> > > in a production environment so if we need to hack it temporarily to get
> over
> > > this issue then so be it for the moment.
> > >
> > > Also I will keep an eye on the JIRA issue for when the proper fix is
> > > implemented.
> > >
> > > Thanks very much,
> > > Shane.
> > >
> > >
> > > On 14/03/07, Tobias Bocanegra < tobias.bocanegra@day.com> wrote:
> > > > hi,
> > > > we analyzed the issue several times and most of the fixes were hacks
> > > > to prevent deadlocks and data corruption.
> > > > imo, we can't fixed the transaction/concurrency issues that occur
> > > > together with versioning without a bigger redesign of some of the core
> > > > parts of jackrabbit.
> > > >
> > > > regards, toby
> > > >
> > > > On 3/14/07, Miro Walker < miro.walker@gmail.com> wrote:
> > > > > We've been aware of this issue for a while. Unfortunately, the
> locking
> > > > > implementation is pretty hard to disentangle, and we haven't been
> able
> > > > > to come up with a fix. However, we have been able to work around it
> by
> > > > > adding an extra level of synchronisation in our own application that
> > > > > ensures only one simultaneous versioning operation can occur. I
> guess
> > > > > it depends how big a hit this would be as to whether it would be a
> > > > > suitable solution for anyone else.
> > > > >
> > > > > Miro
> > > > >
> > > > > On 3/14/07, Jukka Zitting < jukka.zitting@gmail.com> wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Seems like another case of the age-old JCR-18 issue with
> concurrent
> > > > > > versioning. Both of the updates contain some versioning
> operations,
> > > > > > and since concurrent versioning is at the moment still a rather
> > > > > > dangerous sport, I'm not surprised if bad things like a deadlock
> can
> > > > > > occur.
> > > > > >
> > > > > > Any contributions in further diagnosing and resolving the
> concurrent
> > > > > > versioning issues would be very much appreciated!
> > > > > >
> > > > > > BR,
> > > > > >
> > > > > > Jukka Zitting
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -----------------------------------------<
> > > tobias.bocanegra@day.com >---
> > > > Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001
> Basel
> > > > T +41 61 226 98 98, F +41 61 226 98 97
> > > > -----------------------------------------------<
> > > http://www.day.com >---
> > > >
> > >
> > >
> >
> >
> > --
> > -----------------------------------------<
> tobias.bocanegra@day.com >---
> > Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
> > T +41 61 226 98 98, F +41 61 226 98 97
> > -----------------------------------------------<
> http://www.day.com >---
> >
>
>


-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Shane Preater <sh...@googlemail.com>.

Thanks for that Tobias.

We have now implemented the fix proposed by Marcel and this has sorted out
our dead lock issue (Based on the tests we created to verify that our issues
were the same as that found by Olivier) so if anyone else is experiencing
this issue then Marcel's fix is the way to go temporarily.

Regards,
Shane.

On 15/03/07, Tobias Bocanegra <to...@day.com> wrote:
>
> hi,
> a quick search in jira shows that the following issues deal with
> deadlocked repositories:
>
> http://issues.apache.org/jira/browse/JCR-546
> http://issues.apache.org/jira/browse/JCR-672
> http://issues.apache.org/jira/browse/JCR-447
> http://issues.apache.org/jira/browse/JCR-443
> http://issues.apache.org/jira/browse/JCR-335
>
> the hacks i mentioned earlier where fixes for some of those issues.
> the solution that marcel proposed seems reasonable and could help
> solving this issues in the  short run.
>
> regards, toby
>
> On 3/15/07, Shane Preater <sh...@googlemail.com> wrote:
> > Tobias,
> > We are also experiencing this problem with deadlocks on our system could
> you
> > outline the "hacks" you have used to fix this issue. We are using
> versioning
> > in a production environment so if we need to hack it temporarily to get
> over
> > this issue then so be it for the moment.
> >
> > Also I will keep an eye on the JIRA issue for when the proper fix is
> > implemented.
> >
> > Thanks very much,
> > Shane.
> >
> >
> > On 14/03/07, Tobias Bocanegra <to...@day.com> wrote:
> > > hi,
> > > we analyzed the issue several times and most of the fixes were hacks
> > > to prevent deadlocks and data corruption.
> > > imo, we can't fixed the transaction/concurrency issues that occur
> > > together with versioning without a bigger redesign of some of the core
> > > parts of jackrabbit.
> > >
> > > regards, toby
> > >
> > > On 3/14/07, Miro Walker <mi...@gmail.com> wrote:
> > > > We've been aware of this issue for a while. Unfortunately, the
> locking
> > > > implementation is pretty hard to disentangle, and we haven't been
> able
> > > > to come up with a fix. However, we have been able to work around it
> by
> > > > adding an extra level of synchronisation in our own application that
> > > > ensures only one simultaneous versioning operation can occur. I
> guess
> > > > it depends how big a hit this would be as to whether it would be a
> > > > suitable solution for anyone else.
> > > >
> > > > Miro
> > > >
> > > > On 3/14/07, Jukka Zitting <ju...@gmail.com> wrote:
> > > > > Hi,
> > > > >
> > > > > Seems like another case of the age-old JCR-18 issue with
> concurrent
> > > > > versioning. Both of the updates contain some versioning
> operations,
> > > > > and since concurrent versioning is at the moment still a rather
> > > > > dangerous sport, I'm not surprised if bad things like a deadlock
> can
> > > > > occur.
> > > > >
> > > > > Any contributions in further diagnosing and resolving the
> concurrent
> > > > > versioning issues would be very much appreciated!
> > > > >
> > > > > BR,
> > > > >
> > > > > Jukka Zitting
> > > > >
> > > >
> > >
> > >
> > > --
> > > -----------------------------------------<
> > tobias.bocanegra@day.com >---
> > > Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001
> Basel
> > > T +41 61 226 98 98, F +41 61 226 98 97
> > > -----------------------------------------------<
> > http://www.day.com >---
> > >
> >
> >
>
>
> --
> -----------------------------------------< tobias.bocanegra@day.com >---
> Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
> T +41 61 226 98 98, F +41 61 226 98 97
> -----------------------------------------------< http://www.day.com >---
>

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Tobias Bocanegra <to...@day.com>.

hi,
a quick search in jira shows that the following issues deal with
deadlocked repositories:

http://issues.apache.org/jira/browse/JCR-546
http://issues.apache.org/jira/browse/JCR-672
http://issues.apache.org/jira/browse/JCR-447
http://issues.apache.org/jira/browse/JCR-443
http://issues.apache.org/jira/browse/JCR-335

the hacks i mentioned earlier where fixes for some of those issues.
the solution that marcel proposed seems reasonable and could help
solving this issues in the  short run.

regards, toby

On 3/15/07, Shane Preater <sh...@googlemail.com> wrote:
> Tobias,
> We are also experiencing this problem with deadlocks on our system could you
> outline the "hacks" you have used to fix this issue. We are using versioning
> in a production environment so if we need to hack it temporarily to get over
> this issue then so be it for the moment.
>
> Also I will keep an eye on the JIRA issue for when the proper fix is
> implemented.
>
> Thanks very much,
> Shane.
>
>
> On 14/03/07, Tobias Bocanegra <to...@day.com> wrote:
> > hi,
> > we analyzed the issue several times and most of the fixes were hacks
> > to prevent deadlocks and data corruption.
> > imo, we can't fixed the transaction/concurrency issues that occur
> > together with versioning without a bigger redesign of some of the core
> > parts of jackrabbit.
> >
> > regards, toby
> >
> > On 3/14/07, Miro Walker <mi...@gmail.com> wrote:
> > > We've been aware of this issue for a while. Unfortunately, the locking
> > > implementation is pretty hard to disentangle, and we haven't been able
> > > to come up with a fix. However, we have been able to work around it by
> > > adding an extra level of synchronisation in our own application that
> > > ensures only one simultaneous versioning operation can occur. I guess
> > > it depends how big a hit this would be as to whether it would be a
> > > suitable solution for anyone else.
> > >
> > > Miro
> > >
> > > On 3/14/07, Jukka Zitting <ju...@gmail.com> wrote:
> > > > Hi,
> > > >
> > > > Seems like another case of the age-old JCR-18 issue with concurrent
> > > > versioning. Both of the updates contain some versioning operations,
> > > > and since concurrent versioning is at the moment still a rather
> > > > dangerous sport, I'm not surprised if bad things like a deadlock can
> > > > occur.
> > > >
> > > > Any contributions in further diagnosing and resolving the concurrent
> > > > versioning issues would be very much appreciated!
> > > >
> > > > BR,
> > > >
> > > > Jukka Zitting
> > > >
> > >
> >
> >
> > --
> > -----------------------------------------<
> tobias.bocanegra@day.com >---
> > Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
> > T +41 61 226 98 98, F +41 61 226 98 97
> > -----------------------------------------------<
> http://www.day.com >---
> >
>
>


-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Shane Preater <sh...@googlemail.com>.

Tobias,
We are also experiencing this problem with deadlocks on our system could you
outline the "hacks" you have used to fix this issue. We are using versioning
in a production environment so if we need to hack it temporarily to get over
this issue then so be it for the moment.

Also I will keep an eye on the JIRA issue for when the proper fix is
implemented.

Thanks very much,
Shane.

On 14/03/07, Tobias Bocanegra <to...@day.com> wrote:
>
> hi,
> we analyzed the issue several times and most of the fixes were hacks
> to prevent deadlocks and data corruption.
> imo, we can't fixed the transaction/concurrency issues that occur
> together with versioning without a bigger redesign of some of the core
> parts of jackrabbit.
>
> regards, toby
>
> On 3/14/07, Miro Walker <mi...@gmail.com> wrote:
> > We've been aware of this issue for a while. Unfortunately, the locking
> > implementation is pretty hard to disentangle, and we haven't been able
> > to come up with a fix. However, we have been able to work around it by
> > adding an extra level of synchronisation in our own application that
> > ensures only one simultaneous versioning operation can occur. I guess
> > it depends how big a hit this would be as to whether it would be a
> > suitable solution for anyone else.
> >
> > Miro
> >
> > On 3/14/07, Jukka Zitting <ju...@gmail.com> wrote:
> > > Hi,
> > >
> > > Seems like another case of the age-old JCR-18 issue with concurrent
> > > versioning. Both of the updates contain some versioning operations,
> > > and since concurrent versioning is at the moment still a rather
> > > dangerous sport, I'm not surprised if bad things like a deadlock can
> > > occur.
> > >
> > > Any contributions in further diagnosing and resolving the concurrent
> > > versioning issues would be very much appreciated!
> > >
> > > BR,
> > >
> > > Jukka Zitting
> > >
> >
>
>
> --
> -----------------------------------------< tobias.bocanegra@day.com >---
> Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
> T +41 61 226 98 98, F +41 61 226 98 97
> -----------------------------------------------< http://www.day.com >---
>

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Tobias Bocanegra <to...@day.com>.

hi,
we analyzed the issue several times and most of the fixes were hacks
to prevent deadlocks and data corruption.
imo, we can't fixed the transaction/concurrency issues that occur
together with versioning without a bigger redesign of some of the core
parts of jackrabbit.

regards, toby

On 3/14/07, Miro Walker <mi...@gmail.com> wrote:
> We've been aware of this issue for a while. Unfortunately, the locking
> implementation is pretty hard to disentangle, and we haven't been able
> to come up with a fix. However, we have been able to work around it by
> adding an extra level of synchronisation in our own application that
> ensures only one simultaneous versioning operation can occur. I guess
> it depends how big a hit this would be as to whether it would be a
> suitable solution for anyone else.
>
> Miro
>
> On 3/14/07, Jukka Zitting <ju...@gmail.com> wrote:
> > Hi,
> >
> > Seems like another case of the age-old JCR-18 issue with concurrent
> > versioning. Both of the updates contain some versioning operations,
> > and since concurrent versioning is at the moment still a rather
> > dangerous sport, I'm not surprised if bad things like a deadlock can
> > occur.
> >
> > Any contributions in further diagnosing and resolving the concurrent
> > versioning issues would be very much appreciated!
> >
> > BR,
> >
> > Jukka Zitting
> >
>


-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Miro Walker <mi...@gmail.com>.

We've been aware of this issue for a while. Unfortunately, the locking
implementation is pretty hard to disentangle, and we haven't been able
to come up with a fix. However, we have been able to work around it by
adding an extra level of synchronisation in our own application that
ensures only one simultaneous versioning operation can occur. I guess
it depends how big a hit this would be as to whether it would be a
suitable solution for anyone else.

Miro

On 3/14/07, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> Seems like another case of the age-old JCR-18 issue with concurrent
> versioning. Both of the updates contain some versioning operations,
> and since concurrent versioning is at the moment still a rather
> dangerous sport, I'm not surprised if bad things like a deadlock can
> occur.
>
> Any contributions in further diagnosing and resolving the concurrent
> versioning issues would be very much appreciated!
>
> BR,
>
> Jukka Zitting
>

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Olivier Dony <ol...@denali.be>.

Hi,

For the record I've created JCR-790 and attached the thread dump and  
Marcel's lock explanation.

Miro's workaround of putting an additional level of synchronization  
on all write operations on the repository is not quite suitable in  
our case. Not only because of the performance hit, but we will soon  
need to load-balance the backoffice application too, and implementing  
an additional cross-application synchronization mechanism does not  
really make sense.

As for this specific deadlock, it seems that it comes from the fact  
that a new versionable node is being initialized while another one is  
being saved.
I suppose it may not be a good idea to fix this with a hack if a  
bigger redesign is needed. However if that redesign is only coming in  
several months, a little hack might be ok for a while ;-)
It really depends on the frequency of occurrence, so we'll see how it  
goes for us and if we can gather more info.

Thanks for the quick answers!



--
Olivier Dony

Denali s.a., "Bridging the gap between Business and IT"
Rue de Clairvaux 8, B-1348 Louvain-la-Neuve, Belgium
Office: +32 10 43 99 51  Fax: +32 10 43 99 52
www.denali.be

Legal Notice: This message may contain confidential and/or privileged  
information. If you are not the addressee or authorized to receive  
this for the addressee, you must not use, copy, disclose or take any  
action based on this message or any information herein. If you have  
received this message by mistake, please advise the sender  
immediately by return e-mail and delete this message from your  
system. Thank you for your cooperation.

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

Seems like another case of the age-old JCR-18 issue with concurrent
versioning. Both of the updates contain some versioning operations,
and since concurrent versioning is at the moment still a rather
dangerous sport, I'm not surprised if bad things like a deadlock can
occur.

Any contributions in further diagnosing and resolving the concurrent
versioning issues would be very much appreciated!

BR,

Jukka Zitting

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Marcel Reutegger <ma...@gmx.net>.

Olivier Dony wrote:
> The thread dump is attached. I tried to make some sense out of it, but 
> the read/write locks are hard to follow.
> Looks like all RMI-handling thread are waiting to acquire a reader lock 
> on the SharedItemStateManager, except one which is waiting for a writer 
> lock.
> None appear to be ready to release a lock, which is why I suppose they 
> were deadlocked.

the tcp connections 11541 and 11537 are in a deadlock situation. see attached 
text file.

> Is this maybe related to a lock that isn't reentrant but should be? Or not?

all locks are reentrant, but they are acquired in different order by the two 
threads.

regards
  marcel

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Shane Preater <sh...@googlemail.com>.

Sorry for the post spam there I was trying to forward this to my systems
team as we are seeing something quite similar.

Shane.

On 13/03/07, Shane Preater <sh...@googlemail.com> wrote:
>
> Does this sound familiar!
>
> ---------- Forwarded message ----------
> From: Olivier Dony <olivier.dony@denali.be >
> Date: 13-Mar-2007 17:19
> Subject: Possible deadlock of jcr-server 1.2.1 (rmi)
> To: dev@jackrabbit.apache.org
>
> Hi,
>
> We are using the Repository Server deployment model for one of our
> systems, with 3 different web applications using the same jackrabbit
> server.
> Each webapp is running in a separate Tomcat5 server, and jackrabbit
> 1.2.1 is running as a jcr server in a 3rd Tomcat server.
>
> Everything has been doing fine for weeks, but yesterday the
> jackrabbit server suddenly stopped responding to all requests,
> seemingly deadlocked.
> We had the opportunity to take a threadump of the jackrabbit server
> before performing an emergency restart, which solved the situation.
>
> The thread dump is attached. I tried to make some sense out of it,
> but the read/write locks are hard to follow.
> Looks like all RMI-handling thread are waiting to acquire a reader
> lock on the SharedItemStateManager, except one which is waiting for a
> writer lock.
> None appear to be ready to release a lock, which is why I suppose
> they were deadlocked.
>
> Is this maybe related to a lock that isn't reentrant but should be?
> Or not?
> Can anybody see anything there?
>
> Thanks a lot for having a look!
>
>
>
>
>
>
>
> --
> Olivier Dony
>
> Denali s.a., "Bridging the gap between Business and IT"
> Rue de Clairvaux 8, B-1348 Louvain-la-Neuve, Belgium
> Office: +32 10 43 99 51  Fax: +32 10 43 99 52
> www.denali.be
>
> Legal Notice: This message may contain confidential and/or privileged
> information. If you are not the addressee or authorized to receive
> this for the addressee, you must not use, copy, disclose or take any
> action based on this message or any information herein. If you have
> received this message by mistake, please advise the sender
> immediately by return e-mail and delete this message from your
> system. Thank you for your cooperation.
>
>
>
>
>
>

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Shane Preater <sh...@googlemail.com>.

Does this sound familiar!

---------- Forwarded message ----------
From: Olivier Dony <ol...@denali.be>
Date: 13-Mar-2007 17:19
Subject: Possible deadlock of jcr-server 1.2.1 (rmi)
To: dev@jackrabbit.apache.org

Hi,

We are using the Repository Server deployment model for one of our
systems, with 3 different web applications using the same jackrabbit
server.
Each webapp is running in a separate Tomcat5 server, and jackrabbit
1.2.1 is running as a jcr server in a 3rd Tomcat server.

Everything has been doing fine for weeks, but yesterday the
jackrabbit server suddenly stopped responding to all requests,
seemingly deadlocked.
We had the opportunity to take a threadump of the jackrabbit server
before performing an emergency restart, which solved the situation.

The thread dump is attached. I tried to make some sense out of it,
but the read/write locks are hard to follow.
Looks like all RMI-handling thread are waiting to acquire a reader
lock on the SharedItemStateManager, except one which is waiting for a
writer lock.
None appear to be ready to release a lock, which is why I suppose
they were deadlocked.

Is this maybe related to a lock that isn't reentrant but should be?
Or not?
Can anybody see anything there?

Thanks a lot for having a look!

--
Olivier Dony

Denali s.a., "Bridging the gap between Business and IT"
Rue de Clairvaux 8, B-1348 Louvain-la-Neuve, Belgium
Office: +32 10 43 99 51  Fax: +32 10 43 99 52
www.denali.be

Legal Notice: This message may contain confidential and/or privileged
information. If you are not the addressee or authorized to receive
this for the addressee, you must not use, copy, disclose or take any
action based on this message or any information herein. If you have
received this message by mistake, please advise the sender
immediately by return e-mail and delete this message from your
system. Thank you for your cooperation.

Re: Possible deadlock of jcr-server 1.2.1 (rmi)

Posted by Tobias Bocanegra <to...@day.com>.

hi,
this seems involve the version manager and search index - we saw such
problems before but i though we were able to remove those deadlocks
:-( i don't think that RMI causes this problem - maybe just alters the
concurrency compared to direct access.

the best would be if you create a new jira issue and attach the thread
dump. if you can reproduce it by a simple test case - that would be
even better.

regards, toby

On 3/13/07, Olivier Dony <ol...@denali.be> wrote:
> Hi,
>
> We are using the Repository Server deployment model for one of our
> systems, with 3 different web applications using the same jackrabbit
> server.
> Each webapp is running in a separate Tomcat5 server, and jackrabbit
> 1.2.1 is running as a jcr server in a 3rd Tomcat server.
>
> Everything has been doing fine for weeks, but yesterday the
> jackrabbit server suddenly stopped responding to all requests,
> seemingly deadlocked.
> We had the opportunity to take a threadump of the jackrabbit server
> before performing an emergency restart, which solved the situation.
>
> The thread dump is attached. I tried to make some sense out of it,
> but the read/write locks are hard to follow.
> Looks like all RMI-handling thread are waiting to acquire a reader
> lock on the SharedItemStateManager, except one which is waiting for a
> writer lock.
> None appear to be ready to release a lock, which is why I suppose
> they were deadlocked.
>
> Is this maybe related to a lock that isn't reentrant but should be?
> Or not?
> Can anybody see anything there?
>
> Thanks a lot for having a look!
>
>
>
>
>
>
>
> --
> Olivier Dony
>
> Denali s.a., "Bridging the gap between Business and IT"
> Rue de Clairvaux 8, B-1348 Louvain-la-Neuve, Belgium
> Office: +32 10 43 99 51  Fax: +32 10 43 99 52
> www.denali.be
>
> Legal Notice: This message may contain confidential and/or privileged
> information. If you are not the addressee or authorized to receive
> this for the addressee, you must not use, copy, disclose or take any
> action based on this message or any information herein. If you have
> received this message by mistake, please advise the sender
> immediately by return e-mail and delete this message from your
> system. Thank you for your cooperation.
>
>
>
>
>


-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---