You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Luis Muniz <ne...@gmail.com> on 2010/10/07 11:44:59 UTC

Deadlock in my JCR repository

Hi,

We are trying to figure out a deadlock taht blocks all incoming requests
until we restart the jackrabbit repository.


The repository runs as a standalone server. It is accessed by rmi over http.

This is our rep.properties file (from repository/meta):

#Wed Oct 06 16:10:01 CEST 2010
option.versioning.supported=true
jcr.repository.version=1.5.6
jcr.repository.name=Jackrabbit
option.observation.supported=true
option.locking.supported=true
option.transactions.supported=true
jcr.repository.stats.nodes.count=130824
jcr.specification.name=Content Repository API for Java(TM) Technology
Specification
level.2.supported=true
option.query.sql.supported=true
level.1.supported=true
jcr.repository.stats.properties.count=1072194
jcr.repository.vendor=Apache Software Foundation
jcr.specification.version=1.0
query.xpath.doc.order=false
query.xpath.pos.index=true
jcr.repository.vendor.url=http\://jackrabbit.apache.org/


In jackrabbit.log when the deadlock happened, we can see that the server was
performing some index merges:

06.10.2010 15:20:22 *INFO * LRUNodeIdCache: num=0/10240 hits=0 miss=80000
06.10.2010 15:20:22 *INFO * BundleCache: num=312 mem=8163k max=8192k
avg=26793 hits=71672 miss=8328
06.10.2010 15:20:24 *INFO * IndexMerger: merged 165 documents in 2323 ms
into _2piv.
06.10.2010 15:22:26 *INFO * IndexMerger: merged 135 documents in 83 ms into
_2pj6.
06.10.2010 15:35:56 *INFO * LRUNodeIdCache: num=0/10240 hits=0 miss=90000
06.10.2010 15:35:56 *INFO * BundleCache: num=312 mem=8181k max=8192k
avg=26852 hits=80527 miss=9473
06.10.2010 16:09:04 *INFO * RepositoryImpl: Shutting down repository...
06.10.2010 16:09:04 *INFO * IndexMerger: IndexMerger terminated
06.10.2010 16:09:04 *INFO * SearchIndex: Index closed:
/SAN/repository/repository/index
06.10.2010 16:09:04 *INFO * RepositoryImpl: shutting down workspace
'default'...
06.10.2010 16:09:04 *INFO * ObservationDispatcher: Notification of
EventListeners stopped.
06.10.2010 16:09:04 *INFO * IndexMerger: IndexMerger terminated
06.10.2010 16:09:04 *INFO * SearchIndex: Index closed:
/SAN/repository/workspaces/default/index
06.10.2010 16:09:05 *INFO * DerbyPersistenceManager: Database
'/SAN/repository/workspaces/default/db' shutdown.
06.10.2010 16:09:05 *INFO * RepositoryImpl: workspace 'default' has been
shutdown
06.10.2010 16:09:05 *INFO * DerbyPersistenceManager: Database
'/SAN/repository/version/db' shutdown.
06.10.2010 16:09:05 *INFO * RepositoryImpl: Repository has been shutdown


If I look further into the log I can see worrying messages (this is 1h
before the deadlock):

06.10.2010 15:09:16 *INFO * MultiIndex: Unable to delete obsolete index:
_2phm
06.10.2010 15:09:16 *INFO * MultiIndex: Unable to delete obsolete index:
_2phn
06.10.2010 15:09:31 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:10:01 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:10:36 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2phg
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2pho
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2phk
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2phl
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2phm
06.10.2010 15:10:36 *INFO * MultiIndex: Unable to delete obsolete index:
_2phn
06.10.2010 15:10:55 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:11:04 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:11:09 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:11:16 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:11:21 *INFO * MultiIndex: updating index with 1 nodes from
indexing queue.
06.10.2010 15:11:32 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:11:53 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:12:35 *WARN * TransactionContext: Transaction rolled back
because timeout expired.
06.10.2010 15:13:17 *INFO * LRUNodeIdCache: num=0/10240 hits=0 miss=70000
06.10.2010 15:13:17 *INFO * BundleCache: num=366 mem=8140k max=8192k
avg=22774 hits=62774 miss=7226
06.10.2010 15:13:22 *INFO * MultiIndex: updating index with 1 nodes from
indexing queue.
06.10.2010 15:13:46 *INFO * MultiIndex: updating index with 1 nodes from
indexing queue.
06.10.2010 15:13:46 *INFO * IndexMerger: merged 66 documents in 23667 ms
into _2pi0.
06.10.2010 15:13:57 *INFO * DocNumberCache: size=15/1024, #accesses=1001,
#hits=962, #misses=39, cacheRatio=97%



 a ton of these logs appear regularly:

06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi8
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi6
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi7
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi4
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi5
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi2
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi3
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi0
06.10.2010 15:16:18 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi1
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi8
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi6
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi7
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi4
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi5
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi2
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi3
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi0
06.10.2010 15:16:37 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi1
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi8
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi6
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi7
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi4
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi5
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi2
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi3
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi0
06.10.2010 15:16:43 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi1
06.10.2010 15:16:55 *INFO * MultiIndex: Unable to delete obsolete index:
_2pi8



Currently I have no thread dump unfortunately, but maybe there is a
configuration issue on our server.

We are using jackrabbit-1.5.6, should we think about upgrading?

I hope someone can help me out with this.

Luis

RE: Deadlock in my JCR repository

Posted by "PREVOSTO, Laurent" <la...@sfr.com>.
Hi Luis,
Of course, we may not have the same usage & configuration but on our side, we have a 40GB repository.
And we upgraded flawlessly from 1.5.6 to 1.6.1 just by changing the jackrabbit jar.

You might want to try on a staging environment if you have one.

Regards,

Laurent


-----Message d'origine-----
De : Luis Muniz [mailto:neur0maner@gmail.com] 
Envoyé : jeudi 7 octobre 2010 15:08
À : users@jackrabbit.apache.org
Objet : Re: Deadlock in my JCR repository

Hi Jukka,

Of course, I do appreciate the time you put in to help out, and possibly
state the obvious from your perspective. We don't update every component of
our architecture if it is stable. Our repository was fairly stable until
lately. We have a 20GB+ repository so the migration, jumping minor versions
and a major version, is not going to be trivial. I guess the first thing
will have to be to create a migration script. I read in the wiki that from
1.6 on there is a migration program, but that does not solve my issue.

The thread dump we forgot to generate it and we will generate one when the
issue gets reproduced.

I know that identifying the deadlock with the information I provided is not
possible unless some of the elements taht appear in my log remind someone of
a similar case, which was what I was hoping for. it was a long shot, but it
was all I could currently do to diagnose the issue without a thread dump.

Anyway, I'll post an update in this thread in case another lost soul is
looking for this information.

Thanks again for your time,

Luis


On Thu, Oct 7, 2010 at 2:42 PM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Thu, Oct 7, 2010 at 1:19 PM, Luis Muniz <ne...@gmail.com> wrote:
> > On Thu, Oct 7, 2010 at 11:47 AM, Jukka Zitting <jukka.zitting@gmail.com
> >wrote:
> >> On Thu, Oct 7, 2010 at 11:44 AM, Luis Muniz <ne...@gmail.com>
> wrote:
> >> > We are using jackrabbit-1.5.6, should we think about upgrading?
> >> Yes.
> > Why?
>
> You're one major and two minor releases behind the latest Jackrabbit
> version, which means that the likelihood of any issues you encounter
> being already fixed in a more recent release is pretty high.
>
> I'd love to help you track down and fix the problem if it still occurs
> after an upgrade, as doing so will help prevent me or the customers of
> my employer from encountering the same issue in the future. But
> chasing down an old problem that's most likely already fixed in a more
> recent version would probably just be a waste of time from my
> perspective.
>
> There may be other people on this list who may want to help you even
> with 1.5.6 and you can always turn to paid support, but upgrading to a
> more recent version is a cheap way to either get rid of the problem
> automatically or at least notably increase the motivation of other
> people to help you.
>
> That said, you may want to check the Jackrabbit issue tracker for the
> list of known and resolved deadlock issues [1]. Also, without a thread
> dump it's usually quite hard to determine the cause of a deadlock.
>
> [1]
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&query=deadlock&summary=true&description=true&body=true&pid=10591
>
> BR,
>
> Jukka Zitting
>

Re: Deadlock in my JCR repository

Posted by Luis Muniz <ne...@gmail.com>.
Hi Jukka,

Of course, I do appreciate the time you put in to help out, and possibly
state the obvious from your perspective. We don't update every component of
our architecture if it is stable. Our repository was fairly stable until
lately. We have a 20GB+ repository so the migration, jumping minor versions
and a major version, is not going to be trivial. I guess the first thing
will have to be to create a migration script. I read in the wiki that from
1.6 on there is a migration program, but that does not solve my issue.

The thread dump we forgot to generate it and we will generate one when the
issue gets reproduced.

I know that identifying the deadlock with the information I provided is not
possible unless some of the elements taht appear in my log remind someone of
a similar case, which was what I was hoping for. it was a long shot, but it
was all I could currently do to diagnose the issue without a thread dump.

Anyway, I'll post an update in this thread in case another lost soul is
looking for this information.

Thanks again for your time,

Luis


On Thu, Oct 7, 2010 at 2:42 PM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Thu, Oct 7, 2010 at 1:19 PM, Luis Muniz <ne...@gmail.com> wrote:
> > On Thu, Oct 7, 2010 at 11:47 AM, Jukka Zitting <jukka.zitting@gmail.com
> >wrote:
> >> On Thu, Oct 7, 2010 at 11:44 AM, Luis Muniz <ne...@gmail.com>
> wrote:
> >> > We are using jackrabbit-1.5.6, should we think about upgrading?
> >> Yes.
> > Why?
>
> You're one major and two minor releases behind the latest Jackrabbit
> version, which means that the likelihood of any issues you encounter
> being already fixed in a more recent release is pretty high.
>
> I'd love to help you track down and fix the problem if it still occurs
> after an upgrade, as doing so will help prevent me or the customers of
> my employer from encountering the same issue in the future. But
> chasing down an old problem that's most likely already fixed in a more
> recent version would probably just be a waste of time from my
> perspective.
>
> There may be other people on this list who may want to help you even
> with 1.5.6 and you can always turn to paid support, but upgrading to a
> more recent version is a cheap way to either get rid of the problem
> automatically or at least notably increase the motivation of other
> people to help you.
>
> That said, you may want to check the Jackrabbit issue tracker for the
> list of known and resolved deadlock issues [1]. Also, without a thread
> dump it's usually quite hard to determine the cause of a deadlock.
>
> [1]
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&query=deadlock&summary=true&description=true&body=true&pid=10591
>
> BR,
>
> Jukka Zitting
>

Re: Deadlock in my JCR repository

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Oct 7, 2010 at 1:19 PM, Luis Muniz <ne...@gmail.com> wrote:
> On Thu, Oct 7, 2010 at 11:47 AM, Jukka Zitting <ju...@gmail.com>wrote:
>> On Thu, Oct 7, 2010 at 11:44 AM, Luis Muniz <ne...@gmail.com> wrote:
>> > We are using jackrabbit-1.5.6, should we think about upgrading?
>> Yes.
> Why?

You're one major and two minor releases behind the latest Jackrabbit
version, which means that the likelihood of any issues you encounter
being already fixed in a more recent release is pretty high.

I'd love to help you track down and fix the problem if it still occurs
after an upgrade, as doing so will help prevent me or the customers of
my employer from encountering the same issue in the future. But
chasing down an old problem that's most likely already fixed in a more
recent version would probably just be a waste of time from my
perspective.

There may be other people on this list who may want to help you even
with 1.5.6 and you can always turn to paid support, but upgrading to a
more recent version is a cheap way to either get rid of the problem
automatically or at least notably increase the motivation of other
people to help you.

That said, you may want to check the Jackrabbit issue tracker for the
list of known and resolved deadlock issues [1]. Also, without a thread
dump it's usually quite hard to determine the cause of a deadlock.

[1] https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&query=deadlock&summary=true&description=true&body=true&pid=10591

BR,

Jukka Zitting

Re: Deadlock in my JCR repository

Posted by Luis Muniz <ne...@gmail.com>.
Hi

On Thu, Oct 7, 2010 at 11:47 AM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Thu, Oct 7, 2010 at 11:44 AM, Luis Muniz <ne...@gmail.com> wrote:
> > We are using jackrabbit-1.5.6, should we think about upgrading?
>
> Yes.
>
> Why?


> BR,
>
> Jukka Zitting
>

Re: Deadlock in my JCR repository

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Oct 7, 2010 at 11:44 AM, Luis Muniz <ne...@gmail.com> wrote:
> We are using jackrabbit-1.5.6, should we think about upgrading?

Yes.

BR,

Jukka Zitting