You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@commons.apache.org by sebb <se...@gmail.com> on 2008/06/27 13:18:06 UTC

[DBCP] how does narrowing the lock help with deadlock in DBCP-270?

On 27/06/2008, psteitz@apache.org <ps...@apache.org> wrote:
> Author: psteitz
>  Date: Thu Jun 26 19:48:06 2008
>  New Revision: 672097
>
>  URL: http://svn.apache.org/viewvc?rev=672097&view=rev
>  Log:
>  Narrowed synchronization in AbandonedTrace to resolve an Evictor deadlock.
>  JIRA: DBCP-270
>  Reported and patched by Filip Hanik.
>

I've had a look at the patch and the JIRA, and I don't understand how
narrowing the lock works here.

Originally, the code locked on "this", whereas now the code uses "this.trace".

However, as far as I can see, no objects were taken out of the locked
blocks, and there is no other synchronization in the class. So what
else was locking on "this"?

So how does the change of lock object actually help here?

Or is it just a byproduct of the synchronisation that was added to
getTrace() as part of the patch?

The lock details provided in the JIRA don't show any locks on the
AbandonedTrace object - but this info could have been omitted.

I'm not saying that the patch is wrong, but I would like to understand
how it helps!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [DBCP] how does narrowing the lock help with deadlock in DBCP-270?

Posted by Phil Steitz <ph...@steitz.com>.

sebb wrote:
> On 27/06/2008, psteitz@apache.org <ps...@apache.org> wrote:
>   
>> Author: psteitz
>>  Date: Thu Jun 26 19:48:06 2008
>>  New Revision: 672097
>>
>>  URL: http://svn.apache.org/viewvc?rev=672097&view=rev
>>  Log:
>>  Narrowed synchronization in AbandonedTrace to resolve an Evictor deadlock.
>>  JIRA: DBCP-270
>>  Reported and patched by Filip Hanik.
>>
>>     
>
> I've had a look at the patch and the JIRA, and I don't understand how
> narrowing the lock works here.
>
> Originally, the code locked on "this", whereas now the code uses "this.trace".
>
> However, as far as I can see, no objects were taken out of the locked
> blocks, and there is no other synchronization in the class. So what
> else was locking on "this"?
>
> So how does the change of lock object actually help here?
>
> Or is it just a byproduct of the synchronisation that was added to
> getTrace() as part of the patch?
>
> The lock details provided in the JIRA don't show any locks on the
> AbandonedTrace object - but this info could have been omitted.
>
> I'm not saying that the patch is wrong, but I would like to understand
> how it helps!
>   
Good questions and thanks for reviewing.  Even greater thanks for 
jumping in to DBCP and/or POOL :)

Now to the inscrutable world of DBCP...

PoolableConnection extends DelegatingConnection which extends 
AbandonedTrace.   The deadlock sequence in DBCP-270 is

1) DBCP client thread invokes close on PoolableConnection.  This method 
is synchronized, so locks the PC.
2) Close for a PC means return to pool, so the associated 
GenericObjectPool's returnObject is invoked.  This calls 
GOP.addObjectToPool.  Neither of these methods are synchronized, but 
addObjectToPool includes two synchronized blocks, both of which lock the 
pool.  The client thread makes it past the first block, which returns 
the connection to the pool, but before it can acquire the pool lock for 
the second block (to decrement the number of active connections),
3) The (dreaded) Evictor kicks off, locks the pool and attempts to 
validate the connection that was just returned.  Validation involves 
executing a query.  Due to another painful-to-maintain feature of DBCP, 
when a DelegatingConnection creates a statement, it registers itself as 
the source of the statement by passing a reference to itself to the 
DelegatingStatement that its createStatement creates.  The way this 
actually works is that the DelegatingStatement constructor ends up 
adding the DelegatingStatement to the trace property of the 
DelegatingConnection.  This happens via the DelegatingConnection's 
addTrace method (from AbandonedTrace).  Before the patch, addTrace 
protected the trace in a block synchronized on *this, locking the 
DelegatingConnection.  This led to deadlock, since the client thread 
acquired that lock in 1).  Reducing scope to the trace eliminates this 
contention.  A workaround is to turn off testWhileIdle, or avoid using 
the Evictor altogether.

This is a good change for DBCP which makes progress toward addressing 
the basic problem of client and evictor contention for locks.  There are 
others in DBCP-44.  The deadlock reported in DBCP-270 is the same as the 
last one mentioned in DBCP-44, which is new with POOL 1.4, where the 
synchroniztion in addObjectToPool was broken into two blocks.  That 
change was made to a) keep factory methods out of pool-synchronized 
scope (which was killing performance) and b) wait to decrement 
activeCount until objects destroyed on return are destroyed.  The 
general problem of preventing Evictor access to objects being borrowed 
or returned is documented in POOL-125.   I have a patch for this that I 
am testing.

Thanks in advance for any ideas, patches or RM-volunteering to help get 
DBCP 1.3 and/or POOL 1.5 out.  We should cut a DBCP release as soon as 
we can organize it to fix this and the several other issues that have 
been addressed in trunk. 

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org