You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@commons.apache.org by Shawn Heisey <ap...@elyograg.org> on 2015/04/11 15:29:44 UTC

Apparent DBCP deadlock related to getConnection

I seem to be having a problem very similar to DBCP-398, using DBCP 2.1
with MySQL.

https://issues.apache.org/jira/browse/DBCP-398

See the last couple of comments.

I am not familiar with the code, so I can't confirm or refute Phil's
comment that it looks normal, but I can say this:  The problem came to
my attention because the program had been deadlocked for several hours,
with the watchdog thread sending "not updating" alarms about every 15
minutes.

I've been having occasional problems with deadlocks in this program for
quite a while.  Those problems were occurring even before I switched
from dbcp to dbcp2.  Until two days ago, I had not tracked down the
cause of the deadlock.  I was suprised to see that the problem wasn't in
my own code.

What other information can I provide to determine whether it's a bug or
something I've done wrong?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Phil Steitz <ph...@gmail.com>.

On 4/21/15 10:43 PM, Shawn Heisey wrote:
> On 4/21/2015 6:28 PM, Phil Steitz wrote:
>> Thanks for providing the full stack traces, Shawn.  I assume the
>> pool config is the same that you posted before, right?  Also, the
>> mysql driver seems to be taking initiative to try to clean up
>> abandoned connections itself.  That is holding some locks; but I
>> don't see any deadlocks in the dumps.  I will look more at this
>> later tonight.  I think it is possible there is a pool bug lurking
>> in here. 

Confirmed that what I was looking into was not a pool bug (actually
previously resolved bug).

Phil

>>  One question: are you seeing connections fail validation? 
>> I think you said you were testing on borrow / return.  Is that
>> right?  Are you just letting the pool destroy the bad ones or are
>> you destroying them yourself?
> Thanks for your reply.
>
> I have since learned that this was not an actual deadlock.
>   Somebody
> dropped entries into the config database which asked my program to
> reindex 56 million rows (nearly one third of the entire database) into
> Solr.  That is not the kind of thing I intended that capability to
> accomplish, because it takes WAY too long.
>
> My program was working fine ... but sending watchdog alarms every
> fifteen minutes because a "normal" cycle was taking hours longer than
> expected.  I removed the reindex entries and restarted the program
> instances, now everything is working fine again.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/21/2015 6:28 PM, Phil Steitz wrote:
> Thanks for providing the full stack traces, Shawn.  I assume the
> pool config is the same that you posted before, right?  Also, the
> mysql driver seems to be taking initiative to try to clean up
> abandoned connections itself.  That is holding some locks; but I
> don't see any deadlocks in the dumps.  I will look more at this
> later tonight.  I think it is possible there is a pool bug lurking
> in here.  One question: are you seeing connections fail validation? 
> I think you said you were testing on borrow / return.  Is that
> right?  Are you just letting the pool destroy the bad ones or are
> you destroying them yourself?

Thanks for your reply.

I have since learned that this was not an actual deadlock.  Somebody
dropped entries into the config database which asked my program to
reindex 56 million rows (nearly one third of the entire database) into
Solr.  That is not the kind of thing I intended that capability to
accomplish, because it takes WAY too long.

My program was working fine ... but sending watchdog alarms every
fifteen minutes because a "normal" cycle was taking hours longer than
expected.  I removed the reindex entries and restarted the program
instances, now everything is working fine again.

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Phil Steitz <ph...@gmail.com>.

On 4/21/15 4:47 PM, Shawn Heisey wrote:
> On 4/14/2015 6:06 AM, James Carman wrote:
>> You may want to consider something like Spring's JdbcTemplate class to
>> avoid a lot of this.
> I don't know anything about Spring, although I didn't know anything
> about DBCP either, before I wrote this code.  I suspect that Spring is
> much larger and more complicated, though.
>
> Another deadlock problem has happened.
>
> Two separate copies of the program that both use the same MySQL database
> appear to have locked up at the same time on different hosts, using
> different JVMs.  It has been over three hours since the first "not
> updating" alarms from the watchdog thread started coming in, so this is
> definitely not a temporary deadlock.  I just barely obtained these
> stacktraces:
>
> https://www.dropbox.com/s/yudxe1cikm0chnd/stacktrace-idxbuild-bigindy5.txt?dl=0
> https://www.dropbox.com/s/g41mldfxpttb824/stacktrace-idxbuild-idxa1.txt?dl=0
>
> Included in each of those files are the three bits of information
> requested from DBCP, which were automatically logged by the watchdog
> thread -- active, idle, and the pool objects.  That information is from
> the first alarm on each host, running a separate copy of the program.
>
> This time, I see no evidence of locks held by my own code.  All the
> locks present are in libraries that I am using.
>
> It wasn't immediately apparently to me, looking at the stacktraces,
> where the deadlock is.  I think this is different than the deadlock that
> started this thread.

Thanks for providing the full stack traces, Shawn.  I assume the
pool config is the same that you posted before, right?  Also, the
mysql driver seems to be taking initiative to try to clean up
abandoned connections itself.  That is holding some locks; but I
don't see any deadlocks in the dumps.  I will look more at this
later tonight.  I think it is possible there is a pool bug lurking
in here.  One question: are you seeing connections fail validation? 
I think you said you were testing on borrow / return.  Is that
right?  Are you just letting the pool destroy the bad ones or are
you destroying them yourself?

Phil
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/14/2015 6:06 AM, James Carman wrote:
> You may want to consider something like Spring's JdbcTemplate class to
> avoid a lot of this.

I don't know anything about Spring, although I didn't know anything
about DBCP either, before I wrote this code.  I suspect that Spring is
much larger and more complicated, though.

Another deadlock problem has happened.

Two separate copies of the program that both use the same MySQL database
appear to have locked up at the same time on different hosts, using
different JVMs.  It has been over three hours since the first "not
updating" alarms from the watchdog thread started coming in, so this is
definitely not a temporary deadlock.  I just barely obtained these
stacktraces:

https://www.dropbox.com/s/yudxe1cikm0chnd/stacktrace-idxbuild-bigindy5.txt?dl=0
https://www.dropbox.com/s/g41mldfxpttb824/stacktrace-idxbuild-idxa1.txt?dl=0

Included in each of those files are the three bits of information
requested from DBCP, which were automatically logged by the watchdog
thread -- active, idle, and the pool objects.  That information is from
the first alarm on each host, running a separate copy of the program.

This time, I see no evidence of locks held by my own code.  All the
locks present are in libraries that I am using.

It wasn't immediately apparently to me, looking at the stacktraces,
where the deadlock is.  I think this is different than the deadlock that
started this thread.

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by James Carman <ja...@carmanconsulting.com>.

You may want to consider something like Spring's JdbcTemplate class to
avoid a lot of this.

On Sunday, April 12, 2015, Shawn Heisey <ap...@elyograg.org> wrote:

> On 4/12/2015 12:43 PM, Phil Steitz wrote:
> > On 4/12/15 10:31 AM, Shawn Heisey wrote:
> >> http://apaste.info/6pX
> >
> > Several methods, getSingleObject, getResultSet, closeResultSet,
> > doUpdate can abandon connections on exception paths.  You should
> > insert finally blocks in these methods to ensure that connections
> > are closed.
>
> I have modified the code to put close operations in finally blocks in
> getSingleObject, doUpdate, and getColumnSet, and I have done the same
> thing on all uses of the other two methods, with the close method called
> in a finally block.
>
> The updated version of the Database class is here:
>
> http://apaste.info/6XU
>
> I've engaged in some renaming.  The getResultSet method is now called
> createConnectionAndResultSet, and the closeResultSet method is now
> closeAllFromResultSet.
>
> > They don't have to be slow to deadlock, if you hit the right timing
> > and the following kind of scenario:
> >
> > Thread 1 grab the lock and hold it while waiting to get a connection
> > (what the thread you showed in the ticket is doing)
> > Threads 2, ..., n try to get the lock before returning connections
> > they hold
> >
> > Key is to get a full thread dump and see which threads are waiting
> > on / holding which locks.
>
> If the "finally" changes I've made do not fix the problem, I will gather
> a new full thread dump for analysis.  You've given me an idea of some
> additional bad coding that I can look for, so thank you.  I've tried to
> make all synchronized code use the database in a self-contained manner,
> but I might have made mistakes.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org <javascript:;>
> For additional commands, e-mail: user-help@commons.apache.org
> <javascript:;>
>
>

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/12/2015 12:43 PM, Phil Steitz wrote:
> On 4/12/15 10:31 AM, Shawn Heisey wrote:
>> http://apaste.info/6pX
> 
> Several methods, getSingleObject, getResultSet, closeResultSet,
> doUpdate can abandon connections on exception paths.  You should
> insert finally blocks in these methods to ensure that connections
> are closed. 

I have modified the code to put close operations in finally blocks in
getSingleObject, doUpdate, and getColumnSet, and I have done the same
thing on all uses of the other two methods, with the close method called
in a finally block.

The updated version of the Database class is here:

http://apaste.info/6XU

I've engaged in some renaming.  The getResultSet method is now called
createConnectionAndResultSet, and the closeResultSet method is now
closeAllFromResultSet.

> They don't have to be slow to deadlock, if you hit the right timing
> and the following kind of scenario:
> 
> Thread 1 grab the lock and hold it while waiting to get a connection
> (what the thread you showed in the ticket is doing)
> Threads 2, ..., n try to get the lock before returning connections
> they hold
> 
> Key is to get a full thread dump and see which threads are waiting
> on / holding which locks.

If the "finally" changes I've made do not fix the problem, I will gather
a new full thread dump for analysis.  You've given me an idea of some
additional bad coding that I can look for, so thank you.  I've tried to
make all synchronized code use the database in a self-contained manner,
but I might have made mistakes.

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Phil Steitz <ph...@gmail.com>.

On 4/12/15 10:31 AM, Shawn Heisey wrote:
> On 4/12/2015 9:16 AM, Phil Steitz wrote:
>> Can you post your full pool config and info on versions of DBCP and
>> POOL?
>>
>> Do you call isClosed() before closing a connection (not recommended
>> - that is really the issue in the ticket you commented)?
>>
>> To make sure there is not an actual deadlock somewhere, you should
>> examine a full thread dump.  The thread you posted in the ticket is
>> waiting on the pool, not holding any DBCP or POOL related locks; but
>> it does hold this application lock:
>> at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
>>         - locked <0x00000000db0011c8> (a java.lang.Object)
>> Make sure no other threads in process of doing things that will
>> return connections to the pool aren't waiting on this lock.
>>
>> Given that you are using DBCP 2.1, if you have access to a JMX
>> console it would also be good to look at what is happening in the
>> pool when your application hangs.  In particular, numActive, numIdle
>> and listAllObjects would be good to look at.  Looks like you are
>> using PoolingDataSource directly, so you need to look at the
>> underlying GenericObject's JMX properties.
> My code does not contain the string "isClosed" at all.
>
> I thought I had saved the full thread dump, but the stdout logfile was
> wiped when I restarted the program, and I didn't think to save a copy.
> I will be sure to save it next time there's a problem.

Full thread dump will be helpful.
>
> Here is the entire java file for my "Database" class, edited to redact
> the company top-level domain:
>
> http://apaste.info/6pX

Several methods, getSingleObject, getResultSet, closeResultSet,
doUpdate can abandon connections on exception paths.  You should
insert finally blocks in these methods to ensure that connections
are closed. 

Have a look at the PoolingDataSource example referenced [1] on the
DBCP web page for a model to use with try-catch-finally.   Not
closing connections on exception paths will cause leaks if
exceptions are thrown.  If you are seeing BuildExceptions generated
by this code, each of them is leaking a connection.
>
> The object that I am locking on (0x00000000db0011c8) is a global lock
> for my whole program which I synchronize on for a handful of
> normally-short-lived db-related operations just to be sure that they
> aren't happening simultaneously, and to help make sure that the outcome
> of certain operations is visible across threads.  I'm reasonably sure
> that there are no operations in those synchronization blocks that are
> expected to be slow, though I haven't verified that to 100% certainty.

They don't have to be slow to deadlock, if you hit the right timing
and the following kind of scenario:

Thread 1 grab the lock and hold it while waiting to get a connection
(what the thread you showed in the ticket is doing)
Threads 2, ..., n try to get the lock before returning connections
they hold

Key is to get a full thread dump and see which threads are waiting
on / holding which locks.

Phil
[1] http://s.apache.org/0Nl

> One of the operations handled in that kind of synchronization lock is
> the optimization of certain DB tables, which only happens if the size of
> those tables is below a certain threshold -- part of making sure that
> the operation completes quickly and doesn't deadlock the program.
>
> Here are all the jar files in my program's lib directory.  My program
> uses javamail, jrobin, dbcp, and solrj:
>
> commons-dbcp2-2.1.jar
> commons-io-2.4.jar
> commons-pool2-2.3.jar
> httpclient-4.4.1.jar
> httpcore-4.4.1.jar
> httpmime-4.4.1.jar
> javax.mail.jar
> jcl-over-slf4j-1.7.12.jar
> jrobin-1.5.14.jar
> jul-to-slf4j-1.7.12.jar
> log4j-1.2.17.jar
> mysql-connector-java-5.1.35-bin.jar
> noggit-0.7.jar
> slf4j-api-1.7.12.jar
> slf4j-log4j12-1.7.12.jar
> solr-solrj-5.0.0.jar
> stax2-api-3.1.1.jar
> woodstox-core-asl-4.2.0.jar
> zookeeper-3.4.6.jar
>
> I do not have remote JMX enabled for this application, and the server
> where it runs does not have a gui.  I will add some code that spits out
> the three requested pool stats every time the watchdog thread detects a
> problem.  I cannot predict when there will be a problem ... this code
> can run for weeks with nothing going wrong, or the deadlock might happen
> only a few minutes after startup.

>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/12/2015 9:16 AM, Phil Steitz wrote:
> Can you post your full pool config and info on versions of DBCP and
> POOL?
> 
> Do you call isClosed() before closing a connection (not recommended
> - that is really the issue in the ticket you commented)?
> 
> To make sure there is not an actual deadlock somewhere, you should
> examine a full thread dump.  The thread you posted in the ticket is
> waiting on the pool, not holding any DBCP or POOL related locks; but
> it does hold this application lock:
> at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
>         - locked <0x00000000db0011c8> (a java.lang.Object)
> Make sure no other threads in process of doing things that will
> return connections to the pool aren't waiting on this lock.
> 
> Given that you are using DBCP 2.1, if you have access to a JMX
> console it would also be good to look at what is happening in the
> pool when your application hangs.  In particular, numActive, numIdle
> and listAllObjects would be good to look at.  Looks like you are
> using PoolingDataSource directly, so you need to look at the
> underlying GenericObject's JMX properties.

My code does not contain the string "isClosed" at all.

I thought I had saved the full thread dump, but the stdout logfile was
wiped when I restarted the program, and I didn't think to save a copy.
I will be sure to save it next time there's a problem.

Here is the entire java file for my "Database" class, edited to redact
the company top-level domain:

http://apaste.info/6pX

The object that I am locking on (0x00000000db0011c8) is a global lock
for my whole program which I synchronize on for a handful of
normally-short-lived db-related operations just to be sure that they
aren't happening simultaneously, and to help make sure that the outcome
of certain operations is visible across threads.  I'm reasonably sure
that there are no operations in those synchronization blocks that are
expected to be slow, though I haven't verified that to 100% certainty.
One of the operations handled in that kind of synchronization lock is
the optimization of certain DB tables, which only happens if the size of
those tables is below a certain threshold -- part of making sure that
the operation completes quickly and doesn't deadlock the program.

Here are all the jar files in my program's lib directory.  My program
uses javamail, jrobin, dbcp, and solrj:

commons-dbcp2-2.1.jar
commons-io-2.4.jar
commons-pool2-2.3.jar
httpclient-4.4.1.jar
httpcore-4.4.1.jar
httpmime-4.4.1.jar
javax.mail.jar
jcl-over-slf4j-1.7.12.jar
jrobin-1.5.14.jar
jul-to-slf4j-1.7.12.jar
log4j-1.2.17.jar
mysql-connector-java-5.1.35-bin.jar
noggit-0.7.jar
slf4j-api-1.7.12.jar
slf4j-log4j12-1.7.12.jar
solr-solrj-5.0.0.jar
stax2-api-3.1.1.jar
woodstox-core-asl-4.2.0.jar
zookeeper-3.4.6.jar

I do not have remote JMX enabled for this application, and the server
where it runs does not have a gui.  I will add some code that spits out
the three requested pool stats every time the watchdog thread detects a
problem.  I cannot predict when there will be a problem ... this code
can run for weeks with nothing going wrong, or the deadlock might happen
only a few minutes after startup.

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Phil Steitz <ph...@gmail.com>.

On 4/12/15 2:58 AM, Shawn Heisey wrote:
> On 4/11/2015 8:48 AM, Phil Steitz wrote:
>> What is going on in that thread is that it is blocked waiting for a
>> connection to become available in the pool.  There is no evidence of
>> deadlock there - just pool exhaustion.
>>
>> The most likely culprit is application code that is failing to close
>> connections that have been borrowed from the pool.  If you don't
>> close the connections on all execution paths, they will "leak" as
>> the pool has no way of knowing that your code is finished with them
>> unless and until you close them.
> I've done a review of all my database code for leaks in the past, and
> everything looked good at that time.  I can repeat that check.  The
> database portions of my code have not had any changes for quite some time.

Also try turning on removeAbandoned and logAbandoned.
>
> Would I see evidence of a connection leak in the "show full processlist"
> output at the MySQL server?  I don't see that happening.  There are only
> a handful of open connections when I check there, even when the program
> has been running for several days or weeks, running through its cycle
> once a minute.  Once all parts of the program have been run at least
> once, the connection count is stable.

Can you post your full pool config and info on versions of DBCP and
POOL?

Do you call isClosed() before closing a connection (not recommended
- that is really the issue in the ticket you commented)?

To make sure there is not an actual deadlock somewhere, you should
examine a full thread dump.  The thread you posted in the ticket is
waiting on the pool, not holding any DBCP or POOL related locks; but
it does hold this application lock:
at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
        - locked <0x00000000db0011c8> (a java.lang.Object)
Make sure no other threads in process of doing things that will
return connections to the pool aren't waiting on this lock.

Given that you are using DBCP 2.1, if you have access to a JMX
console it would also be good to look at what is happening in the
pool when your application hangs.  In particular, numActive, numIdle
and listAllObjects would be good to look at.  Looks like you are
using PoolingDataSource directly, so you need to look at the
underlying GenericObject's JMX properties.

Phil
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/12/2015 5:18 AM, Kristian Rosenvold wrote:
> I once had a library that allocated 2 connections to complete a
> transaction. Every now and then we'd deadlock the pool if number of
> concurrent transactions > pool size, which could happen in load
> bursts. If you do a thread dump of your process and see "all" threads
> are waiting for a connection this is most probably happening. The
> non-solution solution is to increase pool size, the real solution was
> never to allocate 2 connections to complete a transaction.

It will take a code review to be sure, but I'm pretty sure I would
remember if there was anything odd like that in the code.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Kristian Rosenvold <kr...@apache.org>.

I once had a library that allocated 2 connections to complete a
transaction. Every now and then we'd deadlock the pool if number of
concurrent transactions > pool size, which could happen in load
bursts. If you do a thread dump of your process and see "all" threads
are waiting for a connection this is most probably happening. The
non-solution solution is to increase pool size, the real solution was
never to allocate 2 connections to complete a transaction.

Kristian


2015-04-12 11:58 GMT+02:00 Shawn Heisey <ap...@elyograg.org>:
> On 4/11/2015 8:48 AM, Phil Steitz wrote:
>> What is going on in that thread is that it is blocked waiting for a
>> connection to become available in the pool.  There is no evidence of
>> deadlock there - just pool exhaustion.
>>
>> The most likely culprit is application code that is failing to close
>> connections that have been borrowed from the pool.  If you don't
>> close the connections on all execution paths, they will "leak" as
>> the pool has no way of knowing that your code is finished with them
>> unless and until you close them.
>
> I've done a review of all my database code for leaks in the past, and
> everything looked good at that time.  I can repeat that check.  The
> database portions of my code have not had any changes for quite some time.
>
> Would I see evidence of a connection leak in the "show full processlist"
> output at the MySQL server?  I don't see that happening.  There are only
> a handful of open connections when I check there, even when the program
> has been running for several days or weeks, running through its cycle
> once a minute.  Once all parts of the program have been run at least
> once, the connection count is stable.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: [DBCP] Apparent deadlock related to getConnection

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/11/2015 8:48 AM, Phil Steitz wrote:
> What is going on in that thread is that it is blocked waiting for a
> connection to become available in the pool.  There is no evidence of
> deadlock there - just pool exhaustion.
> 
> The most likely culprit is application code that is failing to close
> connections that have been borrowed from the pool.  If you don't
> close the connections on all execution paths, they will "leak" as
> the pool has no way of knowing that your code is finished with them
> unless and until you close them.

I've done a review of all my database code for leaks in the past, and
everything looked good at that time.  I can repeat that check.  The
database portions of my code have not had any changes for quite some time.

Would I see evidence of a connection leak in the "show full processlist"
output at the MySQL server?  I don't see that happening.  There are only
a handful of open connections when I check there, even when the program
has been running for several days or weeks, running through its cycle
once a minute.  Once all parts of the program have been run at least
once, the connection count is stable.

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org

Re: Apparent DBCP deadlock related to getConnection

Posted by Phil Steitz <ph...@gmail.com>.

On 4/11/15 6:29 AM, Shawn Heisey wrote:
> I seem to be having a problem very similar to DBCP-398, using DBCP 2.1
> with MySQL.
>
> https://issues.apache.org/jira/browse/DBCP-398
>
> See the last couple of comments.
>
> I am not familiar with the code, so I can't confirm or refute Phil's
> comment that it looks normal, but I can say this:  The problem came to
> my attention because the program had been deadlocked for several hours,
> with the watchdog thread sending "not updating" alarms about every 15
> minutes.
>
> I've been having occasional problems with deadlocks in this program for
> quite a while.  Those problems were occurring even before I switched
> from dbcp to dbcp2.  Until two days ago, I had not tracked down the
> cause of the deadlock.  I was suprised to see that the problem wasn't in
> my own code.
>
> What other information can I provide to determine whether it's a bug or
> something I've done wrong?

Most likely what is happening is that your application is exhausting
the pool of connections.  The relevant lines in the thread dump you
posted on the ticket are:

at
org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:583)
        at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:442)

What is going on in that thread is that it is blocked waiting for a
connection to become available in the pool.  There is no evidence of
deadlock there - just pool exhaustion.

The most likely culprit is application code that is failing to close
connections that have been borrowed from the pool.  If you don't
close the connections on all execution paths, they will "leak" as
the pool has no way of knowing that your code is finished with them
unless and until you close them.  What I would do first is carefully
review the code that borrows, uses and returns connections.  Make
sure it always closes them (including exception paths).  If you
can't find the leak, then consider turning on abandoned connection
cleanup (see the DBCP web page for how to configure that).

Phil
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org