You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Andreas Fredriksson <an...@digitalroute.com> on 2005/09/13 09:54:46 UTC

Class loading deadlock

Hey list,
we've re-targeted our system to Derby and now we're seeing some class
loading-related deadlocks.

To give some background, our system loads code from the database and
other various sources (through our own set of class loaders). This has
worked fine with Oracle, MS SQL Server and many other databases, but
with Derby there is an issue with the interaction pattern of class
loaders and the database helper objects.

The deadlock occurs when one thread has locked the classloader to find a
certain class which then triggers a database query to find the right
data in one thread. At the same time a daily query is running against
the database, which triggers class loader operations against the already
locked class loader instance.

Short and sweet:

Thread A (loading a class/resource via a ResourceBundle)
Lock ClassLoader (implicit through ClassLoader.loadClass())
Lock DB helper object (when issuing a query)

Thread B (executing some SQL query)
Lock DB helper object (to generate code?)
Lock ClassLoader (implicit via loadClass())

This is the ABC of deadlocks, but it's pretty hard to resolve because of
the synchronized ClassLoader.loadClass() method. The only option I see
is to remove the ReflectClassesJava2 lock around the classloader
operations and handle failures by retrying them. Would that work?

I've appended the full call stack information at the bottom of this
email.

What can we do to resolve this bug?

Best regards,
Andreas

-----------------------------------------------------------------------

Found one Java-level deadlock:
=============================
"Thread-97":
  waiting to lock monitor 0x080c0fcc (object 0xec053590, a
com.digitalroute.picostart.PlatformClassLoader),
  which is held by "Thread-98"
"Thread-98":
  waiting to lock monitor 0x080c1004 (object 0xebfef7c0, a
org.apache.derby.impl.services.reflect.ReflectClassesJava2),
  which is held by "Thread-97"

Java stack information for the threads listed above:
===================================================
"Thread-97":
        at java.lang.ClassLoader.loadClass(ClassLoader.java:278)
        - waiting to lock <0xec053590> (a
com.digitalroute.picostart.PlatformClassLoader)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
        at
org.apache.derby.impl.services.reflect.ReflectClassesJava2.loadClassNotInDatabaseJar(Unknown Source)
        - locked <0xebfef7c0> (a
org.apache.derby.impl.services.reflect.ReflectClassesJava2)
        at
org.apache.derby.impl.services.reflect.DatabaseClasses.loadApplicationClass(Unknown Source)
        at
org.apache.derby.iapi.services.loader.ClassInspector.getClass(Unknown
Source)
        at
org.apache.derby.iapi.services.loader.ClassInspector.accessible(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.QueryTreeNode.verifyClassExist(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.AggregateNode.checkAggregatorClassName(Unknown Source)
        at
org.apache.derby.impl.sql.compile.AggregateNode.bindExpression(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.SQLToJavaValueNode.bindExpression(Unknown Source)
        at
org.apache.derby.impl.sql.compile.MethodCallNode.bindParameters(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.StaticMethodCallNode.bindExpression(Unknown Source)
        at
org.apache.derby.impl.sql.compile.JavaToSQLValueNode.bindExpression(Unknown Source)
        at
org.apache.derby.impl.sql.compile.ResultColumn.bindExpression(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.ResultColumnList.bindExpressions(Unknown Source)
        at
org.apache.derby.impl.sql.compile.SelectNode.bindExpressions(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.FromSubquery.bindExpressions(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.FromList.bindExpressions(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.SelectNode.bindExpressions(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.DMLStatementNode.bindExpressions(Unknown Source)
        at
org.apache.derby.impl.sql.compile.DMLStatementNode.bind(Unknown Source)
        at org.apache.derby.impl.sql.compile.ReadCursorNode.bind(Unknown
Source)
        at org.apache.derby.impl.sql.compile.CursorNode.bind(Unknown
Source)
        at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown
Source)
        at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown
Source)
        at
org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement.<init>(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement20.<init>(Unknown
Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement30.<init>(Unknown
Source)
        at
org.apache.derby.jdbc.Driver30.newEmbedPreparedStatement(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown
Source)
        - locked <0xeb392740> (a
org.apache.derby.impl.jdbc.EmbedConnection30)
        at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown
Source)
        at
com.digitalroute.jdbc.generic_sm_get_statistics.getCompressedStat(generic_sm_get_statistics.java:141)
        at
com.digitalroute.jdbc.generic_sm_get_statistics.invoke(generic_sm_get_statistics.java:120)
        at
com.digitalroute.misc.jdbc.Dispatcher.invokeOperation(Dispatcher.java:84)
        at
com.digitalroute.statistics.StatManagerImpl.compStat(StatManagerImpl.java:774)
        at
com.digitalroute.statistics.StatManagerImpl.activate(StatManagerImpl.java:758)
        at com.digitalroute.xplan.activation.Activator
$ExecutionThread.run(Activator.java:48)


"Thread-98":
        at
org.apache.derby.impl.services.reflect.ReflectClassesJava2.loadGeneratedClassFromData(Unknown Source)
        - waiting to lock <0xebfef7c0> (a
org.apache.derby.impl.services.reflect.ReflectClassesJava2)
        at
org.apache.derby.impl.services.reflect.DatabaseClasses.loadGeneratedClass(Unknown Source)
        at
org.apache.derby.impl.services.bytecode.GClass.getGeneratedClass(Unknown
Source)
        at
org.apache.derby.impl.sql.compile.ExpressionClassBuilder.getGeneratedClass(Unknown Source)
        at
org.apache.derby.impl.sql.compile.StatementNode.generate(Unknown Source)
        at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown
Source)
        at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown
Source)
        at
org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement.<init>(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement20.<init>(Unknown
Source)
        at
org.apache.derby.impl.jdbc.EmbedPreparedStatement30.<init>(Unknown
Source)
        at
org.apache.derby.jdbc.Driver30.newEmbedPreparedStatement(Unknown Source)
        at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown
Source)
        - locked <0xeb3bad30> (a
org.apache.derby.impl.jdbc.EmbedConnection30)
        at
org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown
Source)
        at
com.digitalroute.codeserver.GenericDBStorage.lookupArchiveByType(GenericDBStorage.java:739)
        at
com.digitalroute.codeserver.CodeServerImpl.lookupArchiveByType(CodeServerImpl.java:438)
        - locked <0xebe300e0> (a
com.digitalroute.codeserver.CodeServerImpl)
        at
com.digitalroute.picostart.CSMethodInvoker.lookupArchiveByType(CSMethodInvoker.java:58)
        at
com.digitalroute.picostart.CacheManager.putArchive(CacheManager.java:448)
        - locked <0xec049e40> (a
com.digitalroute.picostart.CacheManager)
        at
com.digitalroute.picostart.CacheManager.resolveCacheEntry(CacheManager.java:231)
        - locked <0xec049e40> (a
com.digitalroute.picostart.CacheManager)
        at
com.digitalroute.picostart.CacheManager.getCacheEntry(CacheManager.java:264)
        at
com.digitalroute.picostart.PicoClassLoader.tryCacheLoad(PicoClassLoader.java:89)
        at
com.digitalroute.picostart.PicoClassLoader.findClass(PicoClassLoader.java:73)
        - locked <0xec053590> (a
com.digitalroute.picostart.PlatformClassLoader)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
        - locked <0xec053590> (a
com.digitalroute.picostart.PlatformClassLoader)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
        at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1058)
        at java.util.ResourceBundle.findBundle(ResourceBundle.java:928)
        at
java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:765)
        at java.util.ResourceBundle.getBundle(ResourceBundle.java:702)
        at
com.digitalroute.devkit.misc.DRTextCode.getMessage(DRTextCode.java:140)
        at
com.digitalroute.devkit.misc.DRTextCode.<init>(DRTextCode.java:78)
        at
com.digitalroute.misc.MZTreeTextCode.TC_NAMING_LOOKUP_FAILURE(MZTreeTextCode.java:1800)
        at
com.digitalroute.misc.RMINamingHelper.lookup(RMINamingHelper.java:88)
        at
com.digitalroute.tap.manager.cleanup.CleanUpTap.initialize(CleanUpTap.java:62)
        at
com.digitalroute.devkit.task.DRTaskExecutor.initialize(DRTaskExecutor.java:38)
        at com.digitalroute.taskmanager.Task.execute(Task.java:99)
        - locked <0xec79d018> (a com.digitalroute.taskmanager.Task)
        at
com.digitalroute.taskmanager.TaskManagerImpl.activate(TaskManagerImpl.java:593)
        at com.digitalroute.xplan.activation.Activator
$ExecutionThread.run(Activator.java:48)

Found 1 deadlock.




Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Tue, 2005-09-20 at 10:20 -0700, Daniel John Debrunner wrote:

> The changes Knut pointed out have been merged to 10.1. If you can build
> the 10.1 branch and try them out that would be great, otherwise you
> could wait for a snapshot and try that out.

Daniel,
the version on the 10.1 branch doesn't deadlock with the testcase, I've
had it running for about half an hour now. Thanks for the info.

Best regards,
Andreas


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Daniel John Debrunner wrote:

> Andreas Fredriksson wrote:
> 
> 
>>On Thu, 2005-09-15 at 09:59 +0200, Knut Anders Hatlen wrote:
>>
>>
>>
>>>Your patch is against Derby 10.1.1.0, but I see that the development
>>>sources have changed since that version. Some of the changes are
>>>similar to those you proposed. Could you try to run your test using
>>>the latest development sources? Thanks!
>>
>>
>>Knut Anders,
>>
>>I can confirm that the version on the SVN trunk doesn't deadlock with
>>the testcase I posted previously.
> 
> 
> Thanks for testing it, I couldn't reproduce it on the trunk but assumed
> it was due to running on a single cpu machine. ! 2cpu machine was my
> next step!
> 
> I will merge the associated changes into 10.1 from the trunk and
> hopefully they will fix the problem and become part of the upcoming
> 10.1.2 release.

The changes Knut pointed out have been merged to 10.1. If you can build
the 10.1 branch and try them out that would be great, otherwise you
could wait for a snapshot and try that out.

Thanks,
Dan.


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:

> On Thu, 2005-09-15 at 09:59 +0200, Knut Anders Hatlen wrote:
> 
> 
>>Your patch is against Derby 10.1.1.0, but I see that the development
>>sources have changed since that version. Some of the changes are
>>similar to those you proposed. Could you try to run your test using
>>the latest development sources? Thanks!
> 
> 
> Knut Anders,
> 
> I can confirm that the version on the SVN trunk doesn't deadlock with
> the testcase I posted previously.

Thanks for testing it, I couldn't reproduce it on the trunk but assumed
it was due to running on a single cpu machine. ! 2cpu machine was my
next step!

I will merge the associated changes into 10.1 from the trunk and
hopefully they will fix the problem and become part of the upcoming
10.1.2 release.

Dan.


Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Thu, 2005-09-15 at 09:59 +0200, Knut Anders Hatlen wrote:

> Your patch is against Derby 10.1.1.0, but I see that the development
> sources have changed since that version. Some of the changes are
> similar to those you proposed. Could you try to run your test using
> the latest development sources? Thanks!

Knut Anders,

I can confirm that the version on the SVN trunk doesn't deadlock with
the testcase I posted previously.

Thanks for the information.

Best regards,
Andreas


Re: Class loading deadlock

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
Andreas Fredriksson <an...@digitalroute.com> writes:

> On Wed, 2005-09-14 at 08:51 +0200, Andreas Fredriksson wrote:
>
>> I think the only fix is to rewrite the class definition code in Derby to
>> avoid taking a lock on the helper objects, but I don't know the source
>> code well enough to write a proper fix.
>
> Maybe I should spend more time tweaking before sending email in the
> future.. :-)
>
> The attached patch fixes the problem for me. I'm sure there's more than
> this to it, but please feel free to use this a reference. We'll use it
> locally and see if anything else creeps up.

Your patch is against Derby 10.1.1.0, but I see that the development
sources have changed since that version. Some of the changes are
similar to those you proposed. Could you try to run your test using
the latest development sources? Thanks!

-- 
Knut Anders


Re: Class loading deadlock

Posted by "Michael J. Segel" <ms...@segel.com>.
On Thursday 15 September 2005 09:52, Andreas Fredriksson wrote:
[SNIP]
> The point I was trying to make is that if there are no locks involved at
> all, and only a single write of a field (as appears to be the case
> here), a reader might theoretically never see the written value, ever,
> as the view of memory only passes when the writer threads enters a
> monitor.
>
> So in this context, assuming there were no other monitors entered after
> the change to the action field, another thread might still see the
> default value, and take the wrong action. By locking around _something_,
> (the own object in this case), we at least give other threads a fair
> chance to see the value we're written by taking any required flush
> actions to satisfy the memory view model.
>
> I agree that if read/write ordering is an issue in the design, locking
> must be done on a much higher level which covers the event sequencing,
> but I assume this is already accounted for in the class we're talking
> about.
>
> Regards,
> Andreas

Silly question... 

It looks like you have two issues.
Deadlock contention and Transaction level isolation.

I mean, if you have a single write, by default its atomic and you don't need 
locking. To your point above, if another thread is reading the database, then 
the isolation level of the transaction will take care of the issue. (Dirty 
Reads for example. ;-) [Its what you're describing.]

If you have multiple writes that comprise an atomic action, then you need to 
have locking. (Which is what we are all agreeing with.)

If you have n writes and you don't know if n=1 or n>1 then you could argue 
that a simple solution would be to lock for all transactions, thus keeping 
the code simple, at a performance hit. (Locking has a cost, but we don't know 
how much of a performance hit we will incur.)  Balance that against the cost 
to check and see the value of n, if known...

Does that make sense, or am I still suffering from the DTs?

-- 
Michael Segel
Principal
MSCC
(312) 952-8175

Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:

> On Thu, 2005-09-15 at 07:39 -0700, Daniel John Debrunner wrote:
> 
> 
>>I'm more talking from a functional level. Even with the synchronization
>>the caller of the method is just seeing a value from the past, no
>>guarantees over it being the "latest" once the method returns. As soon
>>as the synchronization is released, ten other threads could execute and
>>change the value and our original caller already has a stale value
>>before they operate on it.
> 
> 
> The point I was trying to make is that if there are no locks involved at
> all, and only a single write of a field (as appears to be the case
> here), a reader might theoretically never see the written value, ever,
> as the view of memory only passes when the writer threads enters a
> monitor.

I think we are in agreement and are just pointing out different, but
correct, aspects of synchronization.

In the case of the action variable, it can be set multiple times, and
the setting of it is syncnronized with the reading of it. Ie. the caller
sets it, the calls the doPriv method and in the resulting call to the
run method the action variable is read to determine which action to
take. All of this happens under a single synchronized block, which your
patch changed.


Dan.


Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Thu, 2005-09-15 at 07:39 -0700, Daniel John Debrunner wrote:

> I'm more talking from a functional level. Even with the synchronization
> the caller of the method is just seeing a value from the past, no
> guarantees over it being the "latest" once the method returns. As soon
> as the synchronization is released, ten other threads could execute and
> change the value and our original caller already has a stale value
> before they operate on it.

The point I was trying to make is that if there are no locks involved at
all, and only a single write of a field (as appears to be the case
here), a reader might theoretically never see the written value, ever,
as the view of memory only passes when the writer threads enters a
monitor.

So in this context, assuming there were no other monitors entered after
the change to the action field, another thread might still see the
default value, and take the wrong action. By locking around _something_,
(the own object in this case), we at least give other threads a fair
chance to see the value we're written by taking any required flush
actions to satisfy the memory view model.

I agree that if read/write ordering is an issue in the design, locking
must be done on a much higher level which covers the event sequencing,
but I assume this is already accounted for in the class we're talking
about.

Regards,
Andreas


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:
> On Wed, 2005-09-14 at 09:18 -0700, Daniel John Debrunner wrote:
> 

>>As another general example, if I ever see code like this I know the
>>coder maybe unclear on synchronization.
>>
>>public synchronized int getCount()
>>{
>>     return count;
>>}
>>
>>The synchronization here doesn't guarantee anything, as soon as the
>>method returns the count could change so the caller is only ever seeing
>>a snapshot of the state, which is exactly what would happen if the
>>method was not synchronized.
> 
> 
> Yes, but see the above note. You're not guaranteed to see the _latest_
> value written to that field, as per the Java Memory Model. If you
> synchronize, you explicitly request to see the most recently visible
> memory in that field, as flushed by other threads taking a lock.

I'm more talking from a functional level. Even with the synchronization
the caller of the method is just seeing a value from the past, no
guarantees over it being the "latest" once the method returns. As soon
as the synchronization is released, ten other threads could execute and
change the value and our original caller already has a stale value
before they operate on it.

If the caller wants to guarantee the latest value and perform some other
action based upon the guaranteed latest value then the synchronization
needs to cover reading the value and the action. This of course is the
common case.

Thus I see such a synchronized single action method as a red-flag
because there's a good chance the coder thought that this simple
synchronization solved some other issue.

> Sorry for the lengthy bantering :-)

No need to apologize, it's good to see various points of view &
discussions etc.

Dan.



Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Wed, 2005-09-14 at 09:18 -0700, Daniel John Debrunner wrote:
> Andreas Fredriksson wrote:

> Thanks for the patch, it may provide the correct approach but from a
> quick look the patch has some incorrect synchronization.
> 
> For example, this extract.
> 
> > +		synchronized (this) {
> > +			action = 2;
> > +		}
> 
> Synchronizing a single atomic action typically makes no sense, because
> as soon as the synchronization is released the state can change, in this
> case action could be modified. Synchronization involves ensuring
> *multiple* atomic actions occur in lock step, not a single atomic action.
> In this case the setting of the action variable needs to be synchronized
> with the priv block call, otherwise another thread could change action
> and thus change the behaviour of the call that the original thread was
> expecting.

I understand the reasoning, but to explain my point of view: if there's
no locking around the reading and writing of a primitive field, it's not
guaranteed that other threads see the value written unless the writer
thread has entered at least one monitor. The behavior depends on the
memory visibility parameters of the target platform whether readers see
the "old" or "new" value.

Since I was unsure about the locking patterns of calling code I
preserved the lock on the action field to avoid these problems--if the
calling pattern is such that it isn't a problem it can of course be
removed.

> As another general example, if I ever see code like this I know the
> coder maybe unclear on synchronization.
> 
> public synchronized int getCount()
> {
>      return count;
> }
> 
> The synchronization here doesn't guarantee anything, as soon as the
> method returns the count could change so the caller is only ever seeing
> a snapshot of the state, which is exactly what would happen if the
> method was not synchronized.

Yes, but see the above note. You're not guaranteed to see the _latest_
value written to that field, as per the Java Memory Model. If you
synchronize, you explicitly request to see the most recently visible
memory in that field, as flushed by other threads taking a lock.

Refer to
http://www-128.ibm.com/developerworks/java/library/j-jtp02244.html [
Java theory and practice: Fixing the Java Memory Model, Part 1] for a
summary of these issues.

What I'm getting at here is a special case; if you write a primitive
field from one thread without synchronization, and read it later on in
another thread without synchronization, there is a theoretical
possibility you will never see the written value in the second thread
because of the current memory visibility rules in Java.

In practice it doesn't happen in most code because a lot of calls are
synchronized and implicitly cause a memory barrier, but at our shop we
program defensively since our product deploys on systems with _many_
CPUs and with even laxer memory visibility and runs with uptime
guarantees.

Sorry for the lengthy bantering :-)

Regards,
Andreas


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:

> On Wed, 2005-09-14 at 08:51 +0200, Andreas Fredriksson wrote:
> 
> 
>>I think the only fix is to rewrite the class definition code in Derby to
>>avoid taking a lock on the helper objects, but I don't know the source
>>code well enough to write a proper fix.
> 
> 
> Maybe I should spend more time tweaking before sending email in the
> future.. :-)
> 
> The attached patch fixes the problem for me. I'm sure there's more than
> this to it, but please feel free to use this a reference. We'll use it
> locally and see if anything else creeps up.

Thanks for the patch, it may provide the correct approach but from a
quick look the patch has some incorrect synchronization.

For example, this extract.

> +		synchronized (this) {
> +			action = 2;
> +		}

Synchronizing a single atomic action typically makes no sense, because
as soon as the synchronization is released the state can change, in this
case action could be modified. Synchronization involves ensuring
*multiple* atomic actions occur in lock step, not a single atomic action.
In this case the setting of the action variable needs to be synchronized
with the priv block call, otherwise another thread could change action
and thus change the behaviour of the call that the original thread was
expecting.

As another general example, if I ever see code like this I know the
coder maybe unclear on synchronization.

public synchronized int getCount()
{
     return count;
}

The synchronization here doesn't guarantee anything, as soon as the
method returns the count could change so the caller is only ever seeing
a snapshot of the state, which is exactly what would happen if the
method was not synchronized.

Dan.


Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Wed, 2005-09-14 at 08:51 +0200, Andreas Fredriksson wrote:

> I think the only fix is to rewrite the class definition code in Derby to
> avoid taking a lock on the helper objects, but I don't know the source
> code well enough to write a proper fix.

Maybe I should spend more time tweaking before sending email in the
future.. :-)

The attached patch fixes the problem for me. I'm sure there's more than
this to it, but please feel free to use this a reference. We'll use it
locally and see if anything else creeps up.

Regards,
Andreas



Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Tue, 2005-09-13 at 09:23 -0700, Daniel John Debrunner wrote:
> > Replying to myself here because I got curious of wheter a testcase would
> > reproduce the problem, so I wrote one.
> > 
> Wow, very cool!
> 
> Maybe next step would be you fixing the bug! ;-)

It's amazing how much fun it can be to provoke bugs :-)

I think the only fix is to rewrite the class definition code in Derby to
avoid taking a lock on the helper objects, but I don't know the source
code well enough to write a proper fix.

Regards,
Andreas


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:
> Replying to myself here because I got curious of wheter a testcase would
> reproduce the problem, so I wrote one.
> 

Wow, very cool!

Maybe next step would be you fixing the bug! ;-)

Thanks,
Dan.



Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
Replying to myself here because I got curious of wheter a testcase would
reproduce the problem, so I wrote one.

This self-contained test class produces the deadlock in a matter of
seconds on my 2-CPU workstation box; you might need to tweak it a bit to
get the bug on other machines. Sending -QUIT yields:

Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x09fbd2c4 (object 0xabf7d448, a deadlocktest
$TestCL),  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x09fbd4f4 (object 0xabf18b08, a
org.apache.derby.impl.services.reflect.ReflectClassesJava2),
  which is held by "Thread-1"

I noticed the key was to change the second query (with a random item in
the query string itself) so that the query cannot be cached.

Enjoy,
Andreas


deadlocktest.java:

import java.sql.*;

public class deadlocktest {

    public static final String URL = "jdbc:derby:deadlockdb";

    private class TestCL extends ClassLoader {
        protected Class findClass(String name) throws
ClassNotFoundException {
            issueFindQuery();
            return super.findClass(name);
        }
    }

    private class ThreadA implements Runnable {
        public void run() {
            for (;;) {
                String pkg = "nonexistent.package.foo_" +
                    System.currentTimeMillis();
                try {

Thread.currentThread().getContextClassLoader().loadClass(pkg);
                } catch (ClassNotFoundException ignored) {}
            }
        }
    }

    private class ThreadB implements Runnable {
        public void run() {
            for (;;) {
                issueSlowQuery();
            }
        }
    }

    public void issueSlowQuery() {
        try {
            PreparedStatement ps = conn_b.prepareStatement
                ("SELECT id, bar FROM foo WHERE id > 50 AND bar
LIKE ?");
            ps.setString(1, "%1%");
            ResultSet rs = ps.executeQuery();
            while (rs.next()) {
            }
            rs.close();
            ps.close();
            try { Thread.sleep(1); } catch (InterruptedException
ignored) {}
        } catch (SQLException ex) {
            ex.printStackTrace();
        }
    }

    public void issueFindQuery() {
        try {
            PreparedStatement ps = conn_a.prepareStatement
                ("SELECT bar||'" + System.currentTimeMillis() + "' FROM
foo WHERE bar LIKE ?");
            ps.setString(1, "%2%");
            ResultSet rs = ps.executeQuery();
            while (rs.next()) {
                try { Thread.sleep(1); } catch (InterruptedException
ignored) {}
            }
            rs.close();
            ps.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }
    }

    public void executeNoThrow(Statement s, String sql) {
        try {
            s.execute(sql);
        } catch (SQLException ignored) {}
    }

    public void populate(Connection conn) throws Exception {
        System.out.println("populating");
        Statement s = conn.createStatement();

        // create the table
        executeNoThrow(s, "DROP TABLE foo");
        s.execute("CREATE TABLE foo (id int, bar varchar(60))");
        s.close();

        // populate it
        PreparedStatement ps = conn_a.prepareStatement
            ("INSERT INTO foo VALUES (?, ?)");
        for (int i=0; i<100; ++i) {
            ps.setInt(1, i);
            ps.setString(2, "foobar" + System.currentTimeMillis());
            ps.executeUpdate();
        }
        ps.close();
    }

    private Connection conn_a, conn_b;

    public deadlocktest() throws Exception {
        Class.forName("org.apache.derby.jdbc.EmbeddedDriver");
        conn_a = DriverManager.getConnection(URL + ";create=true");
        populate(conn_a);
        conn_b = DriverManager.getConnection(URL);
        System.out.println("starting threads");
        Thread a = new Thread(new ThreadA());
        Thread b = new Thread(new ThreadB());
        ClassLoader foo = new TestCL();
        a.setContextClassLoader(foo);
        b.setContextClassLoader(foo);
        a.start();
        try { Thread.sleep(2000); } catch (InterruptedException ignored)
{}
        b.start();
    }

    public static void main(String[] args) {
        try {
            new deadlocktest();
        } catch (Throwable t) {
            t.printStackTrace();
        }
    }

}



Re: Class loading deadlock

Posted by Andreas Fredriksson <an...@digitalroute.com>.
On Tue, 2005-09-13 at 06:52 -0700, Daniel John Debrunner wrote:

> I working in Derby's classloader area at the moment, so I'll look into
> this, do you have a simple reproducible case?

Hi Daniel,
I don't think it's easily reproducible without breaking some rules.

Here's an idea for a testcase (untested, but probably not too far off):

Write a ClassLoader that performs some random query against a database
before calling the parent's findClass() method. Making this query take a
long time is a good idea. Maybe a CREATE FUNCTION that maps to
Thread.sleep() would help. A slow query increases the window for
deadlock, but it might need to vary because of caching schemes inside
Derby I'm not aware of--just include the class name as a query parameter
or so.

public class TestClassLoader {
   public Class findClass(String name, boolean resolve) {
	issueTestQuery(name);
	return super.findClass();
   }
}

Set your TestClassLoader as the context class loader for both threads:

TestClassLoader myClassLoader = new TestClassLoader();
thread_a.setContextClassLoader(myClassLoader);
thread_b.setContextClassLoader(myClassLoader);

Have one thread continually try to load bogus classes from this class
loader so that the findClass() map will always fail and your loadClass()
will be called. This first thread represents the ClassLoader->DB lock
chain.

for (;;) {
   String pkg = "nonexistent.package.foo_" + System.currentTime();
   try {
      Class.forName(pkg);
   } catch (ClassNotFoundException ignored) {}
}

Have one other thread continually run a large query against Derby (with
its context class loader also set to myClassLoader). With luck this
query will trigger the DB->ClassLoader lock chain and you will see the
deadlock eventually.

for (;;) {
   issueLargeQuery();
}

This should produce the deadlock fairly quickly, if you have box with
more than one CPU; with one CPU you will have to wait for the scheduler
to timeslice the threads so that the deadlock can occur (that's why the
sleep inside the query helps).

Please let me know if there's anything else I can assist you with.

Best regards,
Andreas


Re: Class loading deadlock

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Andreas Fredriksson wrote:

> Hey list,
> we've re-targeted our system to Derby and now we're seeing some class
> loading-related deadlocks.
> 
> To give some background, our system loads code from the database and
> other various sources (through our own set of class loaders). This has
> worked fine with Oracle, MS SQL Server and many other databases, but
> with Derby there is an issue with the interaction pattern of class
> loaders and the database helper objects.
> 
> The deadlock occurs when one thread has locked the classloader to find a
> certain class which then triggers a database query to find the right
> data in one thread. At the same time a daily query is running against
> the database, which triggers class loader operations against the already
> locked class loader instance.
> 
> Short and sweet:
> 
> Thread A (loading a class/resource via a ResourceBundle)
> Lock ClassLoader (implicit through ClassLoader.loadClass())
> Lock DB helper object (when issuing a query)
> 
> Thread B (executing some SQL query)
> Lock DB helper object (to generate code?)
> Lock ClassLoader (implicit via loadClass())
> 
> This is the ABC of deadlocks, but it's pretty hard to resolve because of
> the synchronized ClassLoader.loadClass() method. The only option I see
> is to remove the ReflectClassesJava2 lock around the classloader
> operations and handle failures by retrying them. Would that work?
> 
> I've appended the full call stack information at the bottom of this
> email.
> 
> What can we do to resolve this bug?

I working in Derby's classloader area at the moment, so I'll look into
this, do you have a simple reproducible case?

Dan.