You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Dag H. Wanvik (JIRA)" <ji...@apache.org> on 2007/04/18 02:05:15 UTC
[jira] Updated: (DERBY-1947) OutOfMemoryError after repeated calls
to boot and shutdown a database
[ https://issues.apache.org/jira/browse/DERBY-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dag H. Wanvik updated DERBY-1947:
---------------------------------
Attachment: ProcedureInTriggerTest-threadstacks.txt
I was getting ready to commit the patch for this issue, but then I
updated my trunk and found that the following newly
introduced/converted JUnit test makes Derby deadlock when the patch
(DERBY-1947-2) is active:
ProcedureInTriggerTest
It looks like a deadlock in the VM (see also enclosed stacks of
involved threads) caused by the patch's use of finalizer code which
leads to synchronization, and I am at a loss presently for how to
resolve this best.
What seems to happen is this:
* In ProcedureInTriggerTest:388, a call to Stement#execute is performed:
> s.execute("create trigger select_from_trig_table no cascade before " +
"delete on t1 for each STATEMENT call proc_reads_sql(1)");
This call synchronizes on the current (root) connection object.
* At this time I believe there are still two nested connection objects
from previous statements (the test uses triggers and stored
procedures called from them), which are no longer referenced but for
whom the finalizer has yet to run (not yet garbage collected).
* Now, the CreateTriggerConstantAction#executeConstantAction in line
275 uses the Dependency manager to send an invalidate to the trigger
table so that DML statements on this table will be recompiled. (see
stack of thread "main" in attachment
ProcedureInTriggerTest-threadstacks.txt).
* As a result of this, GenericPreparedStatement#prepareToInvalidate is
called which again calls
GenericLanguageConnectionContext#verifyNoOpenResultSets
* verifyNoOpenResultSets contains this sequence of calls:
// There may be open ResultSet's that are yet to be garbage collected
// let's try and force these out rather than throw an error
System.gc();
System.runFinalization();
* This last call causes deadlock. This seems to happens because the
said nested connection's finalizers are attempted executed. With
the patch, the finalizer of EmbedConnection contains a call to
EmbedConnection#close which synchronizes on the root connection
object (already locked by the "main" thread as part of the execute).
The execution of the finalizer happens in a thread forked by the
call to System.runFinalization(), which hangs waiting for the
finalizer thread to complete (see Finalizer#forkSecondaryFinalizer's
call to sft.join() ca line 115 in Sun JDK 1.6). Now, the finalizers
are stuck waiting for the lock held my "main", so deadlock ensues.
* Now, without the call to System.runFinalization(); the test runs
successfully, presumably because the two connection objects will
then be garbage collected later, after s.execute() has terminated.
Also, removing the call to System.gc() removes the deadlock, but
this breaks the test however (verifyNoOpenResultSets throws X0X95.S
- LANG_CANT_INVALIDATE_OPEN_RESULT_SET in line 1733):
"Operation 'CREATE TRIGGER' cannot be performed on object 'T1'
because there is an open ResultSet dependent on that object."
So it would seem that calling System.gc() is sufficient, but
System.runFinalization is harmful, since it will wait for
finalization to finish, which is a dangerous thing if the thread
already holds locks. *Or* one could argue that the finalizers should
never call anything that could require a lock (too strict in my
view).
So, now I am trying to understand what would be the correct approach
to resolve this.
- Without the call to close in the finalizer of EmbedConnection, we
have DERBY-1947, that is, ContextManager objects is referenced from
the HashSet 'allContexts' of the ContextService may leak.
- With the call to close in the finalizer of EmbedConnection, we risk
deadlock as long as any Derby code calls System.runFinalization()
anywhere. Grepping, i see this happens in three places in the
engine:
LowMemory.java:95: System.runFinalization();
GenericLanguageConnectionContext.java:1617: System.runFinalization();
GenericLanguageConnectionContext.java:1709: System.runFinalization();
Possible approaches:
1) Maybe we could change the HashSet 'allContexts' of the
ContextService to use weak hash map perhaps, a WeakHashSet, if there
is such a thing. That way, the ContextManager objects would be
cleaned eventually.
2) Remove the explicit calls to System.runFinalization. It seems that
keeping just the System.gc() call is less risky, but may be less
efficient, so could break existing apps? The System.gc() is not
guaranteed to yield results anyway..
3) Figure out a way to explicitly close the result sets in question,
so the gc calls would be redundant. I am not sure how easy this
would be; they are probably there for a good reason.
4) Remove the calls and allow the execution to fail as indicated in
the comment of verifyNoOpenResultSets:
// There may be open ResultSet's that are yet to be garbage collected
// let's try and force these out rather than throw an error
System.gc();
System.runFinalization();
This could break existing apps.
5) other ideas..?
Finally, can this deadlock be considered a Java run-time error? If
not, what is the contract being broken by Derby app code in the above?
I tried to search a bit around advisability of explicitly running
runFinalization, in the presence of held Java locks, and finalizers
which can cause locks, but I didn't find anything conclusive...
Any advise is appreciated :)
> OutOfMemoryError after repeated calls to boot and shutdown a database
> ---------------------------------------------------------------------
>
> Key: DERBY-1947
> URL: https://issues.apache.org/jira/browse/DERBY-1947
> Project: Derby
> Issue Type: Bug
> Components: Services
> Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.2.1.6
> Environment: Any
> Reporter: Rajesh Kartha
> Assigned To: Dag H. Wanvik
> Attachments: DERBY-1947-1.diff, DERBY-1947-1.stat, DERBY-1947-2.diff, DERBY-1947-2.stat, HeapAnalysis_op.jpg, ProcedureInTriggerTest-threadstacks.txt, testEmbedAndClient.java
>
>
> I came across this OOM issue while running some system tests involving
> backup and restore against Derby. The test is expected to run forever but
> using the default heap space it runs into OOM within 2 days. I earlier mentioned about this
> in my reply to the 10.2.1.6 vote - http://www.nabble.com/Re%3A--VOTE--10.2.1.6-release-p6650528.html
> Also there has been some discussions on the list on the related topic:
> http://issues.apache.org/jira/browse/DERBY-23 and
> http://www.nabble.com/question-about-shutdown-tf2300504.html
> Basic Analysis:
> --------------------
> Wrote a simple Java app (attached to this issue) that booted and shutdown the same
> database multiple times. Depending on the heapsize the program ran into the
> OOM at some point, as expected. Some heap dump analysis using the IBM HeapAnalyzer
> and revealed that the HashSet (allContexts) within org.apache.derby.iapi.services.context.ContextService
> class seemed to be location of the leak (snapshot of the heapanalysis attached).
> A little bit of debugging shows that:
> - for every:connection two ContextManager objects (say, cm1, cm2) are added to the HashSet
> - for every shutdown a new ContextManager object (say, cm3) is added and two objects are removed
> - the object removed are cm2 and cm3 - in that sequence
> - but the object cm1 gets left behind
> This happens for every connect/shutdown sequence and since one of the ContextManager objects added to the
> HashSet is not removed as a part of the cleanup, it contributes to growth in memory usage, hence
> an OOM eventually.
> For example:
> ============
> A 64M heap could allow about 1107 iterations of connect/shutdown only before running into OOM and
> created 1108 un-used ContextManager objects in the memory.
> java -Xmx64M testEmbedAndClient
> ++++Debug: add() Size of allContexts HashSet obj= 1
> ----Debug: remove() Size of allContexts HashSet obj= 0
> ++++Debug: add() Size of allContexts HashSet obj= 1
> ----Debug: remove() Size of allContexts HashSet obj= 0
> ++++Debug: add() Size of allContexts HashSet obj= 1
> ++++Debug: add() Size of allContexts HashSet obj= 2
> ==== Database booted in embedded ====
> ++++Debug: add() Size of allContexts HashSet obj= 3
> ----Debug: remove() Size of allContexts HashSet obj= 2
> ----Debug: remove() Size of allContexts HashSet obj= 1
> ==== Shutdown complete in embedded ====
> ++++Debug: add() Size of allContexts HashSet obj= 2
> ++++Debug: add() Size of allContexts HashSet obj= 3
> ==== Database booted in embedded ====
> ++++Debug: add() Size of allContexts HashSet obj= 4
> ----Debug: remove() Size of allContexts HashSet obj= 3
> ----Debug: remove() Size of allContexts HashSet obj= 2
> ..
> ..
> ..
> ==== Database booted in embedded ====
> ++++Debug: add() Size of allContexts HashSet obj= 1109
> ----Debug: remove() Size of allContexts HashSet obj= 1108
> ----Debug: remove() Size of allContexts HashSet obj= 1107
> ==== Shutdown complete in embedded ====
> ++++Debug: add() Size of allContexts HashSet obj= 1108
> ++++Debug: add() Size of allContexts HashSet obj= 1109
> ----Debug: remove() Size of allContexts HashSet obj= 1108
> java.sql.SQLException: Failed to start database 'testdb', see the next exception
> for details.
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
> at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:89)
> at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:95)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:174)
> at org.apache.derby.impl.jdbc.EmbedConnection.newSQLException(EmbedConnection.java:1985)
> at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:1649)
> at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:223)
> at org.apache.derby.impl.jdbc.EmbedConnection30.<init>(EmbedConnection30.java:73)
> at org.apache.derby.jdbc.Driver30.getNewEmbedConnection(Driver30.java:74)
> at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:210)
> ----Debug: remove() Size of allContexts HashSet obj= 1107
> at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:117)
> at java.sql.DriverManager.getConnection(DriverManager.java:525)
> at java.sql.DriverManager.getConnection(DriverManager.java:193)
> at testEmbedAndClient.testInEmbedded(testEmbedAndClient.java:40)
> at testEmbedAndClient.main(testEmbedAndClient.java:19)
> java.lang.OutOfMemoryError: Java heap space
> OOM happened after 1107 iterations
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.