You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Richard Eckart de Castilho (JIRA)" <de...@uima.apache.org> on 2014/04/16 15:24:15 UTC
[jira] [Comment Edited] (UIMA-3747) Memory management problem with
compressed binary deserialization
[ https://issues.apache.org/jira/browse/UIMA-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971394#comment-13971394 ]
Richard Eckart de Castilho edited comment on UIMA-3747 at 4/16/14 1:23 PM:
---------------------------------------------------------------------------
Not that I am aware of.
You can reproduce the problem with the following simple test. Before running the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.
{noformat}
public void testCasReuseWithDifferentTypeSystems() throws Exception
{
// Create a CAS
CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null, null);
cas.setDocumentLanguage("latin");
cas.setDocumentText("test");
// Serialize it
ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());
// Create a new CAS
long min = Long.MAX_VALUE;
long max = 0;
CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null, null);
for (int i = 0; i < 100000; i++) {
// Simulate us reinitializing the CAS with a new type system.
TypeSystemImpl tgt = new TypeSystemImpl();
for (int t = 0; t < 1000; t++) {
tgt.addType("random"+t, tgt.getTopType());
}
tgt.commit();
// Deserialize into the new type system
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
Serialization.deserializeCAS(cas2, bais, tgt, null);
long cur = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
max = Math.max(cur, max);
min = Math.min(cur, min);
if (i % 100 == 0) {
System.out.printf("Cached: %d Max: %d Room left: %d %n",
((TypeSystemImpl) cas2.getTypeSystem()).typeSystemMappers.size(), max,
Runtime.getRuntime().maxMemory() - max);
}
}
}
{noformat}
Eventually, the output screetches to a halt:
{noformat}
...
Cached: 2301 Max: 1466865472 Room left: 442067136
Cached: 2401 Max: 1529083736 Room left: 379848872
Cached: 2501 Max: 1583309160 Room left: 325623448
Cached: 2601 Max: 1618738616 Room left: 290193992
Cached: 2701 Max: 1661499672 Room left: 247432936
Cached: 2801 Max: 1717535904 Room left: 191396704
Cached: 2901 Max: 1717535904 Room left: 191396704
<hanging>
{noformat}
was (Author: rec):
Not that I am aware of.
You can reproduce the problem with the following simple test. Before running the test, remove the "private" modifier from TypeSystemImpl.typeSystemMappers.
{noformat}
public void testCasReuseWithDifferentTypeSystems() throws Exception
{
// Create a CAS
CAS cas = CasCreationUtils.createCas((TypeSystemDescription) null, null, null);
cas.setDocumentLanguage("latin");
cas.setDocumentText("test");
// Serialize it
ByteArrayOutputStream baos = new ByteArrayOutputStream(1024);
Serialization.serializeWithCompression(cas, baos, cas.getTypeSystem());
// Create a new CAS
long min = Long.MAX_VALUE;
long max = 0;
CAS cas2 = CasCreationUtils.createCas((TypeSystemDescription) null, null, null);
for (int i = 0; i < 100000; i++) {
// Simulate us reinitializing the CAS with a new type system.
TypeSystemImpl tgt = new TypeSystemImpl();
for (int t = 0; t < 1000; t++) {
tgt.addType("random"+t, tgt.getTopType());
}
tgt.commit();
// Deserialize into the new type system
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
Serialization.deserializeCAS(cas2, bais, tgt, null);
long cur = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
max = Math.max(cur, max);
min = Math.min(cur, min);
if (i % 100 == 0) {
System.out.printf("Cached: %d Max: %d Room left: %d %n",
((TypeSystemImpl) cas2.getTypeSystem()).typeSystemMappers.size(), max,
Runtime.getRuntime().maxMemory() - max);
}
}
}
{noformat}
> Memory management problem with compressed binary deserialization
> ----------------------------------------------------------------
>
> Key: UIMA-3747
> URL: https://issues.apache.org/jira/browse/UIMA-3747
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Affects Versions: 2.4.2SDK
> Reporter: Richard Eckart de Castilho
> Assignee: Marshall Schor
> Fix For: 2.6.0SDK
>
>
> We think we stumbled across a memory management problem with the new compressed binary serialization when a CAS is reset/reused in a loop, e.g. in the uimaFIT SimplePipeline. When we use form 6, we consistently run into out-of-memory situations. Finally, we took the time to do a heap dump analysis.
> We found a huge TypeSystemImpl instance in the heap (~450MB). What makes it huge is the field "typeSystemMappers"
> that in our case contains 1000+ entries, each of them using apparently using a TypeSystemImpl as key.
> It looks like typeSystemMappers is never reset when a CAS is reused. My current theory is, that it should be reset when CAS.reset() is called, otherwise type systems accumulate there when the binary deserialization is used to repeatedly load data into a CAS in a loop that is resetting and reusing the CAS.
--
This message was sent by Atlassian JIRA
(v6.2#6252)