You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by ji...@apache.org on 2004/05/27 11:44:02 UTC
[jira] Commented: (XERCESJ-724) StringBuffer idiom in DeferredDocumentImpl causes large memory usage
The following comment has been added to this issue:
Author: Tony Butterfield
Created: Thu, 27 May 2004 2:42 AM
Body:
>Also, org/apache/xerces/parsers/AbstractDOMParser uses the same idiom >which may be a problem, but I did not take the time to test it.
This is a problem too. It has the same consequences, text nodes are created which are backed by large (200Kb) StringBuffers. A fix is
to change the two references to fStringBuffer.setLength(0) to fStringBuffer = new StringBuffer();
This issue still exists on XercesJ 2.6.2, but a fix has gone into JDK1.4.2 which changes the behaviour of StringBuffer which resolves the problem on this platform, see:
http://developer.java.sun.com/developer/bugParade/bugs/4724129.html
---------------------------------------------------------------------
View this comment:
http://issues.apache.org/jira/browse/XERCESJ-724?page=comments#action_35743
---------------------------------------------------------------------
View the issue:
http://issues.apache.org/jira/browse/XERCESJ-724
Here is an overview of the issue:
---------------------------------------------------------------------
Key: XERCESJ-724
Summary: StringBuffer idiom in DeferredDocumentImpl causes large memory usage
Type: Bug
Status: Open
Project: Xerces2-J
Components:
DOM
Versions:
2.4.0
Assignee: Xerces-J Developers Mailing List
Reporter: Scott Nygren
Created: Mon, 5 May 2003 3:59 PM
Updated: Thu, 27 May 2004 2:42 AM
Environment: Operating System: Windows NT/2K
Platform: PC
Description:
We have a 3Meg document that uses over 1.5 Gig of memory to parse and causes an
OutOfMemory error on our webserver. I traced it down to the fact the document
has a text node at the beginning that is 16 K and then has 93,000 more text
nodes of much shorter length after it. Each of the text nodes is allocated 16K
of memory to store them even though they may only be a few characters. The
document that causes this is too big to include here, but the code below shows
the problem in abstract.
This problem is due to the way the Sun Windows JDK 1.4.1 treats memory between
Strings and StringBuffers (likely on other versions but I havent tested
them). When StringBuffer.toString is called a String is created with access to
the StringBuffers internal char array. Which in my problem case is 16K. Then
when the next StringBuffer method is called that changes the object (like
setLength) a new char array is created for the StringBuffer with the full
capacity (another 16K).
public class TestStringBuffer {
// run with java -Xms30m -Xmx30m TestStringBuffer
/** Main program entry point. */
public static void main(String argv[]) {
StringBuffer buf = new StringBuffer(10000);
String [] ans1 = new String[1000];
String [] ans2 = new String[1000];
Runtime rt = Runtime.getRuntime();
rt.gc();
long free1 = rt.freeMemory();
// all strings are allocated 10000
// uses over 10 Meg to store array
for (int i=0; i < ans1.length; i++) {
buf.setLength(0);
buf.append("a");
buf.append("b");
ans1[i] = buf.toString();
}
rt.gc();
long free2 = rt.freeMemory();
// uses about 60 K to store array
for (int i=0; i < ans2.length; i++) {
buf.setLength(0);
buf.append("a");
buf.append("b");
ans2[i] = buf.substring(0);
}
rt.gc();
long free3 = rt.freeMemory();
System.out.println("Loop 1 used (toString) "+(free1 - free2));
System.out.println("Loop 2 used (substring) "+(free2 - free3));
}
}
I was able to fix my problem by changing
org/apache/xerces/dom/DeferredDocumentImpl.getNodeValueString to use
value = fBufferStr.substring(0);
instead of
value = fBufferStr.toString();
wherever its referenced.
Also, org/apache/xerces/parsers/AbstractDOMParser uses the same idiom which may
be a problem, but I did not take the time to test it.
I am also going to submit a bug to Sun to recommend at least saying something
in the StringBuffer doc that Strings from toString could be very large.
---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org