You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@xerces.apache.org by Frédéric Bauchet <fb...@shom.fr> on 2000/01/07 11:11:54 UTC

java.lang.OutOfMemoryError for big XML file (over 1.5Mb)

When using DOMWriter example with sun JDK1.1.8 on Windows NT 4,
I get the following error message with an XML file of 1.5 Mega bytes.

java.lang.OutOfMemoryError: 
	at
org.apache.xerces.dom.DeferredDocumentImpl.createChunk(DeferredDocumentImpl.java:1386)
	at
org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity(DeferredDocumentImpl.java:1294)
	at
org.apache.xerces.dom.DeferredDocumentImpl.createNode(DeferredDocumentImpl.java:1310)
	at
org.apache.xerces.dom.DeferredDocumentImpl.createTextNode(DeferredDocumentImpl.java:389)
	at
org.apache.xerces.parsers.DOMParser.ignorableWhitespace(DOMParser.java:1117)
	at
org.apache.xerces.framework.XMLParser.processWhitespace(XMLParser.java:2088)
	at
org.apache.xerces.readers.UTF8Reader.scanContent(UTF8Reader.java:2182)
	at
org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1134)
	at
org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
	at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1138)
	at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1177)
	at
dom.wrappers.NonValidatingDOMParser.parse(NonValidatingDOMParser.java:102)
	at dom.DOMWriter.print(DOMWriter.java:188)
	at dom.DOMWriter.main(DOMWriter.java:405)


The problem occurs when I'm under the machine memory limit.

With JDK1.2.2 that's works fine but it tooks 35 Mb of memory to parse
the file and after this,
to print the file through the "public void print(Node node)" recursive
method the memory
grows of 40 Mb more. But in the method there is no object creation !!!

Does anybody know why the "print" method tooks so much memory ?

Thanks for any answer.

Frederic Bauchet

Re: java.lang.OutOfMemoryError for big XML file (over 1.5Mb)

Posted by Andy Clark <an...@apache.org>.

Frédéric Bauchet wrote:
> 
> When using DOMWriter example with sun JDK1.1.8 on Windows NT 4,
> I get the following error message with an XML file of 1.5 Mega bytes.
> 
> java.lang.OutOfMemoryError:

DOM is a memory intensive data structure. You must increase your
allowable heap size if you want to be able to read in a large
document.

> With JDK1.2.2 that's works fine but it tooks 35 Mb of memory to parse

Java 2 has a larger default heap size (32M) than Java 1.1 (16M).

> Does anybody know why the "print" method tooks so much memory ?

We have a "deferred" DOM implementation that only creates those
DOM nodes that you actually traverse. This allows the parser to
parse and return a document much quicker than if it had to
create all of the nodes up front. As with all programming
problems, it gains a speed increase at the expense of more
memory. Because the deferred implementation is the default, it 
takes more memory when you traverse the tree.

You can turn off the deferred mode, but the DOM tree will still
take a lot of memory because of all of those node objects. The
following code will turn off the feature:

  DOMParser parser;
  parser.setFeature("http://apache.org/xml/features/dom/defer-node-expansion", false);

The lesson to be learned is only use the DOM if it's appropriate
to your task. If you can get away with using SAX, then use that
instead.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org