You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Sebastian Millies (JIRA)" <xe...@xml.apache.org> on 2015/01/07 16:00:46 UTC

[jira] [Created] (XERCESJ-1653) Memory leak with validating SAX Parser

Sebastian Millies created XERCESJ-1653:
------------------------------------------

             Summary: Memory leak with validating SAX Parser
                 Key: XERCESJ-1653
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1653
             Project: Xerces2-J
          Issue Type: Bug
          Components: SAX
    Affects Versions: 2.11.0
         Environment: Windows 7 Enterprise, JDK 1.8.0_25
            Reporter: Sebastian Millies
            Priority: Critical
         Attachments: SAXMemoryUsage.java

I'm parsing a very large XML file with org.apache.xerces.parsers.SAXParser and validation turned on. The file contains 25 million elements of the form specified in the attached DTD's, in total it is ca. 7 GB large. 

Heap monitoring with jvisualvm shows millions of QName instances being cached and not being garbage collected.

Turning off validation makes the problem disappear. 

I have tested a numer of other parsers (Crimson, Aelfred2, Resin, Woodstox). With Woodstox, for example, I can process my 7 GB file (including validation) with just 64MB of heap. With Xerces, 1024MB of heap do not suffice. 

I'll attach a small diagnosis program (SAXMemoryUsage.java) that shows that Xerces heap consumption increases inordinately.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org