You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2002/05/06 23:15:26 UTC

DO NOT REPLY [Bug 8840] New: - SAX out of memory if external-SchemaLocation not used in instance

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8840>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8840

SAX out of memory if external-SchemaLocation not used in instance

           Summary: SAX out of memory if external-SchemaLocation not used in
                    instance
           Product: Xerces2-J
           Version: 2.0.1
          Platform: All
        OS/Version: Windows NT/2K
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: SAX
        AssignedTo: xerces-j-dev@xml.apache.org
        ReportedBy: maxhearn@aol.com


When parsing a document using SAX, Java runs out of memory when all of the 
folowing are true:

1) The document is large (say, 50Meg or greater)
2) The Java code sets the schema validation feature to true.
3) The Java code sets an external-SchemaLocation.
4) The instance document does not reference the schema.

Example:

Set an XMLReader with the following propeties / features:

  xmlReader.setFeature
    ("http://apache.org/xml/features/validation/schema", true);
  xmlReader.setProperty(
    "http://apache.org/xml/properties/schema/external-schemaLocation",
    "publicid:myschema.xsd" + " " +
    SCHEMA_FILENAME);

Create a large instance document, say "test.xml", that looks like:
  <x>
    <y>Some sample text</y>
    <y>Some sample text</y>
    ... Repeat above line many times.  In my tests,
    ... I've repeated this 5 million times.
  </x>

Attempt to parse the document with XMLReader.parse.  Do not override any of
the DefaultHandler methods, so all parse events are ignored.

The parse eventually fails with an Out-of-memory error.

Interestingly, if the instance document actually references the schema,
no out of memory error occurs.

For example, if the first line of the instance is:
  <x xmlns="publicid:myschema.xsd"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="publicid:myschema.xsd myschema.xsd">

And the myschema.xsd schema looks like this:
  <?xml version="1.0" encoding="UTF-8"?>
  <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      xmlns="publicid:myschema.xsd" targetNamespace="publicid:myschema.xsd"
      elementFormDefault="qualified">
 
      <xsd:import schemaLocation="http://www.w3.org/2001/xml.xsd" 
namespace="http://www.w3.org/XML/1998/namespace"/>
      <xsd:element name="x">
          <xsd:complexType>
              <xsd:sequence>
                  <xsd:element name="y" minOccurs="0" maxOccurs="unbounded" 
type="xsd:string"/>
              </xsd:sequence>
          </xsd:complexType>
      </xsd:element>
  </xsd:schema>

Then the parse runs without problems.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org