You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2001/12/12 17:46:51 UTC
DO NOT REPLY [Bug 5382] New: -
Limitation of the number of namespace declarations
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5382>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5382
Limitation of the number of namespace declarations
Summary: Limitation of the number of namespace declarations
Product: Xerces-J
Version: 1.4.4
Platform: All
OS/Version: All
Status: NEW
Severity: Normal
Priority: Other
Component: DOM
AssignedTo: xerces-j-dev@xml.apache.org
ReportedBy: bbeauvoir@yahoo.com
Xerces-J 1.4.4 cannot parse XML documents with a very large number of namespace
declarations. (e.g. attribute such as: xmlns:prefix="uri"). This bug has been
encounter in a production system that uses XSLT (Xalan) to process very large
XML documents.
I propose a simple fix for this problem (see below).
HOW TO REPRODUCE THIS BUG:
~~~~~~~~~~~~~~~~~~~~~~~~~~
To reproduce this bug try to parse an XML document structured as follows:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<a:para xmlns:a="urn:a"/>
<a:para xmlns:a="urn:a"/>
<a:para xmlns:a="urn:a"/>
...
<a:para xmlns:a="urn:a"/>
<a:para xmlns:a="urn:a"/>
<a:para xmlns:a="urn:a"/>
<b:test xmlns:b="urn:b"/>
</root>
There should be 16360 <a:para xmlns:a="urn:a"/> child elements of the <root>
element to reproduce the bug. The text nodes containing only spaces used for
the indentation are important.
If you try to parse this kind of XML document with Xerces-J 1.4.4 the following
NullPointerException is thrown:
java.lang.NullPointerException
at org.apache.xerces.dom.DeferredElementNSImpl.synchronizeData
(DeferredElementNSImpl.java:154)
at org.apache.xerces.dom.ElementImpl.getNodeName(ElementImpl.java:144)
at NSLimitationBug.main(NSLimitationBug.java:26)
Here is the source code of my NSLimitationBug class that produces the
NullPointerException. You should change the path of the XML file to load.
import java.io.*;
import org.apache.xerces.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class NSLimitationBug
{
public static void main(String[] args)
{
try
{
File file = new File("E:\\XercesBug\\large-in.xml");
Reader reader = new BufferedReader(new FileReader(file));
DOMParser parser = new DOMParser();
parser.setFeature("http://xml.org/sax/features/validation", false);
parser.setFeature("http://apache.org/xml/features/dom/defer-node-
expansion", true);
InputSource source = new InputSource(reader);
parser.parse(source);
Document doc = parser.getDocument();
NodeList children = doc.getDocumentElement().getChildNodes();
int count = children.getLength();
Element lastElem = (Element) children.item(count - 2); // The last
child is a text node
System.out.println("Name: '" + lastElem.getNodeName() + "'");
System.out.println("Namespace URI: '" + lastElem.getNamespaceURI()
+ "'");
}
catch (Throwable t)
{
t.printStackTrace();
}
}
}
PROPOSED FIX:
~~~~~~~~~~~~
Apparently this bug is due to a coding error in the
org.apache.xerces.deom.DefferedDocumentImpl class.
In fact, in the method org.apache.xerces.dom.DefferedDocumentImpl#getNodeURI
(int nodeIndex, boolean free) an integer is down casted into a short for no
reason. For the last element child of the <root> element the integer to cast is
32768. As the maximum short number is 32767, the integer 32768 is casted into
the short -32768. In fact: (short)32768 == -32768
This later results into the NullPointerException.
To fix this bug, simply remove the down casting into a short and change the
return type of the two 'getNodeURI' to integer.
Here is the code of these two methods after applying this fix:
/** Returns the URI of the given node. */
public int getNodeURI(int nodeIndex) {
return getNodeURI(nodeIndex, true);
}
/**
* Returns the URI of the given node.
* @param True to free URI index.
*/
public int getNodeURI(int nodeIndex, boolean free) {
if (nodeIndex == -1) {
return -1;
}
int chunk = nodeIndex >> CHUNK_SHIFT;
int index = nodeIndex & CHUNK_MASK;
if (free) {
return clearChunkIndex(fNodeURI, chunk, index);
}
return getChunkIndex(fNodeURI, chunk, index);
} // getNodeURI(int):int
NOTE:
~~~~~
I have first noted this bug in Xerces-J 1.2.0. With this version there is no
NullPointerException. However the namespace URI of the last element is 'null'
instead of being 'urn:b'.
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org