You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Paul Kinnucan <pa...@mathworks.com> on 2002/10/07 22:50:27 UTC

Problem with Xerces-J 2.0.1.01 and/or DOM

Hi,

If I try to use the JAXP version of DOM and Xerces-J 2.0.1.01
to extract the content of an XML document that includes 
a doctype declaration that declares external entities, all
the elements of the parsed document appear to be empty. If I remove
the external entity declaration or the entire doctype declaration,
my program is able to extract the content without any problem.

Consider, for example, the following document:


<?xml version="1.0"  encoding="utf-8"?>
<!DOCTYPE book PUBLIC "-//The Mathworks//DTD axdocbook variant//" "" []>
<book>
<title>Using Simulink</title>
<para>hello world</para>
</book>

If I try to extract the content of the title element, using the following
code:

DOMParser parser = new DOMParser();Book.parser.parse(bookFilePath.getAbsolutePath());
Document doc = parser.getDocument();
NodeList titleElements = doc.getElementsByTagName("title");
Node titleElement = titleElements.item(0);

titleElement is empty, i.e., titleElement.hasChildNodes() returns false. 
However, if I remove the square brackets from the doctype declaration, i.e., 

<!DOCTYPE book PUBLIC "-//The Mathworks//DTD axdocbook variant//" "">

or remove the doctype declaration itself, the above code works perfectly, i.e.,
titleElement.hasChildNodes() returns true and the child node is a text element
that contains "Using Simulink."

I'd appreciate any help you can give me.

Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Problem with Xerces-J 2.0.1.01 and/or DOM

Posted by Paul Kinnucan <pa...@mathworks.com>.
Hi,

I've discovered that 2.0.1 is not that latest version of
Xerces and that the latest version does not exhibit the
problem discribed in my previous post.

Paul

Paul Kinnucan writes:
 > Hi,
 > 
 > If I try to use the JAXP version of DOM and Xerces-J 2.0.1.01
 > to extract the content of an XML document that includes 
 > a doctype declaration that declares external entities, all
 > the elements of the parsed document appear to be empty. If I remove
 > the external entity declaration or the entire doctype declaration,
 > my program is able to extract the content without any problem.
 > 
 > Consider, for example, the following document:
 > 
 > 
 > <?xml version="1.0"  encoding="utf-8"?>
 > <!DOCTYPE book PUBLIC "-//The Mathworks//DTD axdocbook variant//" "" []>
 > <book>
 > <title>Using Simulink</title>
 > <para>hello world</para>
 > </book>
 > 
 > If I try to extract the content of the title element, using the following
 > code:
 > 
 > DOMParser parser = new DOMParser();Book.parser.parse(bookFilePath.getAbsolutePath());
 > Document doc = parser.getDocument();
 > NodeList titleElements = doc.getElementsByTagName("title");
 > Node titleElement = titleElements.item(0);
 > 
 > titleElement is empty, i.e., titleElement.hasChildNodes() returns false. 
 > However, if I remove the square brackets from the doctype declaration, i.e., 
 > 
 > <!DOCTYPE book PUBLIC "-//The Mathworks//DTD axdocbook variant//" "">
 > 
 > or remove the doctype declaration itself, the above code works perfectly, i.e.,
 > titleElement.hasChildNodes() returns true and the child node is a text element
 > that contains "Using Simulink."
 > 
 > I'd appreciate any help you can give me.
 > 
 > Paul
 > 
 > 
 > 
 > ---------------------------------------------------------------------
 > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
 > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
 > 


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org