You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by "Urobushkin, Gleb" <Gl...@westgroup.com> on 2000/02/17 21:12:58 UTC

Xerces4J/SAX -- internal subset: issues with enity declarations

I am observing an inconsistent behavior in the treatment of internal subset
declarations by a SAX parser from Xerces300ea3 for Java.

sax.SAXWriter from the samples jar was used as a parsing application

Issues:

1. Scanning or buffering error (absent in older ibm4j) when reading a large
file of character entity declarations. The declarations are referred to from
the internal subset via a parameter entity.

==============================
<!DOCTYPE dummy [ 
	<!ENTITY % entts SYSTEM "allchars.ent">
	%entts;
]>
<test>
text
</test>
==============================

 <<enttest.ent>> 
...
<!-- Entity set.
     Public identifier:
     -//ISO 8879:1986//ENTITIES Added Math Symbols: Relations//EN
-->
...
<!-- take me out and xerces will break WEIRD -->
<!ENTITY ape    "&#38;#38;ape;">    <!-- approximate, equals -->
...
(from attachment enttest.ent)
==============================  output ==

[Fatal Error] enttest.ent:577:3: The markup declaration contained  ...


In the same situation, everything *might* work if the DTD fragment contains
fewer entity declarations (other things like adding a blank line can make it
work too). 

2. Additional declarations in the internal subset can not override the same
declarations previously read from the external resource or from the same
subset.

<!DOCTYPE dummy [ 
	<!ENTITY % entts SYSTEM "enttest.ent">
<!-- has aacute mapping to itself -->
	%entts;
	<!ENTITY aacute "&#38;#38;xaacute;">
]>
<test>
text &aacute; text
</test>

or

<!DOCTYPE dummy [ 
	<!ENTITY aacute "&#38;#38;aacute;">
	<!ENTITY aacute "&#38;#38;xaacute;">
]>
<test>
text &aacute; text
</test>

============================output ===
<test>
text &amp;aacute; text
</test>


I expected to see the following output
<test>
text &amp;xaacute; text
</test>

 
3. The same relates to the following test that tries to unsuccessfully
override one of the predefined XML entities

===============================
<!DOCTYPE dummy [ 
      	<!ENTITY amp "&#38;#38;xxamp;">
]>
<test>
text &amp; text
</test>
=============================== output ===
<test>
text &amp; text
</test>
===============================

A DOM parser would behave differently, BTW


I think all these issues manifest internal bugs that need to be fixed. Thank
you,

Gleb