You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by "Holliday, Donald B. (LNG-CSP)" <Do...@lexisnexis.com> on 2003/03/29 00:31:05 UTC

Characters missing from XALAN output

We are using XERCES 1.4.2 and XALAN 1.2.2 (and can't update the versions
until we get an approved project).

We have an application that parses a UTF-8 encoded XML document using
XERCES.  We are using a FileInputStream to read the document.  We then have
XALAN transform the document and write the output as 8859-1 (<xsl:output
method="text" encoding="iso-8859-1"/>).

If, after creating the parser, we call
DOMParser.setCreateEntityReferenceNodes(TRUE) then characters represented as
character reference entities DO NOT appear in the output.

If, after creating the parser, we call
DOMParser.setCreateEntityReferenceNodes(FALSE) then characters represented
as character reference entities DO appear in the output.

Some of these character entity references are defined in the DTD as
	<!ENTITY ast     "&#38;#x002A;">	<!-- ASTERISK OPERATOR -->
	<!ENTITY nbsp    "&#38;#x00A0;">	<!-- NO-BREAK SPACE -->
	<!ENTITY lsqb    "&#38;#x005B;">	<!-- LEFT SQUARE BRACKET -->
	<!ENTITY rsqb    "&#38;#x005D;">	<!-- RIGHT SQUARE BRACKET
-->
	<!ENTITY sect    "&#38;#x00A7;">	<!-- SECTION SIGN -->

This behavior is consistent on both Win2K and Solaris.

We speculate that DOMParser.setCreateEntityReferenceNodes(TRUE)causes the
parser to create a special node for these character entity references
instead of expanding them inline with the other text.  When XALAN gets the
DOM tree built this way it doesn't see the contents of the entity reference
nodes, so they don't show up in the output.

Is our speculation correct?

Does anyone know for a fact why XALAN behaves two different ways depending
on how we set DOMParser.setCreateEntityReferenceNodes( ... )?

Does anyone know how we can get XALAN to write out the value of the entity
reference nodes when we have DOMParser.setCreateEntityReferenceNodes(TRUE)?


Thanks,

Donald Holliday