You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by Sc...@lotus.com on 2000/03/16 23:44:36 UTC
Re: [Bug 63] New - Can't use HTML 4.01 character entities
You need to turn on validation (-validate on the command line), and make
sure the HTML DTD is accessible. This is all done in Xerces, not Xalan.
The escaping of entities to the output is done by the Serializer, which is
an entirely different process.
-scott
bugzilla-daemon@locus.
apache.org To: Scott_Boag@lotus.com
cc:
03/16/00 05:16 PM Subject: [Bug 63] New - Can't use HTML 4.01 character entities
http://xml.apache.org/bugs/show_bug.cgi?id=63
*** shadow/63 Thu Mar 16 14:16:47 2000
--- shadow/63.tmp.61366 Thu Mar 16 14:16:47 2000
***************
*** 0 ****
--- 1,33 ----
+ Bug#: 63
+ Product: Xalan-J
+ Version: 0.20.0
+ Platform: Other
+ OS/Version: Linux
+ Status: NEW
+ Resolution:
+ Severity: normal
+ Priority: P2
+ Component: XSLT
+ AssignedTo: Scott_Boag@lotus.com
+ ReportedBy: lee@piclab.com
+ URL:
+ Summary: Can't use HTML 4.01 character entities
+
+ I have an XML source document I'm using Xalan to convert into
+ HTML 4.01 (Specifically, the stylesheet begins with the
+ <xsl:output method="html"
+ doctype-public="-//W3C//DTD HTML 4.01//EN"
+ doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
+ declaration).
+
+ When my XML source file contains some entities like "
+ they get put into the output as expected. But others like
+ à or — cause the processor to error:
+
+ XSL Error: Could not parse faq.xml document!
+ XSLT: The entity "agrave" was referenced, but not declared.
+
+ Oddly, if I use a numeric entity like ê it gets replaced
+ by with à in the output very nicely! But this doesn't
+ work for others like — which produce bizarre two-byte
+ sequences in the output (probably UTF-8).