You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by Sc...@lotus.com on 2000/03/16 23:44:36 UTC

Re: [Bug 63] New - Can't use HTML 4.01 character entities

You need to turn on validation (-validate on the command line), and make
sure the HTML DTD is accessible.  This is all done in Xerces, not Xalan.
The escaping of entities to the output is done by the Serializer, which is
an entirely different process.

-scott




                                                                                                                            
                    bugzilla-daemon@locus.                                                                                  
                    apache.org                    To:     Scott_Boag@lotus.com                                              
                                                  cc:                                                                       
                    03/16/00 05:16 PM             Subject:     [Bug 63] New - Can't use HTML 4.01 character entities        
                                                                                                                            
                                                                                                                            




http://xml.apache.org/bugs/show_bug.cgi?id=63

*** shadow/63        Thu Mar 16 14:16:47 2000
--- shadow/63.tmp.61366        Thu Mar 16 14:16:47 2000
***************
*** 0 ****
--- 1,33 ----
+ Bug#: 63
+ Product: Xalan-J
+ Version: 0.20.0
+ Platform: Other
+ OS/Version: Linux
+ Status: NEW
+ Resolution:
+ Severity: normal
+ Priority: P2
+ Component: XSLT
+ AssignedTo: Scott_Boag@lotus.com
+ ReportedBy: lee@piclab.com
+ URL:
+ Summary: Can't use HTML 4.01 character entities
+
+ I have an XML source document I'm using Xalan to convert into
+ HTML 4.01 (Specifically, the stylesheet begins with the
+ <xsl:output method="html"
+ doctype-public="-//W3C//DTD HTML 4.01//EN"
+ doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
+ declaration).
+
+ When my XML source file contains some entities like &quot;
+ they get put into the output as expected.  But others like
+ &agrave; or &mdash; cause the processor to error:
+
+   XSL Error: Could not parse faq.xml document!
+   XSLT: The entity "agrave" was referenced, but not declared.
+
+ Oddly, if I use a numeric entity like &#234; it gets replaced
+ by with &agrave; in the output very nicely!  But this doesn't
+ work for others like &mdash; which produce bizarre two-byte
+ sequences in the output (probably UTF-8).