You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/03/23 19:09:15 UTC

[Bug 1100] New - XML containing UTF-8 is not transformed correctly

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1100

*** shadow/1100	Fri Mar 23 10:09:15 2001
--- shadow/1100.tmp.1624	Fri Mar 23 10:09:15 2001
***************
*** 0 ****
--- 1,34 ----
+ +============================================================================+
+ | XML containing UTF-8 is not transformed correctly                          |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 1100                        Product: XalanJ2                 |
+ |       Status: NEW                         Version: 2.0.1                   |
+ |   Resolution:                            Platform: Sun                     |
+ |     Severity: Normal                   OS/Version: Solaris                 |
+ |     Priority:                           Component: Xalan                   |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: xalan-dev@xml.apache.org                                     |
+ |  Reported By: aamies@access360.com                                         |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL:                                                              |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ I am having problems transforming UTF-8 documents.  I am inputing
+ double-byte (Chinese characters) with a UTF-8 XML document and
+ transforming them to HTML.  The double-byte characters are all output as
+ '?'.  I am using XalanJ2.01 with JDK1.2.2.05 on Solaris 8 with Weblogic 5.1, 
+ Service pack 8.
+ 
+ I have checked that the double-byte characters in the input XML document
+ have the correct characters by printing out the Unicode values.  I am
+ also displaying some parts of pages correctly in UTF-8 using JSP's by
+ looking up Chinese text from a ResourceBundle.  I was previously using
+ XalanJ1.2, with exactly the same xml / xsl combination and this
+ particular page was displayed correctly.  
+ 
+ The xml source, xsl stylesheet, and html output are attached.  
+ 
+ I am setting the ServletResponse contentType to "text/html;charset=UTF-8", 
+ the xml encoding to "utf-8", and the xsl:output encoding
+ to "utf-8" using a tag in the xsl document.