You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/05/15 22:07:48 UTC
[Bug 1757] New - Wrong encoding used when adding attributes in XSLT
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1757
*** shadow/1757 Tue May 15 13:07:48 2001
--- shadow/1757.tmp.220 Tue May 15 13:07:48 2001
***************
*** 0 ****
--- 1,70 ----
+ +============================================================================+
+ | Wrong encoding used when adding attributes in XSLT |
+ +----------------------------------------------------------------------------+
+ | Bug #: 1757 Product: XalanJ2 |
+ | Status: NEW Version: 2.0.1 |
+ | Resolution: Platform: PC |
+ | Severity: Major OS/Version: Windows NT/2K |
+ | Priority: Component: org.apache.xalan.transf |
+ +----------------------------------------------------------------------------+
+ | Assigned To: xalan-dev@xml.apache.org |
+ | Reported By: ewindes@opentv.com |
+ | CC list: Cc: |
+ +----------------------------------------------------------------------------+
+ | URL: |
+ +============================================================================+
+ | DESCRIPTION |
+ I'm not sure exactly which component is to blame here, so I'll describe the
+ situation...
+
+ org.apache.xalan.templates.ElemAttribute.constructNode() calls
+ org.apache.xalan.transformer.TransformerImpl.transformToString() to turn an
+ attribute node into a string.
+
+ The code called by transformToString() ends up using the default character
+ encoding returned from org.apache.xalan.serialize.Encodings.getMimeEncoding().
+
+ On US NT, getMimeEncoding() returns UTF-8.
+ On Japanese NT, getMimeEncoding() returns "MS932".
+
+ So, on Japanese NT, characters outside the MS932 charset get encoded by
+ transformToString (i.e. "残") and then again when the whole document is
+ serialized (i.e. to: "残")
+
+ Example: ("�c���Ɖ�" == 4 Shift-JIS characters)
+
+ XML:
+ <?xml version="1.0" ?>
+ <blah>
+ </blah>
+
+ XSL:
+ <?xml version="1.0" encoding="Shift_JIS" ?>
+ <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
+ <xsl:output method="html" encoding="Shift_JIS"/>
+ <xsl:template match="/">
+ <html>
+ <head>
+ <title>Attribute Test</title>
+ </head>
+ <body>
+ <p>
+ <xsl:attribute name="test">�c���Ɖ�</xsl:attribute>
+ �c���Ɖ�
+ </p>
+ </body>
+ </html>
+ </xsl:template>
+ </xsl:stylesheet>
+
+
+ On US NT, this produces:
+ <p test="�c���Ɖ�">
+ �c���Ɖ�
+ </p>
+
+
+ On Japanese NT, this produces:
+ <p test="&#27531;&#39640;&#29031;&#20250;">
+ �c���Ɖ�
+ </p>