You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/10/09 15:05:50 UTC

DO NOT REPLY [Bug 4040] New: - xsltc should output UTF-8 not utf-8 in XML declaration

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4040>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4040

xsltc should output UTF-8 not utf-8 in XML declaration

           Summary: xsltc should output UTF-8 not utf-8 in XML declaration
           Product: XalanJ2
           Version: 2.0.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Minor
          Priority: Other
         Component: org.apache.xalan.xsltc
        AssignedTo: xalan-dev@xml.apache.org
        ReportedBy: tom.amiro@sun.com


If you read section 4.3.3 in

    http://www.w3.org/TR/REC-xml#NT-EncName

carefully, it would argue that XSLTC should output UTF-8 or UTF-16 (upper-case)
rather utf-8 or utf-16 (lower-case) in the XML declaration, because the official
designations of these encodings are upper case. We still should treat what
the user specifies in the encoding attribute of the output method in a 
case-inensitive manner.

>In the document entity, the encoding declaration is part of the XML
>declaration. The EncName is the name of the encoding used.

>In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2",
>and "ISO-10646-UCS-4" should be used for the various encodings and
>transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1",
>"ISO-8859-2", ... "ISO-8859-n" (where n is the part number) should be used for
>the parts of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP"
>should be used for the various encoded forms of JIS X-0208-1997. It is
>recommended that character encodings registered (as charsets) with the Internet
>Assigned Numbers Authority [IANA-CHARSETS], other than those just listed,
>be referred to using their registered names; other encodings should use names
>starting with an "x-" prefix. XML processors should match character encoding
>names in a case-insensitive way and should either interpret an IANA-registered
>name as the encoding registered at IANA for that name or treat it as
>unknown (processors are, of course, not required to support all IANA-registered
>encodings).