You are viewing a plain text version of this content. The canonical link for it is here.

Posted to c-dev@xerces.apache.org by bu...@apache.org on 2003/04/11 12:16:26 UTC

DO NOT REPLY [Bug 18946] New: - MemBufFormatTarget does not honour encoding

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18946>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18946

MemBufFormatTarget does not honour encoding

           Summary: MemBufFormatTarget does not honour encoding
           Product: Xerces-C++
           Version: 2.2.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: DOM
        AssignedTo: xerces-c-dev@xml.apache.org
        ReportedBy: colin@colina.demon.co.uk


When serializing with DOMWriter::writeNode, have issued setEncoding to set UTF-8,
writing to a StdOutFormatTarget displays good UTF-8.
Writing to a MemBufFormatTarget however appaers to display UTF-16.

At least, I think that's what's coming out. The data concerned is the Trademark
symbol. In this example data, the charcter following the letters 3M is the Tm
symbol. The input UTF-8 (cut-and-paste from emacs) is:
3M� (hm. OK - on my screen it looks like: 3M\342\204\242).
The output is:
3M�&#132;� (which in emacs looks like 3M\303\242\302\204\302\242)

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org