You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Hans Stoessel <hs...@pm-medici.ch> on 2005/10/12 17:15:41 UTC
Transcoding on Mac OS X
Hi
I parse an UTF-8 xml file on Mac OS X. In my C++ application I use the a
standard string (std::string) to save the content of the tags. Now I have
problems with characters > 127. If I use XMLString::transcode there are two
bytes for such a character instead of one byte. But the std::string uses
only char's (1 byte) for storing the data.
How can I transcode the contents from XMLCh (2 bytes) into the right format
for my std::string's?
Thanks for any help.
Regards
Hans
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Transcoding on Mac OS X
Posted by da...@us.ibm.com.
> I parse an UTF-8 xml file on Mac OS X. In my C++ application I use the a
> standard string (std::string) to save the content of the tags. Now I
have
> problems with characters > 127. If I use XMLString::transcode there are
two
> bytes for such a character instead of one byte. But the std::string uses
> only char's (1 byte) for storing the data.
But UTF-8 uses two, three, or four bytes to represent Unicode code points
above 127, so the behavior you're seeing is expected. If you require that
characters must be equal to code units, you cannot use std::string to hold
UTF-8, or any other multi-byte encoding, for that matter. However, I'm
not sure why this is a problem.
> How can I transcode the contents from XMLCh (2 bytes) into the right
format
> for my std::string's?
It's not possible, of course, unless you want to use a single byte
encoding, like ISO-8859-1, and your sure it can represent all of the
characters in the XML document. If that's what you want, you'll need to
create a transcoder for that encoding, and use it instead of
XMLString::transcode(), which transcodes only to the local code page.
Dave
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org