You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by hb...@aol.com on 2007/09/12 17:02:14 UTC
UTF-8 Chinese characters
Hi, I am trying to parse an SVG file encoded in UTF-8 containing Chinese characters, and am having some problems.
When traversing though the nodes, I perform the following on a text node that contains chinese characters:
XMLCh* ptr = (XMLCh*)dom_node->getNodeValue();
char* str_data = XMLString::transcode(dom_node->getNodeValue());
I was originally using xercesc 2.4, and the resulting str_data was NULL.
After upgrading to xercesc 2.8, I now get a char array of the correct length, however each character is a question mark ('?')
Is there something I am doing incorrectly?? I would appreciate any help
char* str_data = XMLString::transcode(dom_node->getNodeValue());
I was originally using xercesc 2.4, and the resulting str_data was NULL.
After upgrading to xercesc 2.8, I now get a char array of the correct length, however each character is a question mark ('?')
Is there something I am doing incorrectly?? I would appreciate any help
________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - http://mail.aol.com
Re: UTF-8 Chinese characters
Posted by Gareth Reakes <ga...@we7.com>.
Hi,
XMLString::transcode transcodes to your local code page. The characters
may not be mapped there. Even if they are your editor or terminal you
are using to see them probably wont understand them. This topic has been
extensively covered on this mailing list many times. Take a look in the
archives for in depth coverage.
Cheers,
Gareth
hbosche@aol.com wrote:
> Hi, I am trying to parse an SVG file encoded in UTF-8 containing Chinese
> characters, and am having some problems.
> When traversing though the nodes, I perform the following on a text node
> that contains chinese characters:
>
> XMLCh* ptr = (XMLCh*)dom_node->getNodeValue();
> char* str_data = XMLString::transcode(dom_node->getNodeValue());
>
> I was originally using xercesc 2.4, and the resulting str_data was NULL.
> After upgrading to xercesc 2.8, I now get a char array of the correct
> length, however each character is a question mark ('?')
> Is there something I am doing incorrectly? I would appreciate any help
> ------------------------------------------------------------------------
> Email and AIM finally together. You've gotta check out free AOL Mail
> <http://o.aolcdn.com/cdn.webmail.aol.com/mailtour/aol/en-us/index.htm?ncid=AOLAOF00020000000970>!
--
Gareth Reakes, CTO WE7
+44-20-7117-0809 http://www.we7.com
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org