You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by hb...@aol.com on 2007/09/12 17:02:14 UTC

UTF-8 Chinese characters

Hi, I am trying to parse an SVG file encoded in UTF-8 containing Chinese characters, and am having some problems.
When traversing though the nodes, I perform the following on a text node that contains chinese characters:


XMLCh* ptr = (XMLCh*)dom_node->getNodeValue();

char* str_data = XMLString::transcode(dom_node->getNodeValue());

I was originally using xercesc 2.4, and the resulting str_data was NULL.
After upgrading to xercesc 2.8, I now get a char array of the correct length, however each character is a question mark ('?')
Is there something I am doing incorrectly?? I would appreciate any help


char* str_data = XMLString::transcode(dom_node->getNodeValue());

I was originally using xercesc 2.4, and the resulting str_data was NULL.
After upgrading to xercesc 2.8, I now get a char array of the correct length, however each character is a question mark ('?')
Is there something I am doing incorrectly?? I would appreciate any help

________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - http://mail.aol.com

Re: UTF-8 Chinese characters

Posted by Gareth Reakes <ga...@we7.com>.
Hi,

	XMLString::transcode transcodes to your local code page. The characters 
may not be mapped there. Even if they are your editor or terminal you 
are using to see them probably wont understand them. This topic has been 
extensively covered on this mailing list many times. Take a look in the 
archives for in depth coverage.

Cheers,

Gareth

hbosche@aol.com wrote:
> Hi, I am trying to parse an SVG file encoded in UTF-8 containing Chinese 
> characters, and am having some problems.
> When traversing though the nodes, I perform the following on a text node 
> that contains chinese characters:
> 
> XMLCh* ptr = (XMLCh*)dom_node->getNodeValue();
> char* str_data = XMLString::transcode(dom_node->getNodeValue());
> 
> I was originally using xercesc 2.4, and the resulting str_data was NULL.
> After upgrading to xercesc 2.8, I now get a char array of the correct 
> length, however each character is a question mark ('?')
> Is there something I am doing incorrectly?  I would appreciate any help
> ------------------------------------------------------------------------
> Email and AIM finally together. You've gotta check out free AOL Mail 
> <http://o.aolcdn.com/cdn.webmail.aol.com/mailtour/aol/en-us/index.htm?ncid=AOLAOF00020000000970>!

-- 
Gareth Reakes, CTO                                 WE7
+44-20-7117-0809                    http://www.we7.com

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org