You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Mihai Matei <mi...@yahoo.com> on 2007/05/18 11:40:36 UTC

UTF-8 characters in CDATA section

Hi,

I'm trying to add the attached file's contents to a CDATA section in an xml. It contains a few Unicode-UTF8 characters from http://www.columbia.edu/kermit/utf8-t1.html. (you can view the file with Firefox, set the Character Encoding to Unicode(UTF8)).

//string 'text' has the contents;
//if I output it to a file with ofstream, the UTF8 characters are preserved

DOMElement* pText = pDoc->createElement( X(tag.c_str()));
DOMCDATASection* pCdata = pDoc->createCDATASection(X(text.c_str()));
pText->appendChild(pCdata);
parent->appendChild(pText);

the resulting xml however loses the UTF-8 characters. Is it the X() macro that is to blame, or can I set other XML Document properties so
 I keep my UTF8 chars?

Thanks.



      Got a little couch potato? 

Check out fun summer activities for kids.





       
____________________________________________________________________________________Give spam the boot. Take control with tough spam protection in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html 

Re: UTF-8 characters in CDATA section

Posted by Alberto Massari <am...@datadirect.com>.
The X() macro is a helper class that converts from the local encoding 
to Unicode; if you have UTF-8 data, you need to use instead the UTF-8 
transcoder.

Alberto

At 02.40 18/05/2007 -0700, Mihai Matei wrote:
>Hi,
>
>I'm trying to add the attached file's contents to a CDATA section in 
>an xml. It contains a few Unicode-UTF8 characters from 
><http://www.columbia.edu/kermit/utf8-t1.html>http://www.columbia.edu/kermit/utf8-t1.html. 
>(you can view the file with Firefox, set the Character Encoding to 
>Unicode(UTF8)).
>
>//string 'text' has the contents;
>//if I output it to a file with ofstream, the UTF8 characters are preserved
>
>DOMElement* pText = pDoc->createElement( X(tag.c_str()));
>DOMCDATASection* pCdata = pDoc->createCDATASection(X(text.c_str()));
>pText->appendChild(pCdata);
>parent->appendChild(pText);
>
>the resulting xml however loses the UTF-8 characters. Is it the X() 
>macro that is to blame, or can I set other XML Document properties 
>so I keep my UTF8 chars?
>
>Thanks.
>
>
>Got a little couch potato?
>Check out fun 
><http://us.rd.yahoo.com/evt=48248/*http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz>summer 
>activities for kids.
>
>
>
>Ready for the edge of your seat? 
><http://us.rd.yahoo.com/evt=48220/*http://tv.yahoo.com/>Check out 
>tonight's top picks on Yahoo! TV.