You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Floroiu John <fl...@fokus.gmd.de> on 2000/06/05 21:55:16 UTC

transcoding &#...; characters


Hello,


I am running Xerces DOM (built from Xerces-C-src_1_1_0.tar.gz) on a
Linux machine and I am having problems in "transcode"-ing strings
_containing_ character references (like &#226;). The problem with these
strings is that they are converted to null strings.

I saw some earlier messages on the list addressing a similar problem and
I added

char* currentLocale = setlocale(LC_ALL, "");

into the code, but it did not help. Any suggestions would be greatly
appreciated.


Thanks in advance,
John.

Re: transcoding &#...; characters

Posted by Floroiu John <fl...@fokus.gmd.de>.

Hi Dean,


So, an example would be:

  <color>
         <name>Farben</name>
         <or>
                <string>blau</string>
                <string>wei&#223;</string>
                <string>schwartz</string>
         </or>
  </color>

which basically wants to say that the color of a certain object can be
blue or white or black (...).
The string wei&#223; which is actually "weiß" (the german word for
white) is transcoded into a null string (containing just a '\0').
Using &#xdf; or &szlig; which are the ISO 8859-1 equivalents for 'ß'
lead to the same result.
On the other hand, characters like &#x3c (which is '<') work OK, so
definitely is a problem with the characters having the code grater than
127.


Thank you,
John.

Re: transcoding &#...; characters

Posted by Dean Roddey <dr...@charmedquark.com>.
Can you give a specific example of the problem, in a real XML file?

--------------------------
Dean Roddey
The CIDLib Class Libraries
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"Give me immortality, or give me death"

  ----- Original Message ----- 
  From: Floroiu John 
  To: xerces 
  Sent: Monday, June 05, 2000 12:55 PM
  Subject: transcoding &#...; characters


    
  Hello, 
    

  I am running Xerces DOM (built from Xerces-C-src_1_1_0.tar.gz) on a Linux machine and I am having problems in "transcode"-ing strings _containing_ character references (like &#226;). The problem with these strings is that they are converted to null strings. 

  I saw some earlier messages on the list addressing a similar problem and I added 

  char* currentLocale = setlocale(LC_ALL, ""); 

  into the code, but it did not help. Any suggestions would be greatly appreciated. 
    

  Thanks in advance, 
  John.