You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Datir, Vinayak" <vi...@siemens.com> on 2019/11/04 06:09:36 UTC

Node to string is corrupting the characters

Hi All,
                We are trying convert the DOM into string. But it is converting few characters into question mark. Characters are within the ISO 8859-1 code page.
                Below is sample code for converting the DOM to string

DOMImplementation *impl = DOMImplementationRegistry::getDOMImplementation(0);
DOMLSSerializer* serializer = ((DOMImplementationLS*)impl)->createLSSerializer();
DOMLSOutput* out = ((DOMImplementationLS*)impl)->createLSOutput();
XMLFormatTarget *target  = new MemBufFormatTarget();
out->setByteStream(target);
out->setEncoding( STRINGTOXMLCHAR("ISO-8859-1"));
serializer->write(node, out);
char* theXMLString_Encoded = (char*)
((MemBufFormatTarget*)target)->getRawBuffer();
*buffer = (char *) MEM_alloc(strlen(theXMLString_Encoded) + 1);
tc_strcpy(*buffer, theXMLString_Encoded);
serializer->release();
out->release();
delete target;


Thanks,
Vinayak.

Re: Node to string is corrupting the characters

Posted by Alberto Massari <al...@tiscali.it>.
Hi Vinayak,
please include a full repro case, including the code that creates the 
"node" DOMNode with the specific characters that are not mapped 
correctly. Also, on which platform are you running, and what transcoding 
engine did you compile Xerces with (ICU, etc..)?

Alberto

Il 04/11/19 07:09, Datir, Vinayak ha scritto:
>
> Hi All,
>
>                 We are trying convert the DOM into string. But it is 
> converting few characters into question mark. Characters are within 
> the ISO 8859-1 code page.
>
>                 Below is sample code for converting the DOM to string
>
> DOMImplementation *impl = 
> DOMImplementationRegistry::getDOMImplementation(0);
>
> DOMLSSerializer* serializer = 
> ((DOMImplementationLS*)impl)->createLSSerializer();
>
> DOMLSOutput* out = ((DOMImplementationLS*)impl)->createLSOutput();
>
> XMLFormatTarget *target  = new MemBufFormatTarget();
>
> out->setByteStream(target);
>
> *out->setEncoding( STRINGTOXMLCHAR("ISO-8859-1"));*
>
> serializer->write(node, out);
>
> char* theXMLString_Encoded = (char*)
>
> ((MemBufFormatTarget*)target)->getRawBuffer();
>
> *buffer = (char *) MEM_alloc(strlen(theXMLString_Encoded) + 1);
>
> tc_strcpy(*buffer, theXMLString_Encoded);
>
> serializer->release();
>
> out->release();
>
> delete target;
>
> Thanks,
>
> Vinayak.
>