You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Milan Tomic <mi...@setcce.org> on 2005/08/19 10:53:23 UTC

Encoding issues

I'm parsing my XML like this using Xerces 2.5.0:

MemBufInputSource *mbis = new MemBufInputSource((const unsigned char
*const)xml, strlen(xml), L"...");
parser->parse(*mbis);

The problem is that in my xml there is no encoding information:

<?xml version="1.0"?>

Originally my file was like this:

<?xml version="1.0" encoding="UTF-8"?>

but encoding info get lost when I use MSXML parser in JScript, because
of conversions UTF-8 -> UTF-16...

Is there a way to tell Xerces which encoding was used for XML? Something
like this:

MemBufInputSource *mbis = new MemBufInputSource((const unsigned char
*const)xml, strlen(xml), L"...");
parser->setEncoding(L"UTF-8");
parser->parse(*mbis);

Thank you in advance,
Milan