You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xalan.apache.org by Graeme Ing <gr...@inclue.com> on 2006/03/09 00:29:58 UTC
FW: Xerces remapping xxx;
Hello all,
I’m using Xerces 2.7 and I’m trying to parse the following snippet from my XML file:
<title>Junk Mail - just how “heavy” a problem is it?</title>
The xml header/encoding on the file is:
<?xml version="1.0" encoding="UTF-8"?>
When I parse this and walk the DOM and extract the contents of this title node, I get back:
Junk Mail - just how “heavy†a problem is it?
Where the special characters are decimal 30,128,100 and 30,128,99
Why is Xerces interpreting the &#xxxx; codes and more importantly, how do I stop it? :-)
Here is my Xerces setup code:
m_parser = new XercesDOMParser();
m_parser->setValidationScheme( XercesDOMParser::Val_Never );
m_parser->setDoNamespaces( false );
m_parser->setDoSchema( false );
m_errorHandler = (ErrorHandler*) new HandlerBase();
m_parser->setErrorHandler( m_errorHandler );
Hope someone can help, thanks a lot!!
Graeme Ing
Re: FW: Xerces remapping xxx;
Posted by David Bertoni <db...@apache.org>.
Graeme Ing wrote:
>
>
> Hello all,
>
>
>
> I’m using Xerces 2.7 and I’m trying to parse the following snippet from
> my XML file:
>
>
>
> <title>Junk Mail - just how “heavy” a problem is it?</title>
>
>
>
> The xml header/encoding on the file is:
>
>
>
> <?xml version="1.0" encoding="UTF-8"?>
>
>
>
> When I parse this and walk the DOM and extract the contents of this
> title node, I get back:
>
>
>
> Junk Mail - just how “heavy†a problem is it?
>
This is a question for the Xerces-C list, not for the Xalan-C list.
Dave