You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Olumide <50...@web.de> on 2012/04/11 13:13:21 UTC

XMLString::transcode(XMLCh * , ... ) breaks introduces a newline for every ampersand in the input string

Hi -

Does anyone know why the characters method of the DefaultHandler class 
breaks, and introduces a newline, whenever it encounters the ampersand 
character in the input string. (I'm writing an XML parser.)

For example when the parser encounters the input string, "mergers &amp; 
acquisitions", the method DefaultHandler::characters(), shown below, returns

	Input: mergers
	Input: & acquisitions

Instead of

	Input: mergers & acquisitions

Thanks,

- Olumide


//
void DefaultHandler::characters (const XMLCh *const chars, const 
XMLSize_t length)
{
     char* str = XMLString::transcode( chars );
     cout << "Input: " << str << endl;
}

Re: XMLString::transcode(XMLCh * , ... ) breaks introduces a newline for every ampersand in the input string

Posted by Olumide <50...@web.de>.
On 11/04/2012 12:23, Alberto Massari wrote:
> It could happen when an entity reference is found, or if the text
> fragment is bigger than the internal buffer; in any case you must expect
> multiple invocations of the characters() method, that you must concatenate.

Thanks. I've already started doing just that.

- Olumide

Re: XMLString::transcode(XMLCh * , ... ) breaks introduces a newline for every ampersand in the input string

Posted by Alberto Massari <Al...@progress.com>.
It could happen when an entity reference is found, or if the text 
fragment is bigger than the internal buffer; in any case you must expect 
multiple invocations of the characters() method, that you must concatenate.

>From 
http://sax.sourceforge.net/apidoc/org/xml/sax/ContentHandler.html#characters%28char[],%20int,%20int%29:
The Parser will call this method to report each chunk of character data. 
SAX parsers may return all contiguous character data in a single chunk, 
or they may split it into several chunks; however, all of the characters 
in any single event must come from the same external entity so that the 
Locator provides useful information.

Alberto

Il 11/04/2012 13:13, Olumide ha scritto:
> Hi -
>
> Does anyone know why the characters method of the DefaultHandler class
> breaks, and introduces a newline, whenever it encounters the ampersand
> character in the input string. (I'm writing an XML parser.)
>
> For example when the parser encounters the input string, "mergers&amp;
> acquisitions", the method DefaultHandler::characters(), shown below, returns
>
> 	Input: mergers
> 	Input:&  acquisitions
>
> Instead of
>
> 	Input: mergers&  acquisitions
>
> Thanks,
>
> - Olumide
>
>
> //
> void DefaultHandler::characters (const XMLCh *const chars, const
> XMLSize_t length)
> {
>       char* str = XMLString::transcode( chars );
>       cout<<  "Input: "<<  str<<  endl;
> }
>