You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Xiaofan Zhou <Xi...@businessobjects.com> on 2004/09/16 01:16:56 UTC

SAXParser with different InputSource

Hi, ALl, 
 
I have following two questions regarding how to use InputSource with SAX
parser in Xerces for C++.
 
I know I can do something like this:
 
     SAX2XMLReader* pParser = XMLReaderFactory::createXMLReader();
    pParser->parse(input); 
 
Where input is an instance of InputSource. My question is: say I have a
file, I can create a LocalFileInputSource, or read the whole file into
memory buffer then create a MemBufInputSource, which one is better? I
have big input file like 20M.
 
Also, I've tried both DOM and SAX parser in Xerces c++ (1.7), from
Windows Task Manager, I can see the memory usage by my process is
significantly less in xml parsing phase whe SAX Parser is used, 
however, it does not seem to me that SAX  parser can handle larger input
than DOM, in my case, when the input reach 30M, both parser blow out.
Any thought?  
 
Thanks much in advance.
 
Frank.
 
 


Re: SAXParser with different InputSource

Posted by Alberto Massari <am...@progress.com>.
Hi Frank,

At 16.16 15/09/2004 -0700, Xiaofan Zhou wrote:
>Hi, ALl,
>
>I have following two questions regarding how to use InputSource with SAX 
>parser in Xerces for C++.
>
>I know I can do something like this:
>
>      SAX2XMLReader* pParser = XMLReaderFactory::createXMLReader();
>     pParser->parse(input);
>
>Where input is an instance of InputSource. My question is: say I have a 
>file, I can create a LocalFileInputSource, or read the whole file into 
>memory buffer then create a MemBufInputSource, which one is better? I have 
>big input file like 20M.

The better choice is to use the LocalFileInputSource; every 
InputSource-derived object will feed the data to the parser in small 
chunks, so LocalFileInputSource has the advantage that the file will never 
be completely loaded in memory. If you use MemBufInputSource, you have to 
allocate 20Mb of memory even before starting the parse phase.

>
>Also, I've tried both DOM and SAX parser in Xerces c++ (1.7), from Windows 
>Task Manager, I can see the memory usage by my process is significantly 
>less in xml parsing phase whe SAX Parser is used,
>however, it does not seem to me that SAX  parser can handle larger input 
>than DOM, in my case, when the input reach 30M, both parser blow out. Any 
>thought?

The building of the DOM structure requires significantly more memory than 
pure SAX processing (unless you are allocating a DOM-like structure in the 
SAX callbacks). And I have parsed files bigger than 100Mb.
Please try the examples SAXCount and DOMCount on the same files: if they 
succeed, you have some extra code that is responsible for the crash.

Alberto 



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Handling DOM validation errors

Posted by Scott Cantor <ca...@osu.edu>.
> The only way I managed to trap (schema) validation errors while parsing
> using DOMBuilder was to install my own DOMErrorHandler.
> 
> */ Is this the correct way to do this? Is there any other way?

That's how I've been doing it.

> */ How does the 'XercesValidationErrorAsFatal' feature affect this?

I think it just insures that those errors trigger the fatal error handler
method, and not the other method.

--Scott


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Handling DOM validation errors

Posted by Matthew Berry <ma...@apt.com>.
Hi,

The only way I managed to trap (schema) validation errors while parsing
using DOMBuilder was to install my own DOMErrorHandler.

*/ Is this the correct way to do this? Is there any other way?
*/ How does the 'XercesValidationErrorAsFatal' feature affect this?

Thanks for the help

BTW, I think this question should be added to the FAQ.


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org