You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Patrick Rutkowski <ru...@gmail.com> on 2010/05/18 06:40:16 UTC

Broken parser->setErrorHandler()

Ok, so, I'm new to Xerces as of 48 hours ago, and so far I really like it; but this problem I just ran into is extraordinarily frustrating.

First, let me show you a test case of how it's _supposed_ to work:
http://www.rutski89.com/static/xerces-test.cpp

Running this code produces something like this:

rutski@imac:~$ g++ test.cpp -lxerces-c && ./a.out
fatal:SAXParseException:invalid document structure
SAXException
rutski@imac:~$ 

Just as expected, since the XML data is the empty string "", the fatalErorr() in MyErrorHandler is triggered, it re-throws its "e", and that is then caught by the "catch(const SAXException& e)" in main(). Of course, SAXParseException is a sub-type of SAXException, so you would expect the catch in main() to happen.

But now take a look at this code which is totally the same, yet for some reason totally broken. Note that this is no longer self-contained test case code, there's no main(), you can run this, it's actually from my project:

http://www.rutski89.com/static/xerces-broken.cpp

When I call ParseXML("") for some reason it does NOT function the same as the test case. I have verified with test prints and gdb that fatalError() in ThrowErrorHandler is indeed triggered like it's supposed to be, so we're good so far. But then, for some weird reason, the "catch(const SAXException& e)" is NOT triggered. Instead, the SAXParseException which is re-thrown by fatalError() gets caught by "catch(...)" in ParseXML(). God knows why. 

I need it to catch properly!

ARRRRRRRG! What did I do to deserves this!?

A ver frustrated developer,
-Patrick


Re: Broken parser->setErrorHandler()

Posted by Patrick Rutkowski <ru...@gmail.com>.
*very

On May 18, 2010, at 12:40 AM, Patrick Rutkowski wrote:

> Ok, so, I'm new to Xerces as of 48 hours ago, and so far I really like it; but this problem I just ran into is extraordinarily frustrating.
> 
> First, let me show you a test case of how it's _supposed_ to work:
> http://www.rutski89.com/static/xerces-test.cpp
> 
> Running this code produces something like this:
> 
> rutski@imac:~$ g++ test.cpp -lxerces-c && ./a.out
> fatal:SAXParseException:invalid document structure
> SAXException
> rutski@imac:~$ 
> 
> Just as expected, since the XML data is the empty string "", the fatalErorr() in MyErrorHandler is triggered, it re-throws its "e", and that is then caught by the "catch(const SAXException& e)" in main(). Of course, SAXParseException is a sub-type of SAXException, so you would expect the catch in main() to happen.
> 
> But now take a look at this code which is totally the same, yet for some reason totally broken. Note that this is no longer self-contained test case code, there's no main(), you can run this, it's actually from my project:
> 
> http://www.rutski89.com/static/xerces-broken.cpp
> 
> When I call ParseXML("") for some reason it does NOT function the same as the test case. I have verified with test prints and gdb that fatalError() in ThrowErrorHandler is indeed triggered like it's supposed to be, so we're good so far. But then, for some weird reason, the "catch(const SAXException& e)" is NOT triggered. Instead, the SAXParseException which is re-thrown by fatalError() gets caught by "catch(...)" in ParseXML(). God knows why. 
> 
> I need it to catch properly!
> 
> ARRRRRRRG! What did I do to deserves this!?
> 
> A ver frustrated developer,
> -Patrick
> 


Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
okay - I notice one thing, which is weird for me - but someone wiser may know why.

With the following code, the SAXParseException IS caught.

   try  {
        parser->parse(xml_src);
    }
    catch(const SAXParseException& e)  {
        error = XMLString::transcode( e.getMessage() );  
    }
    catch(.... )  {
	//NO we do NOT get here.
    }

However, with this code, the Exception is NOT caught even though SAXException is SAXParseException's base:

   try  {
        parser->parse(xml_src);
    }
    catch(const SAXException& e)  {
        error = XMLString::transcode( e.getMessage() );
    }
    catch(.... )  {
	// YES we get here.
    }

A better c++ coder than I may be able to shed some interesting light on this.


Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
Xercesc 301 on both Mac OS X (snowleopard,  (GCC 4.2.1)) and debian linux (GCC 4.1.2).
I use the XCode IDE for debugging, but also use gdb natively.

What is happening that I can see so far...

.. the SAXParseException  copy constructor is called
but the original exception is being re-thrown....
..then the original exception is destructed..
and so the catch is 'unknown exception'

-B

On 19 May 2010, at 07:31, Vitaly Prapirny wrote:

> Ben Griffin wrote:
>> If it is a scope issue, then it will depend on just how/when the SAXParseException destructor is called.
>> Why not stick a breakpoint there, and see what happens?
>> 
>> I'm off work now -but it sounds like you are having fun!
> 
> Could you provide more info about your development platform - OS,
> compiler version, Xerces-C++ version?
> 
> Good luck!
> 	Vitaly


Re: Broken parser->setErrorHandler()

Posted by Vitaly Prapirny <ma...@mebius.net>.
Ben Griffin wrote:
> If it is a scope issue, then it will depend on just how/when the SAXParseException destructor is called.
> Why not stick a breakpoint there, and see what happens?
>
> I'm off work now -but it sounds like you are having fun!

Could you provide more info about your development platform - OS,
compiler version, Xerces-C++ version?

Good luck!
	Vitaly

Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
If it is a scope issue, then it will depend on just how/when the SAXParseException destructor is called. 
Why not stick a breakpoint there, and see what happens?

I'm off work now -but it sounds like you are having fun!

On 18 May 2010, at 16:35, Patrick M. Rutkowski wrote:

> Wah, interesting; test.cpp on my machine ends up in the catch for SAXError&.
> 
> /me digs back into research.


Re: Broken parser->setErrorHandler()

Posted by Vitaly Prapirny <ma...@mebius.net>.
Patrick M. Rutkowski wrote:
> Wah, interesting; test.cpp on my machine ends up in the catch for SAXError&.
>
> /me digs back into research.

Could you provide more info about your development platform - OS,
compiler version, Xerces-C++ version?

Good luck!
	Vitaly


Re: Broken parser->setErrorHandler()

Posted by "Patrick M. Rutkowski" <ru...@gmail.com>.
Wah, interesting; test.cpp on my machine ends up in the catch for SAXError&.

/me digs back into research.

-Patrick

On Tue, May 18, 2010 at 11:23 AM, Ben Griffin <be...@redsnapper.net> wrote:
> It behaves exactly the same - dropping out at 'unknown exception'
>
>> I'm looking into it as we speak, but as I do, would you mind running
>> the test.cpp form my first post, to see which of the catch() sections
>> triggers for you? I'm just really curious. Here's the source again for
>> easy reference:
>>
>> http://www.rutski89.com/static/xerces-test.cpp
>>
>> Thanks again for the help thus far,
>> -Patrick
>>
>> On Tue, May 18, 2010 at 10:36 AM, Ben Griffin <be...@redsnapper.net> wrote:
>>> Well now, as I said, I'm not an expert on either xercesc or on exceptions but what I see is
>>> This is what is throwing the error to the handler.
>>> --------------------------------------------------
>>> void XercesDOMParser::error( const   unsigned int
>>>                             , const XMLCh* const
>>>                             , const XMLErrorReporter::ErrTypes  errType
>>>                             , const XMLCh* const                errorText
>>>                             , const XMLCh* const                systemId
>>>                             , const XMLCh* const                publicId
>>>                             , const XMLFileLoc                  lineNum
>>>                             , const XMLFileLoc                  colNum)
>>> {
>>>    SAXParseException toThrow = SAXParseException
>>>        (
>>>        errorText
>>>        , publicId
>>>        , systemId
>>>        , lineNum
>>>        , colNum
>>>        , getMemoryManager()
>>>        );
>>>
>>>    //
>>>    //  If there is an error handler registered, call it, otherwise ignore
>>>    //  all but the fatal errors.
>>>    //
>>>    if (!fErrorHandler)
>>>    {
>>>        if (errType == XMLErrorReporter::ErrType_Fatal)
>>>            throw toThrow;
>>>        return;
>>>    }
>>>
>>>    if (errType == XMLErrorReporter::ErrType_Warning)
>>>        fErrorHandler->warning(toThrow);
>>>    else if (errType >= XMLErrorReporter::ErrType_Fatal)
>>>        fErrorHandler->fatalError(toThrow);
>>>    else
>>>        fErrorHandler->error(toThrow);
>>> }
>>> --------------------------------------------------
>>> As I see it, the 'toThrow' exception is going to go out of scope as soon as your error reporter returns - but then throwing instantiated exceptions this way is one reason why i don't know or understand them - maybe it's fine.
>>>
>>> But my =guess= would be that you want to clone the exception - or treat it as a temporary messenger object - which would indicate why your re-throw is failing - as the memory is being recovered as the exception is passed out of scope again.
>>>
>>> What do you think?
>>>
>>>
>>> On 18 May 2010, at 15:22, Patrick M. Rutkowski wrote:
>>>
>>>> Oh! Oh!
>>>>
>>>> So it does throw to the catch(...), with the "unknown error"!
>>>>
>>>> That's exactly why it's broken!
>>>>
>>>> What's being thrown is a SAXParserException, which SHOULD be caught by
>>>> the catch(SAXException& e)! not catch(...)!
>>>>
>>>> The weird thing is that when I run the test.cpp, which I originally
>>>> posted, it does indeed get caught by the right hander, but then when I
>>>> later run the same code in my actual project (which was the 2nd source
>>>> listing in the original posted) it gets caught by a different handler!
>>>> (the ... hander, which is wrong).
>>>>
>>>> I wonder if you'll see the point I'm trying to make there :-)
>>>> -Patrick
>>>>
>>>> P.S.
>>>> I haven't even read the rest of your message yet, I just really felt
>>>> like responding to the first two line independently.
>>>> I'll read the rest now though.
>>>>
>>>> On Tue, May 18, 2010 at 10:01 AM, Ben Griffin <be...@redsnapper.net> wrote:
>>>>> Yes it throws.
>>>>> It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
>>>>> I don't know much about exception handling-  in fact I avoid using them when I can.
>>>>>
>>>>> Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
>>>>> I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
>>>>> So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL
>>>>> Here is a simple example, which actually comes from a bug report.
>>>>>
>>>>> #include <iostream>
>>>>> #include <sstream>
>>>>> #include <xercesc/dom/DOM.hpp>
>>>>>
>>>>> using namespace xercesc;
>>>>> using namespace std;
>>>>>
>>>>> class myErrorHandler : public DOMErrorHandler {
>>>>>        ostringstream errors;
>>>>> public:
>>>>>        bool handleError(const xercesc::DOMError& domError) {
>>>>>                char* msg = XMLString::transcode(domError.getMessage());
>>>>>                errors << "[" << msg << "]";
>>>>>                XMLString::release(&msg);
>>>>>                return true;
>>>>>        }
>>>>>        void errs(string& result) {
>>>>>                result = errors.str();
>>>>>                errors.str("");
>>>>>        }
>>>>>        myErrorHandler() : DOMErrorHandler() {}
>>>>> };
>>>>>
>>>>> int main(int argc, char *argv[]) {
>>>>>        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
>>>>>        XMLPlatformUtils::Initialize();
>>>>>        myErrorHandler* errorHandler = new myErrorHandler();
>>>>>        DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
>>>>>        DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
>>>>>        DOMConfiguration* conf (xmlParser->getDomConfig ());
>>>>>        conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
>>>>>        conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);
>>>>>        conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);
>>>>>        conf->setParameter(XMLUni::fgXercesSchema,true);
>>>>>        conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
>>>>>        conf->setParameter(XMLUni::fgDOMValidate,true);
>>>>>        DOMDocument *foo = NULL;
>>>>>        xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
>>>>>        foo = xmlParser->parseURI("foo.xml");
>>>>>        string err;
>>>>>        errorHandler->errs(err);
>>>>>        cout << "Errors for foo.xml:" << err << endl;
>>>>>        XMLPlatformUtils::Terminate();
>>>>>        return 0;
>>>>> }
>>>>>
>>>>> Adding memory inputsource is a bit more tricky - but I do it something like this:
>>>>>
>>>>> const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
>>>>> std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
>>>>> DOMDocument *foo = NULL;
>>>>> DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();
>>>>> XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
>>>>> MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
>>>>> mbis->setCopyBufToStream(false);
>>>>> input->setByteStream(mbis);
>>>>> input->setEncoding(XMLUni::fgUTF8EncodingString);
>>>>> try {
>>>>>        foo = parser->parse(input);
>>>>> } catch (...); //deal with parser throws here!
>>>>> input->release();
>>>>>
>>>>> I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
>>>>> Lots of people use SAX instead of DOM, which depends upon your purpose of course.
>>>>>
>>>>> Not really sure that any of this helps.
>>>>> I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D
>>>>>
>>>>>> Well, yeah, I am just learning the basics :-)
>>>>>>
>>>>>> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
>>>>>>
>>>>>> -Patrick
>>>>>>
>>>>>> P.S.
>>>>>> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
>>>>>>
>>>>>> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
>>>>>>
>>>>>>> I had a look at your 'broken' source.
>>>>>>> There was no main function, so I added one in as follows:
>>>>>>>
>>>>>>> int main() {
>>>>>>>   XMLPlatformUtils::Initialize();
>>>>>>>      XercesDOMParser* parser = ParseXML("input");
>>>>>>>   return 0;
>>>>>>> }
>>>>>>>
>>>>>>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>>>>>>>
>>>>>>> I added
>>>>>>> #include <xercesc/parsers/XercesDOMParser.hpp>
>>>>>>>
>>>>>>> I don't have / don't use util.hpp so I added
>>>>>>> namespace {
>>>>>>>      typedef std::basic_string<XMLCh> XercesString;
>>>>>>> }
>>>>>>>
>>>>>>> Regardless, I have sent you the source code directly.
>>>>>>> It compiles and runs.
>>>>>>>
>>>>>>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>>>>>>> Maybe you are just getting the basics up first...!
>>>>>>>
>>>>>>>
>>>>>>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>>>>>>>
>>>>>>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>>>>>>>
>>>>>>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>>>>>>>
>>>>>>>> I dunno, I just don't know what to do anymore :-/
>>>>>>>>
>>>>>>>> -Patrick
>>>>>>>>
>>>>>>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>>>>>>>
>>>>>>>>> Patrick Rutkowski wrote:
>>>>>>>>>> I have verified with test prints and gdb that fatalError()
>>>>>>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>>>>>>> to be, so we're good so far.
>>>>>>>>>
>>>>>>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>>>>>>> line misleads. I could assume your toolchain or project is
>>>>>>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>>>>>>> it would be very helpfull to someone who wished to look at it.
>>>>>>>>>
>>>>>>>>> Good luck!
>>>>>>>>>    Vitaly
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>

Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
It behaves exactly the same - dropping out at 'unknown exception'

> I'm looking into it as we speak, but as I do, would you mind running
> the test.cpp form my first post, to see which of the catch() sections
> triggers for you? I'm just really curious. Here's the source again for
> easy reference:
> 
> http://www.rutski89.com/static/xerces-test.cpp
> 
> Thanks again for the help thus far,
> -Patrick
> 
> On Tue, May 18, 2010 at 10:36 AM, Ben Griffin <be...@redsnapper.net> wrote:
>> Well now, as I said, I'm not an expert on either xercesc or on exceptions but what I see is
>> This is what is throwing the error to the handler.
>> --------------------------------------------------
>> void XercesDOMParser::error( const   unsigned int
>>                             , const XMLCh* const
>>                             , const XMLErrorReporter::ErrTypes  errType
>>                             , const XMLCh* const                errorText
>>                             , const XMLCh* const                systemId
>>                             , const XMLCh* const                publicId
>>                             , const XMLFileLoc                  lineNum
>>                             , const XMLFileLoc                  colNum)
>> {
>>    SAXParseException toThrow = SAXParseException
>>        (
>>        errorText
>>        , publicId
>>        , systemId
>>        , lineNum
>>        , colNum
>>        , getMemoryManager()
>>        );
>> 
>>    //
>>    //  If there is an error handler registered, call it, otherwise ignore
>>    //  all but the fatal errors.
>>    //
>>    if (!fErrorHandler)
>>    {
>>        if (errType == XMLErrorReporter::ErrType_Fatal)
>>            throw toThrow;
>>        return;
>>    }
>> 
>>    if (errType == XMLErrorReporter::ErrType_Warning)
>>        fErrorHandler->warning(toThrow);
>>    else if (errType >= XMLErrorReporter::ErrType_Fatal)
>>        fErrorHandler->fatalError(toThrow);
>>    else
>>        fErrorHandler->error(toThrow);
>> }
>> --------------------------------------------------
>> As I see it, the 'toThrow' exception is going to go out of scope as soon as your error reporter returns - but then throwing instantiated exceptions this way is one reason why i don't know or understand them - maybe it's fine.
>> 
>> But my =guess= would be that you want to clone the exception - or treat it as a temporary messenger object - which would indicate why your re-throw is failing - as the memory is being recovered as the exception is passed out of scope again.
>> 
>> What do you think?
>> 
>> 
>> On 18 May 2010, at 15:22, Patrick M. Rutkowski wrote:
>> 
>>> Oh! Oh!
>>> 
>>> So it does throw to the catch(...), with the "unknown error"!
>>> 
>>> That's exactly why it's broken!
>>> 
>>> What's being thrown is a SAXParserException, which SHOULD be caught by
>>> the catch(SAXException& e)! not catch(...)!
>>> 
>>> The weird thing is that when I run the test.cpp, which I originally
>>> posted, it does indeed get caught by the right hander, but then when I
>>> later run the same code in my actual project (which was the 2nd source
>>> listing in the original posted) it gets caught by a different handler!
>>> (the ... hander, which is wrong).
>>> 
>>> I wonder if you'll see the point I'm trying to make there :-)
>>> -Patrick
>>> 
>>> P.S.
>>> I haven't even read the rest of your message yet, I just really felt
>>> like responding to the first two line independently.
>>> I'll read the rest now though.
>>> 
>>> On Tue, May 18, 2010 at 10:01 AM, Ben Griffin <be...@redsnapper.net> wrote:
>>>> Yes it throws.
>>>> It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
>>>> I don't know much about exception handling-  in fact I avoid using them when I can.
>>>> 
>>>> Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
>>>> I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
>>>> So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL
>>>> Here is a simple example, which actually comes from a bug report.
>>>> 
>>>> #include <iostream>
>>>> #include <sstream>
>>>> #include <xercesc/dom/DOM.hpp>
>>>> 
>>>> using namespace xercesc;
>>>> using namespace std;
>>>> 
>>>> class myErrorHandler : public DOMErrorHandler {
>>>>        ostringstream errors;
>>>> public:
>>>>        bool handleError(const xercesc::DOMError& domError) {
>>>>                char* msg = XMLString::transcode(domError.getMessage());
>>>>                errors << "[" << msg << "]";
>>>>                XMLString::release(&msg);
>>>>                return true;
>>>>        }
>>>>        void errs(string& result) {
>>>>                result = errors.str();
>>>>                errors.str("");
>>>>        }
>>>>        myErrorHandler() : DOMErrorHandler() {}
>>>> };
>>>> 
>>>> int main(int argc, char *argv[]) {
>>>>        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
>>>>        XMLPlatformUtils::Initialize();
>>>>        myErrorHandler* errorHandler = new myErrorHandler();
>>>>        DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
>>>>        DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
>>>>        DOMConfiguration* conf (xmlParser->getDomConfig ());
>>>>        conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
>>>>        conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);
>>>>        conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);
>>>>        conf->setParameter(XMLUni::fgXercesSchema,true);
>>>>        conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
>>>>        conf->setParameter(XMLUni::fgDOMValidate,true);
>>>>        DOMDocument *foo = NULL;
>>>>        xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
>>>>        foo = xmlParser->parseURI("foo.xml");
>>>>        string err;
>>>>        errorHandler->errs(err);
>>>>        cout << "Errors for foo.xml:" << err << endl;
>>>>        XMLPlatformUtils::Terminate();
>>>>        return 0;
>>>> }
>>>> 
>>>> Adding memory inputsource is a bit more tricky - but I do it something like this:
>>>> 
>>>> const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
>>>> std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
>>>> DOMDocument *foo = NULL;
>>>> DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();
>>>> XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
>>>> MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
>>>> mbis->setCopyBufToStream(false);
>>>> input->setByteStream(mbis);
>>>> input->setEncoding(XMLUni::fgUTF8EncodingString);
>>>> try {
>>>>        foo = parser->parse(input);
>>>> } catch (...); //deal with parser throws here!
>>>> input->release();
>>>> 
>>>> I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
>>>> Lots of people use SAX instead of DOM, which depends upon your purpose of course.
>>>> 
>>>> Not really sure that any of this helps.
>>>> I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D
>>>> 
>>>>> Well, yeah, I am just learning the basics :-)
>>>>> 
>>>>> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
>>>>> 
>>>>> -Patrick
>>>>> 
>>>>> P.S.
>>>>> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
>>>>> 
>>>>> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
>>>>> 
>>>>>> I had a look at your 'broken' source.
>>>>>> There was no main function, so I added one in as follows:
>>>>>> 
>>>>>> int main() {
>>>>>>   XMLPlatformUtils::Initialize();
>>>>>>      XercesDOMParser* parser = ParseXML("input");
>>>>>>   return 0;
>>>>>> }
>>>>>> 
>>>>>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>>>>>> 
>>>>>> I added
>>>>>> #include <xercesc/parsers/XercesDOMParser.hpp>
>>>>>> 
>>>>>> I don't have / don't use util.hpp so I added
>>>>>> namespace {
>>>>>>      typedef std::basic_string<XMLCh> XercesString;
>>>>>> }
>>>>>> 
>>>>>> Regardless, I have sent you the source code directly.
>>>>>> It compiles and runs.
>>>>>> 
>>>>>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>>>>>> Maybe you are just getting the basics up first...!
>>>>>> 
>>>>>> 
>>>>>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>>>>>> 
>>>>>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>>>>>> 
>>>>>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>>>>>> 
>>>>>>> I dunno, I just don't know what to do anymore :-/
>>>>>>> 
>>>>>>> -Patrick
>>>>>>> 
>>>>>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>>>>>> 
>>>>>>>> Patrick Rutkowski wrote:
>>>>>>>>> I have verified with test prints and gdb that fatalError()
>>>>>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>>>>>> to be, so we're good so far.
>>>>>>>> 
>>>>>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>>>>>> line misleads. I could assume your toolchain or project is
>>>>>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>>>>>> it would be very helpfull to someone who wished to look at it.
>>>>>>>> 
>>>>>>>> Good luck!
>>>>>>>>    Vitaly
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: Broken parser->setErrorHandler()

Posted by "Patrick M. Rutkowski" <ru...@gmail.com>.
I'm looking into it as we speak, but as I do, would you mind running
the test.cpp form my first post, to see which of the catch() sections
triggers for you? I'm just really curious. Here's the source again for
easy reference:

http://www.rutski89.com/static/xerces-test.cpp

Thanks again for the help thus far,
-Patrick

On Tue, May 18, 2010 at 10:36 AM, Ben Griffin <be...@redsnapper.net> wrote:
> Well now, as I said, I'm not an expert on either xercesc or on exceptions but what I see is
> This is what is throwing the error to the handler.
> --------------------------------------------------
> void XercesDOMParser::error( const   unsigned int
>                             , const XMLCh* const
>                             , const XMLErrorReporter::ErrTypes  errType
>                             , const XMLCh* const                errorText
>                             , const XMLCh* const                systemId
>                             , const XMLCh* const                publicId
>                             , const XMLFileLoc                  lineNum
>                             , const XMLFileLoc                  colNum)
> {
>    SAXParseException toThrow = SAXParseException
>        (
>        errorText
>        , publicId
>        , systemId
>        , lineNum
>        , colNum
>        , getMemoryManager()
>        );
>
>    //
>    //  If there is an error handler registered, call it, otherwise ignore
>    //  all but the fatal errors.
>    //
>    if (!fErrorHandler)
>    {
>        if (errType == XMLErrorReporter::ErrType_Fatal)
>            throw toThrow;
>        return;
>    }
>
>    if (errType == XMLErrorReporter::ErrType_Warning)
>        fErrorHandler->warning(toThrow);
>    else if (errType >= XMLErrorReporter::ErrType_Fatal)
>        fErrorHandler->fatalError(toThrow);
>    else
>        fErrorHandler->error(toThrow);
> }
> --------------------------------------------------
> As I see it, the 'toThrow' exception is going to go out of scope as soon as your error reporter returns - but then throwing instantiated exceptions this way is one reason why i don't know or understand them - maybe it's fine.
>
> But my =guess= would be that you want to clone the exception - or treat it as a temporary messenger object - which would indicate why your re-throw is failing - as the memory is being recovered as the exception is passed out of scope again.
>
> What do you think?
>
>
> On 18 May 2010, at 15:22, Patrick M. Rutkowski wrote:
>
>> Oh! Oh!
>>
>> So it does throw to the catch(...), with the "unknown error"!
>>
>> That's exactly why it's broken!
>>
>> What's being thrown is a SAXParserException, which SHOULD be caught by
>> the catch(SAXException& e)! not catch(...)!
>>
>> The weird thing is that when I run the test.cpp, which I originally
>> posted, it does indeed get caught by the right hander, but then when I
>> later run the same code in my actual project (which was the 2nd source
>> listing in the original posted) it gets caught by a different handler!
>> (the ... hander, which is wrong).
>>
>> I wonder if you'll see the point I'm trying to make there :-)
>> -Patrick
>>
>> P.S.
>> I haven't even read the rest of your message yet, I just really felt
>> like responding to the first two line independently.
>> I'll read the rest now though.
>>
>> On Tue, May 18, 2010 at 10:01 AM, Ben Griffin <be...@redsnapper.net> wrote:
>>> Yes it throws.
>>> It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
>>> I don't know much about exception handling-  in fact I avoid using them when I can.
>>>
>>> Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
>>> I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
>>> So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL
>>> Here is a simple example, which actually comes from a bug report.
>>>
>>> #include <iostream>
>>> #include <sstream>
>>> #include <xercesc/dom/DOM.hpp>
>>>
>>> using namespace xercesc;
>>> using namespace std;
>>>
>>> class myErrorHandler : public DOMErrorHandler {
>>>        ostringstream errors;
>>> public:
>>>        bool handleError(const xercesc::DOMError& domError) {
>>>                char* msg = XMLString::transcode(domError.getMessage());
>>>                errors << "[" << msg << "]";
>>>                XMLString::release(&msg);
>>>                return true;
>>>        }
>>>        void errs(string& result) {
>>>                result = errors.str();
>>>                errors.str("");
>>>        }
>>>        myErrorHandler() : DOMErrorHandler() {}
>>> };
>>>
>>> int main(int argc, char *argv[]) {
>>>        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
>>>        XMLPlatformUtils::Initialize();
>>>        myErrorHandler* errorHandler = new myErrorHandler();
>>>        DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
>>>        DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
>>>        DOMConfiguration* conf (xmlParser->getDomConfig ());
>>>        conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
>>>        conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);
>>>        conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);
>>>        conf->setParameter(XMLUni::fgXercesSchema,true);
>>>        conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
>>>        conf->setParameter(XMLUni::fgDOMValidate,true);
>>>        DOMDocument *foo = NULL;
>>>        xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
>>>        foo = xmlParser->parseURI("foo.xml");
>>>        string err;
>>>        errorHandler->errs(err);
>>>        cout << "Errors for foo.xml:" << err << endl;
>>>        XMLPlatformUtils::Terminate();
>>>        return 0;
>>> }
>>>
>>> Adding memory inputsource is a bit more tricky - but I do it something like this:
>>>
>>> const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
>>> std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
>>> DOMDocument *foo = NULL;
>>> DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();
>>> XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
>>> MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
>>> mbis->setCopyBufToStream(false);
>>> input->setByteStream(mbis);
>>> input->setEncoding(XMLUni::fgUTF8EncodingString);
>>> try {
>>>        foo = parser->parse(input);
>>> } catch (...); //deal with parser throws here!
>>> input->release();
>>>
>>> I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
>>> Lots of people use SAX instead of DOM, which depends upon your purpose of course.
>>>
>>> Not really sure that any of this helps.
>>> I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D
>>>
>>>> Well, yeah, I am just learning the basics :-)
>>>>
>>>> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
>>>>
>>>> -Patrick
>>>>
>>>> P.S.
>>>> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
>>>>
>>>> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
>>>>
>>>>> I had a look at your 'broken' source.
>>>>> There was no main function, so I added one in as follows:
>>>>>
>>>>> int main() {
>>>>>   XMLPlatformUtils::Initialize();
>>>>>      XercesDOMParser* parser = ParseXML("input");
>>>>>   return 0;
>>>>> }
>>>>>
>>>>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>>>>>
>>>>> I added
>>>>> #include <xercesc/parsers/XercesDOMParser.hpp>
>>>>>
>>>>> I don't have / don't use util.hpp so I added
>>>>> namespace {
>>>>>      typedef std::basic_string<XMLCh> XercesString;
>>>>> }
>>>>>
>>>>> Regardless, I have sent you the source code directly.
>>>>> It compiles and runs.
>>>>>
>>>>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>>>>> Maybe you are just getting the basics up first...!
>>>>>
>>>>>
>>>>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>>>>>
>>>>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>>>>>
>>>>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>>>>>
>>>>>> I dunno, I just don't know what to do anymore :-/
>>>>>>
>>>>>> -Patrick
>>>>>>
>>>>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>>>>>
>>>>>>> Patrick Rutkowski wrote:
>>>>>>>> I have verified with test prints and gdb that fatalError()
>>>>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>>>>> to be, so we're good so far.
>>>>>>>
>>>>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>>>>> line misleads. I could assume your toolchain or project is
>>>>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>>>>> it would be very helpfull to someone who wished to look at it.
>>>>>>>
>>>>>>> Good luck!
>>>>>>>    Vitaly
>>>>>>
>>>>>
>>>>
>>>
>>>
>
>

Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
Well now, as I said, I'm not an expert on either xercesc or on exceptions but what I see is
This is what is throwing the error to the handler.
--------------------------------------------------
void XercesDOMParser::error( const   unsigned int
                             , const XMLCh* const
                             , const XMLErrorReporter::ErrTypes  errType
                             , const XMLCh* const                errorText
                             , const XMLCh* const                systemId
                             , const XMLCh* const                publicId
                             , const XMLFileLoc                  lineNum
                             , const XMLFileLoc                  colNum)
{
    SAXParseException toThrow = SAXParseException
        (
        errorText
        , publicId
        , systemId
        , lineNum
        , colNum
        , getMemoryManager()
        );

    //
    //  If there is an error handler registered, call it, otherwise ignore
    //  all but the fatal errors.
    //
    if (!fErrorHandler)
    {
        if (errType == XMLErrorReporter::ErrType_Fatal)
            throw toThrow;
        return;
    }

    if (errType == XMLErrorReporter::ErrType_Warning)
        fErrorHandler->warning(toThrow);
    else if (errType >= XMLErrorReporter::ErrType_Fatal)
        fErrorHandler->fatalError(toThrow);
    else
        fErrorHandler->error(toThrow);
}
--------------------------------------------------
As I see it, the 'toThrow' exception is going to go out of scope as soon as your error reporter returns - but then throwing instantiated exceptions this way is one reason why i don't know or understand them - maybe it's fine.

But my =guess= would be that you want to clone the exception - or treat it as a temporary messenger object - which would indicate why your re-throw is failing - as the memory is being recovered as the exception is passed out of scope again.

What do you think?


On 18 May 2010, at 15:22, Patrick M. Rutkowski wrote:

> Oh! Oh!
> 
> So it does throw to the catch(...), with the "unknown error"!
> 
> That's exactly why it's broken!
> 
> What's being thrown is a SAXParserException, which SHOULD be caught by
> the catch(SAXException& e)! not catch(...)!
> 
> The weird thing is that when I run the test.cpp, which I originally
> posted, it does indeed get caught by the right hander, but then when I
> later run the same code in my actual project (which was the 2nd source
> listing in the original posted) it gets caught by a different handler!
> (the ... hander, which is wrong).
> 
> I wonder if you'll see the point I'm trying to make there :-)
> -Patrick
> 
> P.S.
> I haven't even read the rest of your message yet, I just really felt
> like responding to the first two line independently.
> I'll read the rest now though.
> 
> On Tue, May 18, 2010 at 10:01 AM, Ben Griffin <be...@redsnapper.net> wrote:
>> Yes it throws.
>> It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
>> I don't know much about exception handling-  in fact I avoid using them when I can.
>> 
>> Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
>> I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
>> So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL
>> Here is a simple example, which actually comes from a bug report.
>> 
>> #include <iostream>
>> #include <sstream>
>> #include <xercesc/dom/DOM.hpp>
>> 
>> using namespace xercesc;
>> using namespace std;
>> 
>> class myErrorHandler : public DOMErrorHandler {
>>        ostringstream errors;
>> public:
>>        bool handleError(const xercesc::DOMError& domError) {
>>                char* msg = XMLString::transcode(domError.getMessage());
>>                errors << "[" << msg << "]";
>>                XMLString::release(&msg);
>>                return true;
>>        }
>>        void errs(string& result) {
>>                result = errors.str();
>>                errors.str("");
>>        }
>>        myErrorHandler() : DOMErrorHandler() {}
>> };
>> 
>> int main(int argc, char *argv[]) {
>>        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
>>        XMLPlatformUtils::Initialize();
>>        myErrorHandler* errorHandler = new myErrorHandler();
>>        DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
>>        DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
>>        DOMConfiguration* conf (xmlParser->getDomConfig ());
>>        conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
>>        conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);
>>        conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);
>>        conf->setParameter(XMLUni::fgXercesSchema,true);
>>        conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
>>        conf->setParameter(XMLUni::fgDOMValidate,true);
>>        DOMDocument *foo = NULL;
>>        xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
>>        foo = xmlParser->parseURI("foo.xml");
>>        string err;
>>        errorHandler->errs(err);
>>        cout << "Errors for foo.xml:" << err << endl;
>>        XMLPlatformUtils::Terminate();
>>        return 0;
>> }
>> 
>> Adding memory inputsource is a bit more tricky - but I do it something like this:
>> 
>> const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
>> std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
>> DOMDocument *foo = NULL;
>> DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();
>> XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
>> MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
>> mbis->setCopyBufToStream(false);
>> input->setByteStream(mbis);
>> input->setEncoding(XMLUni::fgUTF8EncodingString);
>> try {
>>        foo = parser->parse(input);
>> } catch (...); //deal with parser throws here!
>> input->release();
>> 
>> I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
>> Lots of people use SAX instead of DOM, which depends upon your purpose of course.
>> 
>> Not really sure that any of this helps.
>> I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D
>> 
>>> Well, yeah, I am just learning the basics :-)
>>> 
>>> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
>>> 
>>> -Patrick
>>> 
>>> P.S.
>>> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
>>> 
>>> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
>>> 
>>>> I had a look at your 'broken' source.
>>>> There was no main function, so I added one in as follows:
>>>> 
>>>> int main() {
>>>>   XMLPlatformUtils::Initialize();
>>>>      XercesDOMParser* parser = ParseXML("input");
>>>>   return 0;
>>>> }
>>>> 
>>>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>>>> 
>>>> I added
>>>> #include <xercesc/parsers/XercesDOMParser.hpp>
>>>> 
>>>> I don't have / don't use util.hpp so I added
>>>> namespace {
>>>>      typedef std::basic_string<XMLCh> XercesString;
>>>> }
>>>> 
>>>> Regardless, I have sent you the source code directly.
>>>> It compiles and runs.
>>>> 
>>>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>>>> Maybe you are just getting the basics up first...!
>>>> 
>>>> 
>>>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>>>> 
>>>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>>>> 
>>>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>>>> 
>>>>> I dunno, I just don't know what to do anymore :-/
>>>>> 
>>>>> -Patrick
>>>>> 
>>>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>>>> 
>>>>>> Patrick Rutkowski wrote:
>>>>>>> I have verified with test prints and gdb that fatalError()
>>>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>>>> to be, so we're good so far.
>>>>>> 
>>>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>>>> line misleads. I could assume your toolchain or project is
>>>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>>>> it would be very helpfull to someone who wished to look at it.
>>>>>> 
>>>>>> Good luck!
>>>>>>    Vitaly
>>>>> 
>>>> 
>>> 
>> 
>> 


Re: Broken parser->setErrorHandler()

Posted by "Patrick M. Rutkowski" <ru...@gmail.com>.
Oh! Oh!

So it does throw to the catch(...), with the "unknown error"!

That's exactly why it's broken!

What's being thrown is a SAXParserException, which SHOULD be caught by
the catch(SAXException& e)! not catch(...)!

The weird thing is that when I run the test.cpp, which I originally
posted, it does indeed get caught by the right hander, but then when I
later run the same code in my actual project (which was the 2nd source
listing in the original posted) it gets caught by a different handler!
(the ... hander, which is wrong).

I wonder if you'll see the point I'm trying to make there :-)
-Patrick

P.S.
I haven't even read the rest of your message yet, I just really felt
like responding to the first two line independently.
I'll read the rest now though.

On Tue, May 18, 2010 at 10:01 AM, Ben Griffin <be...@redsnapper.net> wrote:
> Yes it throws.
> It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
> I don't know much about exception handling-  in fact I avoid using them when I can.
>
> Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
> I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
> So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL
> Here is a simple example, which actually comes from a bug report.
>
> #include <iostream>
> #include <sstream>
> #include <xercesc/dom/DOM.hpp>
>
> using namespace xercesc;
> using namespace std;
>
> class myErrorHandler : public DOMErrorHandler {
>        ostringstream errors;
> public:
>        bool handleError(const xercesc::DOMError& domError) {
>                char* msg = XMLString::transcode(domError.getMessage());
>                errors << "[" << msg << "]";
>                XMLString::release(&msg);
>                return true;
>        }
>        void errs(string& result) {
>                result = errors.str();
>                errors.str("");
>        }
>        myErrorHandler() : DOMErrorHandler() {}
> };
>
> int main(int argc, char *argv[]) {
>        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
>        XMLPlatformUtils::Initialize();
>        myErrorHandler* errorHandler = new myErrorHandler();
>        DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
>        DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
>        DOMConfiguration* conf (xmlParser->getDomConfig ());
>        conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
>        conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);
>        conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);
>        conf->setParameter(XMLUni::fgXercesSchema,true);
>        conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
>        conf->setParameter(XMLUni::fgDOMValidate,true);
>        DOMDocument *foo = NULL;
>        xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
>        foo = xmlParser->parseURI("foo.xml");
>        string err;
>        errorHandler->errs(err);
>        cout << "Errors for foo.xml:" << err << endl;
>        XMLPlatformUtils::Terminate();
>        return 0;
> }
>
> Adding memory inputsource is a bit more tricky - but I do it something like this:
>
> const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
> std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
> DOMDocument *foo = NULL;
> DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();
> XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
> MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
> mbis->setCopyBufToStream(false);
> input->setByteStream(mbis);
> input->setEncoding(XMLUni::fgUTF8EncodingString);
> try {
>        foo = parser->parse(input);
> } catch (...); //deal with parser throws here!
> input->release();
>
> I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
> Lots of people use SAX instead of DOM, which depends upon your purpose of course.
>
> Not really sure that any of this helps.
> I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D
>
>> Well, yeah, I am just learning the basics :-)
>>
>> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
>>
>> -Patrick
>>
>> P.S.
>> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
>>
>> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
>>
>>> I had a look at your 'broken' source.
>>> There was no main function, so I added one in as follows:
>>>
>>> int main() {
>>>   XMLPlatformUtils::Initialize();
>>>      XercesDOMParser* parser = ParseXML("input");
>>>   return 0;
>>> }
>>>
>>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>>>
>>> I added
>>> #include <xercesc/parsers/XercesDOMParser.hpp>
>>>
>>> I don't have / don't use util.hpp so I added
>>> namespace {
>>>      typedef std::basic_string<XMLCh> XercesString;
>>> }
>>>
>>> Regardless, I have sent you the source code directly.
>>> It compiles and runs.
>>>
>>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>>> Maybe you are just getting the basics up first...!
>>>
>>>
>>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>>>
>>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>>>
>>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>>>
>>>> I dunno, I just don't know what to do anymore :-/
>>>>
>>>> -Patrick
>>>>
>>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>>>
>>>>> Patrick Rutkowski wrote:
>>>>>> I have verified with test prints and gdb that fatalError()
>>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>>> to be, so we're good so far.
>>>>>
>>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>>> line misleads. I could assume your toolchain or project is
>>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>>> it would be very helpfull to someone who wished to look at it.
>>>>>
>>>>> Good luck!
>>>>>    Vitaly
>>>>
>>>
>>
>
>

Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
Yes it throws.
It is handled by your fatalError which then re-throws it to .. your "unknown error" catch.
I don't know much about exception handling-  in fact I avoid using them when I can.

Instead, I install a DOMErrorHandler and then gather messages until after the parse. But that's me.
I use a single DOMLSParser which lasts for the duration of the process, and then I let it manage the documents as it likes.
So my normal approach is to use my own 'loadDocument' method which deals with the parsing and returns a DOMDocument, or a NULL 
Here is a simple example, which actually comes from a bug report.

#include <iostream>
#include <sstream>
#include <xercesc/dom/DOM.hpp>

using namespace xercesc; 
using namespace std; 

class myErrorHandler : public DOMErrorHandler {
	ostringstream errors;
public:
	bool handleError(const xercesc::DOMError& domError) {
		char* msg = XMLString::transcode(domError.getMessage());
		errors << "[" << msg << "]";
		XMLString::release(&msg);
		return true;
	}
	void errs(string& result) {
		result = errors.str();
		errors.str("");
	}
	myErrorHandler() : DOMErrorHandler() {}
};

int main(int argc, char *argv[]) {
	const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};
	XMLPlatformUtils::Initialize();
	myErrorHandler* errorHandler = new myErrorHandler();
	DOMImplementation* impl (DOMImplementationRegistry::getDOMImplementation (ls_id));
	DOMLSParser* xmlParser = impl->createLSParser (DOMImplementationLS::MODE_SYNCHRONOUS,NULL);
	DOMConfiguration* conf (xmlParser->getDomConfig ());
	conf->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
	conf->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);		
	conf->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);		
	conf->setParameter(XMLUni::fgXercesSchema,true);	
	conf->setParameter(XMLUni::fgXercesIgnoreCachedDTD,false);
	conf->setParameter(XMLUni::fgDOMValidate,true);
	DOMDocument *foo = NULL;
	xmlParser->loadGrammar("foo.xsd",Grammar::SchemaGrammarType,true);
	foo = xmlParser->parseURI("foo.xml");
	string err;
	errorHandler->errs(err);
	cout << "Errors for foo.xml:" << err << endl;
	XMLPlatformUtils::Terminate();
	return 0;
}

Adding memory inputsource is a bit more tricky - but I do it something like this:

const XMLCh *mem = {chLatin_M, chLatin_E, chLatin_M, chNull};
std::string xmlfile = "<? xml version="1.0" ?><bla bla bla />" ;
DOMDocument *foo = NULL;
DOMLSInput *input = ((DOMImplementationLS*)impl)->createLSInput();	
XMLByte *xmlraw = (XMLByte*)(xmlfile.c_str());
MemBufInputSource *mbis = new MemBufInputSource(xmlraw,xmlfile.size(),mem);
mbis->setCopyBufToStream(false);
input->setByteStream(mbis);
input->setEncoding(XMLUni::fgUTF8EncodingString); 
try {
	foo = parser->parse(input);
} catch (...); //deal with parser throws here!
input->release();

I cannot say that eny of this is the right way to do things, but maybe it's of some use to you?
Lots of people use SAX instead of DOM, which depends upon your purpose of course.

Not really sure that any of this helps.
I am really STILL LEARNING xercesc after five years. Just ask Alberto how annoying I can be at times :D

> Well, yeah, I am just learning the basics :-)
> 
> But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?
> 
> -Patrick
> 
> P.S.
> Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?
> 
> On May 18, 2010, at 7:08 AM, Ben Griffin wrote:
> 
>> I had a look at your 'broken' source.
>> There was no main function, so I added one in as follows:
>> 
>> int main() {
>>   XMLPlatformUtils::Initialize();
>> 	XercesDOMParser* parser = ParseXML("input");
>>   return 0;
>> }
>> 
>> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
>> 
>> I added
>> #include <xercesc/parsers/XercesDOMParser.hpp>
>> 
>> I don't have / don't use util.hpp so I added 
>> namespace {
>> 	typedef std::basic_string<XMLCh> XercesString;
>> }	
>> 
>> Regardless, I have sent you the source code directly.
>> It compiles and runs.
>> 
>> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
>> Maybe you are just getting the basics up first...!
>> 
>> 
>> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
>> 
>>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>>> 
>>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>>> 
>>> I dunno, I just don't know what to do anymore :-/
>>> 
>>> -Patrick
>>> 
>>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>>> 
>>>> Patrick Rutkowski wrote:
>>>>> I have verified with test prints and gdb that fatalError()
>>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>>> to be, so we're good so far.
>>>> 
>>>> So the parser->setErrorHandler() is not broken actually, your subject
>>>> line misleads. I could assume your toolchain or project is
>>>> somewhat broken. If you could prepare a minimal self-contained test case
>>>> it would be very helpfull to someone who wished to look at it.
>>>> 
>>>> Good luck!
>>>> 	Vitaly
>>> 
>> 
> 


Re: Broken parser->setErrorHandler()

Posted by Patrick Rutkowski <ru...@gmail.com>.
Well, yeah, I am just learning the basics :-)

But I figured that deleting the XercesDOMParser and returning the DOMDocument would be wrong, because the XercesDOMParser owns the DOMDocument, no?

-Patrick

P.S.
Did running my "broken" code on your machine trigger the catch(...) or the catch(const SAXException& e)?

On May 18, 2010, at 7:08 AM, Ben Griffin wrote:

> I had a look at your 'broken' source.
> There was no main function, so I added one in as follows:
> 
> int main() {
>    XMLPlatformUtils::Initialize();
>  	XercesDOMParser* parser = ParseXML("input");
>    return 0;
> }
> 
> I commented out the QK:: push stuff, because it's not part of the library (that I know of).
> 
> I added
> #include <xercesc/parsers/XercesDOMParser.hpp>
> 
> I don't have / don't use util.hpp so I added 
> namespace {
> 	typedef std::basic_string<XMLCh> XercesString;
> }	
> 
> Regardless, I have sent you the source code directly.
> It compiles and runs.
> 
> I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
> Maybe you are just getting the basics up first...!
> 
> 
> On 18 May 2010, at 12:01, Patrick Rutkowski wrote:
> 
>> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
>> 
>> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
>> 
>> I dunno, I just don't know what to do anymore :-/
>> 
>> -Patrick
>> 
>> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
>> 
>>> Patrick Rutkowski wrote:
>>>> I have verified with test prints and gdb that fatalError()
>>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>>> to be, so we're good so far.
>>> 
>>> So the parser->setErrorHandler() is not broken actually, your subject
>>> line misleads. I could assume your toolchain or project is
>>> somewhat broken. If you could prepare a minimal self-contained test case
>>> it would be very helpfull to someone who wished to look at it.
>>> 
>>> Good luck!
>>> 	Vitaly
>> 
> 


Re: Broken parser->setErrorHandler()

Posted by Ben Griffin <be...@redsnapper.net>.
I had a look at your 'broken' source.
There was no main function, so I added one in as follows:

int main() {
    XMLPlatformUtils::Initialize();
  	XercesDOMParser* parser = ParseXML("input");
    return 0;
}

I commented out the QK:: push stuff, because it's not part of the library (that I know of).

I added
#include <xercesc/parsers/XercesDOMParser.hpp>

I don't have / don't use util.hpp so I added 
namespace {
	typedef std::basic_string<XMLCh> XercesString;
}	

Regardless, I have sent you the source code directly.
It compiles and runs.

I don't really know why you would want to return the parser having parsed a file. Normally one wants the document from the parse.
Maybe you are just getting the basics up first...!


On 18 May 2010, at 12:01, Patrick Rutkowski wrote:

> If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.
> 
> Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.
> 
> I dunno, I just don't know what to do anymore :-/
> 
> -Patrick
> 
> On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:
> 
>> Patrick Rutkowski wrote:
>>> I have verified with test prints and gdb that fatalError()
>>> in ThrowErrorHandler  is indeed triggered like it's supposed
>>> to be, so we're good so far.
>> 
>> So the parser->setErrorHandler() is not broken actually, your subject
>> line misleads. I could assume your toolchain or project is
>> somewhat broken. If you could prepare a minimal self-contained test case
>> it would be very helpfull to someone who wished to look at it.
>> 
>> Good luck!
>> 	Vitaly
> 


Re: Broken parser->setErrorHandler()

Posted by Patrick Rutkowski <ru...@gmail.com>.
If any of you Xerces-C devs are up for it, I would be willing to go as far as doing a screen-sharing session to debug this, since it seems to be impossible to reproduce. I have my version of xerces already built with -g -O0 even.

Of course, that might amount to you helping me find a silly big in my code somewhere, free of charge, which might seem unfair. Then again, maybe it really is an obscure bug in Xerces, which would make it worth it.

I dunno, I just don't know what to do anymore :-/

-Patrick

On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:

> Patrick Rutkowski wrote:
>> I have verified with test prints and gdb that fatalError()
>> in ThrowErrorHandler  is indeed triggered like it's supposed
>> to be, so we're good so far.
> 
> So the parser->setErrorHandler() is not broken actually, your subject
> line misleads. I could assume your toolchain or project is
> somewhat broken. If you could prepare a minimal self-contained test case
> it would be very helpfull to someone who wished to look at it.
> 
> Good luck!
> 	Vitaly


Re: Broken parser->setErrorHandler()

Posted by Patrick Rutkowski <ru...@gmail.com>.
Perhaps the subject line wasn't the best, that's true.

I tried to prepare a test case, as in the previous post, but in the test case it works!

I don't know what else to do :-(

-Patrick

On May 18, 2010, at 2:47 AM, Vitaly Prapirny wrote:

> Patrick Rutkowski wrote:
>> I have verified with test prints and gdb that fatalError()
>> in ThrowErrorHandler  is indeed triggered like it's supposed
>> to be, so we're good so far.
> 
> So the parser->setErrorHandler() is not broken actually, your subject
> line misleads. I could assume your toolchain or project is
> somewhat broken. If you could prepare a minimal self-contained test case
> it would be very helpfull to someone who wished to look at it.
> 
> Good luck!
> 	Vitaly


Re: Broken parser->setErrorHandler()

Posted by Vitaly Prapirny <ma...@mebius.net>.
Patrick Rutkowski wrote:
> I have verified with test prints and gdb that fatalError()
> in ThrowErrorHandler  is indeed triggered like it's supposed
> to be, so we're good so far.

So the parser->setErrorHandler() is not broken actually, your subject
line misleads. I could assume your toolchain or project is
somewhat broken. If you could prepare a minimal self-contained test case
it would be very helpfull to someone who wished to look at it.

Good luck!
	Vitaly