You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Peter Burns (JIRA)" <xe...@xml.apache.org> on 2011/04/05 18:33:06 UTC

[jira] [Created] (XERCESC-1961) Invalid IGXMLScanner::fDTDGrammar, causing segfault

Invalid IGXMLScanner::fDTDGrammar, causing segfault
---------------------------------------------------

                 Key: XERCESC-1961
                 URL: https://issues.apache.org/jira/browse/XERCESC-1961
             Project: Xerces-C++
          Issue Type: Bug
          Components: SAX/SAX2, Utilities, Validating Parser (DTD)
    Affects Versions: 2.6.0, 2.8.0
         Environment: Linux, OpenVXI (http://sourceforge.net/projects/openvxi/)
            Reporter: Peter Burns


The problem occurs while OpenVXI is initialising, when it parses a couple of (hard-coded) DTDs and then a (hard-coded) XSD. During the DTD parsing (SAX2XMLReader::parse()), a DTDGrammar is created and stored in two places: GrammarResolver::fGrammarBucket and IGXMLScanner::fDTDGrammar. At the start of the XSG parsing (SAX2XMLReader::loadGrammar()), the GrammarBucket is cleared, deleting the DTDGrammar but leaving IGXMLScanner::fDTDGrammar still pointing to it. During the parsing, IGXMLScanner::getEntityDeclPool() is called and hence tries to call fDTDGrammar->getEntityDeclPool(). This sometimes causes a segfault (though usually only after our app - performing these operations over and over - has been running for a few hours).

I have some code which reproduces the problem - I'll attach it to this case as soon as I can work out how. Since the code rarely segfaults, I've been demonstrating it by adding printf()s to the DTDGrammar constructor/destructor and IGXMLScanner::getEntityDeclPool(). So my test code currently generates this:

[peter@ultra1 xerces_bug]$ ./test
DTDGrammar::DTDGrammar() this = 0x95691508
Warning in file vxml 1.0 defaults at line 2 column 51
Reason: Element 'metadata' was referenced in a content model but never declared
DTDGrammar::~DTDGrammar() this = 0x95691508
DTDGrammar::DTDGrammar() this = 0x956fa908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
IGXMLScanner::getEntityDeclPool() fDTDGrammar = 0x95691508
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x956fa908
[peter@ultra1 xerces_bug]$

showing the DTDGrammar this=0x95691508 being created, deleted and then used by IGXMLScanner.

Our fix is to set fDTDGrammar to 0 after the bucket-clearing operation

fGrammarResolver->useCachedGrammarInParse(toCache);

at the start of IGXMLScanner::loadGrammar(), and this solves our problem.

We've reproduced the problem in v2.6.0 and v2.7.0, but v3.1.1 doesn't call IGXMLScanner::getEntityDeclPool() in our test code. However, tracing it in gdb I can see that v3.1.1 does potentially have the same problem, i.e. IGXMLScanner::fDTDGrammar is pointing to a deleted DTDGrammar after IGXMLScanner::loadGrammar() has cleared the cache.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] [Updated] (XERCESC-1961) Invalid IGXMLScanner::fDTDGrammar, causing segfault

Posted by "Peter Burns (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESC-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Burns updated XERCESC-1961:
---------------------------------

    Attachment: 2011-04-05_xerces_bug.tar.gz

Here's our test code: Schema.hpp is verbatim from the OpenVXI code; DTDResolver is from OpenVXI's DocumentParser.cpp.

> Invalid IGXMLScanner::fDTDGrammar, causing segfault
> ---------------------------------------------------
>
>                 Key: XERCESC-1961
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1961
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: SAX/SAX2, Utilities, Validating Parser (DTD)
>    Affects Versions: 2.6.0, 2.8.0
>         Environment: Linux, OpenVXI (http://sourceforge.net/projects/openvxi/)
>            Reporter: Peter Burns
>         Attachments: 2011-04-05_xerces_bug.tar.gz
>
>
> The problem occurs while OpenVXI is initialising, when it parses a couple of (hard-coded) DTDs and then a (hard-coded) XSD. During the DTD parsing (SAX2XMLReader::parse()), a DTDGrammar is created and stored in two places: GrammarResolver::fGrammarBucket and IGXMLScanner::fDTDGrammar. At the start of the XSG parsing (SAX2XMLReader::loadGrammar()), the GrammarBucket is cleared, deleting the DTDGrammar but leaving IGXMLScanner::fDTDGrammar still pointing to it. During the parsing, IGXMLScanner::getEntityDeclPool() is called and hence tries to call fDTDGrammar->getEntityDeclPool(). This sometimes causes a segfault (though usually only after our app - performing these operations over and over - has been running for a few hours).
> I have some code which reproduces the problem - I'll attach it to this case as soon as I can work out how. Since the code rarely segfaults, I've been demonstrating it by adding printf()s to the DTDGrammar constructor/destructor and IGXMLScanner::getEntityDeclPool(). So my test code currently generates this:
> [peter@ultra1 xerces_bug]$ ./test
> DTDGrammar::DTDGrammar() this = 0x95691508
> Warning in file vxml 1.0 defaults at line 2 column 51
> Reason: Element 'metadata' was referenced in a content model but never declared
> DTDGrammar::~DTDGrammar() this = 0x95691508
> DTDGrammar::DTDGrammar() this = 0x956fa908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::DTDGrammar() this = 0x95ac4908
> IGXMLScanner::getEntityDeclPool() fDTDGrammar = 0x95691508
> DTDGrammar::~DTDGrammar() this = 0x95ac4908
> DTDGrammar::~DTDGrammar() this = 0x956fa908
> [peter@ultra1 xerces_bug]$
> showing the DTDGrammar this=0x95691508 being created, deleted and then used by IGXMLScanner.
> Our fix is to set fDTDGrammar to 0 after the bucket-clearing operation
> fGrammarResolver->useCachedGrammarInParse(toCache);
> at the start of IGXMLScanner::loadGrammar(), and this solves our problem.
> We've reproduced the problem in v2.6.0 and v2.7.0, but v3.1.1 doesn't call IGXMLScanner::getEntityDeclPool() in our test code. However, tracing it in gdb I can see that v3.1.1 does potentially have the same problem, i.e. IGXMLScanner::fDTDGrammar is pointing to a deleted DTDGrammar after IGXMLScanner::loadGrammar() has cleared the cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org