You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "boris-kolpackov (via GitHub)" <gi...@apache.org> on 2023/12/05 13:47:23 UTC

[PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

boris-kolpackov opened a new pull request, #54:
URL: https://github.com/apache/xerces-c/pull/54

   These are the instructions for observing the bug (before this commit):
   
   ````
   $ git clone https://github.com/apache/xerces-c.git $ cd xerces-c
   $ mkdir build
   $ cd build
   $ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug .. $ make -j8
   $ cp ../samples/data/personal.xml .
   
   $ cat <<EOF >personal.dtd
   <?xml encoding="ISO-8859-1"?>
   <!ENTITY % nonExistentEntity SYSTEM "non-existent.ent"> %nonExistentEntity;
   EOF
   
   $ gdb samples/StdInParse
   (gdb) b IGXMLScanner.cpp:1544
   (gdb) run <personal.xml
   1544	            fReaderMgr.pushReader(reader, declDTD);
   (gdb) p declDTD
   $1 = (xercesc_4_0::DTDEntityDecl *) 0x49ac68
   (gdb) n
   1547	            dtdScanner.scanExtSubsetDecl(false, true);
   (gdb) n
   1548	        }
   (gdb) s
   ...
   (gdb) s                     # The Janitor is about to delete the above declDTD.
   90	        delete fData;
   (gdb) p fData
   $1 = (xercesc_4_0::DTDEntityDecl *) 0x49ac68
   (gdb) b ReaderMgr.cpp:1024
   (gdb) n
   ...
   (gdb) n                     # Now we about to dereference the deleted declDTD.
   1024	    if (curEntity && !curEntity->isExternal())
   (gdb) p curEntity
   $2 = (const xercesc_4_0::XMLEntityDecl *) 0x49ac68
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: [PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

Posted by "boris-kolpackov (via GitHub)" <gi...@apache.org>.
boris-kolpackov commented on PR #54:
URL: https://github.com/apache/xerces-c/pull/54#issuecomment-1840890853

   This fix follows the same overall idea as https://github.com/apache/xerces-c/pull/47 with the following key differences:
   
   1. It addresses the lifetime issue when throwing `EndOfEntityException` (mentioned in a review comment to that PR).
   
   2. It is binary backwards-compatible so can be used for a patch release.
   
   Besides the instructions for observing the bug under the debugger (and confirming that it is no longer observed after the fix), we've also added a direct test for `ReaderMgr` to our package of Xerces-C++ that can can be used to reproduce the issues/confirm the fix: https://github.com/build2-packaging/xerces-c/tree/3.2.5/libxerces-c/tests/reader-mgr
   
   So the fix is reasonably well testes and we haven't observed any regressions. We've also run our CI which covers all the major platforms/compilers (but not in C++98): https://ci.stage.build2.org/@2177ad08-5621-4300-807f-8861b54c54c0
   
   I've also reviewed this patch and it looks good to me.
   
   Please review and/or test and let us know if there are any issues. Note that while this commit is against the `master` branch, it can be cleanly cherry-picked to the `xerces-3.2` branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: [PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

Posted by "boris-kolpackov (via GitHub)" <gi...@apache.org>.
boris-kolpackov commented on PR #54:
URL: https://github.com/apache/xerces-c/pull/54#issuecomment-1853490361

   This PR has been merged (with whitespace issues addressed):
   
   `master`: https://github.com/apache/xerces-c/commit/b38ab79e934b9c27de191ee7af6926c7af42069d
   `xerces-3.2`: https://github.com/apache/xerces-c/commit/e0024267504188e42ace4dd9031d936786914835
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: [PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

Posted by "boris-kolpackov (via GitHub)" <gi...@apache.org>.
boris-kolpackov closed pull request #54: XERCESC-2188 - Use-after-free on external DTD scan
URL: https://github.com/apache/xerces-c/pull/54


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: [PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

Posted by "boris-kolpackov (via GitHub)" <gi...@apache.org>.
boris-kolpackov commented on code in PR #54:
URL: https://github.com/apache/xerces-c/pull/54#discussion_r1417168833


##########
src/xercesc/internal/ReaderMgr.cpp:
##########
@@ -873,33 +921,50 @@ bool ReaderMgr::isScanningPERefOutOfLiteral() const
     return false;
 }
 
-
 bool ReaderMgr::pushReader(         XMLReader* const        reader
                             ,       XMLEntityDecl* const    entity)
+{
+    return pushReaderAdoptEntity(reader, entity, false);
+}
+
+bool ReaderMgr::pushReaderAdoptEntity(     XMLReader* const        reader
+                                       ,   XMLEntityDecl* const    entity
+                                       ,   const bool              adoptEntity)
 {
     //
     //  First, if an entity was passed, we have to confirm that this entity
-    //  is not already on the entity stack. If so, then this is a recursive
+    //  is not already on the reader stack. If so, then this is a recursive
     //  entity expansion, so we issue an error and refuse to put the reader
     //  on the stack.
     //
     //  If there is no entity passed, then its not an entity being pushed, so
     //  nothing to do. If there is no entity stack yet, then of coures it
     //  cannot already be there.
     //
-    if (entity && fEntityStack)
+    if (entity && fReaderStack)
     {
-        const XMLSize_t count = fEntityStack->size();
+        // @@ Strangely, we don't check the entity at the top of the stack
+        //    (fCurReaderData). Is it a bug?
+        //
+        const XMLSize_t count = fReaderStack->size();
         const XMLCh* const theName = entity->getName();
         for (XMLSize_t index = 0; index < count; index++)
         {
-            const XMLEntityDecl* curDecl = fEntityStack->elementAt(index);
+            const XMLEntityDecl* curDecl =
+              fReaderStack->elementAt(index)->getEntity();
+
             if (curDecl)
             {
                 if (XMLString::equals(theName, curDecl->getName()))
                 {
-                    // Oops, already there so delete reader and return
+                    // Oops, already there so delete reader and entity and
+                    // return.
+                    //
                     delete reader;
+
+                    if (adoptEntity)
+                      delete entity;

Review Comment:
   Ok, thanks, we will fixup the final version of the commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: [PR] XERCESC-2188 - Use-after-free on external DTD scan [xerces-c]

Posted by "theta682 (via GitHub)" <gi...@apache.org>.
theta682 commented on code in PR #54:
URL: https://github.com/apache/xerces-c/pull/54#discussion_r1416808653


##########
src/xercesc/internal/ReaderMgr.cpp:
##########
@@ -1020,7 +1070,9 @@ ReaderMgr::getLastExtEntity(const XMLEntityDecl*& itsEntity) const
     //  search the stack; else, keep the reader that we've got since its
     //  either an external entity reader or the main file reader.
     //
-    const XMLEntityDecl* curEntity = fCurEntity;
+    const XMLEntityDecl* curEntity =
+      fCurReaderData? fCurReaderData->getEntity() : 0;

Review Comment:
   ```suggestion
           fCurReaderData? fCurReaderData->getEntity() : 0;
   ```



##########
src/xercesc/internal/ReaderMgr.hpp:
##########
@@ -208,36 +214,96 @@ private :
     ReaderMgr(const ReaderMgr&);
     ReaderMgr& operator=(const ReaderMgr&);
 
+    // -----------------------------------------------------------------------
+    //  Private data types
+    // -----------------------------------------------------------------------
+    class ReaderData : public XMemory
+    {
+    public  :
+      // ---------------------------------------------------------------------
+      //  Constructors and Destructor
+      // ---------------------------------------------------------------------
+      ReaderData
+      (    XMLReader* const     reader
+         , XMLEntityDecl* const entity
+         , const bool           adoptEntity
+      );
+
+      ~ReaderData();
+
+      // ----------------------------------------------------------------------
+      //  Getter methods
+      // ----------------------------------------------------------------------
+      XMLReader* getReader() const;
+      XMLEntityDecl* getEntity() const;
+      bool getEntityAdopted() const;
+
+      XMLEntityDecl* releaseEntity();

Review Comment:
   ```suggestion
           // ---------------------------------------------------------------------
           //  Constructors and Destructor
           // ---------------------------------------------------------------------
           ReaderData
           (    XMLReader* const     reader
              , XMLEntityDecl* const entity
              , const bool           adoptEntity
           );
   
           ~ReaderData();
   
           // ----------------------------------------------------------------------
           //  Getter methods
           // ----------------------------------------------------------------------
           XMLReader* getReader() const;
           XMLEntityDecl* getEntity() const;
           bool getEntityAdopted() const;
   
           XMLEntityDecl* releaseEntity();
   ```



##########
src/xercesc/internal/ReaderMgr.hpp:
##########
@@ -208,36 +214,96 @@ private :
     ReaderMgr(const ReaderMgr&);
     ReaderMgr& operator=(const ReaderMgr&);
 
+    // -----------------------------------------------------------------------
+    //  Private data types
+    // -----------------------------------------------------------------------
+    class ReaderData : public XMemory
+    {
+    public  :
+      // ---------------------------------------------------------------------
+      //  Constructors and Destructor
+      // ---------------------------------------------------------------------
+      ReaderData
+      (    XMLReader* const     reader
+         , XMLEntityDecl* const entity
+         , const bool           adoptEntity
+      );
+
+      ~ReaderData();
+
+      // ----------------------------------------------------------------------
+      //  Getter methods
+      // ----------------------------------------------------------------------
+      XMLReader* getReader() const;
+      XMLEntityDecl* getEntity() const;
+      bool getEntityAdopted() const;
+
+      XMLEntityDecl* releaseEntity();
+
+    private :
+      // ---------------------------------------------------------------------
+      //  Unimplemented constructors and operators
+      // ---------------------------------------------------------------------
+      ReaderData();
+      ReaderData(const ReaderData&);
+      ReaderData& operator=(const ReaderData&);
+
+      // ---------------------------------------------------------------------
+      //  Private data members
+      //
+      //  fReader
+      //      This is the pointer to the reader object that must be destroyed
+      //      when this object is destroyed.
+      //
+      //  fEntity
+      //  fEntityAdopted
+      //      This is the pointer to the entity object that, if adopted, must
+      //      be destroyed when this object is destroyed.
+      //
+      //      Note that we need to keep up with which of the pushed readers
+      //      are pushed entity values that are being spooled. This is done
+      //      to avoid the problem of recursive definitions.
+      // ---------------------------------------------------------------------
+      XMLReader*     fReader;
+      XMLEntityDecl* fEntity;
+      bool           fEntityAdopted;

Review Comment:
   ```suggestion
           // ---------------------------------------------------------------------
           //  Unimplemented constructors and operators
           // ---------------------------------------------------------------------
           ReaderData();
           ReaderData(const ReaderData&);
           ReaderData& operator=(const ReaderData&);
   
           // ---------------------------------------------------------------------
           //  Private data members
           //
           //  Reader
           //      This is the pointer to the reader object that must be destroyed
           //      when this object is destroyed.
           //
           //  fEntity
           //  fEntityAdopted
           //      This is the pointer to the entity object that, if adopted, must
           //      be destroyed when this object is destroyed.
           //
           //      Note that we need to keep up with which of the pushed readers
           //      are pushed entity values that are being spooled. This is done
           //      to avoid the problem of recursive definitions.
           // ---------------------------------------------------------------------
           XMLReader*     Reader;
           XMLEntityDecl* fEntity;
           bool           fEntityAdopted;
   ```



##########
src/xercesc/internal/ReaderMgr.cpp:
##########
@@ -873,33 +921,50 @@ bool ReaderMgr::isScanningPERefOutOfLiteral() const
     return false;
 }
 
-
 bool ReaderMgr::pushReader(         XMLReader* const        reader
                             ,       XMLEntityDecl* const    entity)
+{
+    return pushReaderAdoptEntity(reader, entity, false);
+}
+
+bool ReaderMgr::pushReaderAdoptEntity(     XMLReader* const        reader
+                                       ,   XMLEntityDecl* const    entity
+                                       ,   const bool              adoptEntity)
 {
     //
     //  First, if an entity was passed, we have to confirm that this entity
-    //  is not already on the entity stack. If so, then this is a recursive
+    //  is not already on the reader stack. If so, then this is a recursive
     //  entity expansion, so we issue an error and refuse to put the reader
     //  on the stack.
     //
     //  If there is no entity passed, then its not an entity being pushed, so
     //  nothing to do. If there is no entity stack yet, then of coures it
     //  cannot already be there.
     //
-    if (entity && fEntityStack)
+    if (entity && fReaderStack)
     {
-        const XMLSize_t count = fEntityStack->size();
+        // @@ Strangely, we don't check the entity at the top of the stack
+        //    (fCurReaderData). Is it a bug?
+        //
+        const XMLSize_t count = fReaderStack->size();
         const XMLCh* const theName = entity->getName();
         for (XMLSize_t index = 0; index < count; index++)
         {
-            const XMLEntityDecl* curDecl = fEntityStack->elementAt(index);
+            const XMLEntityDecl* curDecl =
+              fReaderStack->elementAt(index)->getEntity();
+
             if (curDecl)
             {
                 if (XMLString::equals(theName, curDecl->getName()))
                 {
-                    // Oops, already there so delete reader and return
+                    // Oops, already there so delete reader and entity and
+                    // return.
+                    //
                     delete reader;
+
+                    if (adoptEntity)
+                      delete entity;

Review Comment:
   ```suggestion
                           delete entity;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org