You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Brodeur, Stephen (CT)" <Br...@diebold.com> on 2001/08/31 21:53:52 UTC

Bug fixes for Validating Parser (Schema)...

Greetings,

	I've discovered some memory leaks in the Validating Parser package, and have implemented some fixes.  There are some remaining leaks, but the torrential leak is now a trickle.  Bugzilla is down, so I'll simply submit this fix.  If you find that this fix is valid, as I believe it is, I'd love my name on the contributor's list.  It'd really "spin my propeller"...

Xerces C++ Version Number: 1.5.1

Platform: Win32

OS: Windows 2000  Version 5.0 (build 2195: Service pack 2)

XML document that failed:  No failure -- just memory leaks.  Any reasonably complex schema with user-defined data types will do.

C++ Application Code that failed:

	xerces-c-src1_5_1\src\validators\common\DFAContentModel.cpp
	xerces-c-src1_5_1\src\validators\datatype\DecimalDatatypeValidator.cpp
	xerces-c-src1_5_1\src\validators\schema\TraverseSchema.cpp

Fixes are attached.   Overview for each fix:


validators\common\DFAContentModel.cpp :

	The majority of the fix lies in DFAContentModel::buildDFA(), which contains the "algorithm from hell".  I don't pretend to thoroughly understand the algorithm (yet), but the problem lies in the release of nodes within the syntax tree.  The fLeafList attribute contained references to the leaves of the syntax tree, and was used for resource release.  This left intermediate nodes leaking into the bit bucket.  The "fix" is to delete the root node and merely delete the fLeafList attribute.  There's still a potential leak for allocations that occur in the postTreeBuildInit() method, but the fix requires deeper understandingn of the "algorithm from hell".  For our use, this is a complete fix, since these types of allocation do not occur for our files.

	Other, easier fixes included the stack allocation of QName objects fed to CMLeaf ctors, since a copy is made in the ctor.

validators\datatype\DecimalDatatypeValidator.cpp :

	The init() routine was leaking XMLCh* strings when checking min/max inclusive/exclusive values.  Fixed -- ArrayJanitor to the rescue.

validators\schema\TraverseSchema.cpp :

	This may be debatable as whether it's a "fix", but for our application, we've got over a dozen separate XML configuration files, and we can be "reconfigured" dynamically, causing a reparse of the files.  The lazy initialization of the XMLStringPool attribute caused accumulated leaks, so I merely changed it to a dynamic allocation within doTraverseSchema().  Since it's purely implementation, very safe.

	I've already attempted to send you the above files as attachments, and was rejected to to oversized e-mail filtering.  I've got the fixes ready for your inspection -- let me know the process...

Regards,

Steve Brodeur
brodeus@diebold.com

alternate e-mail: sbrodeur@4techwork.com


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Bug fixes for Validating Parser (Schema)...

Posted by Dean Roddey <dr...@charmedquark.com>.
I wrote the "algorithm from hell". Its described in a book affectionately
known as "The Dragon Book", "Compilers: Principles, Techniques, and Tools".
This may not help you, since it still took me quite a while to write that
algorithm, which differs slightly from what would be used by a compiler,
from the descriptions in the book. But if you want a high level explanation
of what its doing, then that's where to look.

--------------------------
Dean Roddey
The Charmed Quark Controller
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"If it don't have a control port, don't buy it!"


----- Original Message -----
From: "Brodeur, Stephen (CT)" <Br...@diebold.com>
To: <xe...@xml.apache.org>
Sent: Friday, August 31, 2001 12:53 PM
Subject: Bug fixes for Validating Parser (Schema)...


> Greetings,
>
> I've discovered some memory leaks in the Validating Parser package, and
have implemented some fixes.  There are some remaining leaks, but the
torrential leak is now a trickle.  Bugzilla is down, so I'll simply submit
this fix.  If you find that this fix is valid, as I believe it is, I'd love
my name on the contributor's list.  It'd really "spin my propeller"...
>
> Xerces C++ Version Number: 1.5.1
>
> Platform: Win32
>
> OS: Windows 2000  Version 5.0 (build 2195: Service pack 2)
>
> XML document that failed:  No failure -- just memory leaks.  Any
reasonably complex schema with user-defined data types will do.
>
> C++ Application Code that failed:
>
> xerces-c-src1_5_1\src\validators\common\DFAContentModel.cpp
> xerces-c-src1_5_1\src\validators\datatype\DecimalDatatypeValidator.cpp
> xerces-c-src1_5_1\src\validators\schema\TraverseSchema.cpp
>
> Fixes are attached.   Overview for each fix:
>
>
> validators\common\DFAContentModel.cpp :
>
> The majority of the fix lies in DFAContentModel::buildDFA(), which
contains the "algorithm from hell".  I don't pretend to thoroughly
understand the algorithm (yet), but the problem lies in the release of nodes
within the syntax tree.  The fLeafList attribute contained references to the
leaves of the syntax tree, and was used for resource release.  This left
intermediate nodes leaking into the bit bucket.  The "fix" is to delete the
root node and merely delete the fLeafList attribute.  There's still a
potential leak for allocations that occur in the postTreeBuildInit() method,
but the fix requires deeper understandingn of the "algorithm from hell".
For our use, this is a complete fix, since these types of allocation do not
occur for our files.
>
> Other, easier fixes included the stack allocation of QName objects fed to
CMLeaf ctors, since a copy is made in the ctor.
>
> validators\datatype\DecimalDatatypeValidator.cpp :
>
> The init() routine was leaking XMLCh* strings when checking min/max
inclusive/exclusive values.  Fixed -- ArrayJanitor to the rescue.
>
> validators\schema\TraverseSchema.cpp :
>
> This may be debatable as whether it's a "fix", but for our application,
we've got over a dozen separate XML configuration files, and we can be
"reconfigured" dynamically, causing a reparse of the files.  The lazy
initialization of the XMLStringPool attribute caused accumulated leaks, so I
merely changed it to a dynamic allocation within doTraverseSchema().  Since
it's purely implementation, very safe.
>
> I've already attempted to send you the above files as attachments, and was
rejected to to oversized e-mail filtering.  I've got the fixes ready for
your inspection -- let me know the process...
>
> Regards,
>
> Steve Brodeur
> brodeus@diebold.com
>
> alternate e-mail: sbrodeur@4techwork.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org