You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Erik Schroeder <ES...@Wausaufs.com> on 2000/07/13 15:26:20 UTC

Progressive parse "bug"

I sent this to xml4c@us.ibm.com yesterday, but this list seems to have a
quick(er) return, and since the same behavior also occurs with Xerces 1.2.0,
I shall also post here.  I apologize for any potential duplicity.
Comments/similar experiences are appreciated.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=-=-=-=-=-=-

	XML4C Version:			3.1.0
	OS Release Number:		Microsoft(R) Windows NT(TM)
Workstation Version 4.0 (Build 1381: Service Pack 6, RC 1.5)
	Compiler Version Number:		Microsoft(R) Visual C++ 6.0,
Service Pack 3
	XML Document:
	--------------------------
	<!--This is a comment-->
	<ROOT>
	  <Tag1>
	  <!--This is another comment-->
	  </Tag1>
	</ROOT>
	--------------------------
	Scenario:
	Given the samples that come with the xml4csrc3_1_0.zip download
(also the Xerces-C-src_1_2_0a.zip from http://xml.apache.org/ ; the problem
also occurs with Xerces, as one might expect), when a debug build of PParse
(progressive parsing sample) is done and run with the aforelisted XML file,
the following output is produced:
	Did not get the required 16 elements
	This is expected output.

	When a release build of PParse is done and run with the same XML
file, the following output is produced:
	Fatal Error at file c:\temp\test.xml, line 5, char 3
	  Message: Expected comment or processing instruction
	Did not get the required 16 elements
	This output differs from that which was obtained form the debug
build.

	When a debug build of SaxCount ("complete parse" sample) is done and
the same XML file is provided, the following output is produced:
	c:\temp\test.xml: 0 ms (2 elems, 0 attrs, 0 spaces, 10 chars)
	The same output results from a release build of SaxCount.

Forgive me for not having followed through with this a bit more prior to
submitting it, but it seems that in XMLScanner::scanNext (XMLScanner.cpp),
when bool gotData is declared, it is not initialized.  In the debug build,
it just happened to have a non-zero value.  In the release build, its value
was zero.  This led to a call to scanMiscellaneous, which eventually
executes:
	        // This can't be possible, so just give up
                    emitError(XMLErrs::ExpectedCommentOrPI);
                    fReaderMgr.skipPastChar(chCloseAngle);
In scanNext, it seems there are a few ways that gotData can get through the
switch(curToken) statement without gotData being assigned, even though it's
value is checked outside of the switch statement.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=-=-=-=-=-=-


Thought for the day/week/month/year:*
Disclaimer: The entire physical universe, including this item, may one day
collapse back into an infinitesimally small space. Should another universe
subsequently re-emerge, the existence of this item in that universe cannot
be guaranteed.


RE: Progressive parse "bug"

Posted by Jim Reitz <je...@home.com>.
Thanks Joe.  Looks good.

-----Original Message-----
From: Joe Polastre [mailto:jpolast@apache.org]
Sent: Monday, July 17, 2000 2:12 PM
To: xerces-c-dev@xml.apache.org; Dean Roddey; Erik Schroeder
Cc: jereitz@home.com
Subject: Re: Progressive parse "bug"


I've been following this thread and agree with Jim's patch.  I've made the
changes that Jim has proposed and did a bit testing with some different xml
documents.   I think that everything should be fine now.  Please grab the
latest version from CVS and try it out.

-Joe Polastre  (jpolast@apache.org)
IBM Cupertino, XML Technology Group

----- Original Message -----
From: "Jim Reitz" <je...@home.com>
To: <xe...@xml.apache.org>; "Dean Roddey" <dr...@charmedquark.com>
Cc: <je...@home.com>
Sent: Friday, July 14, 2000 7:49 PM
Subject: RE: Progressive parse "bug"


> >>It almost sounds like the 'mixing C++ runtimes' problem, but in reverse.
>
> I don't think so, I think Erik hit it on the head.
>
> In XMLScanner::scanNext() (XMLScanner.cpp), the gotData variable is
declared
> like this:
>    bool gotData;
>
> i.e., gotData is NOT initialized to anything.
>
> In the switch/case statement immediate following gotData's declaration,
the
> following cases do not set "gotData":
>    case Token_CData :
>    case Token_Comment :
>    case Token_PI :
>    default :
>
> And then, gotData is checked inside an if() statement.
>
> In addition, XMLScanner::scanNext() never calls the document handler's
> endDocument() callback as I believe it should.
>
> Here's a diff for how I think XMLScanner.cpp should look to fix these two
> problems, but I'm no expert and I haven't verified this works yet...
>
============================================================================
> ================
> 714c714
> <             bool gotData;
> ---
> >             bool gotData = true;
> 752a753
> >             {
> 753a755,758
> >                 // If we have a document handler, then call the end
> document
> >                 if (fDocHandler)
> >                     fDocHandler->endDocument();
> >             }
>
============================================================================
> ================
>
> Jim Reitz
> jereitz@home.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Progressive parse "bug"

Posted by Joe Polastre <jp...@apache.org>.
I've been following this thread and agree with Jim's patch.  I've made the
changes that Jim has proposed and did a bit testing with some different xml
documents.   I think that everything should be fine now.  Please grab the
latest version from CVS and try it out.

-Joe Polastre  (jpolast@apache.org)
IBM Cupertino, XML Technology Group

----- Original Message -----
From: "Jim Reitz" <je...@home.com>
To: <xe...@xml.apache.org>; "Dean Roddey" <dr...@charmedquark.com>
Cc: <je...@home.com>
Sent: Friday, July 14, 2000 7:49 PM
Subject: RE: Progressive parse "bug"


> >>It almost sounds like the 'mixing C++ runtimes' problem, but in reverse.
>
> I don't think so, I think Erik hit it on the head.
>
> In XMLScanner::scanNext() (XMLScanner.cpp), the gotData variable is
declared
> like this:
>    bool gotData;
>
> i.e., gotData is NOT initialized to anything.
>
> In the switch/case statement immediate following gotData's declaration,
the
> following cases do not set "gotData":
>    case Token_CData :
>    case Token_Comment :
>    case Token_PI :
>    default :
>
> And then, gotData is checked inside an if() statement.
>
> In addition, XMLScanner::scanNext() never calls the document handler's
> endDocument() callback as I believe it should.
>
> Here's a diff for how I think XMLScanner.cpp should look to fix these two
> problems, but I'm no expert and I haven't verified this works yet...
>
============================================================================
> ================
> 714c714
> <             bool gotData;
> ---
> >             bool gotData = true;
> 752a753
> >             {
> 753a755,758
> >                 // If we have a document handler, then call the end
> document
> >                 if (fDocHandler)
> >                     fDocHandler->endDocument();
> >             }
>
============================================================================
> ================
>
> Jim Reitz
> jereitz@home.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>


RE: Progressive parse "bug"

Posted by Jim Reitz <je...@home.com>.
>>It almost sounds like the 'mixing C++ runtimes' problem, but in reverse.

I don't think so, I think Erik hit it on the head.

In XMLScanner::scanNext() (XMLScanner.cpp), the gotData variable is declared
like this:
   bool gotData;

i.e., gotData is NOT initialized to anything.

In the switch/case statement immediate following gotData's declaration, the
following cases do not set "gotData":
   case Token_CData :
   case Token_Comment :
   case Token_PI :
   default :

And then, gotData is checked inside an if() statement.

In addition, XMLScanner::scanNext() never calls the document handler's
endDocument() callback as I believe it should.

Here's a diff for how I think XMLScanner.cpp should look to fix these two
problems, but I'm no expert and I haven't verified this works yet...
============================================================================
================
714c714
<             bool gotData;
---
>             bool gotData = true;
752a753
>             {
753a755,758
>                 // If we have a document handler, then call the end
document
>                 if (fDocHandler)
>                     fDocHandler->endDocument();
>             }
============================================================================
================

Jim Reitz
jereitz@home.com


Re: Progressive parse "bug"

Posted by Dean Roddey <dr...@charmedquark.com>.
It almost sounds like the 'mixing C++ runtimes' problem, but in reverse. The
drop from Apache should have been built with the "Multithreaded DLL", so it
should work with the production build of the sample, but not the debug
build. But it sounds like you are getting the opposite problem. I would hope
that the released versions weren't debug builds?? I think its happened
before, but I can't imagine that that kind of thing would happen these days
(and that it would happen with both the Xerces and XML4C releases.) You
could always run Chkmod32 on the sample and see what runtime DLLs you are
getting.

I assume you didn't change any project settings for the samples, right?

--------------------------
Dean Roddey
The CIDLib C++ Frameworks
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"You young, and you gotcha health. Whatchoo wanna job fer?"


----- Original Message -----
From: "Erik Schroeder" <ES...@Wausaufs.com>
To: <xe...@xml.apache.org>
Sent: Thursday, July 13, 2000 6:26 AM
Subject: Progressive parse "bug"


> I sent this to xml4c@us.ibm.com yesterday, but this list seems to have a
> quick(er) return, and since the same behavior also occurs with Xerces
1.2.0,
> I shall also post here.  I apologize for any potential duplicity.
> Comments/similar experiences are appreciated.
>
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> =-=-=-=-=-=-=-=-
>
> XML4C Version: 3.1.0
> OS Release Number: Microsoft(R) Windows NT(TM)
> Workstation Version 4.0 (Build 1381: Service Pack 6, RC 1.5)
> Compiler Version Number: Microsoft(R) Visual C++ 6.0,
> Service Pack 3
> XML Document:
> --------------------------
> <!--This is a comment-->
> <ROOT>
>   <Tag1>
>   <!--This is another comment-->
>   </Tag1>
> </ROOT>
> --------------------------
> Scenario:
> Given the samples that come with the xml4csrc3_1_0.zip download
> (also the Xerces-C-src_1_2_0a.zip from http://xml.apache.org/ ; the
problem
> also occurs with Xerces, as one might expect), when a debug build of
PParse
> (progressive parsing sample) is done and run with the aforelisted XML
file,
> the following output is produced:
> Did not get the required 16 elements
> This is expected output.
>
> When a release build of PParse is done and run with the same XML
> file, the following output is produced:
> Fatal Error at file c:\temp\test.xml, line 5, char 3
>   Message: Expected comment or processing instruction
> Did not get the required 16 elements
> This output differs from that which was obtained form the debug
> build.
>
> When a debug build of SaxCount ("complete parse" sample) is done and
> the same XML file is provided, the following output is produced:
> c:\temp\test.xml: 0 ms (2 elems, 0 attrs, 0 spaces, 10 chars)
> The same output results from a release build of SaxCount.
>
> Forgive me for not having followed through with this a bit more prior to
> submitting it, but it seems that in XMLScanner::scanNext (XMLScanner.cpp),
> when bool gotData is declared, it is not initialized.  In the debug build,
> it just happened to have a non-zero value.  In the release build, its
value
> was zero.  This led to a call to scanMiscellaneous, which eventually
> executes:
>         // This can't be possible, so just give up
>                     emitError(XMLErrs::ExpectedCommentOrPI);
>                     fReaderMgr.skipPastChar(chCloseAngle);
> In scanNext, it seems there are a few ways that gotData can get through
the
> switch(curToken) statement without gotData being assigned, even though
it's
> value is checked outside of the switch statement.
>
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> =-=-=-=-=-=-=-=-
>
>
> Thought for the day/week/month/year:*
> Disclaimer: The entire physical universe, including this item, may one day
> collapse back into an infinitesimally small space. Should another universe
> subsequently re-emerge, the existence of this item in that universe cannot
> be guaranteed.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>