You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Bryan, Daniel T" <da...@lmco.com> on 2000/07/11 13:47:55 UTC

Possible defect in Xerces-C library (xerces-c_1_2.dll)

I am new to Xerces and XML, and am having a problem with the Xerces-C
library version 1.2.0.  I am running on Windows NT 4, SP6.  The compiler I'm
using is Microsoft Visual C++ 6.0, although that should make no difference
as I am experiencing the same problem with both your compiled binary and my
own debug version of the binary (xerces-c_1_2D.dll).  My problem is as
follows:

When attempting to retrieve text associated with a particular node (in this
case the TEXT node), the following two input documents behave differently:

EXAMPLE DOCUMENT 1:

<?xml version="1.0"?>
<DOC>
<TEXTREGION>
	<TEXT>
		<TIMESTAMP>
			<TIME>03:45:00</TIME>
			<TIMEZONE>EST</TIMEZONE>
		</TIMESTAMP>
		This is the text I would like returned.
	</TEXT>
</TEXTREGION>
</DOC>


EXAMPLE DOCUMENT 2:

<?xml version="1.0"?>
<DOC>
<TEXTREGION>
	<TEXT>
		This is the text I would like returned.
	</TEXT>
</TEXTREGION>
</DOC>


The call myNode.getNodeValue() (where myNode is of type DOM_Node) returns
only the whitespace between <TEXT> and <TIMESTAMP> in the first example, and
returns the desired text in the second example.  Because I am new to XML and
Xerces, I do not know if this the behavior dictated by the XML standard, or
if this is a defect in the Xerces library that needs to be fixed.  If Xerces
is behaving correctly, and is supposed to stop adding character data when it
encounters another tag, I would greatly appreciate any insight you could
provide as to how I can modify Xerces for my desired functionality.

Daniel Bryan



Re: Possible defect in Xerces-C library (xerces-c_1_2.dll)

Posted by Craig Noah <Cr...@ca.com>.
Daniel,

Something to keep in mind - the text that you see as the "value" of an element
is actually stored in the DOM tree as a separate node.  It is a child of the
element you are focused on.  In order to get this "value", first find the
element node you are interested in (you already are able to do this), next,
step through that node's child nodes and look for a node of type
DOM_Node::TEXT_NODE.  That is the node that holds your text.

For a more specific example, take a look at the DOMPrint sample.

Craig

"Bryan, Daniel T" wrote:

> I am new to Xerces and XML, and am having a problem with the Xerces-C
> library version 1.2.0.  I am running on Windows NT 4, SP6.  The compiler I'm
> using is Microsoft Visual C++ 6.0, although that should make no difference
> as I am experiencing the same problem with both your compiled binary and my
> own debug version of the binary (xerces-c_1_2D.dll).  My problem is as
> follows:
>
> When attempting to retrieve text associated with a particular node (in this
> case the TEXT node), the following two input documents behave differently:
>
> EXAMPLE DOCUMENT 1:
>
> <?xml version="1.0"?>
> <DOC>
> <TEXTREGION>
>         <TEXT>
>                 <TIMESTAMP>
>                         <TIME>03:45:00</TIME>
>                         <TIMEZONE>EST</TIMEZONE>
>                 </TIMESTAMP>
>                 This is the text I would like returned.
>         </TEXT>
> </TEXTREGION>
> </DOC>
>
> EXAMPLE DOCUMENT 2:
>
> <?xml version="1.0"?>
> <DOC>
> <TEXTREGION>
>         <TEXT>
>                 This is the text I would like returned.
>         </TEXT>
> </TEXTREGION>
> </DOC>
>
> The call myNode.getNodeValue() (where myNode is of type DOM_Node) returns
> only the whitespace between <TEXT> and <TIMESTAMP> in the first example, and
> returns the desired text in the second example.  Because I am new to XML and
> Xerces, I do not know if this the behavior dictated by the XML standard, or
> if this is a defect in the Xerces library that needs to be fixed.  If Xerces
> is behaving correctly, and is supposed to stop adding character data when it
> encounters another tag, I would greatly appreciate any insight you could
> provide as to how I can modify Xerces for my desired functionality.
>
> Daniel Bryan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org

--
Craig Noah                  INTERNET: Craig.Noah@ca.com
Software Engineer

Computer Associates
1404 Fort Crook Road South     Phone:    (402) 291-8300 x 284
Bellevue,  NE   68005-2969     FAX:      (402) 291-4362