You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Ulrich Arnold <Ul...@t-online.de> on 2002/04/29 15:13:55 UTC

BUG 6875

Hi !

I posted a bug to Bugzilla at 2002-03-05 and confirmed that the problem
persists in 1.7.0 at 2002-03-14. Until today the bug is still marked as
new, although it can easily be reproduced. Has anybody an idea of the
typical time-frame for a response in xerces-c ?

The problem is a wrong error-line resulting for an element which occurs
at the wrong position in a sequence. I did some testing, which shows
that the reported error-position is the closing tag of the next higher
hierarchie. Some debugging in XMLScanner.cpp, line 1851 shows, that the
index of the offending element is known (variable res) but the reported
line-position results from the last pushed reader.

 
In case somebody likes to have a look at it:

Using the following simple schema

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold

(Beratung - Systemanalyse) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
elementFormDefault="qualified" attributeFormDefault="unqualified">
	<xs:element name="Root">
		<xs:complexType>
			<xs:sequence>
				<xs:element ref="A"/>
				<xs:element ref="B"/>
				<xs:element ref="C"/>
				<xs:element ref="D"/>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
	<xs:element name="A"/>
	<xs:element name="B"/>
	<xs:element name="C"/>
	<xs:element name="D"/>
</xs:schema>

and this xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold

(Beratung - Systemanalyse) -->
<!--Sample XML file generated by XML Spy v4.3 U
(http://www.xmlspy.com)--> <Root
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="Bug.xsd">
	<A>
		aa
		aa
		aa
	</A>
	<X>
		bb
		bb
		bb
	</X>
	<C>
		cc
		cc
		cc
	</C>
	<D>
		dd
		dd
		dd
	</D>
</Root>

xerces emits the following messages

Error at file "d:\xml\Bugs\bugx.xml", line 10, column 5
   Message: Unknown element 'X'
Error at file "d:\xml\Bugs\bugx.xml", line 25, column 8
   Message: Element 'X' is not valid for content model: '(A,B,C,D)'

where line 10 pinpoints the problem. But if the element is known, but
not ok at 
this position as in 

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold

(Beratung - Systemanalyse) -->
<!--Sample XML file generated by XML Spy v4.3 U
(http://www.xmlspy.com)--> <Root
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="Bug.xsd">
	<A>
		aa
		aa
		aa
	</A>
	<D>
		bb
		bb
		bb
	</D>
	<C>
		cc
		cc
		cc
	</C>
	<D>
		dd
		dd
		dd
	</D>
</Root>

the only message is

Error at file "d:\xml\Bugs\bugd.xml", line 25, column 8
   Message: Element 'D' is not valid for content model: '(A,B,C,D)'

which does not lead to the problem in line 10.


Thanks Uli


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: AW: BUG 6875

Posted by Khaled Noaman <kn...@ca.ibm.com>.
Hi Ulrich,

I understand your requirements. The parser will validate the content of an
element after it has processed all of its children (no content validation of
parent is taking place as we process each child). Since the parser is
validating the parent element, it is generating an error with the
line/column of that parent element (closing tag). I agree that the real
error happens at the first D-Tag, and it would be easier if the line/column
no of the error message represented that tag. That's a nice feature to have,
and it would require either changing the validation mechanism so that the
content of an element is validated each time we process a new child, or
store the line/column notinfo of those children and use that info when
generating the error messages. We have to look into that and ensure that we
do not impact performance or add processing overhead.

Khaled

Ulrich Arnold wrote:

> Hi Khaled !
>
> Thanks for your response. I realize that the line-position regarding
> "element invalid for content-model" is the same for both cases. The
> parser actually generates it not at the end-tag of the root, but at the
> end tag of the parent-element. I understand that the parser cannot be
> sure of the error before the next higher end-tag. But still the real
> error position is at the D-Tag (either opening or closing-tag). Other
> parsers, e.g. XML-Spy by Altova, are able to point to the real
> error-location, not to the point where the parser was sure it was an
> error. I need the error-line to provide the error-position to the
> end-user via a GUI, so I need the posion of the cause of the error.
>
> Uli
>
> -----Ursprungliche Nachricht-----
> Von: Khaled Noaman [mailto:knoaman@ca.ibm.com]
> Gesendet: Montag, 29. April 2002 16:21
> An: xerces-c-dev@xml.apache.org
> Betreff: Re: BUG 6875
>
> The error message that the parser produces regarding the invalid content
> model, is generated when the parser validates the content of the Root
> element. That's why you are getting the line/column no of that element.
> If you check both test cases, the parser is generating an invalid
> content error using the same line/column no. In the first test case, the
> parser has no idea of element X and so it generates an extra error
> message. In the second case, element D is already known, and so the
> parser does not generate an error message, and the error message is only
> generated when validating the content of Root.  So, I would not consider
> this as a bug.
>
> Khaled
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


AW: BUG 6875

Posted by Ulrich Arnold <Ul...@t-online.de>.
Hi Khaled !

Thanks for your response. I realize that the line-position regarding
"element invalid for content-model" is the same for both cases. The
parser actually generates it not at the end-tag of the root, but at the
end tag of the parent-element. I understand that the parser cannot be
sure of the error before the next higher end-tag. But still the real
error position is at the D-Tag (either opening or closing-tag). Other
parsers, e.g. XML-Spy by Altova, are able to point to the real
error-location, not to the point where the parser was sure it was an
error. I need the error-line to provide the error-position to the
end-user via a GUI, so I need the posion of the cause of the error.

Uli

-----Ursprungliche Nachricht-----
Von: Khaled Noaman [mailto:knoaman@ca.ibm.com] 
Gesendet: Montag, 29. April 2002 16:21
An: xerces-c-dev@xml.apache.org
Betreff: Re: BUG 6875


The error message that the parser produces regarding the invalid content
model, is generated when the parser validates the content of the Root
element. That's why you are getting the line/column no of that element.
If you check both test cases, the parser is generating an invalid
content error using the same line/column no. In the first test case, the
parser has no idea of element X and so it generates an extra error
message. In the second case, element D is already known, and so the
parser does not generate an error message, and the error message is only
generated when validating the content of Root.  So, I would not consider
this as a bug.

Khaled


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: BUG 6875

Posted by Khaled Noaman <kn...@ca.ibm.com>.
The error message that the parser produces regarding the invalid content
model, is generated when the parser validates the content of the Root
element. That's why you are getting the line/column no of that element. If
you check both test cases, the parser is generating an invalid content error
using the same line/column no. In the first test case, the parser has no
idea of element X and so it generates an extra error message. In the second
case, element D is already known, and so the parser does not generate an
error message, and the error message is only generated when validating the
content of Root.  So, I would not consider this as a bug.

Khaled

Ulrich Arnold wrote:

> Hi !
>
> I posted a bug to Bugzilla at 2002-03-05 and confirmed that the problem
> persists in 1.7.0 at 2002-03-14. Until today the bug is still marked as
> new, although it can easily be reproduced. Has anybody an idea of the
> typical time-frame for a response in xerces-c ?
>
> The problem is a wrong error-line resulting for an element which occurs
> at the wrong position in a sequence. I did some testing, which shows
> that the reported error-position is the closing tag of the next higher
> hierarchie. Some debugging in XMLScanner.cpp, line 1851 shows, that the
> index of the offending element is known (variable res) but the reported
> line-position results from the last pushed reader.
>
>
> In case somebody likes to have a look at it:
>
> Using the following simple schema
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold
>
> (Beratung - Systemanalyse) -->
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
> elementFormDefault="qualified" attributeFormDefault="unqualified">
>         <xs:element name="Root">
>                 <xs:complexType>
>                         <xs:sequence>
>                                 <xs:element ref="A"/>
>                                 <xs:element ref="B"/>
>                                 <xs:element ref="C"/>
>                                 <xs:element ref="D"/>
>                         </xs:sequence>
>                 </xs:complexType>
>         </xs:element>
>         <xs:element name="A"/>
>         <xs:element name="B"/>
>         <xs:element name="C"/>
>         <xs:element name="D"/>
> </xs:schema>
>
> and this xml
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold
>
> (Beratung - Systemanalyse) -->
> <!--Sample XML file generated by XML Spy v4.3 U
> (http://www.xmlspy.com)--> <Root
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:noNamespaceSchemaLocation="Bug.xsd">
>         <A>
>                 aa
>                 aa
>                 aa
>         </A>
>         <X>
>                 bb
>                 bb
>                 bb
>         </X>
>         <C>
>                 cc
>                 cc
>                 cc
>         </C>
>         <D>
>                 dd
>                 dd
>                 dd
>         </D>
> </Root>
>
> xerces emits the following messages
>
> Error at file "d:\xml\Bugs\bugx.xml", line 10, column 5
>    Message: Unknown element 'X'
> Error at file "d:\xml\Bugs\bugx.xml", line 25, column 8
>    Message: Element 'X' is not valid for content model: '(A,B,C,D)'
>
> where line 10 pinpoints the problem. But if the element is known, but
> not ok at
> this position as in
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!-- edited with XML Spy v4.3 U (http://www.xmlspy.com) by Ulrich Arnold
>
> (Beratung - Systemanalyse) -->
> <!--Sample XML file generated by XML Spy v4.3 U
> (http://www.xmlspy.com)--> <Root
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:noNamespaceSchemaLocation="Bug.xsd">
>         <A>
>                 aa
>                 aa
>                 aa
>         </A>
>         <D>
>                 bb
>                 bb
>                 bb
>         </D>
>         <C>
>                 cc
>                 cc
>                 cc
>         </C>
>         <D>
>                 dd
>                 dd
>                 dd
>         </D>
> </Root>
>
> the only message is
>
> Error at file "d:\xml\Bugs\bugd.xml", line 25, column 8
>    Message: Element 'D' is not valid for content model: '(A,B,C,D)'
>
> which does not lead to the problem in line 10.
>
> Thanks Uli
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org