You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Boris Kolpackov (JIRA)" <xe...@xml.apache.org> on 2009/11/04 11:23:35 UTC

[jira] Updated: (XERCESC-1881) xsd sequence validation reporting errors too late

     [ https://issues.apache.org/jira/browse/XERCESC-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Boris Kolpackov updated XERCESC-1881:
-------------------------------------

    Component/s:     (was: SAX/SAX2)
                 Validating Parser (XML Schema)

This is due to the way Xerces-C++ validates content. It builds up a model of the content and then runs a regex-like grammar (constructed from schema) on it to make sure it is valid. I agree this behavior is wrong but fixing it will require changing the way validation is performed.

> xsd sequence validation reporting errors too late
> -------------------------------------------------
>
>                 Key: XERCESC-1881
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1881
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Validating Parser (XML Schema)
>    Affects Versions: 3.0.1
>         Environment: Windows Visaa 32, Xerces 3.0.1
>            Reporter: Brian Hoyt
>
> Validation using the following xsd and xml results in two different results between XercesJ and XercesC++.
> For java I get the error reporting the sequence error right after the processing of element <url> because <name>
> cannot appear after <url>. But for C++ the error is not reported until the last element within <person> has been
> processed. This obviously isn't correct because by that time it is too late. The way Java is reporting it seems to 
> be correct so that I can stop processing the xml file. 
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'>
>    <xs:element name="person">
>        <xs:complexType>
>            <xs:sequence>
>                  <xs:element name="name"  type='xs:string' minOccurs='0' maxOccurs='1'/>
>                  <xs:element name="email"  type='xs:string' minOccurs='0' maxOccurs='unbounded'/>
>                  <xs:element name="url"    type='xs:string' minOccurs='0' maxOccurs='unbounded'/>
>                  <xs:element name="link"   type='xs:string' minOccurs='0' maxOccurs='1'/>
>            </xs:sequence>
>        </xs:complexType>
>     </xs:element>
> </xs:schema>
> <?xml version="1.0" encoding="UTF-8"?>
> <person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>            xsi:noNamespaceSchemaLocation='foo.xsd'>
>     <url>www.foo.com</url>
>     <name>Boss</name>
>     <email>chief@foo.com</email>
>     <link/>
> </person>
> The output from running the XercesJ 2.9.1 Writer sample on the above xsd/xml is:
> <person xsi:noNamespaceSchemaLocation="foo.xsd">
>     <url>www.foo.com</url>
>     [Error] foo.xml:5:11: cvc-complex-type.2.4.a: Invalid content was found star
> ting with element 'name'. One of '{url, link}' is expected.
> <name>Boss</name>
>     <email>chief@foo.com</email>
>     <link></link>
> </person>
> The output from running the XercesC++ 3.0.1 
> <?xml version="1.0" encoding="LATIN1"?>
> <person xsi:noNamespaceSchemaLocation="foo.xsd">
>     <url>www.foo.com</url>
>     <name>Boss</name>
>     <email>chief@foo.com</email>
>     <link></link>
> Error at file C:\xerces-3_0_1\bin/foo.xml, line 8, char 10
>   Message: element 'name' is not allowed for content model '((name,email,url),li
> nk)'
> </person>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org