You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by Iulian Mindrila <iu...@deimos-space.com> on 2018/03/01 08:47:28 UTC

RE: daffodil DFDL validation schema

Hi Steve,

Loading grammars is working now.   Thanks a lot for your help!

I have another question for you:
I’ve tried to validate a very simple dfdl schema test file with some errors in it :
<xs:element name="x" type="xs:string" dfdl:lengthKind___a="delimited"/>
 <xs:element name="y" type="xs:string__a" dfdl:lengthKind="delimited"/>

Xerces does not detect them if the schema is parsed as a XML file and has validation enabled against the grammar pool.
Do you know how these kind of errors can be detected during validation ? 


Thanks,
Iulian.

From: Steve Lawrence
Sent: Wednesday, February 28, 2018 4:41 PM
To: users@daffodil.apache.org; Iulian Mindrila
Subject: Re: daffodil DFDL validation schema

Iulian,

Thanks for the sample. I've managed to modify the necessary schema files
to allow xerces++ to load the schemas and validate a DFDL schema with them.

Surprisingly, these modifications do not work when loaded using
Daffodil, so there appears to be some inconsistencies with xerces++ and
what Daffodil uses to load schemas. We'lll need to do some more
investigation to figure out which one is correct, or what changes can be
made so both parsers are happy. But until we figure that out, I've
attached the modified schemas that should allow you to move forward
using xerces++.

Note that you only need to load DFDL_part3_model.xsd and
XMLSchema_for_DFDL.xsd (in that order) to get your code to load the
grammar pool. The includes/imports in those two files will handle
getting the rest of the files.

- Steve

On 02/28/2018 01:52 AM, Iulian Mindrila wrote:
> Hi Steve,
> 
> Please find attached the cpp file with the loading schema code. You'll need to 
> link the cpp file with xercesc++ library (I've used the latest version 3.2.0).
> Also, you'll need to provide the schema files list as arguments for the executable.
> 
> Thanks,
> Iulian.
> 
> 
> On 27 February 2018 at 14:40, Steve Lawrence <slawrence@apache.org 
> <ma...@apache.org>> wrote:
> 
>     Iulian,
> 
>     These errors are surprising to me. They look like the Xerces is unable
>     to find certain simple types in the "sub" prefix namespace, but all
>     those types are defined in XMLSchema_for_DFDL.xsd, so it's unclear to me
>     why it can't find them.
> 
>     Would it be possible to provide your C++ code that loads these schemas
>     and I can see if I can reproduce the issue?
> 
>     - Steve
> 
>     On 02/23/2018 04:28 AM, Iulian Mindrila wrote:
>     > Hi,
>     >
>     > I’m trying to load the Daffodil dtd and xsd files in a grammar pool of
>     > XercesC++. The purpose is to try to validate a DFDL schema file against it. I
>     > load the grammas in the order indicated by Steve:
>     >
>     > "datatypes.dtd"
>     >
>     > "XMLSchema.dtd"
>     >
>     > "dafint.xsd"
>     >
>     > "DFDL_part1_simpletypes.xsd"
>     >
>     > "DFDL_part2_attributes.xsd"
>     >
>     > "DFDL_part3_model.xsd"
>     >
>     > "XMLSchema_for_DFDL.xsd"
>     >
>     > Everything works fine until the last xsd, XMLSchema_for_DFDL.xsd.
>     >
>     > As you can see below the errors starts with the line 153, referring to “sub”
>      > namespace /<xsd:restriction base="sub:derivationControl">./
>      >
>      > Can you please advise if those errors can be fixed ?
>      >
>      > Thanks,
>      >
>      > Iulian.
>      >
>      > loading  XMLSchema_for_DFDL.xsd
>      >
>      > XMLSchema_for_DFDL.xsd:153:51 error: type
>      > 'http://www.w3.org/2001/XMLSchema:derivationControl
>     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:175:61 error: type
>      > 'http://www.w3.org/2001/XMLSchema:reducedDerivationControl
>     <http://www.w3.org/2001/XMLSchema:reducedDerivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:174:23 error: unknown simpleType '{null}'
>      >
>      > XMLSchema_for_DFDL.xsd:186:51 error: type
>      > 'http://www.w3.org/2001/XMLSchema:derivationControl
>     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:210:58 error: type
>      > 'http://www.w3.org/2001/XMLSchema:typeDerivationControl
>     <http://www.w3.org/2001/XMLSchema:typeDerivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:209:23 error: unknown simpleType '{null}'
>      >
>      > XMLSchema_for_DFDL.xsd:244:52 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:formChoice
>     <http://www.w3.org/2001/XMLSchema:formChoice>' for attribute
>      > 'attributeFormDefault' not found
>      >
>      > XMLSchema_for_DFDL.xsd:246:52 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:formChoice
>     <http://www.w3.org/2001/XMLSchema:formChoice>' for attribute
>     'elementFormDefault'
>      > not found
>      >
>      > XMLSchema_for_DFDL.xsd:316:21 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:allNNI
>     <http://www.w3.org/2001/XMLSchema:allNNI>' for attribute 'maxOccurs' not found
>      >
>      > XMLSchema_for_DFDL.xsd:471:30 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:derivationSet
>     <http://www.w3.org/2001/XMLSchema:derivationSet>' for attribute 'final' not
>     found
>      >
>      > XMLSchema_for_DFDL.xsd:473:30 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:derivationSet
>     <http://www.w3.org/2001/XMLSchema:derivationSet>' for attribute 'block' not
>     found
>      >
>      > XMLSchema_for_DFDL.xsd:721:59 error: type
>      > 'http://www.w3.org/2001/XMLSchema:derivationControl
>     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:720:27 error: unknown simpleType '{null}'
>      >
>      > XMLSchema_for_DFDL.xsd:718:23 error: unknown simpleType '{null}'
>      >
>      > XMLSchema_for_DFDL.xsd:1477:70 error: simpleType
>      > 'http://www.w3.org/2001/XMLSchema:simpleDerivationSet
>     <http://www.w3.org/2001/XMLSchema:simpleDerivationSet>' for attribute
>     'final' not
>      > found
>      >
>      > XMLSchema_for_DFDL.xsd:786:43 error: attribute 'form' has invalid target
>      > namespace with respect to the base wildcard constraint or base has no
>     wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'substitutionGroup' has
>     invalid
>      > target namespace with respect to the base wildcard constraint or base has no
>      > wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'final' has invalid target
>      > namespace with respect to the base wildcard constraint or base has no
>     wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'abstract' has invalid target
>      > namespace with respect to the base wildcard constraint or base has no
>     wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:869:43 error: attribute 'ref' is already defined
>     in base
>      >
>      > XMLSchema_for_DFDL.xsd:921:49 error: attribute 'minOccurs' has invalid target
>      > namespace with respect to the base wildcard constraint or base has no
>     wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:921:49 error: attribute 'maxOccurs' has invalid target
>      > namespace with respect to the base wildcard constraint or base has no
>     wildcard
>      >
>      > XMLSchema_for_DFDL.xsd:977:48 error: type
>      > 'http://www.w3.org/2001/XMLSchema:allNNI
>     <http://www.w3.org/2001/XMLSchema:allNNI>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:955:48 error: type of attribute 'maxOccurs' must be
>      > derived by restriction from type of the corresponding attribute in the base
>      >
>      > XMLSchema_for_DFDL.xsd:1462:59 error: type
>      > 'http://www.w3.org/2001/XMLSchema:derivationControl
>     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
>      >
>      > XMLSchema_for_DFDL.xsd:1461:27 error: unknown simpleType '{null}'
>      >
>      > XMLSchema_for_DFDL.xsd:1459:23 error: unknown simpleType '{null}'
>      >
> 
> 



Re: daffodil DFDL validation schema

Posted by Steve Lawrence <sl...@apache.org>.
Regarding the invalid dfdl attribute, Daffodil modified this schema so
that it would not error on unknown attributes. The reason being that we
would sometimes get schemas that had unknown attributes to us. For
example, IBM's DFDL implementation adds IBM specific attributes that
were unknown to us. If we enabled strict validation of all attributes,
these unknown attributes would cause a failure. We wanted to support
schemas with uknown attributes, so we made this validation more lax, and
we plan to dfdl attributes ourselves in Daffodil (hasn't been done yet,
though).

If you would like to enable strict checking, you can modify the anyOther
attributeGroup in XMLSchema_for_DFDL.xsd (line 550) to comment out the
existing xsd:anyAttribute line that has ##other value, and uncomment the
anyAttribute lines that lookslike this:

     <xsd:anyAttribute
    namespace="http://www.w3.org/XML/1998/namespace
    urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:int
    urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext
    http://www.ibm.com/schema/extensions"
      processContents="lax"
    />

That will enable strict validation for DFDL attributes and will also
allow some other attributes (e.g. XSD, Daffodil internal stuff, and IBM
extensions). Any other unknown attributes should cause a validation error.

Regarding the invalid type, I think this is actually expected behavior.
The XMLSchema_for_DFDL.xsd only requires that the "type" attribute have
a value that is an xsd:QName (line 757). The value "xs:string___a" is a
syntactically valid xs:Qname, so xerces won't complain. Whether or not
that qname exists is not checked by a schema validator. It is up to
whatever reads the XML (e.g. Daffodil) to resolve that xs:QName and
determine its validity.

- Steve

On 03/01/2018 03:47 AM, Iulian Mindrila wrote:
> Hi Steve,
> 
> Loading grammars is working now.   Thanks a lot for your help!
> 
> I have another question for you:
> 
> I’ve tried to validate a very simple dfdl schema test file with some errors in it :
> 
> <xs:element name="x" type="xs:string" *dfdl:lengthKind___a*="delimited"/>
> 
> <xs:element name="y" type="*xs:string__a*" dfdl:lengthKind="delimited"/>
> 
> Xerces does not detect them if the schema is parsed as a XML file and has 
> validation enabled against the grammar pool.
> 
> Do you know how these kind of errors can be detected during validation ?
> 
> Thanks,
> 
> Iulian.
> 
> *From: *Steve Lawrence <ma...@apache.org>
> *Sent: *Wednesday, February 28, 2018 4:41 PM
> *To: *users@daffodil.apache.org <ma...@daffodil.apache.org>; Iulian 
> Mindrila <ma...@deimos-space.com>
> *Subject: *Re: daffodil DFDL validation schema
> 
> Iulian,
> 
> Thanks for the sample. I've managed to modify the necessary schema files
> 
> to allow xerces++ to load the schemas and validate a DFDL schema with them.
> 
> Surprisingly, these modifications do not work when loaded using
> 
> Daffodil, so there appears to be some inconsistencies with xerces++ and
> 
> what Daffodil uses to load schemas. We'lll need to do some more
> 
> investigation to figure out which one is correct, or what changes can be
> 
> made so both parsers are happy. But until we figure that out, I've
> 
> attached the modified schemas that should allow you to move forward
> 
> using xerces++.
> 
> Note that you only need to load DFDL_part3_model.xsd and
> 
> XMLSchema_for_DFDL.xsd (in that order) to get your code to load the
> 
> grammar pool. The includes/imports in those two files will handle
> 
> getting the rest of the files.
> 
> - Steve
> 
> On 02/28/2018 01:52 AM, Iulian Mindrila wrote:
> 
>  > Hi Steve,
> 
>  >
> 
>  > Please find attached the cpp file with the loading schema code. You'll need to
> 
>  > link the cpp file with xercesc++ library (I've used the latest version 3.2.0).
> 
>  > Also, you'll need to provide the schema files list as arguments for the 
> executable.
> 
>  >
> 
>  > Thanks,
> 
>  > Iulian.
> 
>  >
> 
>  >
> 
>  > On 27 February 2018 at 14:40, Steve Lawrence <slawrence@apache.org
> 
>  > <ma...@apache.org>> wrote:
> 
>  >
> 
>  >     Iulian,
> 
>  >
> 
>  >     These errors are surprising to me. They look like the Xerces is unable
> 
>  >     to find certain simple types in the "sub" prefix namespace, but all
> 
>  >     those types are defined in XMLSchema_for_DFDL.xsd, so it's unclear to me
> 
>  >     why it can't find them.
> 
>  >
> 
>  >     Would it be possible to provide your C++ code that loads these schemas
> 
>  >     and I can see if I can reproduce the issue?
> 
>  >
> 
>  >     - Steve
> 
>  >
> 
>  >     On 02/23/2018 04:28 AM, Iulian Mindrila wrote:
> 
>  >     > Hi,
> 
>  >     >
> 
>  >     > I’m trying to load the Daffodil dtd and xsd files in a grammar pool of
> 
>  >     > XercesC++. The purpose is to try to validate a DFDL schema file against 
> it. I
> 
>  >     > load the grammas in the order indicated by Steve:
> 
>  >     >
> 
>  >     > "datatypes.dtd"
> 
>  >     >
> 
>  >     > "XMLSchema.dtd"
> 
>  >     >
> 
>  >     > "dafint.xsd"
> 
>  >     >
> 
>  >     > "DFDL_part1_simpletypes.xsd"
> 
>  >     >
> 
>  >     > "DFDL_part2_attributes.xsd"
> 
>  >     >
> 
>  >     > "DFDL_part3_model.xsd"
> 
>  >     >
> 
>  >     > "XMLSchema_for_DFDL.xsd"
> 
>  >     >
> 
>  >     > Everything works fine until the last xsd, XMLSchema_for_DFDL.xsd.
> 
>  >     >
> 
>  >     > As you can see below the errors starts with the line 153, referring to 
> “sub”
> 
>  >      > namespace /<xsd:restriction base="sub:derivationControl">./
> 
>  >      >
> 
>  >      > Can you please advise if those errors can be fixed ?
> 
>  >      >
> 
>  >      > Thanks,
> 
>  >      >
> 
>  >      > Iulian.
> 
>  >      >
> 
>  >      > loading  XMLSchema_for_DFDL.xsd
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:153:51 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:175:61 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:reducedDerivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:reducedDerivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:174:23 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:186:51 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:210:58 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:typeDerivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:typeDerivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:209:23 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:244:52 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:formChoice
> 
>  >     <http://www.w3.org/2001/XMLSchema:formChoice>' for attribute
> 
>  >      > 'attributeFormDefault' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:246:52 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:formChoice
> 
>  >     <http://www.w3.org/2001/XMLSchema:formChoice>' for attribute
> 
>  >     'elementFormDefault'
> 
>  >      > not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:316:21 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:allNNI
> 
>  >     <http://www.w3.org/2001/XMLSchema:allNNI>' for attribute 'maxOccurs' not 
> found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:471:30 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationSet
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationSet>' for attribute 'final' not
> 
>  >     found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:473:30 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationSet
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationSet>' for attribute 'block' not
> 
>  >     found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:721:59 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:720:27 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:718:23 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:1477:70 error: simpleType
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:simpleDerivationSet
> 
>  >     <http://www.w3.org/2001/XMLSchema:simpleDerivationSet>' for attribute
> 
>  >     'final' not
> 
>  >      > found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:786:43 error: attribute 'form' has invalid target
> 
>  >      > namespace with respect to the base wildcard constraint or base has no
> 
>  >     wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'substitutionGroup' has
> 
>  >     invalid
> 
>  >      > target namespace with respect to the base wildcard constraint or base 
> has no
> 
>  >      > wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'final' has invalid target
> 
>  >      > namespace with respect to the base wildcard constraint or base has no
> 
>  >     wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:809:43 error: attribute 'abstract' has invalid 
> target
> 
>  >      > namespace with respect to the base wildcard constraint or base has no
> 
>  >     wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:869:43 error: attribute 'ref' is already defined
> 
>  >     in base
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:921:49 error: attribute 'minOccurs' has invalid 
> target
> 
>  >      > namespace with respect to the base wildcard constraint or base has no
> 
>  >     wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:921:49 error: attribute 'maxOccurs' has invalid 
> target
> 
>  >      > namespace with respect to the base wildcard constraint or base has no
> 
>  >     wildcard
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:977:48 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:allNNI
> 
>  >     <http://www.w3.org/2001/XMLSchema:allNNI>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:955:48 error: type of attribute 'maxOccurs' must be
> 
>  >      > derived by restriction from type of the corresponding attribute in the 
> base
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:1462:59 error: type
> 
>  >      > 'http://www.w3.org/2001/XMLSchema:derivationControl
> 
>  >     <http://www.w3.org/2001/XMLSchema:derivationControl>' not found
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:1461:27 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >      > XMLSchema_for_DFDL.xsd:1459:23 error: unknown simpleType '{null}'
> 
>  >      >
> 
>  >
> 
>  >
>