You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/07/10 17:20:02 UTC

I learned this about DFDL's ability to express complex types with nil value

Hello DFDL community,

I learned that DFDL supports in-band nil values. Recall what an in-band nil is:

In-band nil: a symbol inserted into the region indicates nil. A part of the region's value space is reserved for indicating nil.

I learned that, if the region is to hold an atomic value, then the symbol can be anything. For example, I used N/A to represent a region with a nil value:

<xs:element name="make" type="xs:string" nillable="true" dfdl:nilValue="N/A" />

I learned that, if the region is to hold a complex value, then the symbol is restricted to %ES; (empty string). For example:

<xs:element name="person" nillable="true" dfdl:nilValue="%ES;">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="name" type="xs:string" />
            <xs:element name="age" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

Is that restriction true of real world data formats?

Recall that there are some data formats out in the real world which specify out-of-band that a region represents a nil value. How are out-of-band nils expressed in DFDL?

/Roger

Re: I learned this about DFDL's ability to express complex types with nil value

Posted by "Beckerle, Mike" <mb...@tresys.com>.
So the restriction on ES only as the nilValue for nillable complex types is to avoid this problem.


If you allowed other kinds of nil indications then one needs the DFDL properties to characterize and isolate the representation of those indicators. For example, if you allow the nilValue="nil" for a complex type, all of a sudden you need a lengthKind for the complex type so that you can isolate that "nil", and properly recognize that it is "nil" the nil indicator and not just the first 3 letters of real data containing "nilsson, harold s.".


But the structure part of the complex type needs a lengthKind also, and that may not match what you need for the nil value.


We could have gone down the path of proliferating yet more properties needed only in these obscure cases e.g., dfdl:nilLength. There are a lot of properties in DFDL already as you know. There are many ways to model data in a DFDL schema, so we decided not to go in this direction.


The only nilValue that doesn't cause this sort of problem is %ES;. Length is 0 given the regular length kind of the complex type element. No other properties are needed to characterize it that aren't already applicable to the element. So we allow nillable complex type elements, but the only indicator allowed is %ES;.


re: out-of-band indicators for nil


I suppose out-of-band nil-indicators isn't really that well-baked an idea. It's really out-of-band indicators period.


In DFDL v1.0 you can use indicator flag elements to guide choice branch selection, or to determine presence/absence of an optional element.


In DFDL v1.0 I am pretty sure you can't use a separate indicator element to decide if a nillable element should be nilled or not. The whole DFDL nillable system of properties is about in-band behaviors for nil indicators.


So if you have an external flag in DFDL v1.0 you have to represent the meaning of that flag in terms of other things than whether a logical element is nilled or not.


________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Wednesday, July 10, 2019 1:20:02 PM
To: users@daffodil.apache.org
Subject: I learned this about DFDL's ability to express complex types with nil value

Hello DFDL community,

I learned that DFDL supports in-band nil values. Recall what an in-band nil is:

In-band nil: a symbol inserted into the region indicates nil. A part of the region's value space is reserved for indicating nil.

I learned that, if the region is to hold an atomic value, then the symbol can be anything. For example, I used N/A to represent a region with a nil value:

<xs:element name="make" type="xs:string" nillable="true" dfdl:nilValue="N/A" />

I learned that, if the region is to hold a complex value, then the symbol is restricted to %ES; (empty string). For example:

<xs:element name="person" nillable="true" dfdl:nilValue="%ES;">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="name" type="xs:string" />
            <xs:element name="age" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

Is that restriction true of real world data formats?

Recall that there are some data formats out in the real world which specify out-of-band that a region represents a nil value. How are out-of-band nils expressed in DFDL?

/Roger