You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/11/18 11:52:41 UTC

Is there such a thing as "in-scope delimiters"?

Hi Folks,

Is there such a thing as in-scope delimiters?

At the field element in the below DFDL schema, what are the in-scope delimiters? Comma and newline?

Notice that the field element references a block escapeScheme, which specifies that the double quote symbol is used to escape a block of text. If a field's value is escaped (via double quotes), then what delimiters are escaped? All in-scope delimiters - comma and newline?  /Roger

<xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:defineEscapeScheme name='Quotes'>
            <dfdl:escapeScheme escapeKind='escapeBlock'
                escapeBlockStart='"'
                escapeBlockEnd='"'
                escapeEscapeCharacter='"'
                extraEscapedCharacters=''
                generateEscapeBlock='whenNeeded'/>
        </dfdl:defineEscapeScheme>
        <dfdl:format ref="default-dfdl-properties"/>
    </xs:appinfo>
</xs:annotation>

<xs:element name="csv">
    <xs:complexType>
        <xs:sequence>
            <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                <xs:element name="record" maxOccurs="unbounded">
                    <xs:complexType>
                        <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
                            <xs:element name="field" maxOccurs="unbounded" type="xs:string"
                                dfdl:escapeSchemeRef="Quotes"
                                dfdl:occursCountKind="implicit">
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:sequence>
    </xs:complexType>
</xs:element>



Re: Is there such a thing as "in-scope delimiters"?

Posted by "Beckerle, Mike" <mb...@tresys.com>.
Yes. In-scope delimiters refers to delimiters, largely terminating delimiters, which are the terminators and separators that are surrounding and related to a given element or model-group in DFDL, based on DFDL's scoping rules for properties, and based on the nesting of elements and model-groups in the schema.

In your schema, for "field" element, both the comma and newline are in-scope delimiters.

Some people will call this the  "in-scope terminating markup" but I just searched the DFDL spec and did not find the term "markup" used in this way, which is good. I've never liked referring to delimiters as "markup".

One clarification perhaps: if an element has a terminator and length kind delimited, The surrounding group's separator is still considered to be in-scope and must be escaped. DFDL didn't have to be defined this way, we could have gone with a rule where a terminator is the only in-scope markup if specified, but that was not the decision. Even if an element has a terminator, the enclosing model-group's separator/terminator are still considered to be in-scope.

E.g., consider this unusual example:

<sequence dfdl:terminator="#" dfdl:separator="$" dfdl:separatorPosition="postfix">
   <!-- we have both a terminator above, AND a postfix separator -->
   <element name="foo" type="xs:string" dfdl:terminator="%"/> <!-- and another terminator -->
</sequence>

For the "foo" element, the in-scope terminating delimiters include %, $ , and #. DFDL specifies that the "foo" element must be terminated by a "%", but the escape-scheme rules indicate that if the "foo" content contains any of %, $, or # that those characters are protected via an escape scheme.






________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Monday, November 18, 2019 6:52 AM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: Is there such a thing as "in-scope delimiters"?


Hi Folks,



Is there such a thing as in-scope delimiters?



At the field element in the below DFDL schema, what are the in-scope delimiters? Comma and newline?



Notice that the field element references a block escapeScheme, which specifies that the double quote symbol is used to escape a block of text. If a field’s value is escaped (via double quotes), then what delimiters are escaped? All in-scope delimiters – comma and newline?  /Roger



<xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:defineEscapeScheme name='Quotes'>
            <dfdl:escapeScheme escapeKind='escapeBlock'
                escapeBlockStart='"'
                escapeBlockEnd='"'
                escapeEscapeCharacter='"'
                extraEscapedCharacters=''
                generateEscapeBlock='whenNeeded'/>
        </dfdl:defineEscapeScheme>
        <dfdl:format ref="default-dfdl-properties"/>
    </xs:appinfo>
</xs:annotation>

<xs:element name="csv">
    <xs:complexType>
        <xs:sequence>
            <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                <xs:element name="record" maxOccurs="unbounded">
                    <xs:complexType>
                        <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
                            <xs:element name="field" maxOccurs="unbounded" type="xs:string"
                                dfdl:escapeSchemeRef="Quotes"
                                dfdl:occursCountKind="implicit">
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:sequence>
    </xs:complexType>
</xs:element>