You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by "Costello, Roger L." <co...@mitre.org> on 2019/07/16 14:51:50 UTC

RE: [EXT] Re: Unable to parse an in-band nil on complex type

I am using the latest release, 2.4.0



From: Beckerle, Mike <mb...@tresys.com>
Sent: Tuesday, July 16, 2019 10:47 AM
To: users@daffodil.apache.org
Subject: [EXT] Re: Unable to parse an in-band nil on complex type


What release of Daffodil?

2.4.0 or earlier?

________________________________
From: Costello, Roger L. <co...@mitre.org>>
Sent: Tuesday, July 16, 2019 10:43:38 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type


Thanks Mike. I made the changes you described. Now parsing works perfect but unparsing gives the same error message. Here is the XML that is output:



<input xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <person>
    <name>John Doe</name>
    <age>29</age>
  </person>
  <person>
    <name>Sally Smith</name>
    <age>34</age>
  </person>
  <person xsi:nil="true"></person>
  <person>
    <name>Bob Jones</name>
    <age>51</age>
  </person>
</input>



Here is my updated DFDL schema:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" dfdl:initiator="Person:" nillable="true" dfdl:nilValue="%ES;" dfdl:nilValueDelimiterPolicy="initiator">
                <xs:complexType>
                    <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



From: Beckerle, Mike <mb...@tresys.com>>
Sent: Tuesday, July 16, 2019 10:22 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: [EXT] Re: Unable to parse an in-band nil on complex type



The initiator must be on the person element, not on the interior sequence, since that interior sequence will never even be attempted to parse when the DFDL processor finds the nil representation of the surrounding complex element.



You have dfdl:nilValueDelimiterPolicy="initiator" on the complex element,  but there is no initiator defined there. So that doesn't do anything.



Here's the principle to keep in mind. If the complex element matches its nil representation, nothing inside it is ever even considered.



I am still uncertain about the unparse error. That seems like the unparser may have a bug having to do with nilled complex elements, or empty string elements. Not sure.





________________________________

From: Costello, Roger L. <co...@mitre.org>>
Sent: Tuesday, July 16, 2019 7:47:53 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type



Thanks Mike.



Okay, I modified the input so that the data for each person is initiated by "Person:"



Person:John Doe,29
Person:Sally Smith,34

Person:Bob Jones,51



This DFDL schema parses and unparses that input perfectly:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" >
                <xs:complexType>
                    <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix" dfdl:initiator="Person:">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



Then I added the nillable stuff to the person element declaration:



<xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" nillable="true" dfdl:nilValue="%ES;" dfdl:nilValueDelimiterPolicy="initiator" >



Now the above input parses perfectly but unparsing produces no output and generates this error message:



Unparse Error: Element {}person does not have a value.



Huh? What does that mean?



Next, I added a line in my input file for a person with a null value:



Person:John Doe,29
Person:Sally Smith,34
Person:
Person:Bob Jones,51



Parsing results in consuming the first two lines and discarding the rest.



Eek! What am I doing wrong, please?



/Roger

From: Beckerle, Mike <mb...@tresys.com>>
Sent: Monday, July 15, 2019 11:14 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type





The nil representation is preferred to any other representation, which is to say that the parser checks for the nil representation first before attempting to parse any other representation.

So if %ES; is the nil value, and there is no initiator nor terminator in the nil representation (based on dfdl:nilValueDelimiterPolicy) then DFDL will always create a nilled element and consume nothing.



If that happens in an array element where the array is of unbounded length, and if the array is not in a separated sequence, then you have created the DFDL equivalent of an infinite loop. You are able to positively parse and produce an element (in this case a nilled element), while not consuming any bits from the data stream at all, and you can do this an unbounded number of times. The DFDL spec. requires that implementations check for this case, and cause a parse error.



This check applies not specifically to nillable elements, but to any element representation. If zero bits are consumed, in an unbounded array context, then it's a parse error.



A requirement in DFDL is that you make forward progress through the data stream when parsing an unbounded array. You can consume nothing only in a scalar, or a bounded array, which insures that the array eventually ends even if nothing from the data stream is being consumed for each array element.



-mike beckerle



________________________________

From: Costello, Roger L. <co...@mitre.org>>
Sent: Monday, July 15, 2019 9:13:39 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Unable to parse an in-band nil on complex type



Hello DFDL community,



My input consists of a series of name, age pairs (on different lines):



John Doe
29
Sally Smith
34



This DFDL schema parses the input perfectly:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit">
                <xs:complexType>
                    <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>





I want to test in-band nil on complex type. So, I added nillable="true" dfdl:nilValue="%ES;"

to the person element declaration:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" nillable="true" dfdl:nilValue="%ES;">
                <xs:complexType>
                   <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



Now, when I parse the same input file, I get this error message:



Parse Error: Repeating or Optional Element - No forward progress at byte 29. Attempt to parse person succeeded but consumed no data.



Why am I getting this error? How do I fix it?



/Roger

Re: [EXT] Re: Unable to parse an in-band nil on complex type

Posted by "Beckerle, Mike" <mb...@tresys.com>.
https://issues.apache.org/jira/browse/DAFFODIL-2183


I created this ticket for the unparser bug.

________________________________
From: Costello, Roger L. <co...@mitre.org>
Sent: Tuesday, July 16, 2019 10:51:50 AM
To: users@daffodil.apache.org
Subject: RE: [EXT] Re: Unable to parse an in-band nil on complex type


I am using the latest release, 2.4.0







From: Beckerle, Mike <mb...@tresys.com>
Sent: Tuesday, July 16, 2019 10:47 AM
To: users@daffodil.apache.org
Subject: [EXT] Re: Unable to parse an in-band nil on complex type



What release of Daffodil?

2.4.0 or earlier?

________________________________

From: Costello, Roger L. <co...@mitre.org>>
Sent: Tuesday, July 16, 2019 10:43:38 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type



Thanks Mike. I made the changes you described. Now parsing works perfect but unparsing gives the same error message. Here is the XML that is output:



<input xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <person>
    <name>John Doe</name>
    <age>29</age>
  </person>
  <person>
    <name>Sally Smith</name>
    <age>34</age>
  </person>
  <person xsi:nil="true"></person>
  <person>
    <name>Bob Jones</name>
    <age>51</age>
  </person>
</input>



Here is my updated DFDL schema:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" dfdl:initiator="Person:" nillable="true" dfdl:nilValue="%ES;" dfdl:nilValueDelimiterPolicy="initiator">
                <xs:complexType>
                    <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



From: Beckerle, Mike <mb...@tresys.com>>
Sent: Tuesday, July 16, 2019 10:22 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: [EXT] Re: Unable to parse an in-band nil on complex type



The initiator must be on the person element, not on the interior sequence, since that interior sequence will never even be attempted to parse when the DFDL processor finds the nil representation of the surrounding complex element.



You have dfdl:nilValueDelimiterPolicy="initiator" on the complex element,  but there is no initiator defined there. So that doesn't do anything.



Here's the principle to keep in mind. If the complex element matches its nil representation, nothing inside it is ever even considered.



I am still uncertain about the unparse error. That seems like the unparser may have a bug having to do with nilled complex elements, or empty string elements. Not sure.





________________________________

From: Costello, Roger L. <co...@mitre.org>>
Sent: Tuesday, July 16, 2019 7:47:53 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type



Thanks Mike.



Okay, I modified the input so that the data for each person is initiated by “Person:”



Person:John Doe,29
Person:Sally Smith,34

Person:Bob Jones,51



This DFDL schema parses and unparses that input perfectly:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" >
                <xs:complexType>
                    <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix" dfdl:initiator="Person:">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



Then I added the nillable stuff to the person element declaration:



<xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" nillable="true" dfdl:nilValue="%ES;" dfdl:nilValueDelimiterPolicy="initiator" >



Now the above input parses perfectly but unparsing produces no output and generates this error message:



Unparse Error: Element {}person does not have a value.



Huh? What does that mean?



Next, I added a line in my input file for a person with a null value:



Person:John Doe,29
Person:Sally Smith,34
Person:
Person:Bob Jones,51



Parsing results in consuming the first two lines and discarding the rest.



Eek! What am I doing wrong, please?



/Roger

From: Beckerle, Mike <mb...@tresys.com>>
Sent: Monday, July 15, 2019 11:14 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Re: Unable to parse an in-band nil on complex type





The nil representation is preferred to any other representation, which is to say that the parser checks for the nil representation first before attempting to parse any other representation.

So if %ES; is the nil value, and there is no initiator nor terminator in the nil representation (based on dfdl:nilValueDelimiterPolicy) then DFDL will always create a nilled element and consume nothing.



If that happens in an array element where the array is of unbounded length, and if the array is not in a separated sequence, then you have created the DFDL equivalent of an infinite loop. You are able to positively parse and produce an element (in this case a nilled element), while not consuming any bits from the data stream at all, and you can do this an unbounded number of times. The DFDL spec. requires that implementations check for this case, and cause a parse error.



This check applies not specifically to nillable elements, but to any element representation. If zero bits are consumed, in an unbounded array context, then it's a parse error.



A requirement in DFDL is that you make forward progress through the data stream when parsing an unbounded array. You can consume nothing only in a scalar, or a bounded array, which insures that the array eventually ends even if nothing from the data stream is being consumed for each array element.



-mike beckerle



________________________________

From: Costello, Roger L. <co...@mitre.org>>
Sent: Monday, July 15, 2019 9:13:39 AM
To: users@daffodil.apache.org<ma...@daffodil.apache.org>
Subject: Unable to parse an in-band nil on complex type



Hello DFDL community,



My input consists of a series of name, age pairs (on different lines):



John Doe
29
Sally Smith
34



This DFDL schema parses the input perfectly:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit">
                <xs:complexType>
                    <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>





I want to test in-band nil on complex type. So, I added nillable="true" dfdl:nilValue="%ES;"

to the person element declaration:



<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="person" maxOccurs="unbounded" dfdl:occursKind="implicit" nillable="true" dfdl:nilValue="%ES;">
                <xs:complexType>
                   <xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix">
                        <xs:element name="name" type="xs:string" />
                        <xs:element name="age" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>



Now, when I parse the same input file, I get this error message:



Parse Error: Repeating or Optional Element - No forward progress at byte 29. Attempt to parse person succeeded but consumed no data.



Why am I getting this error? How do I fix it?



/Roger