You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Steve Lawrence <sl...@apache.org> on 2020/08/12 17:00:29 UTC

Points of Uncertainty and Unordered Sequences

I'm working on fixing how points of uncertainty are handled so they
have the correct behavior and to allow memory cleanup (DAFFODIL-2371).
I'm making good progress, I think I need to write some additional
tests, but so far only one test is failing, related to unordered
sequences with initiated content. The schema we have looks like this:

  <xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/">
      <dfdl:format ref="ex:GeneralFormat"
        lengthUnits="characters"
        lengthKind="delimited"
        occursCountKind="parsed" />
    </xs:appinfo>
  </xs:annotation>

  <xs:element name="R" dfdl:terminator="END">
    <xs:complexType>
      <xs:sequence dfdl:separator="|" dfdl:separatorPosition="infix"
        dfdl:sequenceKind="unordered" dfdl:initiatedContent="yes">
        <xs:element name="X" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="X:">
          <xs:annotation>
            <xs:appinfo source="http://www.ogf.org/dfdl/">
              <dfdl:assert message="the expected message">
                { . eq 'expected' }
              </dfdl:assert>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
        <xs:element name="Y" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="Y:"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

So we have an unordered sequence with initiated content. The test data
looks like this:

  X:not expected|Y:something else

I'm having trouble interpreting if the test is failing because there's
a bug in my changes, or if maybe this test isn't quite right, and my
changes to points of uncertainty are revealing that. I'm honestly
having trouble thinking through what the correct behavior of this test
is.

So, my question is where are the points of uncertainty, and how does
the initiated content resolve those PoUs, and what is the expected
result?

The test expects a parse error with the assertion error of "the
expected message", but I currently get a missing END terminator error.


Re: Points of Uncertainty and Unordered Sequences

Posted by "Sloane, Brandon" <bs...@tresys.com>.
Since occursCountKind=parsed, there is a point of uncertainty at the element X.

However, because the sequence is set initiatedContent=yes, it should not be able to backtrack after it matched the "X:" initiator.

From the spec:

> If the child is optional then it is deemed to have been found when its initiator has been found. Any subsequent error parsing the child will not cause the parser to backtrack to try other alternatives.
________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, August 12, 2020 1:00 PM
To: dev@daffodil.apache.org <de...@daffodil.apache.org>
Subject: Points of Uncertainty and Unordered Sequences

I'm working on fixing how points of uncertainty are handled so they
have the correct behavior and to allow memory cleanup (DAFFODIL-2371).
I'm making good progress, I think I need to write some additional
tests, but so far only one test is failing, related to unordered
sequences with initiated content. The schema we have looks like this:

  <xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/">
      <dfdl:format ref="ex:GeneralFormat"
        lengthUnits="characters"
        lengthKind="delimited"
        occursCountKind="parsed" />
    </xs:appinfo>
  </xs:annotation>

  <xs:element name="R" dfdl:terminator="END">
    <xs:complexType>
      <xs:sequence dfdl:separator="|" dfdl:separatorPosition="infix"
        dfdl:sequenceKind="unordered" dfdl:initiatedContent="yes">
        <xs:element name="X" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="X:">
          <xs:annotation>
            <xs:appinfo source="http://www.ogf.org/dfdl/">
              <dfdl:assert message="the expected message">
                { . eq 'expected' }
              </dfdl:assert>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
        <xs:element name="Y" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="Y:"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

So we have an unordered sequence with initiated content. The test data
looks like this:

  X:not expected|Y:something else

I'm having trouble interpreting if the test is failing because there's
a bug in my changes, or if maybe this test isn't quite right, and my
changes to points of uncertainty are revealing that. I'm honestly
having trouble thinking through what the correct behavior of this test
is.

So, my question is where are the points of uncertainty, and how does
the initiated content resolve those PoUs, and what is the expected
result?

The test expects a parse error with the assertion error of "the
expected message", but I currently get a missing END terminator error.


Re: Points of Uncertainty and Unordered Sequences

Posted by "Beckerle, Mike" <mb...@tresys.com>.
There are two PoU here, one for each recurring element.

The initiator once found should discriminate the element so that the assert for X, since it will fail, should cause the X array to fail entirely, and the whole schema should fail.

I think you should get the message "the expected message" from the failure.

So it seems discrimination is not taking place.



________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, August 12, 2020, 1:00 PM
To: dev@daffodil.apache.org
Subject: Points of Uncertainty and Unordered Sequences

I'm working on fixing how points of uncertainty are handled so they
have the correct behavior and to allow memory cleanup (DAFFODIL-2371).
I'm making good progress, I think I need to write some additional
tests, but so far only one test is failing, related to unordered
sequences with initiated content. The schema we have looks like this:

  <xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/">
      <dfdl:format ref="ex:GeneralFormat"
        lengthUnits="characters"
        lengthKind="delimited"
        occursCountKind="parsed" />
    </xs:appinfo>
  </xs:annotation>

  <xs:element name="R" dfdl:terminator="END">
    <xs:complexType>
      <xs:sequence dfdl:separator="|" dfdl:separatorPosition="infix"
        dfdl:sequenceKind="unordered" dfdl:initiatedContent="yes">
        <xs:element name="X" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="X:">
          <xs:annotation>
            <xs:appinfo source="http://www.ogf.org/dfdl/">
              <dfdl:assert message="the expected message">
                { . eq 'expected' }
              </dfdl:assert>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
        <xs:element name="Y" type="xs:string" maxOccurs="unbounded"
          dfdl:initiator="Y:"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

So we have an unordered sequence with initiated content. The test data
looks like this:

  X:not expected|Y:something else

I'm having trouble interpreting if the test is failing because there's
a bug in my changes, or if maybe this test isn't quite right, and my
changes to points of uncertainty are revealing that. I'm honestly
having trouble thinking through what the correct behavior of this test
is.

So, my question is where are the points of uncertainty, and how does
the initiated content resolve those PoUs, and what is the expected
result?

The test expects a parse error with the assertion error of "the
expected message", but I currently get a missing END terminator error.