You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@daffodil.apache.org by "Sloane, Brandon" <bs...@tresys.com> on 2019/04/10 16:45:00 UTC

Understanding outputValueCalc elements

I am a bit confused with the behaviour of one of my test cases. I have the schema:


    <xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
      <xs:restriction base="xs:int"/>
    </xs:simpleType>


    <xs:element name="outputTypeCalcNextSiblingInt_01">
      <xs:complexType>
        <xs:sequence>
          <xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
          <xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>

With the test case:


  <tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
    root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">

    <tdml:document>
    <tdml:documentPart type="byte">
    0E
    </tdml:documentPart>
    </tdml:document>
    <tdml:infoset>
      <tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <outputTypeCalcNextSiblingInt_01>
          <logic>7</logic>
        </outputTypeCalcNextSiblingInt_01>
      </tdml:dfdlInfoset>
    </tdml:infoset>
  </tdml:unparserTestCase>

As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).


However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:

<ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
  <ex:logic>7</ex:logic>
  <ex:raw></ex:raw>
</ex:outputTypeCalcNextSiblingInt_01>

This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.

I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.

Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.

Thoughts?


Brandon T. Sloane

Associate, Services

bsloane@tresys.com | tresys.com

Re: Understanding outputValueCalc elements

Posted by "Sloane, Brandon" <bs...@tresys.com>.

I think I see the issue.


OVCStartEndStrategy.unparseBegin calls:


val eventMaybe = state.inspectMaybe


as part of determining if it needs to consume any events (since non-hidden OVC elements are optional)


The issue is that the logic in inspect is a bit wonky. In particular, Cursor.scala has this logic for CursorImplMixin.inspect:


    accessor = inspectAccessor

    val res = priorOpKind match {
      case Advance => {
        priorOpKind = Inspect
        doAdvance(true)
      }
      case Inspect => true // inspect again does nothing.
      case Unsuccessful => return false
    }


Which is to say that "inspect" actually consumes and processes the input. It just makes a note that the "next" element has already been consumed so that future calls to inspect and advance know to behave differently.


As far as I can tell, the issue is that creating the DINode to return from inspect/advance has the side effect of inserting it into the infoset. The above logic works around that side effect by effectively caching the results, so you only ever actually construct each element once.


Ignoring side effects, this approach is fine. However, it means that the side effect is realized on the first call to inspect, instead of the call to advance, which is counter intuitive and giving us the problem.


I think the correct solution is to better document this behaviour and create a more side-effect-free peekERD() function which returns the ElementRuntimeData of the next element

________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, April 10, 2019 12:57:09 PM
To: dev@daffodil.apache.org; Sloane, Brandon
Subject: Re: Understanding outputValueCalc elements

I'm not sure how the OVC element could appear after where it is supposed
to appear, but we do actually allow OVC elements to be missing from the
infoset. Whether or not this is correct might be up for debate.

The logic for creating OVC elements that are missing is in the
OVCStartEndStrategy trait in ElementUnparser.scala. I'd guess there is a
logic bug in there that's causing things to get added out of order.
Nothing obvious is jumping out at me though.

- Steve

On 4/10/19 12:45 PM, Sloane, Brandon wrote:
> I am a bit confused with the behaviour of one of my test cases. I have the schema:
>
>
>     <xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
>       <xs:restriction base="xs:int"/>
>     </xs:simpleType>
>
>
>     <xs:element name="outputTypeCalcNextSiblingInt_01">
>       <xs:complexType>
>         <xs:sequence>
>           <xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
>           <xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
>         </xs:sequence>
>       </xs:complexType>
>     </xs:element>
>
> With the test case:
>
>
>   <tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
>     root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">
>
>     <tdml:document>
>     <tdml:documentPart type="byte">
>     0E
>     </tdml:documentPart>
>     </tdml:document>
>     <tdml:infoset>
>       <tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
>         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>         <outputTypeCalcNextSiblingInt_01>
>           <logic>7</logic>
>         </outputTypeCalcNextSiblingInt_01>
>       </tdml:dfdlInfoset>
>     </tdml:infoset>
>   </tdml:unparserTestCase>
>
> As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).
>
>
> However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:
>
> <ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
>   <ex:logic>7</ex:logic>
>   <ex:raw></ex:raw>
> </ex:outputTypeCalcNextSiblingInt_01>
>
> This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.
>
> I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.
>
> Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.
>
> Thoughts?
>
>
> Brandon T. Sloane
>
> Associate, Services
>
> bsloane@tresys.com | tresys.com
>

Re: Understanding outputValueCalc elements

Posted by Steve Lawrence <sl...@apache.org>.

I'm not sure how the OVC element could appear after where it is supposed
to appear, but we do actually allow OVC elements to be missing from the
infoset. Whether or not this is correct might be up for debate.

The logic for creating OVC elements that are missing is in the
OVCStartEndStrategy trait in ElementUnparser.scala. I'd guess there is a
logic bug in there that's causing things to get added out of order.
Nothing obvious is jumping out at me though.

- Steve

On 4/10/19 12:45 PM, Sloane, Brandon wrote:
> I am a bit confused with the behaviour of one of my test cases. I have the schema:
> 
> 
>     <xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
>       <xs:restriction base="xs:int"/>
>     </xs:simpleType>
> 
> 
>     <xs:element name="outputTypeCalcNextSiblingInt_01">
>       <xs:complexType>
>         <xs:sequence>
>           <xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
>           <xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
>         </xs:sequence>
>       </xs:complexType>
>     </xs:element>
> 
> With the test case:
> 
> 
>   <tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
>     root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">
> 
>     <tdml:document>
>     <tdml:documentPart type="byte">
>     0E
>     </tdml:documentPart>
>     </tdml:document>
>     <tdml:infoset>
>       <tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
>         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>         <outputTypeCalcNextSiblingInt_01>
>           <logic>7</logic>
>         </outputTypeCalcNextSiblingInt_01>
>       </tdml:dfdlInfoset>
>     </tdml:infoset>
>   </tdml:unparserTestCase>
> 
> As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).
> 
> 
> However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:
> 
> <ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
>   <ex:logic>7</ex:logic>
>   <ex:raw></ex:raw>
> </ex:outputTypeCalcNextSiblingInt_01>
> 
> This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.
> 
> I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.
> 
> Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.
> 
> Thoughts?
> 
> 
> Brandon T. Sloane
> 
> Associate, Services
> 
> bsloane@tresys.com | tresys.com
>