You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Sloane, Brandon" <bs...@tresys.com> on 2019/04/10 16:45:00 UTC
Understanding outputValueCalc elements
I am a bit confused with the behaviour of one of my test cases. I have the schema:
<xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
<xs:restriction base="xs:int"/>
</xs:simpleType>
<xs:element name="outputTypeCalcNextSiblingInt_01">
<xs:complexType>
<xs:sequence>
<xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
<xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
</xs:sequence>
</xs:complexType>
</xs:element>
With the test case:
<tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">
<tdml:document>
<tdml:documentPart type="byte">
0E
</tdml:documentPart>
</tdml:document>
<tdml:infoset>
<tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<outputTypeCalcNextSiblingInt_01>
<logic>7</logic>
</outputTypeCalcNextSiblingInt_01>
</tdml:dfdlInfoset>
</tdml:infoset>
</tdml:unparserTestCase>
As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).
However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:
<ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
<ex:logic>7</ex:logic>
<ex:raw></ex:raw>
</ex:outputTypeCalcNextSiblingInt_01>
This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.
I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.
Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.
Thoughts?
Brandon T. Sloane
Associate, Services
bsloane@tresys.com | tresys.com
Re: Understanding outputValueCalc elements
Posted by "Sloane, Brandon" <bs...@tresys.com>.
I think I see the issue.
OVCStartEndStrategy.unparseBegin calls:
val eventMaybe = state.inspectMaybe
as part of determining if it needs to consume any events (since non-hidden OVC elements are optional)
The issue is that the logic in inspect is a bit wonky. In particular, Cursor.scala has this logic for CursorImplMixin.inspect:
accessor = inspectAccessor
val res = priorOpKind match {
case Advance => {
priorOpKind = Inspect
doAdvance(true)
}
case Inspect => true // inspect again does nothing.
case Unsuccessful => return false
}
Which is to say that "inspect" actually consumes and processes the input. It just makes a note that the "next" element has already been consumed so that future calls to inspect and advance know to behave differently.
As far as I can tell, the issue is that creating the DINode to return from inspect/advance has the side effect of inserting it into the infoset. The above logic works around that side effect by effectively caching the results, so you only ever actually construct each element once.
Ignoring side effects, this approach is fine. However, it means that the side effect is realized on the first call to inspect, instead of the call to advance, which is counter intuitive and giving us the problem.
I think the correct solution is to better document this behaviour and create a more side-effect-free peekERD() function which returns the ElementRuntimeData of the next element
________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, April 10, 2019 12:57:09 PM
To: dev@daffodil.apache.org; Sloane, Brandon
Subject: Re: Understanding outputValueCalc elements
I'm not sure how the OVC element could appear after where it is supposed
to appear, but we do actually allow OVC elements to be missing from the
infoset. Whether or not this is correct might be up for debate.
The logic for creating OVC elements that are missing is in the
OVCStartEndStrategy trait in ElementUnparser.scala. I'd guess there is a
logic bug in there that's causing things to get added out of order.
Nothing obvious is jumping out at me though.
- Steve
On 4/10/19 12:45 PM, Sloane, Brandon wrote:
> I am a bit confused with the behaviour of one of my test cases. I have the schema:
>
>
> <xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
> <xs:restriction base="xs:int"/>
> </xs:simpleType>
>
>
> <xs:element name="outputTypeCalcNextSiblingInt_01">
> <xs:complexType>
> <xs:sequence>
> <xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
> <xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>
> With the test case:
>
>
> <tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
> root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">
>
> <tdml:document>
> <tdml:documentPart type="byte">
> 0E
> </tdml:documentPart>
> </tdml:document>
> <tdml:infoset>
> <tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <outputTypeCalcNextSiblingInt_01>
> <logic>7</logic>
> </outputTypeCalcNextSiblingInt_01>
> </tdml:dfdlInfoset>
> </tdml:infoset>
> </tdml:unparserTestCase>
>
> As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).
>
>
> However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:
>
> <ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
> <ex:logic>7</ex:logic>
> <ex:raw></ex:raw>
> </ex:outputTypeCalcNextSiblingInt_01>
>
> This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.
>
> I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.
>
> Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.
>
> Thoughts?
>
>
> Brandon T. Sloane
>
> Associate, Services
>
> bsloane@tresys.com | tresys.com
>
Re: Understanding outputValueCalc elements
Posted by Steve Lawrence <sl...@apache.org>.
I'm not sure how the OVC element could appear after where it is supposed
to appear, but we do actually allow OVC elements to be missing from the
infoset. Whether or not this is correct might be up for debate.
The logic for creating OVC elements that are missing is in the
OVCStartEndStrategy trait in ElementUnparser.scala. I'd guess there is a
logic bug in there that's causing things to get added out of order.
Nothing obvious is jumping out at me though.
- Steve
On 4/10/19 12:45 PM, Sloane, Brandon wrote:
> I am a bit confused with the behaviour of one of my test cases. I have the schema:
>
>
> <xs:simpleType name="AbstractMulitiply2FromInt" dfdl:repType="xs:int" dfdl:outputTypeCalc="{ dfdl:logicalTypeValueInt() * 2 }">
> <xs:restriction base="xs:int"/>
> </xs:simpleType>
>
>
> <xs:element name="outputTypeCalcNextSiblingInt_01">
> <xs:complexType>
> <xs:sequence>
> <xs:element name="raw" type="tns:uint8" dfdl:outputValueCalc="{ dfdl:outputTypeCalcNextSiblingInt() }"/>
> <xs:element name="logic" type="tns:AbstractMulitiply2FromInt" dfdl:length="1" dfdl:inputValueCalc="{ 0 }"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>
> With the test case:
>
>
> <tdml:unparserTestCase name="outputTypeCalcNextSiblingInt_01"
> root="outputTypeCalcNextSiblingInt_01" model="inputTypeCalc-Embedded.dfdl.xsd" description="Extensions - inputTypeCalc keysetValue transform">
>
> <tdml:document>
> <tdml:documentPart type="byte">
> 0E
> </tdml:documentPart>
> </tdml:document>
> <tdml:infoset>
> <tdml:dfdlInfoset xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <outputTypeCalcNextSiblingInt_01>
> <logic>7</logic>
> </outputTypeCalcNextSiblingInt_01>
> </tdml:dfdlInfoset>
> </tdml:infoset>
> </tdml:unparserTestCase>
>
> As I understand it, this should result in an error since the infoset is missing the requires <raw> element and is thus not xsd-valid with the schema (this was an oversight in my writing the test, and if that is what I was seeing I would simply fix it and move on).
>
>
> However, that is not what I am observing, instead the unparser is attempting to unparse it. The trace functionality reveals that it is constructing the infoset:
>
> <ex:outputTypeCalcNextSiblingInt_01 xmlns:ex="http://example.com">
> <ex:logic>7</ex:logic>
> <ex:raw></ex:raw>
> </ex:outputTypeCalcNextSiblingInt_01>
>
> This is almost what I would expect if <raw> where a hidden element. However, note that the ordering of <logic><raw> does not match the ordering <raw><logic> defined by schema.
>
> I have confirmed in the debugger that the contents array of outputTypeCalcNextSiblingInt_01 has <logic> as the first element and <raw> as the second.
>
> Apart from my not understanding how the above infoset is being accepted in the first place, the actual issue I am running into is the ordering in the internal infoset; as the implementation of dfdl:outputTypeCalcNextSiblingInt requires that the ordering information be preserved.
>
> Thoughts?
>
>
> Brandon T. Sloane
>
> Associate, Services
>
> bsloane@tresys.com | tresys.com
>