You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2020/12/09 14:31:57 UTC

[GitHub] [incubator-daffodil] stevedlawrence commented on a change in pull request #450: Improve newVariableInstace and defineVariable defaultValue expressions

stevedlawrence commented on a change in pull request #450:
URL: https://github.com/apache/incubator-daffodil/pull/450#discussion_r539280896



##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/util/MStack.scala
##########
@@ -182,7 +182,7 @@ object MStackOfAnyRef {
  * things.
  */
 protected abstract class MStack[@specialized T] private[util] (
-  arrayAllocator: (Int) => Array[T], nullValue: T) extends Serializable {
+  arrayAllocator: (Int) => Array[T], nullValue: T) {

Review comment:
       Is this intentional? Is there a problem with serializing an MStack?

##########
File path: daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/processors/unparsers/ExpressionEvaluatingUnparsers.scala
##########
@@ -87,7 +87,8 @@ class NewVariableInstanceStartUnparser(override val context: RuntimeData)
   override lazy val childProcessors = Vector()
 
   override def unparse(state: UState) = {
-    state.newVariableInstance(context.asInstanceOf[VariableRuntimeData])
+    val vrd = context.asInstanceOf[VariableRuntimeData]
+    state.newVariableInstance(vrd, vrd, state)

Review comment:
       It looks like in all cases where we call newVariableInstance, we pass in the same thing as the first and second paramter, and the same state as the third parameter. Can newVariableInstance just only accept one parameter and always use vrd when it needs VariableRuntimeData, and always use ``this`` when it needs state?

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala
##########
@@ -435,6 +435,11 @@ class DataProcessor private (
       state.notifyDebugging(true)
     }
     state.dataProc.get.init(ssrd.parser)
+
+    // Force the evaluation of any defineVariable's with non-constant default
+    // value expressions
+    state.variableMap.forceExpressionEvaluations(state)
+

Review comment:
       You mention there's an issue with the TDML runner not seeing this diagnostic ([DAFFODIL-2443](https://issues.apache.org/jira/browse/DAFFODIL-2443)). I think this change might be the issue.
   
   When you force the expression evaluate for defaults, that could fail and throw an SDE exception. There's nothing here to catch it, so it exception will bubble up to the TDML Runner. But the TDML Runner does (and shouldn't) handle SDE exceptions. Instead SDE exceptions are supposed to be caught by Daffodil and turned into a failure diagnostic. That catch happens in the doParse method (along with a bunch of other exceptions that might be thrown that implied some sort of error that should become a diagnostic).
   
   So I think the fix is to just move this forceExpressionEvaluations call into the try-block of the doParse method. That way if this function does throw an SDE, then it will be caught and correctly turned into a diagnostic. I think it can be moved to right before the startElement call and be funcionally equivalent to what you have now, just with the SDE handled correctly.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -188,24 +283,42 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    * VariableInstances are mutable and cannot safely be shared across threads
    */
   def copy(): VariableMap = {
-    val table = Map[GlobalQName, MStackOf[VariableInstance]]()
-    vTable.foreach { case (k: GlobalQName, s: MStackOf[VariableInstance]) => {
-      val newStack= new MStackOf[VariableInstance]()
+    val table = Map[GlobalQName, ArrayBuffer[VariableInstance]]()
+    vTable.foreach { case (k: GlobalQName, abuf: ArrayBuffer[VariableInstance]) => {
+      val newBuf = new ArrayBuffer[VariableInstance]()
 
       // toList provides a list in LIFO order, so we need to reverse it to
       // maintain order
-      s.toList.reverse.foreach { case v: VariableInstance => newStack.push(v.copy()) }
-      table(k) = newStack
+      abuf.foreach { case v: VariableInstance => { newBuf += v.copy() }}

Review comment:
       The comment above about toList seems to no longer apply. I think it can be removed. Also, I think the foreach's can be replaced with maps. Should make things ab it cleaner, e.g.:
   ```scala
   val table = vTable.map { case (k: GlobalQName, abuf: ArrayBuffer[VariableInstance]) =>
     val newBuf = abuf.map { _.copy() }
     (k, newBuf)
   }
   new VariableMap(table)
   ```

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -188,24 +283,42 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    * VariableInstances are mutable and cannot safely be shared across threads
    */
   def copy(): VariableMap = {
-    val table = Map[GlobalQName, MStackOf[VariableInstance]]()
-    vTable.foreach { case (k: GlobalQName, s: MStackOf[VariableInstance]) => {
-      val newStack= new MStackOf[VariableInstance]()
+    val table = Map[GlobalQName, ArrayBuffer[VariableInstance]]()
+    vTable.foreach { case (k: GlobalQName, abuf: ArrayBuffer[VariableInstance]) => {
+      val newBuf = new ArrayBuffer[VariableInstance]()
 
       // toList provides a list in LIFO order, so we need to reverse it to
       // maintain order
-      s.toList.reverse.foreach { case v: VariableInstance => newStack.push(v.copy()) }
-      table(k) = newStack
+      abuf.foreach { case v: VariableInstance => { newBuf += v.copy() }}
+      table(k) = newBuf
     }}
 
     new VariableMap(table)
   }
 
+  // For defineVariable's with non-constant expressions for default values, it
+  // is necessary to force the evaluation of the expressions after the
+  // VariableMap has been created and initialized, but before parsing begins. We
+  // must also ensure that the expressions only reference other variables that
+  // have default value expressions or are defined externally.
+  def forceExpressionEvaluations(state: ParseOrUnparseState): Unit = {
+    vTable.map { case (_, abuf) => { abuf.foreach { inst => {

Review comment:
       This map can be a foreach. No need to allocate a new Map since we're mutate the abuf's inside it.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -232,19 +345,33 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    */
   def readVariable(vrd: VariableRuntimeData, referringContext: ThrowsSDE, maybePstate: Maybe[ParseOrUnparseState]): DataValuePrimitive = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    if (stack.isDefined) {
-      val variable = stack.get.top
-      variable.state match {
-        case VariableRead if (variable.value.isDefined) => variable.value.getNonNullable
-        case VariableDefined | VariableSet if (variable.value.isDefined) => {
+    val abuf = vTable.get(varQName)
+    if (abuf.isDefined) {
+      val variable = abuf.get.last
+      (variable.state, variable.value.isDefined, variable.defaultValueExpr.isDefined) match {
+        case (VariableRead, true, _) => variable.value.getNonNullable
+        case (VariableDefined | VariableSet, true, _) => {
           if (maybePstate.isDefined && maybePstate.get.isInstanceOf[PState])
             maybePstate.get.asInstanceOf[PState].markVariableRead(vrd)
 
           variable.setState(VariableRead)
           variable.value.getNonNullable
         }
-        case _ => throw new VariableHasNoValue(varQName, vrd)
+        case (VariableDefined, false, true) => {
+          Assert.invariant(maybePstate.isDefined)
+          variable.setState(VariableInProcess)
+          val pstate = maybePstate.get
+          val res = DataValue.unsafeFromAnyRef(variable.defaultValueExpr.get.evaluate(pstate))
+
+          // Need to update the variable's value with the result of the
+          // expression
+          variable.setState(VariableRead)
+          variable.setValue(res)
+
+          res

Review comment:
       Is this correct? This is evaluating the defaultValue and setting the state when the variable is read. But the spec says this about newVariableInstance:
   
   > If a default value is specified this initial value of the instance will be created when the instance is created
   
   So it seems this evaluation should not happen during read, but when the instance of the variable is created. And then that default could be overridden with a setVariableCal.
   
   Looks like you do evaluate the default expression below in newVariableInstance, so maybe this just needs to use what that sets?

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -286,23 +413,28 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
   /**
    * Creates a new instance of a variable
    */
-  def newVariableInstance(vrd: VariableRuntimeData) = {
+  def newVariableInstance(vrd: VariableRuntimeData, referringContext: ThrowsSDE, state: ParseOrUnparseState) = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    Assert.invariant(stack.isDefined)
-    stack.get.push(vrd.createVariableInstance)
+    val abuf = vTable.get(varQName)
+    Assert.invariant(abuf.isDefined)
+
+    if (vrd.maybeDefaultValueExpr.isDefined) {
+      val defaultValue = DataValue.unsafeFromAnyRef(vrd.maybeDefaultValueExpr.get.evaluate(state))
+      abuf.get += vrd.createVariableInstance(VariableUtils.convert(defaultValue.getAnyRef.toString, vrd, referringContext))

Review comment:
       This ``VariableUtils.convert`` seems incorrect to me. I would expect that evaluating the defaultValue expression, or any setVariable expression as well, would know the target type and the expression evaluation would handle the conversion, just like it does with all other expression evaluations. Looks like this isn't something you added, so probably out of scope for this change. But we should create a bug to fix this.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -232,19 +345,33 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    */
   def readVariable(vrd: VariableRuntimeData, referringContext: ThrowsSDE, maybePstate: Maybe[ParseOrUnparseState]): DataValuePrimitive = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    if (stack.isDefined) {
-      val variable = stack.get.top
-      variable.state match {
-        case VariableRead if (variable.value.isDefined) => variable.value.getNonNullable
-        case VariableDefined | VariableSet if (variable.value.isDefined) => {
+    val abuf = vTable.get(varQName)
+    if (abuf.isDefined) {
+      val variable = abuf.get.last
+      (variable.state, variable.value.isDefined, variable.defaultValueExpr.isDefined) match {
+        case (VariableRead, true, _) => variable.value.getNonNullable
+        case (VariableDefined | VariableSet, true, _) => {
           if (maybePstate.isDefined && maybePstate.get.isInstanceOf[PState])
             maybePstate.get.asInstanceOf[PState].markVariableRead(vrd)
 
           variable.setState(VariableRead)
           variable.value.getNonNullable
         }
-        case _ => throw new VariableHasNoValue(varQName, vrd)
+        case (VariableDefined, false, true) => {
+          Assert.invariant(maybePstate.isDefined)
+          variable.setState(VariableInProcess)
+          val pstate = maybePstate.get
+          val res = DataValue.unsafeFromAnyRef(variable.defaultValueExpr.get.evaluate(pstate))
+
+          // Need to update the variable's value with the result of the
+          // expression
+          variable.setState(VariableRead)
+          variable.setValue(res)
+
+          res
+        }
+        case (VariableInProcess, _, _) => throw new VariableCircularDefinition(varQName, vrd)
+        case (_, _, _) => throw new VariableHasNoValue(varQName, vrd)

Review comment:
       Should the test we have add coverage for this, but it's currently failing due to DAFFODIL-2444? This seems like a really important path to cover.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -166,17 +257,21 @@ final class VariableBox(initialVMap: VariableMap) {
  *
  * The DPath implementation must be made to implement the
  * no-set-after-default-value-has-been-read behavior. This requires that reading the variables causes a state transition.
+ *
+ * The implementaiton of the VariabeMap uses ArrayBuffers essentially as a

Review comment:
       Typo: implementation

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -232,19 +345,33 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    */
   def readVariable(vrd: VariableRuntimeData, referringContext: ThrowsSDE, maybePstate: Maybe[ParseOrUnparseState]): DataValuePrimitive = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    if (stack.isDefined) {
-      val variable = stack.get.top
-      variable.state match {
-        case VariableRead if (variable.value.isDefined) => variable.value.getNonNullable
-        case VariableDefined | VariableSet if (variable.value.isDefined) => {
+    val abuf = vTable.get(varQName)
+    if (abuf.isDefined) {
+      val variable = abuf.get.last
+      (variable.state, variable.value.isDefined, variable.defaultValueExpr.isDefined) match {
+        case (VariableRead, true, _) => variable.value.getNonNullable
+        case (VariableDefined | VariableSet, true, _) => {
           if (maybePstate.isDefined && maybePstate.get.isInstanceOf[PState])

Review comment:
       While we're renaming variables, can we change maybePstate to just maybeState, since it seems it can be either parse or unparse? I thought this was specifically for parse, and was confused about what happens during unparse, but that doesn't seem to be the case. Or should this always be PState, and maybe the type of maybePState should be Maybe[PState] in which case you only need th isDefined check?

##########
File path: daffodil-test/src/test/resources/org/apache/daffodil/section07/variables/variables.tdml
##########
@@ -1884,4 +1976,63 @@
     </tdml:infoset>
   </tdml:parserTestCase>
 
+  <tdml:defineSchema name="circular_defineVariable">
+    <xs:include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"/>
+    <dfdl:format ref="ex:GeneralFormat" />
+
+    <dfdl:defineVariable name="circular1" type="xs:int"
+      defaultValue="{ $ex:circular2 }" />
+    <dfdl:defineVariable name="circular2" type="xs:int"
+      defaultValue="{ $ex:circular1 }" />
+    <xs:element name="root" type="xs:int" dfdl:inputValueCalc="{ $ex:circular1 }" />
+
+  </tdml:defineSchema>
+
+  <tdml:parserTestCase name="circular_defineVariable_err" root="root"
+    model="circular_defineVariable" description="Section 7 - ">
+    <tdml:document/>
+    <tdml:errors>
+      <tdml:error>Runtime Schema Definition Error</tdml:error>
+      <tdml:error>part of circular definition</tdml:error>
+    </tdml:errors>
+  </tdml:parserTestCase>
+
+  <tdml:defineSchema name="defineVariable_ref_infoset">
+    <xs:include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"/>
+    <dfdl:format ref="ex:GeneralFormat" />
+
+    <dfdl:defineVariable name="badRef" type="xs:int"
+      defaultValue="{ ex:root }" />
+    <xs:element name="root" type="xs:int" dfdl:inputValueCalc="{ 42 }" />
+
+  </tdml:defineSchema>
+
+  <tdml:parserTestCase name="defineVariable_ref_infoset_err" root="root"
+    model="defineVariable_ref_infoset" description="Section 7 - ">
+    <tdml:document/>
+    <tdml:errors>
+      <tdml:error>????</tdml:error>
+    </tdml:errors>
+  </tdml:parserTestCase>
+
+  <tdml:defineSchema name="defineVariable_ref_noDefault">
+    <xs:include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"/>
+    <dfdl:format ref="ex:GeneralFormat" />
+
+    <dfdl:defineVariable name="noDefault" type="xs:int" />
+    <dfdl:defineVariable name="badRef" type="xs:int"
+      defaultValue="{ $ex:noDefault }" />
+    <xs:element name="root" type="xs:int" dfdl:inputValueCalc="{ $ex:badRef }" />
+
+  </tdml:defineSchema>
+
+  <tdml:parserTestCase name="defineVariable_ref_noDefault_err" root="root"
+    model="defineVariable_ref_noDefault" description="Section 7 - ">
+    <tdml:document/>
+    <tdml:errors>
+      <tdml:error>Runtime Schema Definition Error</tdml:error>
+      <tdml:error>has no value. It was not set</tdml:error>
+    </tdml:errors>
+  </tdml:parserTestCase>
+

Review comment:
       An intersting test case that I'm not sure is covered but would be worth having: What if you have two variables where they both have defaults, and one refers to the other, like what you have above:
   ```xml
       <dfdl:defineVariable name="var1" type="xs:int" defaultValue="5" />
       <dfdl:defineVariable name="var2" type="xs:int" defaultValue="{ $var1 + 1 }" />
   ```
   Now what if during parse we have a dfdl:setVariable that sets the value of both of these variables? The spec says this:
   > If a defaultValue expression references another variable then the single-assignment nature of variables prevents the referenced variable's value from ever changing, that is, it is considered to be a read of the variable's value, and once read, a variable's value cannot be changed.
   
   So we shoudl be able to setVariable var2, but not setVariable var1. I think having a test that checks for this would be useful.
   
   Another test I think would be useful, if it doesn't already exist, would be to allow var1 and var2 to be externally set, and make sure that those values override the default values. With the change with the forceEvaluations and delay evaluates of things, I'm not 100% confident that works as expected.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -232,19 +345,33 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    */
   def readVariable(vrd: VariableRuntimeData, referringContext: ThrowsSDE, maybePstate: Maybe[ParseOrUnparseState]): DataValuePrimitive = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    if (stack.isDefined) {
-      val variable = stack.get.top
-      variable.state match {
-        case VariableRead if (variable.value.isDefined) => variable.value.getNonNullable
-        case VariableDefined | VariableSet if (variable.value.isDefined) => {
+    val abuf = vTable.get(varQName)
+    if (abuf.isDefined) {
+      val variable = abuf.get.last
+      (variable.state, variable.value.isDefined, variable.defaultValueExpr.isDefined) match {
+        case (VariableRead, true, _) => variable.value.getNonNullable
+        case (VariableDefined | VariableSet, true, _) => {
           if (maybePstate.isDefined && maybePstate.get.isInstanceOf[PState])
             maybePstate.get.asInstanceOf[PState].markVariableRead(vrd)
 
           variable.setState(VariableRead)
           variable.value.getNonNullable
         }
-        case _ => throw new VariableHasNoValue(varQName, vrd)
+        case (VariableDefined, false, true) => {
+          Assert.invariant(maybePstate.isDefined)
+          variable.setState(VariableInProcess)
+          val pstate = maybePstate.get
+          val res = DataValue.unsafeFromAnyRef(variable.defaultValueExpr.get.evaluate(pstate))

Review comment:
       I guess this means maybePState really could be a UState or a PState?

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/VariableMap1.scala
##########
@@ -232,19 +345,33 @@ class VariableMap private(vTable: Map[GlobalQName, MStackOf[VariableInstance]])
    */
   def readVariable(vrd: VariableRuntimeData, referringContext: ThrowsSDE, maybePstate: Maybe[ParseOrUnparseState]): DataValuePrimitive = {
     val varQName = vrd.globalQName
-    val stack = vTable.get(varQName)
-    if (stack.isDefined) {
-      val variable = stack.get.top
-      variable.state match {
-        case VariableRead if (variable.value.isDefined) => variable.value.getNonNullable
-        case VariableDefined | VariableSet if (variable.value.isDefined) => {
+    val abuf = vTable.get(varQName)
+    if (abuf.isDefined) {
+      val variable = abuf.get.last
+      (variable.state, variable.value.isDefined, variable.defaultValueExpr.isDefined) match {

Review comment:
       This is maybe personal preference, but I prefer the previous code where we had ``case Foo if ...``. It's easier to see what a particular case is checking for when you can see the condition, rather than just seeing true/false and having to refer back to the match statement. There also might be some overhead with having to allocate a tuple (not sure if Scala will do that behind the scenes).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org