You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2021/07/02 18:49:04 UTC

[GitHub] [daffodil] mbeckerle commented on a change in pull request #600: Remove by-name and transient params for runtime data objects

mbeckerle commented on a change in pull request #600:
URL: https://github.com/apache/daffodil/pull/600#discussion_r662348844



##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/dsom/CompiledExpression1.scala
##########
@@ -201,26 +189,16 @@ final case class ConstantExpression[+T <: AnyRef](
  * into "passes".
  */
 class DPathCompileInfo(
+  // parentsArg is only a transient because of the issue below about
+  // serialization ordering. There is no delayed/lazy/by-name evaluation
+  // involved.
   @TransientParam parentsArg: Seq[DPathCompileInfo],

Review comment:
       The Arg suffix just denotes that the argument gets special treatment and isn't what you should access. 
   
   I use Arg suffix when an argument is just being passed to a base class constructor, which itself defines the same parameter as a val, so that you only have one declaration of that name as a val. Derived classes the parameter is not declared val, and has Arg suffix to avoid ambiguity. 

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/dpath/NodeInfo.scala
##########
@@ -61,16 +63,26 @@ import org.apache.daffodil.xml.XMLUtils
  * can answer questions about how two types are related. It can find the
  * least general supertype, or most general subtype of two types.
  */
-sealed abstract class TypeNode(parentsArg: => Seq[NodeInfo.Kind], childrenArg: => Seq[NodeInfo.Kind])
-  extends Serializable
+sealed abstract class TypeNode private (
+  val parents:  Seq[NodeInfo.Kind],
+  @TransientParam childrenArg: Delay[Seq[NodeInfo.Kind]])
+  extends PreSerialization
   with NodeInfo.Kind {
 
-  def this(parentArg: NodeInfo.Kind, childrenArg: => Seq[NodeInfo.Kind]) = this(Seq(parentArg), childrenArg)
-  def this(parentArg: NodeInfo.Kind) = this(Seq(parentArg), Seq(NodeInfo.Nothing))
+  def this(parents: Seq[NodeInfo.Kind], children: => Seq[NodeInfo.Kind]) = this(parents, Delay(children))
+  def this(parentArg: NodeInfo.Kind, childrenArg: => Seq[NodeInfo.Kind]) = this(Seq(parentArg), Delay(childrenArg))
+  def this(parentArg: NodeInfo.Kind) = this(Seq(parentArg), Delay(Seq[NodeInfo.Kind](NodeInfo.Nothing)))
 
   // Eliminated a var here. Doing functional graph construction now below.
-  final override lazy val parents = parentsArg
-  final override lazy val children = childrenArg
+  final override lazy val children = childrenArg.value
+
+  override def preSerialization = {
+    super.preSerialization
+    children

Review comment:
       Ah, nevermind that. The Import/Includes system creates many Delay objects that are not forced - once it detects that it already has the target namespace for the file, or that it has already imported the file. E.g., if 35 files all import a common file, we only read it in once, but all 35 have an import statment which creates an Import object, which has a Delay object associated with the _potential need_ to read in that file. 

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/dpath/NodeInfo.scala
##########
@@ -61,16 +63,26 @@ import org.apache.daffodil.xml.XMLUtils
  * can answer questions about how two types are related. It can find the
  * least general supertype, or most general subtype of two types.
  */
-sealed abstract class TypeNode(parentsArg: => Seq[NodeInfo.Kind], childrenArg: => Seq[NodeInfo.Kind])
-  extends Serializable
+sealed abstract class TypeNode private (
+  val parents:  Seq[NodeInfo.Kind],
+  @TransientParam childrenArg: Delay[Seq[NodeInfo.Kind]])
+  extends PreSerialization
   with NodeInfo.Kind {
 
-  def this(parentArg: NodeInfo.Kind, childrenArg: => Seq[NodeInfo.Kind]) = this(Seq(parentArg), childrenArg)
-  def this(parentArg: NodeInfo.Kind) = this(Seq(parentArg), Seq(NodeInfo.Nothing))
+  def this(parents: Seq[NodeInfo.Kind], children: => Seq[NodeInfo.Kind]) = this(parents, Delay(children))
+  def this(parentArg: NodeInfo.Kind, childrenArg: => Seq[NodeInfo.Kind]) = this(Seq(parentArg), Delay(childrenArg))
+  def this(parentArg: NodeInfo.Kind) = this(Seq(parentArg), Delay(Seq[NodeInfo.Kind](NodeInfo.Nothing)))
 
   // Eliminated a var here. Doing functional graph construction now below.
-  final override lazy val parents = parentsArg
-  final override lazy val children = childrenArg
+  final override lazy val children = childrenArg.value
+
+  override def preSerialization = {
+    super.preSerialization
+    children

Review comment:
       I am working on a small example so I can chase back using jvisualVM to find exactly what is holding on to everything from compilation. 
   
   I can see for a large schema that even just using the API (no TDML runner involved), even after everything has been compiled, and the application is *only* holding onto the DataProcessor object, yet still there are tons of compiler-oriented structures still present.

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/dsom/CompiledExpression1.scala
##########
@@ -201,26 +189,16 @@ final case class ConstantExpression[+T <: AnyRef](
  * into "passes".
  */
 class DPathCompileInfo(
+  // parentsArg is only a transient because of the issue below about
+  // serialization ordering. There is no delayed/lazy/by-name evaluation
+  // involved.
   @TransientParam parentsArg: Seq[DPathCompileInfo],
-  @TransientParam variableMapArg: => VariableMap,
+  val variableMap: VariableMap,
   val namespaces: scala.xml.NamespaceBinding,
   val path: String,
   override val schemaFileLocation: SchemaFileLocation,
   val unqualifiedPathStepPolicy: UnqualifiedPathStepPolicy,
-  typeCalcMapArg: TypeCalcMap,
-  //
-  // lexicalContextRuntimeData is used to get the partialNextElementResolver which is used
-  // to get the next sibling info to support the outputTypeCalcNextSibling function.
-  //
-  // TODO: DAFFODIL-2326 do not pass runtime data as argment to DPathCompileInfo. The point
-  // is for DPathCompileInfo to NOT have all of the runtime data for the term, but just what
-  // is needed to compile DPath. If functions need info about next sibling, then we should
-  // compute what it needs and pass that here (e.g, sequence of possible next sibling DPathCompileInfos,
-  // or if this applies only to elements, then it should be on that object.
-  //
-  // We should not hook in the whole runtime data object here. This should be runtime independent code.
-  //
-  val lexicalContextRuntimeData: RuntimeData)
+  typeCalcMapArg: TypeCalcMap)

Review comment:
       I think so. There's a comment where this is used in defining a typeCalcMap member in terms of typeCalcMapArg. 

##########
File path: daffodil-core/src/main/scala/org/apache/daffodil/runtime1/ElementBaseRuntime1Mixin.scala
##########
@@ -163,7 +164,7 @@ trait ElementBaseRuntime1Mixin { self: ElementBase =>
       position,
       childrenERDs,
       schemaSet.variableMap,
-      partialNextElementResolver,
+      Delay(partialNextElementResolver),

Review comment:
       Ah. I was incorrect. It's not in both places. 

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/dsom/CompiledExpression1.scala
##########
@@ -341,9 +311,18 @@ class DPathCompileInfo(
  * structures are created which reference these.
  */
 class DPathElementCompileInfo(
+  // parentsArg is a transient due to serialization order issues,
+  // there is no delay/lazy/by-name involvement here.
   @TransientParam parentsArg: Seq[DPathElementCompileInfo],

Review comment:
       Well, this one is debatable because later the val parents is defined as just
   ```
   @transient val parents = parentsArg
   ```
   But there's a bunch of comments about why and why there's special serialize/deserialize for this. 

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/RuntimeData.scala
##########
@@ -862,88 +786,68 @@ sealed abstract class ModelGroupRuntimeData(
   final override def isArray = false
   override def namespaces: NamespaceBinding = ci.namespaces
 
-  lazy val variableMap = variableMapArg
-  lazy val encInfo = encInfoArg
-  lazy val ci = ciArg
-  lazy val groupMembers = groupMembersArg
-
-  override def preSerialization: Unit = {
-    super.preSerialization
-    variableMap
-    encInfo
-    ci
-    groupMembers
-  }
   @throws(classOf[java.io.IOException])

Review comment:
       Ok. I'm not going to try to take these out, unless there are no @transient or @TransientParam annotations at all on an object, in which case it can be just plain Serializable. (There is one case of this. With latest changes (next commits ) EncodingRuntimeData no longer needs to mixin PreSerialization at all.)

##########
File path: daffodil-core/src/main/scala/org/apache/daffodil/runtime1/SequenceTermRuntime1Mixin.scala
##########
@@ -30,7 +30,7 @@ trait SequenceTermRuntime1Mixin { self: SequenceTermBase =>
   lazy val sequenceRuntimeData = {
     new SequenceRuntimeData(
       position,
-      partialNextElementResolver,
+      Delay(partialNextElementResolver),

Review comment:
       It is used.  A stack of these is maintained by the runtime. What ERD an incoming unparse-time element corresponds to is context dependent, and model-groups change this, not just elements. 
   
   Note that we may want to figure out some language restriction for Runtime2 so that this is far less tricky. We don't want this sort of lookup in Runtime2 given a C structure that we're trying to unparse, we'd like that C-structure to unambiguously specify exactly which ERD the element corresponds to. In the regular Daffodil Infoset and Runtime1, a DFDL Infoset is created (somehow, perhaps programmatically, perhaps by parsing XML or JSON, perhaps by a NiFi adapter), but a near identically named child with same name and namespace can correspond to a different ERD depending on what is on the stack. 

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/EncodingRuntimeData.scala
##########
@@ -121,9 +113,9 @@ trait KnownEncodingMixin { self: ThrowsSDE =>
  */
 
 final class EncodingRuntimeData(
-  @TransientParam charsetEvArg: => CharsetEv,
+  val charsetEv: CharsetEv,
   override val schemaFileLocation: SchemaFileLocation,
-  optionUTF16WidthArg: Option[UTF16Width],
+  @TransientParam optionUTF16WidthArg: Option[UTF16Width],

Review comment:
       Good question. Serialization only affects val members. So I'm not sure why we need @TransientParam for an "-Arg" that's not a val at all.
   @stevedlawrence  can you comment on this? Do we need @TransientParam on non vals at all?

##########
File path: daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/RuntimeData.scala
##########
@@ -968,7 +872,7 @@ final class VariableRuntimeData(
     unqualifiedPathStepPolicyArg)
   with Serializable {
 
-  lazy val maybeDefaultValueExpr = maybeDefaultValueExprArg
+  lazy val maybeDefaultValueExpr = maybeDefaultValueExprArg.value

Review comment:
       I like this suggestion. Suffix "-Delay" makes a lot of sense, as marking where cyclic structures are being created. 
   I have to see how many places use Delay types to decide whether to do this next, or wait for a subsequent commit. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org