You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2022/12/06 18:09:52 UTC

[GitHub] [daffodil] tuxji opened a new pull request, #886: Update scaladocs mentioning deprecated API (also fix CI)

tuxji opened a new pull request, #886:
URL: https://github.com/apache/daffodil/pull/886

   Ensure nothing mentions the deprecated API removed in previous commits for DAFFODIL-2743.  Also fix some IDEA nits and bump actions/setup-java from 3.7.0 to 3.8.0 because GitHub has [removed](https://github.com/actions/setup-java/issues/422) 3.7.0.
   
   main.yml: Bump actions/setup-java from 3.7.0 to 3.8.0 because GitHub has removed 3.7.0 (which broke my fork's CI for this PR).
   
   SchemaSet.scala: Simplify scaladoc to stop mentioning deprecated API. Fix some IDEA nits in the same code block with IDEA quick fixes.
   
   DataProcessor.scala: Rewrite scaladoc mentioning compilerExternalVars to say this constructor is for reconstructing a DataProcessor from a SerializedDataProcessor. Remove another scaladoc since it's no longer relevant.
   
   DAFFODIL-2743


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
tuxji commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1043766047


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   I ended up omitting variableMap because variableMap already has a default value (variableMap.copy()) which is copied if not passed explicitly.  
   
   Mike @mbeckerle , will you please review all the changes so far?  I want to get this PR done and out of the way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
tuxji commented on PR #886:
URL: https://github.com/apache/daffodil/pull/886#issuecomment-1341385983

   Rebased and updated pull request, now needs a second reviewer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #886: Update scaladocs mentioning deprecated API (also fix CI)

Posted by GitBox <gi...@apache.org>.
tuxji commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042394559


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -154,27 +154,23 @@ class DataProcessor private (
    * That means when we save for reloading, we must explicitly clobber validationMode and externalVars to
    * initialized values.
    *
-   * @throws java.io.ObjectStreamException
+   * @throws java.io.ObjectStreamException Must be part of writeReplace's API
    * @return the serializable object
    */
   @throws(classOf[java.io.ObjectStreamException])
   private def writeReplace() : Object =
     new SerializableDataProcessor(ssrd, tunables, externalVars, validationMode)
 
   /**
-   * The compilerExternalVars argument supports the deprecated feature to assign external var bindings
-   * on the compiler object.
-   *
-   * These are just incorporated into the initial variable map of the data processor.
+   * This constructor reconstructs a DataProcessor from a SerializedDataProcessor.
    */
-
   def this(
     ssrd: SchemaSetRuntimeData,
     tunables:DaffodilTunables,
-    compilerExternalVars: Queue[Binding] = Queue.empty,
+    externalVars: Queue[Binding] = Queue.empty,
     validationMode: ValidationMode.Type = ValidationMode.Off) =
-    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(compilerExternalVars, ssrd, ssrd.originalVariables),
-      false, None, validationMode, compilerExternalVars)
+    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(externalVars, ssrd, ssrd.originalVariables),
+      false, None, validationMode, externalVars)

Review Comment:
   I had initially thought SchemaSetRuntime1Mixin.scala was the only caller of this constructor too and removed the externalVars parameter from this constructor, but I got a compilation error which showed that there was a second call of this constructor from SerializableDataProcessor's constructor in DataProcessor.scala at line 107 that I didn't know about.  SerializableDataProcessor extends DataProcessor(data, tunable, externalVars, validationModeArg) in order to preserve the externalVars and validationMode in case they may be needed by other serializations besides Daffodil save/reload such as Apache Spark which serializes in order to move objects for remote execution.  
   
   We can't remove the externalVars field from DataProcessor because we inform DataProcessor about external variable bindings in that field.  DataProcessor's primary constructor is private (probably because of the mix of parameters some of which should be serialized all the time, some which should be serialized only sometimes, and some which should never be serialized).  DataProcessor has both a class and an object, but the object DataProcessor has no apply method, only a private class SerializableDataProcessor.  Therefore, this particular (public) constructor is the only way which callers outside DataProcessor.scala can create a DataProcessor.  Our options are to keep this public constructor or add an apply method to the object DataProcessor which calls the private constructor.
   
   At the very least, I need to fix this constructor's scaladoc.  I had changed the scaladoc to say it was needed by SerializableDataProcessor, but I misstated the reason.  I can make the scaladoc state that this constructor exists to construct a SerializableDataProcessor, not the other way around.  Or I can remove this constructor and add an apply method to object DataProcessor, which would put that apply method next to the class SerializableDataProcessor and make the relationship between them more obvious.  I would also add a second apply method with only 2 parameters and its own scaladoc to let external callers like SchemaSetRuntime1Mixin.scala create a DataProcessor.
   
   Thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042784059


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -84,27 +83,25 @@ class InvalidUsageException(msg: String, cause: Throwable = null) extends Except
 object DataProcessor {
 
   /**
-   * This is the DataProcessor constructed from a saved processor.
+   * This is the SerializableDataProcessor constructed from a saved processor.
    *
    * It enables us to implement restrictions on what you can/cannot do with a reloaded
    * processor versus an original one.
    *
-   * When we create one of these, it will have default values for everything
-   * settable like debuggers, debug mode.
+   * When we reload a processor, it will have default values for everything settable
+   * like validation mode, debug mode, and debugger.
    *
-   * Note that this does preserve the externalVars and validationMode. That is because
-   * those may be needed by serializations other than our own save/reload (e.g., Apache Spark which
-   * serializes to move things for remote execution).
+   * Note that this class does preserve validationMode. That is because validationMode
+   * may be needed by serializations other than our own save/reload (e.g., Apache Spark
+   * which serializes to move objects for remote execution).
    *
-   * Hence, we're depending on the save method to explicitly reset validationMode and
-   * externalVars to initial values.
+   * Hence, we're depending on the save method to explicitly turn off validationMode.
    */
   private class SerializableDataProcessor(
     val data: SchemaSetRuntimeData,
     tunable: DaffodilTunables,
-    externalVars: Queue[Binding], // must be explicitly set to empty by save method
-    validationModeArg: ValidationMode.Type) // must be explicitly set from Full to Limited by save method.
-    extends DataProcessor(data, tunable, externalVars, validationModeArg) {
+    validationModeArg: ValidationMode.Type) // must be explicitly turned off by save method
+    extends DataProcessor(data, tunable, data.originalVariables, validationModeArg) {

Review Comment:
   >  Are you suggesting that SerializableDataProcessor should have 4 (instead of 3) arguments with VariableMap becoming the 3rd parameter and validationModeArg becoming the 4th parameter? All so that our writeReplace method can pass the DataProcessor's own VariableMap to the SerializableDataProcessor constructor instead of ssrd's (data's) VariableMap getting passed?
   
   Yep. This way a SerializableDataProcessor is exactly the same as the associated DataProcessor, with the one difference being Full validation is not allowed. This is needed for things like Spark, which serializes objects and then sends them to different threads/machines to do the processing.
   
   But if we use `save()` instead of built-in serialization, then we don't actually want all that state saved since it's a bit more confusing, so the save function resets that state before doing any serialization.
   
   Would definitely like to hear @mbeckerle's view though. I think he implemented that logic?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji merged pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
tuxji merged PR #886:
URL: https://github.com/apache/daffodil/pull/886


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on PR #886:
URL: https://github.com/apache/daffodil/pull/886#issuecomment-1341772598

   When you restore a saved compiled schema, you should get the initial values of any DFDL external variables, regardless of whether they had any values set before the schema was pre-compiled. 
   
   Saving the parser does NOT lock-down/freeze or even utilize any variable values.  Saving the parser is supposed to be transparent, i.e., only faster, than using the uncompiled schema. 
   
   For example, expressions that refer to variables, even if the variable has been externally set, cannot be constant-folded by the schema compiler to remove the variable reference. Rather, any attempt to reference a variable at schema compile time causes the expression folding to stop. 
   
   There are, however, some tunables you can set which only affect schema compilation. Those are only used by the schema compiler so clearly one cannot subsequently modify them after restoring a saved processor. (You can, but it won't do anything because the compilation is already over. )  Example of this is a tunable to suppress a particular compile-time warning, or to specify a particular interpretation  (UnqualifiedPathStepPolicy is an example.) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1043812166


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   From Mikes other comment:
   
   > Saving the parser does NOT lock-down/freeze or even utilize any variable values. Saving the parser is supposed to be transparent, i.e., only faster, than using the uncompiled schema.
   
   So it seems like we do want `variableMap = ssrd.originalVariables` so that calling `save()` essentially erases any calls to `withVariableMap`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API (also fix CI)

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042429478


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -154,27 +154,23 @@ class DataProcessor private (
    * That means when we save for reloading, we must explicitly clobber validationMode and externalVars to
    * initialized values.
    *
-   * @throws java.io.ObjectStreamException
+   * @throws java.io.ObjectStreamException Must be part of writeReplace's API
    * @return the serializable object
    */
   @throws(classOf[java.io.ObjectStreamException])
   private def writeReplace() : Object =
     new SerializableDataProcessor(ssrd, tunables, externalVars, validationMode)
 
   /**
-   * The compilerExternalVars argument supports the deprecated feature to assign external var bindings
-   * on the compiler object.
-   *
-   * These are just incorporated into the initial variable map of the data processor.
+   * This constructor reconstructs a DataProcessor from a SerializedDataProcessor.
    */
-
   def this(
     ssrd: SchemaSetRuntimeData,
     tunables:DaffodilTunables,
-    compilerExternalVars: Queue[Binding] = Queue.empty,
+    externalVars: Queue[Binding] = Queue.empty,
     validationMode: ValidationMode.Type = ValidationMode.Off) =
-    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(compilerExternalVars, ssrd, ssrd.originalVariables),
-      false, None, validationMode, compilerExternalVars)
+    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(externalVars, ssrd, ssrd.originalVariables),
+      false, None, validationMode, externalVars)

Review Comment:
   Hmm, good points.
   
   Digging in some more based on what you've found, the only differences between the public and private DataProcessor constructors are the ability to pass in `areDebugging` and `optDebugger`. I'm not sure how important it is to restrict setting debugging when creating a DataProcessor (especially since this DataProcessor is internal only), so we could potentially just make the current private constructor public and remove the public one. And then the `SerializableDataProcessor` could just extend the current private one? And the current private constructor could set default values of the debugging parameters if we want the to be optional?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042697688


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   I think we might still want this line if we add back the VariableMap to the SerializableDataProcessor. Or i guess really we want this to be `externalVars = ssrd.originalValues`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API (also fix CI)

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042163302


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -154,27 +154,23 @@ class DataProcessor private (
    * That means when we save for reloading, we must explicitly clobber validationMode and externalVars to
    * initialized values.
    *
-   * @throws java.io.ObjectStreamException
+   * @throws java.io.ObjectStreamException Must be part of writeReplace's API
    * @return the serializable object
    */
   @throws(classOf[java.io.ObjectStreamException])
   private def writeReplace() : Object =
     new SerializableDataProcessor(ssrd, tunables, externalVars, validationMode)
 
   /**
-   * The compilerExternalVars argument supports the deprecated feature to assign external var bindings
-   * on the compiler object.
-   *
-   * These are just incorporated into the initial variable map of the data processor.
+   * This constructor reconstructs a DataProcessor from a SerializedDataProcessor.
    */
-
   def this(
     ssrd: SchemaSetRuntimeData,
     tunables:DaffodilTunables,
-    compilerExternalVars: Queue[Binding] = Queue.empty,
+    externalVars: Queue[Binding] = Queue.empty,
     validationMode: ValidationMode.Type = ValidationMode.Off) =
-    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(compilerExternalVars, ssrd, ssrd.originalVariables),
-      false, None, validationMode, compilerExternalVars)
+    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(externalVars, ssrd, ssrd.originalVariables),
+      false, None, validationMode, externalVars)

Review Comment:
   I wonder if this constructor should even exist? Or if it should be changed to not even accept `externalVars` and `validationMode`? I think those parameters should never be passed in now, with the only way to change them via the `withXYZ` functions on the `DataProcessor`?
   
   I *think* the only use of this constructor is in SchemaSetRuntime1Mixin.scala, and it doesn't pass anything in anything for these parameters. Seems like maybe this constructor can be removed, and the SSRD can be responsible for creating the variable map before it creates the `DataProcessor` and passing in the other variables that it currently doesn't explicitly set? Or maybe just remove these params so they can't be passed in?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1043808211


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   Doesn't this change the behavior? I believe previously if you `save()` and `reload()` you would get the original variables values defined in the schema? With this, doesn't a `reload()` get whatever variables were set on the `DataProcessor` at the time of `save()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042570241


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -154,37 +156,21 @@ class DataProcessor private (
    * That means when we save for reloading, we must explicitly clobber validationMode and externalVars to
    * initialized values.
    *
-   * @throws java.io.ObjectStreamException
+   * @throws java.io.ObjectStreamException Must be part of writeReplace's API
    * @return the serializable object
    */
   @throws(classOf[java.io.ObjectStreamException])
   private def writeReplace() : Object =
     new SerializableDataProcessor(ssrd, tunables, externalVars, validationMode)
 
-  /**
-   * The compilerExternalVars argument supports the deprecated feature to assign external var bindings
-   * on the compiler object.
-   *
-   * These are just incorporated into the initial variable map of the data processor.
-   */
-
-  def this(
-    ssrd: SchemaSetRuntimeData,
-    tunables:DaffodilTunables,
-    compilerExternalVars: Queue[Binding] = Queue.empty,
-    validationMode: ValidationMode.Type = ValidationMode.Off) =
-    this(ssrd, tunables, ExternalVariablesLoader.loadVariables(compilerExternalVars, ssrd, ssrd.originalVariables),
-      false, None, validationMode, compilerExternalVars)
-
-  def copy(
-    ssrd: SchemaSetRuntimeData = ssrd,
-    tunables: DaffodilTunables = tunables,
-    areDebugging : Boolean = areDebugging,
-    optDebugger : Option[Debugger] = optDebugger,
-    validationMode: ValidationMode.Type = validationMode,
-    variableMap : VariableMap = variableMap.copy,
-    externalVars: Queue[Binding] = externalVars) =
-    new DataProcessor(ssrd, tunables, variableMap, areDebugging, optDebugger, validationMode,  externalVars)
+  def copy(ssrd: SchemaSetRuntimeData = ssrd,
+           tunables: DaffodilTunables = tunables,
+           variableMap: VariableMap = variableMap.copy(),
+           externalVars: Queue[Binding] = externalVars,
+           validationMode: ValidationMode.Type = validationMode,
+           areDebugging: Boolean = areDebugging,
+           optDebugger: Option[Debugger] = optDebugger) =

Review Comment:
   I *think* this indentation isn't the scala standard? I think the previous way is right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
tuxji commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042763601


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -224,14 +203,14 @@ class DataProcessor private (
   }
 
   // TODO Deprecate and replace usages with just tunables.

Review Comment:
   I'd taken a look and backed away when I saw getTunable was used in DFDL.DataProcessor, but I just looked again and I think it's feasible and wouldn't require deprecation since getTunable doesn't seem to be part of the Java or Scala API.  The only places using getTunable seem to be some tests and runtime1 itself.  I'll replace getTunable with tunables.
   
   ```grep
   ./daffodil-core/src/test/scala/org/apache/daffodil/general/TestTunables.scala:66:    val t1 = dp1.getTunables()
   ./daffodil-core/src/test/scala/org/apache/daffodil/general/TestTunables.scala:67:    val t2 = dp2.getTunables()
   ./daffodil-core/src/test/scala/org/apache/daffodil/general/TestTunables.scala:72:    val t3 = dp2.getTunables() // modified tunables at 'run-time'
   ./daffodil-core/src/test/scala/org/apache/daffodil/general/TestTunables.scala:73:    val t4 = dp1.getTunables() // obtain first data processor to see if anything changed
   ./daffodil-core/src/test/scala/org/apache/daffodil/infoset/TestInfosetCursor.scala:94:    infosetInputter.initialize(rootERD, u.getTunables())
   ./daffodil-core/src/test/scala/org/apache/daffodil/infoset/TestInfosetCursor1.scala:52:    ic.initialize(rootERD, u.getTunables())
   ./daffodil-core/src/test/scala/org/apache/daffodil/infoset/TestInfosetCursorFromReader.scala:58:    inputter.initialize(rootERD, u.getTunables())
   ./daffodil-core/src/test/scala/org/apache/daffodil/infoset/TestInfosetCursorFromReader2.scala:66:    inputter.initialize(rootERD, u.getTunables())
   ./daffodil-core/src/test/scala/org/apache/daffodil/util/TestUtils.scala:348:    override def getTunables(): DaffodilTunables = { tunables }
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/api/DFDLParserUnparser.scala:195:    def getTunables(): DaffodilTunables
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilUnparseContentHandler.scala:81:  private lazy val tunablesBatchSize = dp.getTunables().saxUnparseEventBatchSize
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:206:  def getTunables(): DaffodilTunables = tunables
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:476:    inputter.initialize(ssrd.elementRuntimeData, getTunables())
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PState.scala:661:    val tunables = dataProc.getTunables()
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/PState.scala:691:    val tunables = dataProc.getTunables()
   ./daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/unparsers/UState.scala:692:        dataProc.getTunables(),
   ./daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/Runtime2DataProcessor.scala:66:  override def getTunables(): DaffodilTunables = ???
   ```



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   I think you meant `variableMap = ssrd.originalVariables` since we've gotten rid of all uses/mentions of `externalVars`.  Anyway, should a saved processor preserve its own `variableMap` or replace it with `ssrd.originalVariables`?  That is the real question we need to ask.  Mike @mbeckerle , do you want to weigh in?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -84,27 +83,25 @@ class InvalidUsageException(msg: String, cause: Throwable = null) extends Except
 object DataProcessor {
 
   /**
-   * This is the DataProcessor constructed from a saved processor.
+   * This is the SerializableDataProcessor constructed from a saved processor.
    *
    * It enables us to implement restrictions on what you can/cannot do with a reloaded
    * processor versus an original one.
    *
-   * When we create one of these, it will have default values for everything
-   * settable like debuggers, debug mode.
+   * When we reload a processor, it will have default values for everything settable
+   * like validation mode, debug mode, and debugger.
    *
-   * Note that this does preserve the externalVars and validationMode. That is because
-   * those may be needed by serializations other than our own save/reload (e.g., Apache Spark which
-   * serializes to move things for remote execution).
+   * Note that this class does preserve validationMode. That is because validationMode
+   * may be needed by serializations other than our own save/reload (e.g., Apache Spark
+   * which serializes to move objects for remote execution).
    *
-   * Hence, we're depending on the save method to explicitly reset validationMode and
-   * externalVars to initial values.
+   * Hence, we're depending on the save method to explicitly turn off validationMode.
    */
   private class SerializableDataProcessor(
     val data: SchemaSetRuntimeData,
     tunable: DaffodilTunables,
-    externalVars: Queue[Binding], // must be explicitly set to empty by save method
-    validationModeArg: ValidationMode.Type) // must be explicitly set from Full to Limited by save method.
-    extends DataProcessor(data, tunable, externalVars, validationModeArg) {
+    validationModeArg: ValidationMode.Type) // must be explicitly turned off by save method
+    extends DataProcessor(data, tunable, data.originalVariables, validationModeArg) {

Review Comment:
   The `data.originalVariable` is a method which returns a copy of the VariableMap stored in `data` so that we can proceed to serialize the SerializableDataProcessor even if another thread tries to mutate the VariableMap stored in `data` at the same time.  Are you suggesting that SerializableDataProcessor should have 4 (instead of 3) arguments with VariableMap becoming the 3rd parameter and validationModeArg becoming the 4th parameter?  All so that our writeReplace method can pass the DataProcessor's own VariableMap to the SerializableDataProcessor constructor instead of ssrd's (data's) VariableMap getting passed?  
   
   I think that makes sense since the withExternalVariables methods return a copy of the DataProcessor with a new VariableMap and we should serialize that new VariableMap instead of the original VariableMap we started with.  VariableMap says it's not thread-safe, so I'll make writeReplace pass variableMap.copy() just to be as safe as possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1043831416


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -311,18 +271,13 @@ class DataProcessor private (
     val oos = new ObjectOutputStream(new GZIPOutputStream(os))
 
     //
-    // Make a copy of this object, so that our state mods below don't side-effect the user's object.
-    // Saving shouldn't have side-effects on the state of the object.
-    //
-    //
-    // Note that the serialization system *does* preserve these two settings. This is for general serialization
-    // that may be required by other software (e.g., Apache Spark)
+    // Make a copy of this object so that we can make its saved state
+    // different than its original state.  Note that the serialization
+    // system *does* preserve validationMode since it may be required by
+    // other software like Apache Spark. But for our save/reload purposes,
+    // we don't want validationMode preserved.
     //
-    // But for our save/reload purposes, we don't want them preserved.
-    //
-
     val dpToSave = this.copy(
-      externalVars = Queue.empty[Binding], // explicitly set these to empty so restored processor won't have them.

Review Comment:
   Ah. Yes, steve is right.  We do want the originalVariables. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042568269


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -129,10 +131,10 @@ class DataProcessor private (
   // The values these will have (since this is a base class) are the correct default values that we want
   // back when the object is re-initialized.
   //
-  protected val areDebugging : Boolean,
-  protected val optDebugger : Option[Debugger],
+  private val externalVars: Queue[Binding],

Review Comment:
   I think maybe `externalVars` can go away completely? The only way to set external variables should be with `withExternalVariables`, which I think should already create a copy of the variable map and mutate it with the new bindings?
   
   I *think* externalVars was only needed because the compiler/processor factory could have variables set on them, so they needed to be carried down this way to the DataProcessor?
   
   So maybe the `SchemaSetRuntimeData` just passes in the `originalVariables` Variable map when it creates the DataProcosser, and that is copied/mutated as needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
tuxji commented on PR #886:
URL: https://github.com/apache/daffodil/pull/886#issuecomment-1341565694

   I've removed all mentions/uses of `externalVars` in DataProcessor and restored `copy`'s original formatting.  I'm letting CI run the tests for me; if they pass, then this pull request is ready for a second reviewer to review it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042693163


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -84,27 +83,25 @@ class InvalidUsageException(msg: String, cause: Throwable = null) extends Except
 object DataProcessor {
 
   /**
-   * This is the DataProcessor constructed from a saved processor.
+   * This is the SerializableDataProcessor constructed from a saved processor.
    *
    * It enables us to implement restrictions on what you can/cannot do with a reloaded
    * processor versus an original one.
    *
-   * When we create one of these, it will have default values for everything
-   * settable like debuggers, debug mode.
+   * When we reload a processor, it will have default values for everything settable
+   * like validation mode, debug mode, and debugger.
    *
-   * Note that this does preserve the externalVars and validationMode. That is because
-   * those may be needed by serializations other than our own save/reload (e.g., Apache Spark which
-   * serializes to move things for remote execution).
+   * Note that this class does preserve validationMode. That is because validationMode
+   * may be needed by serializations other than our own save/reload (e.g., Apache Spark
+   * which serializes to move objects for remote execution).
    *
-   * Hence, we're depending on the save method to explicitly reset validationMode and
-   * externalVars to initial values.
+   * Hence, we're depending on the save method to explicitly turn off validationMode.
    */
   private class SerializableDataProcessor(
     val data: SchemaSetRuntimeData,
     tunable: DaffodilTunables,
-    externalVars: Queue[Binding], // must be explicitly set to empty by save method
-    validationModeArg: ValidationMode.Type) // must be explicitly set from Full to Limited by save method.
-    extends DataProcessor(data, tunable, externalVars, validationModeArg) {
+    validationModeArg: ValidationMode.Type) // must be explicitly turned off by save method
+    extends DataProcessor(data, tunable, data.originalVariables, validationModeArg) {

Review Comment:
   Should this be the same (or a copy of) the VariableMap from the DataProcessor? That way if something like Spark wants to serialize the existing VariableMap then it can? We just don't do it thorugh bindings anymore but directly serialize the actual VariableMap?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1042694526


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DataProcessor.scala:
##########
@@ -224,14 +203,14 @@ class DataProcessor private (
   }
 
   // TODO Deprecate and replace usages with just tunables.

Review Comment:
   Should we do what this comment suggests and replace `getTunable` with `tunbles`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #886: Update scaladocs mentioning deprecated API

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #886:
URL: https://github.com/apache/daffodil/pull/886#discussion_r1043806531


##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/Runtime2DataProcessor.scala:
##########
@@ -46,28 +46,19 @@ import org.apache.daffodil.util.Maybe.Nope
 class Runtime2DataProcessor(executableFile: os.Path) extends DFDL.DataProcessorBase {
 
   override def withValidationMode(mode: ValidationMode.Type): DFDL.DataProcessor = ???
-
   override def withTunable(name: String, value: String): DFDL.DataProcessor = ???
-
   override def withTunables(tunables: Map[String, String]): DFDL.DataProcessor = ???
-
   override def withExternalVariables(extVars: Map[String, String]): DFDL.DataProcessor = ???
-
   override def withExternalVariables(extVars: File): DFDL.DataProcessor = ???
-
   override def withExternalVariables(extVars: Seq[Binding]): DFDL.DataProcessor = ???
-
   override def withDebugger(dbg:AnyRef): DFDL.DataProcessor = ???
-  
   override def withDebugging(flag: Boolean): DFDL.DataProcessor = ???
 
-  override def validationMode: ValidationMode.Type = ???
-
-  override def getTunables(): DaffodilTunables = ???
-
   override def save(output: DFDL.Output): Unit = ???
 
+  override def tunables: DaffodilTunables = ???

Review Comment:
   Suggest put coverage off/on around these stubs. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org