You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2022/08/19 20:11:06 UTC

[GitHub] [daffodil] mbeckerle opened a new pull request, #831: Added TDML with precompiled binary schema feature.

mbeckerle opened a new pull request, #831:
URL: https://github.com/apache/daffodil/pull/831

   DAFFODIL-2725


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] tuxji commented on a diff in pull request #831: Added TDML with precompiled binary schema feature.

Posted by GitBox <gi...@apache.org>.
tuxji commented on code in PR #831:
URL: https://github.com/apache/daffodil/pull/831#discussion_r950939129


##########
daffodil-tdml-processor/src/main/scala/org/apache/daffodil/tdml/processor/DaffodilTDMLDFDLProcessor.scala:
##########
@@ -160,9 +160,15 @@ final class TDMLDFDLProcessorFactory private (
     optRootName: Option[String],
     optRootNamespace: Option[String],
     tunables: Map[String, String]): TDML.CompileResult = {
-
-    val res = compileProcessor(schemaSource, useSerializedProcessor, optRootName, optRootNamespace)
-    res
+    if (schemaSource.isXSD) {
+      val res = compileProcessor(schemaSource, useSerializedProcessor, optRootName, optRootNamespace)
+      res
+    } else {
+      val dp = compiler.reload(schemaSource)
+      val diags = dp.getDiagnostics
+      Assert.invariant(diags.forall{! _.isError })
+      Right((diags, new DaffodilTDMLDFDLProcessor(dp)))
+    }

Review Comment:
   You've added a test that takes us to line 167 and calls compiler.reload but the test forces that method to throw an InvalidParserException("The saved parser is only compatible with Daffodil " + savedVersion + ". Current version is " + curVersion) and we never get to lines 168 and 170, hence the coverage check warnings on these lines.  
   
   You could write a test `testNonXSDModelPositiveGoodCompiledFile` which creates a parser, saves it to a temporary file (deleted automatically later), interpolates that temporary file's path into the test suite, and then runs the test suite like `testNonXSDModelNegativeBadCompiledFile` does except this test suite reloads the compiled parser successfully.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle merged pull request #831: Added TDML with precompiled binary schema feature.

Posted by GitBox <gi...@apache.org>.
mbeckerle merged PR #831:
URL: https://github.com/apache/daffodil/pull/831


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #831: Added TDML with precompiled binary schema feature.

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on code in PR #831:
URL: https://github.com/apache/daffodil/pull/831#discussion_r950535870


##########
daffodil-tdml-processor/src/main/scala/org/apache/daffodil/tdml/processor/DaffodilTDMLDFDLProcessor.scala:
##########
@@ -139,18 +139,31 @@ final class TDMLDFDLProcessorFactory private (
     }
   }
 
+
   private def compileProcessor(
     schemaSource: DaffodilSchemaSource,
     useSerializedProcessor: Boolean,
     optRootName: Option[String],
     optRootNamespace: Option[String]): TDML.CompileResult = {
-    val pf = compiler.compileSource(schemaSource, optRootName, optRootNamespace)
-    val diags = pf.getDiagnostics
-    if (pf.isError) {
-      Left(diags)
+    //
+    // if the schemaSource is not an XSD file, then we assume it is
+    // a pre-compiled DFDL schema
+    //
+    if (schemaSource.isXSD)
+    {
+      val pf = compiler.compileSource(schemaSource, optRootName, optRootNamespace)
+      val diags = pf.getDiagnostics
+      if (pf.isError) {
+        Left(diags)
+      } else {
+        val res = this.generateProcessor(pf, useSerializedProcessor)
+        res
+      }
     } else {
-      val res = this.generateProcessor(pf, useSerializedProcessor)
-      res
+      val chan = Channels.newChannel(schemaSource.uriForLoading.toURL.openStream())
+      val dp = compiler.reload(chan)
+      val diags = dp.getDiagnostics
+      Right((diags, new DaffodilTDMLDFDLProcessor(dp)))

Review Comment:
   In general, I think it's a not a great idea to store precompiled binaries in a repository along with the schema/tests. Any update to Daffodil (including snapshots) are likely to not be compatible, so it's going to be pretty fragile. And any updates to the schema requires you to rebuild the precompiled binary--keeping those in sync sounds difficult and easy to forget.
   
   Ideally we would add a better caching mechanism than what we have right now (e.g. cache to a file instead of to memory). Or as I think you've suggested in the past, when we build the schema jar we also build a precompiled binary into the jar that the TDML runner could look for. That way we can rely on sbt to track when the schema changes and rebuilds the jar/precompiled binary as needed and you don't have to worry about storing one in the repo. That probably requires a special Daffodil plugin though or special build.sbt logic, so probably not a trivial amount of work.
   
   This feels fine as a temporary solution, but I hope we are able to deprecated it relatively soon and not make this standard practice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #831: Added TDML with precompiled binary schema feature.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #831:
URL: https://github.com/apache/daffodil/pull/831#discussion_r950547707


##########
daffodil-tdml-processor/src/main/scala/org/apache/daffodil/tdml/processor/DaffodilTDMLDFDLProcessor.scala:
##########
@@ -139,18 +139,31 @@ final class TDMLDFDLProcessorFactory private (
     }
   }
 
+
   private def compileProcessor(
     schemaSource: DaffodilSchemaSource,
     useSerializedProcessor: Boolean,
     optRootName: Option[String],
     optRootNamespace: Option[String]): TDML.CompileResult = {
-    val pf = compiler.compileSource(schemaSource, optRootName, optRootNamespace)
-    val diags = pf.getDiagnostics
-    if (pf.isError) {
-      Left(diags)
+    //
+    // if the schemaSource is not an XSD file, then we assume it is
+    // a pre-compiled DFDL schema
+    //
+    if (schemaSource.isXSD)
+    {
+      val pf = compiler.compileSource(schemaSource, optRootName, optRootNamespace)
+      val diags = pf.getDiagnostics
+      if (pf.isError) {
+        Left(diags)
+      } else {
+        val res = this.generateProcessor(pf, useSerializedProcessor)
+        res
+      }
     } else {
-      val res = this.generateProcessor(pf, useSerializedProcessor)
-      res
+      val chan = Channels.newChannel(schemaSource.uriForLoading.toURL.openStream())
+      val dp = compiler.reload(chan)
+      val diags = dp.getDiagnostics
+      Right((diags, new DaffodilTDMLDFDLProcessor(dp)))

Review Comment:
   Completely agree that they shouldn't be stored in repos. My use case for this keeps them in /tmp. 
   
   The only reason there is this fake ".bin" in here is for purposes of having a test here that exercises the code paths.
   
   re: improving this feature further... I did think about that, but punted for now. It's not THAT hard, there's just lots of time pressure now on other things. But I definitely wanted to put this into Daffodil while we can rather than write yet another non-TDML test rig to dance around this problem. 
   
   We could add the method to daffodil to allow one to get latest mod date for any file included/imported by a schema, and use that date to decide whether to recompile or not. Having daffodil do this avoids reinventing include/import and resolver logic again elsewhere. I think this is pretty easy. DFDLSchemaFile class gets a modDateTime member, and rollups doing max on that for an entire SchemaSet. 
   
   Still doesn't deal with what if the schema is the same, but you changed daffodil itself, but it would still be better than entirely manual. 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [daffodil] mbeckerle commented on a diff in pull request #831: Added TDML with precompiled binary schema feature.

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on code in PR #831:
URL: https://github.com/apache/daffodil/pull/831#discussion_r951438972


##########
daffodil-tdml-processor/src/main/scala/org/apache/daffodil/tdml/processor/DaffodilTDMLDFDLProcessor.scala:
##########
@@ -160,9 +160,15 @@ final class TDMLDFDLProcessorFactory private (
     optRootName: Option[String],
     optRootNamespace: Option[String],
     tunables: Map[String, String]): TDML.CompileResult = {
-
-    val res = compileProcessor(schemaSource, useSerializedProcessor, optRootName, optRootNamespace)
-    res
+    if (schemaSource.isXSD) {
+      val res = compileProcessor(schemaSource, useSerializedProcessor, optRootName, optRootNamespace)
+      res
+    } else {
+      val dp = compiler.reload(schemaSource)
+      val diags = dp.getDiagnostics
+      Assert.invariant(diags.forall{! _.isError })
+      Right((diags, new DaffodilTDMLDFDLProcessor(dp)))
+    }

Review Comment:
   I could do this, but I didn't because I deemed coverage of this particular functionality to be quite thorough in other places. This save/restore is used everywhere. Every TDML test, for example by default serializes and restores the processor before executing. So it's really just these two specific lines, and they are correct by inspection. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org