You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2021/11/12 18:13:00 UTC

[GitHub] [daffodil] tuxji opened a new pull request #681: Fix some code generation corner cases

tuxji opened a new pull request #681:
URL: https://github.com/apache/daffodil/pull/681


   1. Elements used in multiple places within a schema were causing
      redundant code generation
   
   2. Empty complex type elements without any child elements were causing
      compiler warnings
   
   3. Certain choice dispatch key expressions were causing compiler
      errors
   
   Also start keeping generated code examples up to date by automation.
   
   main.yml - Bump mxml from 3.2 to 3.3.
   
   BUILD.md - Bump mxml from 3.2 to 3.3 and Fedora from 34 to 35.
   
   Makefile - Add "format" and "iwyu" targets for maintainers' use.
   
   cli_errors.c - Print program version on stdout, not stderr.  Update includes.
   
   xml_reader.c - Fix a lint warning.  Update includes.
   
   xml_writer.c - Update includes.
   
   infoset.c - Fix some lint warnings.
   
   parsers.h - Update includes.
   
   unparsers.h - Update includes.
   
   CodeGeneratorState.scala - Fix code generation for some corner cases:
   1) elements used in multiple places within a schema causing redundant
   code generation, 2) empty complex type elements without any child
   elements causing compiler warnings, 3) certain choice dispatch key
   expressions causing compiler errors.  Update includes.
   
   ElementParseAndUnspecifiedLengthCodeGenerator.scala - Fix code
   generation for one corner case (elements used in multiple places
   within a schema causing redundant code generation).
   
   NestedUnion/generated_code.[ch] - Update generated code example by
   automation.
   
   ex_nums/generated_code.[ch] - Update generated code example by
   automation.
   
   TestCodeGenerator.scala - Change tempDir from /tmp/NNN... to
   /tmp/daffodil-runtime2-NNN... to make its purpose clearer.  Add new
   test_updateGeneratedCodeExamples to keep our generated code examples
   up to date by automation.
   
   DAFFODIL-2585


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976980753


   Steve, I've integrated all the changes from your branch.  I found one last bug (the caching doesn't work), but I just realized what's causing the example files to be generated every time.  `genExamples` watches all the runtime2 files for changes and calls FileFunction.cached, but the example files are located in a subdirectory under the example schemas, so we need to separate the example files from the example schemas to avoid the example files being watched for changes too.  I think I will move the example c files to `daffodil-runtime2/test/c/examples`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r752618483



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       OK, we are discussing two separate issues now:
   
   1. Should test_updateGeneratedCodeExamples fail every time or only the first time?
   2. Should the C code generator write a) only LFs or b) System.lineSeparator into generated code files regardless of which line endings are used in source files?
   
   For 1), I don't think we can separate the test and the update without adding an extra step (the update command) to the code contributor's workflow.  Instead of simply running "sbt test" until no test fails, the contributor will have to run "sbt genExamples" and then "sbt test" again to verify the tests pass now.  Do we really want to add more friction to the code contributor's workflow?  
   
   I say the convenience of combining the test and the update outweighs the strangeness of the test failing only the first time.  The test may seem like it will fail only once, but it will invariably fail every time when run on a fresh checkout of the repository on a GitHub Actions runner.  The test in CI will prevent anyone from committing a change to the code generator without running "sbt test" first.  
   
   At the same time, I do think we need to expand the assert's message so that any code contributor will understand what is going on, especially if they're merely publishing a new daffodil release (even changing daffodil's version in build.sbt without running "sbt test" will make the CI job fail).  "Examples have been updated!" is too terse for GitHub Actions; we probably should write something like "You need to update the examples!  Run `sbt test` and commit the updated example code files."  Suggestions for better phrasing would be welcome.
   
   For 2), I could make an even smaller change if we want the C code generator to write only LFs or System.lineSeparator regardless of source files' line endings.  I would modify CodeGeneratorState's generatedCodeHeader and generatedCodeFile functions to replace CRLF and LF before returning the final strings to be written out to files.  That's only two new lines of code, but we would have to replace LF with System.lineSeparator before we could remove the new line from .gitattributes.  Unfortunately, if we remove the new line from .gitattributes, the test still could fail in a code contributor's checkout due to the contributor's git config settings.  The test will work in GitHub Actions, but it won't work for all code contributors running it because they could have checked out the example code files on Windows with either LF or CRLF line endings depending on their settings.  The best thing to do would be to leave the new line but change it to  `* text=auto`, which causes git to check 
 out all text-like files with System.lineSeparator line endings regardless of git config settings.  Then the examples and the generated files will match each other - they'll always be System.lineSeparator on GitHub Actions and in anyone's checkout.  Files that are already CRLF in git will remain CRLF on checkout with this new line as well.
   
   John (@jw3) or anyone else, please weigh in too (I've been waiting a long time for a second +1).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r754598833



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Convert to CRLF on Windows since tests expect System.lineSeparator
+* text=auto

Review comment:
       I would prefer we apply this only to the runtime2 examples directory if possible. I think those files are really the only ones where the new lines *must* match the preferred OS line separator, since that's what the generator outputs now. All other files should just follow the users `core.autocrlf` settings that they think works best with their preferred development tools/OS.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r756286472



##########
File path: build.sbt
##########
@@ -339,3 +339,40 @@ lazy val unidocSettings = Seq(
     }
   },
 )
+
+lazy val genExamplesSettings = Seq(
+  Compile / genExamples := {
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc).toSet

Review comment:
       Does this need to watch more than just the sources and resources? For example, what if a scala file in daffodil-core changes? Couldn't that potentially require the need for a rebuild? Seems any changes to anything on the classpath would need to be watched?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976578694


   I've added a branch to my fork that implements what I mentioned above. Take a look and let me know what you think:
   
   https://github.com/stevedlawrence/daffodil/commit/fb4d4d153b530a11ca0d34b8aab6416dd34e1e48
   
   What this does:
   
   * Move the generated files to the `tutorials/src/main/examples/` directory to avoid any potential cyclic dependency/modification issues
   * Modify build.sbt to fork the CLI to run the generator for the two examples. Some things worth pointing out:
     * To deal with the log4j messages I just used `ForkOptions` and a `CustomOutput` to output the fork stdout/stderr to a buffer and then just ignore that buffer. So the log4j warnings still exist, this just doesn't show them.
     * This is the magic sbt incantation to run `genExamples` when `sbt compile` is run:
       ```
       Compile / compile := {
         val res = (Compile / compile).value
         (Compile / genExamples).value
         res
       }
       ```
       Note that this technically only runs if you run `sbt daffodil-tutorials/compile` but `sbt compile` runs that because the tutorials subproject is part of the subproject aggregation, so it's essentially the same
     * This also includes `FileFunction.cached` so that we only regenerate files if either the schemas or the classpath jars change. Avoids unnecessarily regenerating things 
   * Add a github action to ensure the examples are updated by generating the examples and running git diff. I intentionally didn't commit the changes to the nested generated files (which I can imagine some people might unknowingly o), so you can see what the failure looks like here
     https://github.com/stevedlawrence/daffodil/runs/4299971721?check_suite_focus=true
   * Update the other generators to be consistent and less verbose about logging, similar to the rest of sbt compilation messages
   * Update `genManaged` to list the files, so you if you run `sbt 'show genManaged'` it will actually show the files that are generated
   
   I didn't see your latest changes, so this is slightly different (e.g. running the CLI vs your new CodeGenerator class). Feel free to pick and choose from that what you like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r756321225



##########
File path: build.sbt
##########
@@ -343,14 +343,38 @@ lazy val unidocSettings = Seq(
 
 lazy val genExamplesSettings = Seq(
   Compile / genExamples := {
-    val cp = (runtime2 / Runtime / dependencyClasspath).value
-    val forkOpts = ForkOptions().withBootJars(cp.files.toVector)
-    val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
-    val args = Seq(mainClass)
-    val ret = Fork.java(forkOpts, args)
-    val stream = streams.value
-    if (ret != 0) {
-      stream.log.error(s"Failed to generate examples")
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Compile / resources).value
+    val inTSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc ++ inTSrc).toSet
+    val cachedFun = FileFunction.cached(stream.cacheDirectory / "genExamples") { _ =>
+      val out = new java.io.ByteArrayOutputStream()
+      val forkOpts = ForkOptions()
+                       .withOutputStrategy(Some(CustomOutput(out)))
+                       .withBootJars(cp.files.toVector)
+      val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
+      val args = Seq(mainClass)
+      val ret = Fork.java(forkOpts, args)
+      if (ret != 0) {
+        stream.log.error(s"failed to generate example files")
+      }
+      val bis = new java.io.ByteArrayInputStream(out.toByteArray)
+      val isr = new java.io.InputStreamReader(bis)
+      val br = new java.io.BufferedReader(isr)
+      val iterator = Iterator.continually(br.readLine()).takeWhile(_ != null).filterNot(_.startsWith("WARN"))
+      val files = iterator.map { f =>
+        new File(f)
+      }.toSet
+      stream.log.info(s"generated ${files.size} runtime2 example files")

Review comment:
       I'll add $outDir to the message, but it needs to be something like `val outDir = (runtime2 / Test / "c" / "examples").value` (which doesn't work) or `val outDir = (runtime2 / Test).value / "c" / "examples"` (which doesn't work either).  How do I define outDir correctly?
   
   I'll pass outDir as a parameter to CodeGenerator.main (I don't want to have to change the examples directory's location in both build.sbt and CodeGenerator.scala), but I'll miss being able to run CodeGenerator.main from the IDE using whatever default runtime configuration it creates without having to edit the runtime configuration to pass the directory.  Currently only Rat.scala and CodeGenerator.scala had to know the examples directory's location; now it'll be build.sbt and Rat.scala. 

##########
File path: .github/workflows/main.yml
##########
@@ -149,6 +149,9 @@ jobs:
       - name: Run Integration Tests
         run: $SBT coverage IntegrationTest/test
 
+      - name: Run Modified Example Files Check
+        run: git diff --color --exit-code

Review comment:
       The check has passed in the last couple of CI jobs even though it's running over the entire checkout, so there's no problem right now.  And the check would alert us if a change introduced such a problem in the future, so I think that's a good thing.  Also, if we specify a directory, then that's another file which we have to remember to update if we move the examples again.

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/CodeGenerator.scala
##########
@@ -198,3 +200,60 @@ class CodeGenerator(root: Root) extends DFDL.CodeGenerator {
   override def getDiagnostics: Seq[Diagnostic] = diagnostics
   override def isError: Boolean = errorStatus
 }
+
+/** Runs from "sbt compile" to keep all example generated code files up to date */
+object CodeGenerator {
+  // Update one set of example generated code files from an example schema
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val tempDir = os.temp.dir(dir = null, prefix = "daffodil-runtime2-")
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+
+    // Replace the example generated files with the newly generated files
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true, createFolders = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true, createFolders = true)
+
+    // Print the example generated files' names so "sbt 'show genExamples'" can list them
+    System.out.println(exampleCodeHeader)
+    System.out.println(exampleCodeFile)
+
+    // tempDir should be removed automatically after main exits; this is just in case
+    os.remove.all(tempDir)
+  }
+
+  // Make sure "sbt compile" calls this main method
+  def main(args: Array[String]): Unit = {

Review comment:
       I don't think it's a big deal that this main method works only in the daffodil root directory or the daffodil-runtime2 directory (see the if expression on line 235).  The only two times I expect anyone to run this main method is from "sbt compile" (where the daffodil root directory is the current directory) or from their IDE as a runtime configuration (where the daffodil-runtime2 directory is the current directory).

##########
File path: build.sbt
##########
@@ -339,3 +339,40 @@ lazy val unidocSettings = Seq(
     }
   },
 )
+
+lazy val genExamplesSettings = Seq(
+  Compile / genExamples := {
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc).toSet

Review comment:
       True, we might want to watch Scala files in daffodil-core and daffodil-runtime1 too although I think the likelihood of a dsom or grammar change making it to the PR check is below 10%, not only because it would have to change generated_code.[ch] without breaking any test, also because a contributor would have to submit a PR without having done a clean build ("sbt clean") at some point.  Their PR will fail the "git diff --exit-code" check, so the change still won't slip past the PR check even then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-972915240


   Regarding the version issue, I'm not sure of a way around that, but I'm not too familar with IDE's. Might be some IDE setting.
   
   I was curious to see if modifying build.sbt to add a new `sbt genExamples` would be complicated or messy, and I think it actually turned out not too bad:
   
   ```scala
   import sbt.internal.util.Util
   import sbt.io.IO
   import scala.sys.process.Process
   
   lazy val genExamples = taskKey[Unit]("Generate runtime2 example files")
   
   lazy val daffodil = project.in(file("."))...
                                  .settings(..., genExamplesSettings)
   
   ...
   
   lazy val genExamplesSettings = Seq(
     Compile / genExamples := {
       val tests = Seq(
         "ex_nums.dfdl.xsd" -> "ex_nums",
         "nested.dfdl.xsd" -> "NestedUnion",
       )
   
       // require that daffodil is staged
       (cli / Universal / stage).value
   
       // run "daffodil generate c" for each of the test files and copy the results
       // to the examples directory
       val daffodilBinDir = (cli / Universal / stagingDirectory).value / "bin"
       val daffodilBinExt = if (Util.isWindows) ".bat" else ""
       val daffodilBin = daffodilBinDir / ("daffodil" + daffodilBinExt)
       val rootDir = (runtime2 / Test / resourceDirectory).value / "org" / "apache" / "daffodil" / "runtime2"
       val logger = streams.value.log
       tests.foreach { test =>
         IO.withTemporaryDirectory { tmpDir =>
           val res = Process(s"""${daffodilBin} generate c -s ${rootDir / test._1} ${tmpDir}""").!
           if (res != 0) {
             logger.error(s"failed to generate example: ${test._1}")
           }
           val srcDir = tmpDir / "c" / "libruntime"
           val tgtDir = rootDir / "examples" / test._2
           IO.copyFile(srcDir / "generated_code.c", tgtDir / "generated_code.c")
           IO.copyFile(srcDir / "generated_code.h", tgtDir / "generated_code.h")
         }
       }
     },
   )
   ```
   
   It's maybe a bit slower since it requires staging the daffodil CLI, but that isn't awful. And this doesn't have any version issues since it just executes the daffodil CLI, which knows its version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence edited a comment on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence edited a comment on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-972915240


   Regarding the version issue, I'm not sure of a way around that, but I'm not too familar with IDE's. Might be some IDE setting.
   
   I was curious to see if modifying build.sbt to add a new `sbt genExamples` would be complicated or messy, and I think it actually turned out not too bad:
   
   ```scala
   lazy val genExamples = taskKey[Unit]("Generate runtime2 example files")
   
   lazy val daffodil = project.in(file("."))...
                                  .settings(..., genExamplesSettings)
   
   ...
   
   lazy val genExamplesSettings = Seq(
     Compile / genExamples := {
       val tests = Seq(
         "ex_nums.dfdl.xsd" -> "ex_nums",
         "nested.dfdl.xsd" -> "NestedUnion",
       )
   
       val rootDir = (runtime2 / Test / resourceDirectory).value / "org" / "apache" / "daffodil" / "runtime2"
       val cp = (cli / Runtime / dependencyClasspath).value
       val stream = streams.value
       val mainClass = "org.apache.daffodil.Main"
       val forkOpts = ForkOptions()
                      .withBootJars(cp.files.toVector)
       tests.foreach { test =>
         IO.withTemporaryDirectory { tmpDir =>
           val args = Seq(mainClass, "generate", "c", "-s", (rootDir / test._1).toString, tmpDir.toString)
           val ret = Fork.java(forkOpts, args)
           if (ret != 0) {
             stream.log.error(s"failed to generate example: ${test._1}")
           }
           val srcDir = tmpDir / "c" / "libruntime"
           val tgtDir = rootDir / "examples" / test._2
           IO.copyFile(srcDir / "generated_code.c", tgtDir / "generated_code.c")
           IO.copyFile(srcDir / "generated_code.h", tgtDir / "generated_code.h")
         }
       }
     },
   )
   ```
   
   This depends on the Daffodil jars being built, and then fors java with the correct arguments to the run the CLI C generator to generate code to a temp directory and copy the generated files to the example directory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r753721069



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       I've implemented the changes I talked about above ("* text=auto" and System.lineSeparator) and the tests still pass, so this is the final version I think should be merged.  Still need a second +1.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji edited a comment on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji edited a comment on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976980753


   Steve, I've integrated all the changes from your branch.  I found one last bug (the caching doesn't work), but I just realized what's causing the example files to be generated every time.  `genExamples` watches all the runtime2 files for changes and calls FileFunction.cached, but the example files are located in a subdirectory under the example schemas, so we need to separate the example files from the example schemas to avoid the example files being watched for changes too.  I think I will move the example c files to `daffodil-runtime2/src/test/c/examples`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-971976115


   Latest push should fix failure running test_updateGeneratedCodeExamples in GitHubs Actions because test_updateGeneratedCodeExamples was previously run from the IDE, not from sbt, and the example generated_code files got their daffodil_program_version set to "null null" instead of "daffodil-runtime2 3.2.0-SNAPSHOT".  The IDE run tests directly from classes, not from jars, which results in these two lines in CodeGeneratorState.generateCodeFile producing "null":
   
   ```
       val program = this.getClass.getPackage.getImplementationTitle
       val version = this.getClass.getPackage.getImplementationVersion
   
            const char *daffodil_program_version = "$program $version";
   ```
   
   If anyone knows how to ensure these two lines produce the same output no matter whether code contributors run "sbt test" or run test_updateGeneratedCodeExamples from their IDE, then that would make test_updateGeneratedCodeExamples more robust.  Any ideas?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r753721069



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       I've implemented the changes I talked about above ("* text=auto" and System.lineSeparator) and the tests still pass, so this is the final version I think should be merged.  Still need a second +1.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-978066543


   I've moved the example files.  Now genExamples won't generate the example files over and over but only when changes are made to the runtime2 Scala sources or example schemas.  This completes my pull request.  Anyone who's around, please review and give me a second +1 so I can rebase and merge it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji merged pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji merged pull request #681:
URL: https://github.com/apache/daffodil/pull/681


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r752461365



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       Another thought on the test approach, if the example files are out of date, the first time you run tests you'll get a failure, but the second time the test will pass, even though the user hasn't changed anything. That feels like potentially confusing behavior. And I think that's because a test is updated the expected results. It feels wrong tests to be updating expected results.
   
   I'm fine doing something other than the `genExamples` approach if you think it's too slow--daffodil-cli/stage does take some time, especially when making updates. But I'm not sure tests are the right place for it.
   
   
   Whitespace issue: If we choose to update .gitattributes to have daffodil-runtime2/ to disable autocrlf, then I think we should just do it for everything. Having some source that is always LF and some that is OS dependent feels wrong. That said, as dumb as I think Git's decision is to enable autocrlf, I also don't want to make development harder for people on windows. Maybe there are some tools that really do need this autocrlf feature, so I'm not sure disabling it for .scala files is safe? I'm not really sure...
   
   One last thought regarding whitespace, we could minimize code changes by adding some implicit classes:
   ```scala
   implicit class StringNewLine(s: String) {
     def stripMarginNL: String = s.stripMargin.replace("\r\n", "\n").replace("\n", System.lineSeparator)
   }
   
   implicit class SeqNewLine[T](s: Seq[T]) {
     def mkStringNL: String = s.mkString(System.lineSeparator)
   }
   ```
   
   Then multiline strings become something like
   ```scala
   val foo =
     """|multi
        |line
        |string""".stripMarginNL
   ```
   and Seqs become
   
   ```scala
   val bar = Seq("multi", "line", "string").mkStringNL
   ```
   There would need to be a big change to switch it over, but the new code would look fairly clean and would ensure consistent newlines.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r755460200



##########
File path: build.sbt
##########
@@ -343,14 +343,38 @@ lazy val unidocSettings = Seq(
 
 lazy val genExamplesSettings = Seq(
   Compile / genExamples := {
-    val cp = (runtime2 / Runtime / dependencyClasspath).value
-    val forkOpts = ForkOptions().withBootJars(cp.files.toVector)
-    val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
-    val args = Seq(mainClass)
-    val ret = Fork.java(forkOpts, args)
-    val stream = streams.value
-    if (ret != 0) {
-      stream.log.error(s"Failed to generate examples")
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Compile / resources).value
+    val inTSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc ++ inTSrc).toSet
+    val cachedFun = FileFunction.cached(stream.cacheDirectory / "genExamples") { _ =>
+      val out = new java.io.ByteArrayOutputStream()
+      val forkOpts = ForkOptions()
+                       .withOutputStrategy(Some(CustomOutput(out)))
+                       .withBootJars(cp.files.toVector)
+      val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
+      val args = Seq(mainClass)
+      val ret = Fork.java(forkOpts, args)
+      if (ret != 0) {
+        stream.log.error(s"failed to generate example files")
+      }
+      val bis = new java.io.ByteArrayInputStream(out.toByteArray)
+      val isr = new java.io.InputStreamReader(bis)
+      val br = new java.io.BufferedReader(isr)
+      val iterator = Iterator.continually(br.readLine()).takeWhile(_ != null).filterNot(_.startsWith("WARN"))
+      val files = iterator.map { f =>
+        new File(f)
+      }.toSet
+      stream.log.info(s"generated ${files.size} runtime2 example files")

Review comment:
       For consistency with Scala logging and potentially easy debugging, can we include an output directory here? Doesn't have to be the exact directory, but at least something so a user knows vaguely where the files are being generated, something like:
   
   ```scala
   val outDir = (runtime2 / Test / resourceDirectory).value
   stream.log.info(s"generated ${files.size} runtime2 example files to ${outDir} ...")
   ```
   Might also be nice if outDir could be passed in as a parameter to the `CodeGenerator` so the code generator doesn't need to hard code where to write things to. This is what the prop gen does

##########
File path: .github/workflows/main.yml
##########
@@ -149,6 +149,9 @@ jobs:
       - name: Run Integration Tests
         run: $SBT coverage IntegrationTest/test
 
+      - name: Run Modified Example Files Check
+        run: git diff --color --exit-code

Review comment:
       Do we want to be more specific about what directory we're checking? I'm concerned that there might be some cruft leftover from some build process that might cause this check to fail.

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/CodeGenerator.scala
##########
@@ -198,3 +200,60 @@ class CodeGenerator(root: Root) extends DFDL.CodeGenerator {
   override def getDiagnostics: Seq[Diagnostic] = diagnostics
   override def isError: Boolean = errorStatus
 }
+
+/** Runs from "sbt compile" to keep all example generated code files up to date */
+object CodeGenerator {
+  // Update one set of example generated code files from an example schema
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val tempDir = os.temp.dir(dir = null, prefix = "daffodil-runtime2-")
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+
+    // Replace the example generated files with the newly generated files
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true, createFolders = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true, createFolders = true)
+
+    // Print the example generated files' names so "sbt 'show genExamples'" can list them
+    System.out.println(exampleCodeHeader)
+    System.out.println(exampleCodeFile)
+
+    // tempDir should be removed automatically after main exits; this is just in case
+    os.remove.all(tempDir)
+  }
+
+  // Make sure "sbt compile" calls this main method
+  def main(args: Array[String]): Unit = {

Review comment:
       I'm not sure this is a big deal, but if anyone ever runts this jar using java -jar it's probably going to fail, since it can only be run in the daffodil root dir. Like, one might expect this to be an alternative to using the CLI, but it's really a test only thing. Maybe it should go in src/test/scala so it's not distributed? Or maybe it doesn't really matter.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-975980179


   Almost there now.  Still need to make two more changes:
   
   1. Make "sbt compile" call genExamples automatically, otherwise we
   have to type genExamples at sbt prompt
   
   2. Add log4j-core to genExamples' classpath so we don't get "ERROR
   StatusLogger Log4j2 could not find a logging implementation. Please
   add log4j-core to the classpath. Using SimpleLogger to log to the
   console..."
   
   Steve, how can we modify build.sbt to do 1) and 2)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] mbeckerle commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r754465889



##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -594,10 +639,10 @@ class CodeGeneratorState {
       s"""// clang-format off
          |#include "generated_code.h"
          |#include <math.h>       // for NAN
-         |#include <stdbool.h>    // for true, false, bool
+         |#include <stdbool.h>    // for false, bool, true
          |#include <stddef.h>     // for NULL, size_t
          |#include <string.h>     // for memset, memcmp
-         |#include "errors.h"     // for Error, PState, UState, ERR_CHOICE_KEY, UNUSED
+         |#include "errors.h"     // for Error, PState, UState, ERR_CHOICE_KEY, Error::(anonymous), UNUSED

Review comment:
       What is "Error::(anonymous)"? I am not familiar with what this is notationally. 
   
   These lines are maintained by some tool I believe. What is that tool trying to tell us with this?

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]

Review comment:
       Why is calling the empty method preferable to construction? These are mutable. You need a new instance. Calling empty seems very odd here. 

##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       So, if I sbt test, and the output has changed, the tests fail, but a side effect of running those tests, is that the comparison files are updated, so if I sbt test again, it passes?
   
   This seems very problematic to me. 
   
   Make 'sbt test' always idempotent. Same behavior every time. Updating output files to be match new stuff *should* be a manual step. You can provide some automation support for this to make it a one-shot action to update them all, but having 'sbt test' be non-idempotent is asking for trouble. 
   

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -157,30 +178,41 @@ class CodeGeneratorState {
    * - we can store the accessed value in an int64_t local variable safely
    */
   private def choiceDispatchField(context: ElementBase): String = {
-    // We want to use SchemaComponent.scPath but it's private so duplicate it here (for now...)
+    // We want to call SchemaComponent.scPath but it's private so duplicate it here for now
     def scPath(sc: SchemaComponent): Seq[SchemaComponent] = sc.optLexicalParent.map { scPath }.getOrElse(Nil) :+ sc
-    val localNames = scPath(context).map {
-      case er: AbstractElementRef => er.refQName.local
-      case e: ElementBase => e.namedQName.local
-      case ed: GlobalElementDecl => ed.namedQName.local
-      case _ => ""
-    }
-    val absoluteSlashPath = localNames.mkString("/")
-    val dispatchSlashPath = context.complexType.modelGroup match {
+
+    // We handle only direct dispatch choices, so ignore other elements
+    context.complexType.modelGroup match {
       case choice: Choice if choice.isDirectDispatch =>
+        // Get parent path against which to perform up paths
+        val parentNames = scPath(context).map {
+          case _: Root => ""
+          case er: AbstractElementRef => er.refQName.local
+          case e: ElementBase => e.namedQName.local
+          case ed: GlobalElementDecl => ed.namedQName.local
+          case _ => ""
+        }
+        val parentPath = parentNames.mkString("/")
+
+        // Convert expression to a relative path (may have up paths)
         val expr = choice.choiceDispatchKeyEv.expr.toBriefXML()
         val before = "'{xs:string("
         val after = ")}'"
         val relativePath = if (expr.startsWith(before) && expr.endsWith(after))
           expr.substring(before.length, expr.length - after.length) else expr
-        val normalizedURI = new URI(absoluteSlashPath + "/" + relativePath).normalize
-        normalizedURI.getPath.substring(1)
-      case _ => ""
+
+        // Remove redundant slashes (//) and up paths (../)
+        val normalizedURI = new URI(parentPath + "/" + relativePath).normalize
+
+        // Strip namespace prefixes since C code uses only local names (for now)
+        val dispatchPath = normalizedURI.getPath.replaceAll("/[^/:]+:", "/")
+
+        // Convert to C struct dot notation without any leading dot
+        val notation = dispatchPath.replace('/', '.').substring(1)
+        notation
+      case _ =>
+        ""

Review comment:
       This line is kind of invisible. Why is returning empty strng ok here? This is the case where the choice is not a direct dispatch choice. I would expect an SDE to be issued here. 
   
   At least add a comment to say why returning "" is ok here. Ie., point out where the error is detected somewhere else. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r749719668



##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       We already have roundtrip parse/unparse tests for a number of schemas that should have sufficient coverage to let us know if something broke in the generated code.  I want to eliminate unnecessary steps to reduce friction for code contributors making changes to the C code generator.    I would rather have the tests update the expected files automatically instead of forcing code contributors to update the expected files manually.  Once the expected files are updated, code contributors will see any changes in `git diff` output and reviewers will see the changes in a pull request if the code contributor follows our code contributor workflow documentation (e.g., always run the tests first).  If a contributor changes some generator code but doesn't run the tests, the worst case is that those changes will appear in someone else's unrelated pull request, which still will alert us that the changes happened (they won't sneak past or avoid our review for too long).
   

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]
+  private val structs = mutable.Stack.empty[ComplexCGState]
+  private val prototypes = mutable.ArrayBuffer.empty[String]
+  private val erds = mutable.ArrayBuffer.empty[String]
+  private val finalStructs = mutable.ArrayBuffer.empty[String]
+  private val finalImplementation = mutable.ArrayBuffer.empty[String]
+
+  // Returns true if the element has not been seen before (checking if
+  // a map already contains the element, otherwise adding it to the map)
+  def elementNotSeenYet(context: ElementBase): Boolean = {
+    val key = context.namedQName.toString
+    val alreadySeen = elementsAlreadySeen.contains(key)
+    if (!alreadySeen) elementsAlreadySeen += (key -> context)
+    !alreadySeen
+  }

Review comment:
       Yes, this code is intended to handle references to global elements (which have unique qnames) from multiple places in a schema (e.g., a common `Header` element which is interpolated into several `....Message` elements with `<xs:element ref="tns:Header"/>`).  Before this change, we were generating redundant C code (duplicate `struct Header` definitions, duplicate ERDs, etc.).  The schema I'm writing is not for public use so I don't want to put it into an example.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r752344644



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       It might also be worth punting this whitespace and checking failing test if generated not updated to another PR, this feels like it could become a big change, and it would be good to get this corner case changes in.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976461723


   > Even better, let's stop comparing the old example files to the new example files and just update them unconditionally during "sbt compile" like we update all the other automatically generated files.
   
   Even if we autocompile, it still doesn't guarantee that those files are committed. Devs still need to `git add` those files. If they use `git add .` or something similar that will happen, but it's dev workflow dependent. I personally use `git add -p` or manually add files I know I've modified--I don't think it would even cross my mind to commit changes I didn't personally make if I was new to this. Since it's still possible that a PR is made without actually updating these files depending on a dev's workflow, I'd prefer that we still have a test somewhere, even if we unconditionally rebuild these when compiling. A one line github action could do this.
   
   > Make "sbt compile" call genExamples automatically, otherwise we have to type genExamples at sbt prompt
   
   This is fine with me. Note that this is a bit different than the existing auto generators. Ideally this would be a `sourceGenerator` or a `resourceGenerator`. But the output of a `sourceGenerator` I think needs to be compilable Scala or Java. And `resourceGenerators` aren't run with sbt compile, they are run some other time during the packaging prcoess. So things are a bit different.
   
   However, things get a bit tricky. We can still trigger a task to run during compilation without using a generator, but we have to be careful about task dependency cycles. The way the genExample needs to work is it essentially just forks a java process and executes the CLI with the right arguments. But this task depends on the CLI (and all of its dependencies being compiled), so this task really can't live in the CLI or any of its dependent subprojects (e.g. `daffodil-runtime2` or the root `daffodil` project), or we'll get circularity issues. That probably also means that what this task generates probably shouldn't be in any of those projects either--even if it works, generating a file inside another subproject feels dangerous and might confuse sbt, or create subtle issues.
   
   So, what if we moved the runtime2 generated example files into the tutorials subproject? The genExamples task would also live in the `tutorials` project, and then the example files are just resources that get generated and committed in the tutorials subproject? 
   
   > Add log4j-core to genExamples' classpath so we don't get "ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console..."
   
   I've got some changes that I think should fix this.
   
   Do you mind if I push the changes to your PR once I get things working?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r754556876



##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]

Review comment:
       The definition of mutable.Map.empty is:
   
   ```scala
   def empty[K, V]: Map[K, V] = new HashMap[K, V]
   ```
   
   The other mutable.\<C\>.empty functions also return new instances.  There is no semantic (only syntactic) difference between empty calls (parentheses are optional), apply calls (parentheses are mandatory), and constructors calls (`new`  is mandatory).  I checked and found all three types of mutable container calls used in Daffodil source code.  Nevertheless, I'll switch back to apply since that idiom is more common and some lines at the end of this same file use apply as well.

##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       Okay, then I want to remove this test and make "sbt compile" update the example code files instead.  We should automatically generate (a.k.a. update) the example code files whenever we build Daffodil, just like the build automatically generates those other files and keeps them up to date after any changes to the source code:
   
   ```scala
   lazy val genManaged = taskKey[Unit]("Generate managed sources and resources")
   lazy val genProps = taskKey[Seq[File]]("Generate properties scala source")
   lazy val genSchemas = taskKey[Seq[File]]("Generated DFDL schemas")
   ```
   
   All I want is to have pull requests include any changes to the example code files automatically.  I don't want any extra manual step.  We have parse and unparse tests which compile and run the new generated files.  These tests will warn us if any code generator backend changes break the generated files, so there is no reason why we need to compare the new generated files to the old files and fail the build even if only one byte differs.  It just would be nice (and helpful for reviewers) to show how code generator backend changes affect the example code files.
   
   Steve (or Mike), what's the easiest way to make "sbt compile" call `updateGeneratedCodeExample` without needing to stage the daffodil CLI?  

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -594,10 +639,10 @@ class CodeGeneratorState {
       s"""// clang-format off
          |#include "generated_code.h"
          |#include <math.h>       // for NAN
-         |#include <stdbool.h>    // for true, false, bool
+         |#include <stdbool.h>    // for false, bool, true
          |#include <stddef.h>     // for NULL, size_t
          |#include <string.h>     // for memset, memcmp
-         |#include "errors.h"     // for Error, PState, UState, ERR_CHOICE_KEY, UNUSED
+         |#include "errors.h"     // for Error, PState, UState, ERR_CHOICE_KEY, Error::(anonymous), UNUSED

Review comment:
       The tool is include-what-you-use, iwyu for short.  It's telling us that we're dereferencing an anonymous union inside the Error struct, which has no union name but does have a member name:
   
   ```c
   typedef struct Error
   {
       uint8_t code;
       union
       {
           int         c;   // for %c
           int64_t     d64; // for %d64
           const char *s;   // for %s
       } arg;
   } Error;
   ```

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -157,30 +178,41 @@ class CodeGeneratorState {
    * - we can store the accessed value in an int64_t local variable safely
    */
   private def choiceDispatchField(context: ElementBase): String = {
-    // We want to use SchemaComponent.scPath but it's private so duplicate it here (for now...)
+    // We want to call SchemaComponent.scPath but it's private so duplicate it here for now
     def scPath(sc: SchemaComponent): Seq[SchemaComponent] = sc.optLexicalParent.map { scPath }.getOrElse(Nil) :+ sc
-    val localNames = scPath(context).map {
-      case er: AbstractElementRef => er.refQName.local
-      case e: ElementBase => e.namedQName.local
-      case ed: GlobalElementDecl => ed.namedQName.local
-      case _ => ""
-    }
-    val absoluteSlashPath = localNames.mkString("/")
-    val dispatchSlashPath = context.complexType.modelGroup match {
+
+    // We handle only direct dispatch choices, so ignore other elements
+    context.complexType.modelGroup match {
       case choice: Choice if choice.isDirectDispatch =>
+        // Get parent path against which to perform up paths
+        val parentNames = scPath(context).map {
+          case _: Root => ""
+          case er: AbstractElementRef => er.refQName.local
+          case e: ElementBase => e.namedQName.local
+          case ed: GlobalElementDecl => ed.namedQName.local
+          case _ => ""
+        }
+        val parentPath = parentNames.mkString("/")
+
+        // Convert expression to a relative path (may have up paths)
         val expr = choice.choiceDispatchKeyEv.expr.toBriefXML()
         val before = "'{xs:string("
         val after = ")}'"
         val relativePath = if (expr.startsWith(before) && expr.endsWith(after))
           expr.substring(before.length, expr.length - after.length) else expr
-        val normalizedURI = new URI(absoluteSlashPath + "/" + relativePath).normalize
-        normalizedURI.getPath.substring(1)
-      case _ => ""
+
+        // Remove redundant slashes (//) and up paths (../)
+        val normalizedURI = new URI(parentPath + "/" + relativePath).normalize
+
+        // Strip namespace prefixes since C code uses only local names (for now)
+        val dispatchPath = normalizedURI.getPath.replaceAll("/[^/:]+:", "/")
+
+        // Convert to C struct dot notation without any leading dot
+        val notation = dispatchPath.replace('/', '.').substring(1)
+        notation
+      case _ =>
+        ""

Review comment:
       Method addBeforeSwitchStatements in this same file calls choiceDispatchField on every complexType (group) element and generates some additional C code only if choiceDispatchField returns a non-blank string:
   
   ```scala
       val dispatchField = choiceDispatchField(context)
       if (dispatchField.nonEmpty) {
   ```
   
   I've moved the empty string to the same line as the case and added a comment explaining why returning "" is OK here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r754605635



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Convert to CRLF on Windows since tests expect System.lineSeparator
+* text=auto

Review comment:
       Even better, let's stop comparing the old example files to the new example files and just update them unconditionally during "sbt compile" like we update all the other automatically generated files.  Then we can either apply this setting only to the examples directory or remove both this setting and the `replace("\n", System.lineSeparator)` calls from CodeGeneratorState's generate methods.  Either way will guarantee that updated example code files are checked in with LFs, not CRLFs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji edited a comment on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji edited a comment on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-978066543


   I've moved the example files.  Now genExamples won't generate the example files over and over but only when changes are made to the runtime2 Scala sources or example schemas.  This completes my pull request.  Thanks for the second +1!  I'll rebase and merge the PR later today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976697702


   > I don't think anyone would deliberately hold back from committing the modified example files since they must have been making a change to the code generation backend anyway
   
   Maybe not deliberately, but I can definitely imagine accidentally leaving out files. Or someone using an IDE that doesn't use SBT (e.g. eclipse) and so the generator is never even run. If our goal is to have the examples always match what the generator will generate at each commit (which I like from a review perspective), adding a check somewhere to enforce that seems important, especially since there's not other way to know if it was updated or not.
   
   > Do note that incrementing the version number could have to commit the example files too depending on how sbt constructs the classpath.
   
   That's fine with me, it's pretty easy to manually bump version numbers in a couple files, and if we add a check, it will ensure that happens.
   
   Though, another option might be for the generator to generate something like `libcli/version.h` file which defines the `daffodil_program_version` variable. That way the version number is never in a file that is actually committed, so it doesn't matter if things can't find a version or if the version changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r756349629



##########
File path: build.sbt
##########
@@ -339,3 +339,40 @@ lazy val unidocSettings = Seq(
     }
   },
 )
+
+lazy val genExamplesSettings = Seq(
+  Compile / genExamples := {
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc).toSet

Review comment:
       Fair enough, I guess the extra complexity isn't worth it with the unlikeliness of actualy needing a rebuild, and the likelyhood of catching it in the github actions if it was missed. Fine with me. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r752428829



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       Steve, I just saw your comments.  Yes, your sbt code works around the version issue, but so does running the unit test with "sbt test" without needing any changes to build.sbt.  I think having a unit test is nice because it offers a way to quickly test and debug changes to the code generator using a couple of larger example schemas.
   
   And yes, your generator changes look like they would allow the test to pass regardless of which line endings the git checkout uses.  However, there are a lot of multiline strings in runtime2 and we would need to make these `stripMargin.replace("\r\n", "\n")` changes to every place. I think it's a lot easier and simpler to add one line in .gitattributes telling git to check out all of daffodil-runtime as LF regardless of platform or git config settings.  However, I need a second+1 anyway, so Mike (@mbeckerle), will you please tell us which way you prefer?
   
   I also wonder which would give us fewer problems in the long run, telling git to check out all of daffodil as LF with `* text=auto eol=LF` or continuing to call `if (windows)`, `replace("\r\n", "\n")`, `replace("\n", System.lineSeparator)`, `mkString(System.lineSeparator)`, and so on.  I suspect it's probably not worth making major changes to line separators in files or code and we should keep changing as few files and lines of code as possible.  The tests have all passed now, so I think checking out daffodil-runtime2 as LF without changing any other part of daffodil has the least risk and will spare daffodil-runtime2 from needing explicit code to manage line separators in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on pull request #681:
URL: https://github.com/apache/daffodil/pull/681#issuecomment-976606602


   > Even if we autocompile, it still doesn't guarantee that those files are committed. Devs still need to `git add` those files. If they use `git add .` or something similar that will happen, but it's dev workflow dependent. I personally use `git add -p` or manually add files I know I've modified--I don't think it would even cross my mind to commit changes I didn't personally make if I was new to this. Since it's still possible that a PR is made without actually updating these files depending on a dev's workflow, I'd prefer that we still have a test somewhere, even if we unconditionally rebuild these when compiling. A one line github action could do this.
   
   Yes, dev workflows aren't identical.  I don't leave any modified files lying around in my checkout after a push unless I have a very, very good reason for deliberately not committing them.  My own workflow is to run `git diff`, write the commit message in a file while looking at every difference carefully, and then run `git commit -a -F ~/f`.  I use `git add` only if I need to add new files not already in git, and that forces me to `git add .`  everything so `git diff --staged` will show me all the differences. In practice, the generated files will be updates to files already in git and I don't think anyone would deliberately hold back from committing the modified example files since they must have been making a change to the code generation backend anyway.  If we do see contributors leaving out changes to the example files, we can take some action (otherwise, YAGNI).
   
   Do note that incrementing the version number could have to commit the example files too depending on how sbt constructs the classpath.  If it's doable, you could tell sbt to construct the classpath using classes instead of jars so the version number doesn't appear in the example files, but I wouldn't worry about it.
   
   > However, things get a bit tricky. We can still trigger a task to run during compilation without using a generator, but we have to be careful about task dependency cycles. The way the genExample needs to work is it essentially just forks a java process and executes the CLI with the right arguments. But this task depends on the CLI (and all of its dependencies being compiled), so this task really can't live in the CLI or any of its dependent subprojects (e.g. `daffodil-runtime2` or the root `daffodil` project), or we'll get circularity issues. That probably also means that what this task generates probably shouldn't be in any of those projects either--even if it works, generating a file inside another subproject feels dangerous and might confuse sbt, or create subtle issues.
   
   When I was testing object CodeGenerator.main, I had no problem running it manually from IDEA with the runtime configuration offering a dropdown allowing me to use either daffodil-runtime2's classpath (I got the log4j warning) or daffodil-core's classpath (I didn't get the log4j warning but I did get another odd warning).  The dropdown offered me the option of using every daffodil module's classpath including daffodil-cli, but I know for sure that nothing in the runtime2 code generator backend needs anything from daffodil-cli.  Even the version number comes from daffodil-runtime2's own jar.  
   
   > Do you mind if I push the changes to your PR once I get things working?
   
   FYI, I wouldn't have minded, but I've just seen your next comment saying that you've already added a branch to your fork.  I'll pull it, thanks.  I still don't think we need to worry about circular class dependencies, so I might try to keep things contained inside daffodil-runtime2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r750231777



##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       > the worst case is that those changes will appear in someone else's unrelated pull request,
   
   This is what I hope to avoid. I actually find the example updates really helpful when doing code reviews--it's much easier to visualize what's changing compared to looking only at template C-code. Tests failing is a good way to know if the examples were updated to match the code changes and what we are reviewing is accurate.
   
   If the goal is friction reduction, what if we add a new sbt command that runs and updates the examples, e.g. `sbt genExamples`. This sort of matches our existing `genProps` and `genSchemas` sbt commands, the only difference is we expect people to actually commit those files. And then the test to check this could just be a github action that's something like:
   
   ```
   sbt genExamples && git diff --exit-code daffodil-runtime2/
   ``` 
   
   Note that what you have is fine, and I won't block the PR over this. And if you do find this idea ok, it doesn't have to be part of this PR--sbt can be a beast sometimes so this probably won't be trivial to implement.

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]
+  private val structs = mutable.Stack.empty[ComplexCGState]
+  private val prototypes = mutable.ArrayBuffer.empty[String]
+  private val erds = mutable.ArrayBuffer.empty[String]
+  private val finalStructs = mutable.ArrayBuffer.empty[String]
+  private val finalImplementation = mutable.ArrayBuffer.empty[String]
+
+  // Returns true if the element has not been seen before (checking if
+  // a map already contains the element, otherwise adding it to the map)
+  def elementNotSeenYet(context: ElementBase): Boolean = {
+    val key = context.namedQName.toString
+    val alreadySeen = elementsAlreadySeen.contains(key)
+    if (!alreadySeen) elementsAlreadySeen += (key -> context)
+    !alreadySeen
+  }

Review comment:
       Makes sense, maybe renaming something or add a comment just to make it clear that this is only about global elements? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r749649976



##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       Rather than overwriting these files, what are your thoughts on just comparing the generate and expected files and fail the test if they aren't the same? That way if the test fails that means we either broke something or we just forgot to update the expected files.

##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]
+  private val structs = mutable.Stack.empty[ComplexCGState]
+  private val prototypes = mutable.ArrayBuffer.empty[String]
+  private val erds = mutable.ArrayBuffer.empty[String]
+  private val finalStructs = mutable.ArrayBuffer.empty[String]
+  private val finalImplementation = mutable.ArrayBuffer.empty[String]
+
+  // Returns true if the element has not been seen before (checking if
+  // a map already contains the element, otherwise adding it to the map)
+  def elementNotSeenYet(context: ElementBase): Boolean = {
+    val key = context.namedQName.toString
+    val alreadySeen = elementsAlreadySeen.contains(key)
+    if (!alreadySeen) elementsAlreadySeen += (key -> context)
+    !alreadySeen
+  }

Review comment:
       Isn't it possible to have to elements with the same named qname? Or is this about global elements, which I guess must always be unqiue? Or maybe I'm just not understanding this? Maybe an example what what went wrong would be helpful?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] tuxji commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
tuxji commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r750708700



##########
File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/CodeGeneratorState.scala
##########
@@ -36,11 +37,32 @@ import scala.collection.mutable
  * Builds up the state of generated code.
  */
 class CodeGeneratorState {
-  private val structs = mutable.Stack[ComplexCGState]()
-  private val prototypes = mutable.ArrayBuffer[String]()
-  private val erds = mutable.ArrayBuffer[String]()
-  private val finalStructs = mutable.ArrayBuffer[String]()
-  private val finalImplementation = mutable.ArrayBuffer[String]()
+  private val elementsAlreadySeen = mutable.Map.empty[String, ElementBase]
+  private val structs = mutable.Stack.empty[ComplexCGState]
+  private val prototypes = mutable.ArrayBuffer.empty[String]
+  private val erds = mutable.ArrayBuffer.empty[String]
+  private val finalStructs = mutable.ArrayBuffer.empty[String]
+  private val finalImplementation = mutable.ArrayBuffer.empty[String]
+
+  // Returns true if the element has not been seen before (checking if
+  // a map already contains the element, otherwise adding it to the map)
+  def elementNotSeenYet(context: ElementBase): Boolean = {
+    val key = context.namedQName.toString
+    val alreadySeen = elementsAlreadySeen.contains(key)
+    if (!alreadySeen) elementsAlreadySeen += (key -> context)
+    !alreadySeen
+  }

Review comment:
       I've made the comment slightly clearer.

##########
File path: daffodil-runtime2/src/test/scala/org/apache/daffodil/runtime2/TestCodeGenerator.scala
##########
@@ -168,4 +174,40 @@ class TestCodeGenerator {
     assert(ur.isError, "expected ur.isError to be true")
     assert(ur.getDiagnostics.nonEmpty, "expected ur.getDiagnostics to be non-empty")
   }
+
+  private def updateGeneratedCodeExample(schemaFile: os.Path, rootName: Option[String],
+                                         exampleCodeHeader: os.Path, exampleCodeFile: os.Path): Unit = {
+    // Generate code from the example schema file
+    val pf = Compiler().compileFile(schemaFile.toIO, rootName)
+    assert(!pf.isError, pf.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val cg = pf.forLanguage("c")
+    val rootNS = QName.refQNameFromExtendedSyntax(rootName.getOrElse("")).toOption
+    val codeDir = cg.generateCode(rootNS, tempDir.toString)
+    assert(!cg.isError, cg.getDiagnostics.map(_.getMessage()).mkString("\n"))
+    val generatedCodeHeader = codeDir/"libruntime"/"generated_code.h"
+    val generatedCodeFile = codeDir/"libruntime"/"generated_code.c"
+
+    // Replace the example generated code files
+    os.copy(generatedCodeHeader, exampleCodeHeader, replaceExisting = true)
+    os.copy(generatedCodeFile, exampleCodeFile, replaceExisting = true)
+  }

Review comment:
       I've made the test fail if the example generated code files were changed in any way.  The test still updates these files automatically to reduce friction.  All the code contributor has to do is to run "sbt test" before committing changes to a pull request as before, but now their own pull request will fail its tests instead of the changes appearing in someone else's unrelated pull request if they don't remember.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r752337305



##########
File path: .gitattributes
##########
@@ -13,4 +13,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+# Do not include KEYS in archived source releases
 /KEYS export-ignore 
+
+# Do not convert to CRLF on Windows since tests require LF files
+/**/runtime2/examples/** text=auto eol=lf

Review comment:
       Ugh, git mangling newlines has brought us so many problems.
   
   That said, right now the generator is not consistent about multline strings, since depending on git mangling they could have CR or CRLF. I wonder if our multiline strings need something like this:
   
   ```scala
   val foo = 
     "|some
      |multiline
      |string".stripMargin.replace("\r\n", "\n")
   ```
   
   So that we always output LF regardless of where daffodil is built and what mangling git does?
   
   Also, I wonder if we should consider outputting as the preferred OS line-separator, e.g.
   
   ```scala
   val foo = 
     "|some
      |multiline
      |string".stripMargin.replace("\r\n", "\n").replace("\n", System.lineSeparator)
   val bar = Seq("some", "multline", "string").mkString(System.lineSeparator)
   ```
   If we do this and output the system line separator, I *think* things should just work without needing this .gitattributes line? And it has the benefit that people get code that has newlines matching their system.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [daffodil] stevedlawrence commented on a change in pull request #681: Fix some code generation corner cases

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #681:
URL: https://github.com/apache/daffodil/pull/681#discussion_r756345435



##########
File path: build.sbt
##########
@@ -343,14 +343,38 @@ lazy val unidocSettings = Seq(
 
 lazy val genExamplesSettings = Seq(
   Compile / genExamples := {
-    val cp = (runtime2 / Runtime / dependencyClasspath).value
-    val forkOpts = ForkOptions().withBootJars(cp.files.toVector)
-    val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
-    val args = Seq(mainClass)
-    val ret = Fork.java(forkOpts, args)
-    val stream = streams.value
-    if (ret != 0) {
-      stream.log.error(s"Failed to generate examples")
+    val cp = (runtime2 / Test / dependencyClasspath).value
+    val inSrc = (runtime2 / Compile / sources).value
+    val inRSrc = (runtime2 / Compile / resources).value
+    val inTSrc = (runtime2 / Test / resources).value
+    val stream = (runtime2 / streams).value
+    val filesToWatch = (inSrc ++ inRSrc ++ inTSrc).toSet
+    val cachedFun = FileFunction.cached(stream.cacheDirectory / "genExamples") { _ =>
+      val out = new java.io.ByteArrayOutputStream()
+      val forkOpts = ForkOptions()
+                       .withOutputStrategy(Some(CustomOutput(out)))
+                       .withBootJars(cp.files.toVector)
+      val mainClass = "org.apache.daffodil.runtime2.CodeGenerator"
+      val args = Seq(mainClass)
+      val ret = Fork.java(forkOpts, args)
+      if (ret != 0) {
+        stream.log.error(s"failed to generate example files")
+      }
+      val bis = new java.io.ByteArrayInputStream(out.toByteArray)
+      val isr = new java.io.InputStreamReader(bis)
+      val br = new java.io.BufferedReader(isr)
+      val iterator = Iterator.continually(br.readLine()).takeWhile(_ != null).filterNot(_.startsWith("WARN"))
+      val files = iterator.map { f =>
+        new File(f)
+      }.toSet
+      stream.log.info(s"generated ${files.size} runtime2 example files")

Review comment:
       I think this should get you the right path:
   ```scala
   val outDir = (runtime2 / Test / sourceDirectory).value / "c" / "examples"
   ```
   That gets the value of the `sourceDirectory` setting in the `Test` config (i.e. `src/test`) in the `runtime2` subproject, and then adds the "c" and "examples" sub dirs to that.
   
   If it causes problems with IDE's then it's not a big deal.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org