You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by "stevedlawrence (via GitHub)" <gi...@apache.org> on 2023/02/13 16:54:55 UTC

[GitHub] [daffodil] stevedlawrence commented on a diff in pull request #959: Manually extract runtime2 files out of the jar

stevedlawrence commented on code in PR #959:
URL: https://github.com/apache/daffodil/pull/959#discussion_r1104762150


##########
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/CodeGenerator.scala:
##########
@@ -66,21 +62,23 @@ class CodeGenerator(root: Root) extends DFDL.CodeGenerator {
     os.remove.all(codeDir)
 
     // Copy all the C source files from our resources to our code subdirectory
-    // (using synchronized to avoid calling FileSystems.newFileSystem concurrently)
     val resourceUri = Misc.getRequiredResource(resources)
-    mutex.synchronized {
-      val fileSystem = if (resourceUri.getScheme == "jar") {
-        val env: java.util.Map[String, String] = Collections.emptyMap()
-        FileSystems.newFileSystem(resourceUri, env)
-      } else {
-        null
-      }
-      try {
-        val resourceDir = os.Path(if (fileSystem != null) fileSystem.getPath(resources) else Paths.get(resourceUri))
-        os.copy(resourceDir, codeDir)
-      }
-      finally
-        if (fileSystem != null) fileSystem.close()
+    if (resourceUri.getScheme == "jar") {
+      val jarConnection = resourceUri.toURL.openConnection().asInstanceOf[JarURLConnection]
+      val jarFile = jarConnection.getJarFile()
+      jarFile.entries.asScala
+        .filter { entry => ("/" + entry.getName).startsWith(resources) }
+        .filterNot { entry => entry.isDirectory }
+        .foreach { entry =>
+          val entryPath = "/" + entry.getName
+          val subPath = os.SubPath(entryPath.stripPrefix(resources + "/"))
+          val dstPath = codeDir / subPath
+          val stream = jarFile.getInputStream(entry)
+          os.write(dstPath, stream, createFolders = true)

Review Comment:
   I don't think this solves the problem we are seeing with CI though, where `newFileSystem` throws a `FileSystemAlreadyExistsException`. Here's a recent example:
   
   https://github.com/apache/daffodil/actions/runs/4144366800/jobs/7167330500
   
   Things are synchronized (and everything is currently synchronized vs your suggestion which just synchronizes the `newFileSystem` call), so I wouldn't think this wouldn't be a problem, but this has happened multiple times in GitHub actions. I couldn't reproduce it or figure out the cause though. My best guess is maybe `newFileSystem` is getting called in separate processes where `synchronized` doesn't have an effect, and maybe you can't have the same `newFileSystem` even in separate processes? I don't think our tests run the code generate in separate processes with the recent CLI changes, so that doesn't feel likely, but I'm not sure what else would cause it.
   
   > The advantage of keeping using os.copy is that it will copy the original file's attributes (date/time and permissions).
   
   Can jar files (i.e. zip files) even store permissions? And if it can, do we even want those same permissions? As a user I would expect all generated C files, whether they are copied out of a jar or written by the code generator, to have the same permissions. For example, it would feel odd if I had permissions to delete the generated code files but not the files copied from a jar. Seems files should all be generated with whatever user is calling the generator and with whatever umask is set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org