You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "lostluck (via GitHub)" <gi...@apache.org> on 2023/02/21 22:56:09 UTC

[GitHub] [beam] lostluck opened a new issue, #25582: [Feature Request]: Supply classpaths to java using a pathing jar, to avoid too long command lines.

lostluck opened a new issue, #25582:
URL: https://github.com/apache/beam/issues/25582

   ### What would you like to happen?
   
   Occasionally a Java SDK worker harness is unable to boot up due to extremely long command line arguments in the form of the list files to include on the classpath.
   
   https://github.com/apache/beam/blob/master/sdks/java/container/boot.go#L236
   
   This can be difficult for end users to work around since they are generally unaware of dependant jars and similar, depending on pipeline construction.
   
   A solution exists by using another jar, known as a pathing jar, which simply refers to the desired contents of the class path. This solution is unfortunately required vs the more updated solution of an @argfile parameter for starting the JVM. However the beam container needs to support back to Java 8, and @argfile wasn't introduced until Java 9.
   
   Since this would need to be authored in Go in order to build the pathing jar with the contents of the provisioned manifest.  Unfortunately, there doesn't seem to be existing support for this as a Go package. https://pkg.go.dev/search?q=jar+java&m= the tools are largely about reading Jars rather than creating them.
   
   Fortunately, the [Jar specification](https://docs.oracle.com/javase/7/docs/technotes/guides/jar/jar.html) is relatively simple, as it's generally a Zip file with a Meta-INF directory. 
   
   Authoring zip files in Go is robustly supported in the Go standard library: https://pkg.go.dev/archive/zip.
   
   The proposal is to build such a pathing jar in memory from the existing local artifacts, write it out, and then use that as the single class path parameter when invoking java.
   
   ------
   For reference:
   
   Gradle itself has support for building pathing jars.
   
   https://github.com/gradle/gradle/pull/10544/files#diff-bda9c25c55281a1f596c7e7892ce79631e74ac6eef32fe06fef664da23759c62R349
   
   https://stackoverflow.com/questions/5434482/how-can-i-create-a-pathing-jar-in-gradle
   
   Java natively can create jars...
   https://docs.oracle.com/javase/7/docs/api/java/util/jar/JarOutputStream.html#JarOutputStream(java.io.OutputStream)
   
   Apparently a "pathing jar" requires the files listed to be relative to the location of the jar. That's not too bad. It's a matter of creating the appropriate manifest in the Go boot script however (if not on the sservice side)
   
   
   ### Issue Priority
   
   Priority: 2 (default / most feature requests should be filed as P2)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [X] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck closed issue #25582: [Feature Request]: Supply classpaths to java using a pathing jar, to avoid too long command lines.

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck closed issue #25582: [Feature Request]: Supply classpaths to java using a pathing jar, to avoid too long command lines.
URL: https://github.com/apache/beam/issues/25582


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on issue #25582: [Feature Request]: Supply classpaths to java using a pathing jar, to avoid too long command lines.

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on issue #25582:
URL: https://github.com/apache/beam/issues/25582#issuecomment-1448667249

   I'm on vacation until mid march, so if this blocks you the current solution is build uber jars and use those as your dependencies.
   
   eg. From Stack Overflow
   https://stackoverflow.com/questions/52208667/create-an-uber-jar-for-dataflow-and-apache-beam
   
   I have no idea about uberjars generally. Don't ask me.
   I'm the Go guy and since the containers are in Go, I get the fun task of synthesizing a pathing jar in the harness. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org