You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/27 03:54:00 UTC

[GitHub] [beam] ihji opened a new pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

ihji opened a new pull request #11814:
URL: https://github.com/apache/beam/pull/11814


   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ihji commented on a change in pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
ihji commented on a change in pull request #11814:
URL: https://github.com/apache/beam/pull/11814#discussion_r432099933



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
##########
@@ -397,10 +397,21 @@ public static PackageAttributes forFileToStage(
             String.format("Non-existent file to stage: %s", file.getAbsolutePath()));
       }
       checkState(!file.isDirectory(), "Source file must not be a directory.");
+      String target;
+      switch (dest) {
+        case "dataflow-worker.jar":

Review comment:
       Put some comments and logging




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635677737






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635674009


   I don't think so. We changed from using "key=value" strings to StagedFile objects in https://github.com/apache/beam/pull/11039/files.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #11814:
URL: https://github.com/apache/beam/pull/11814#discussion_r431387079



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
##########
@@ -397,10 +397,21 @@ public static PackageAttributes forFileToStage(
             String.format("Non-existent file to stage: %s", file.getAbsolutePath()));
       }
       checkState(!file.isDirectory(), "Source file must not be a directory.");
+      String target;
+      switch (dest) {
+        case "dataflow-worker.jar":

Review comment:
       I'm not sure why we need special treatment for Dataflow jars here. Behavior for these jars should be similar for any other jars uploaded for Dataflow IMO.
   
   @lukecwik WDYT ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635615410


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] steveniemitz commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
steveniemitz commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635015560


   I just tested this change out, it seems like things work as they did before.  The only difference I noticed is that my `filesToStage` are slightly different.
   
   Previously they looked like:
   ```
   dataflow-worker.jar=<my dataflow jar>,
   <some other jar>
   ```
   
   however now they're just:
   ```
   <my dataflow jar>,
   <some other jar>
   ```
   
   It seems like the job launches correctly and uses the jar I set, so it seems like it's still working correctly, but I'm not sure if that missing might cause other issues down the road.  Also, the names are deterministic again which is great!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit merged pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #11814:
URL: https://github.com/apache/beam/pull/11814


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-634085087


   Did you mean review https://github.com/apache/beam/pull/11813 and not this one ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635591331






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-634750348


   Can you rebase this since https://github.com/apache/beam/pull/11813 was merged ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] steveniemitz edited a comment on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
steveniemitz edited a comment on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635015560


   I just tested this change out, it seems like things work as they did before in 2.20.  The only difference I noticed is that my `filesToStage` are slightly different.
   
   Previously they looked like:
   ```
   dataflow-worker.jar=<my dataflow jar>,
   <some other jar>
   ```
   
   however now they're just:
   ```
   <my dataflow jar>,
   <some other jar>
   ```
   
   It seems like the job launches correctly and uses the jar I set, so it seems like it's still working correctly, but I'm not sure if that missing might cause other issues down the road.  Also, the names are deterministic again which is great!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635672193


   @chamikaramj, @ihji is the `filesToStage` change a problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ihji commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
ihji commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-634882179


   Rebased. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #11814:
URL: https://github.com/apache/beam/pull/11814#discussion_r432014418



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
##########
@@ -397,10 +397,21 @@ public static PackageAttributes forFileToStage(
             String.format("Non-existent file to stage: %s", file.getAbsolutePath()));
       }
       checkState(!file.isDirectory(), "Source file must not be a directory.");
+      String target;
+      switch (dest) {
+        case "dataflow-worker.jar":

Review comment:
       Can we add a log statement describing why we need special treatment for these jars ? I see these jars being specifically being referenced to in Dataflow boot.go files.May be that's why ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ihji commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
ihji commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-633786896






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #11814: [BEAM-10078] uniquify Dataflow specific jars when staging

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #11814:
URL: https://github.com/apache/beam/pull/11814#issuecomment-635695779


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org