You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/19 01:10:51 UTC

[GitHub] [beam] lukecwik commented on a change in pull request #11039: [BEAM-9383] Staging Dataflow artifacts from environment

lukecwik commented on a change in pull request #11039:
URL: https://github.com/apache/beam/pull/11039#discussion_r426973945



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -784,7 +877,25 @@ public DataflowPipelineJob run(Pipeline pipeline) {
         "Executing pipeline on the Dataflow Service, which will have billing implications "
             + "related to Google Compute Engine usage and other Google Cloud Services.");
 
-    List<DataflowPackage> packages = options.getStager().stageDefaultFiles();
+    // Capture the sdkComponents for look up during step translations
+    SdkComponents sdkComponents = SdkComponents.create();
+
+    DataflowPipelineOptions dataflowOptions = options.as(DataflowPipelineOptions.class);
+    String workerHarnessContainerImageURL = DataflowRunner.getContainerImageForJob(dataflowOptions);
+    RunnerApi.Environment defaultEnvironmentForDataflow =
+        Environments.createDockerEnvironment(workerHarnessContainerImageURL);
+
+    sdkComponents.registerEnvironment(
+        defaultEnvironmentForDataflow
+            .toBuilder()
+            .addAllDependencies(getDefaultArtifacts())

Review comment:
       We also need to make sure we have the capabilities set, I have this PR: https://github.com/apache/beam/pull/11748




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org