You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/03/31 22:53:38 UTC

[GitHub] [beam] youngoli opened a new pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

youngoli opened a new pull request #14397:
URL: https://github.com/apache/beam/pull/14397


   Contains a variety of small changes needed to enable cross-language tests to work on Dataflow. Here's a list of changes:
   
   1. Allow running pipelines with portable job submission (required for x-lang transforms).
   2. Allow submitting Dataflow jobs with multiple environments.
   3. Add SdkHarnessContainerImageOverrides flag to allow specifying overrides for multiple environments. This implementation works by calling the flag multiple times, once for each override.
   4. Update the ValidatesRunner script to use the above functionality (portable submission and overrides) as well as uploading a Java SDK container for cross-language in addition to a Go SDK container.
   5. Don't namespace environments in external transform expansion. This avoids bugs that occurred because an environment with an empty value was present in the final pipeline.
   6. Skip fake impulse generation in external transform expansion. Something must have changed on the expansion service end, because this is no longer necessary, and was causing problems because the fake impulse wasn't getting properly removed.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Lang</th>
         <th>SDK</th>
         <th>ULR</th>
         <th>Dataflow</th>
         <th>Flink</th>
         <th>Samza</th>
         <th>Spark</th>
         <th>Twister2</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Go</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/">
             <img src="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>Java</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon">
           </a>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/badge/icon">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Python</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>XLang</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>---</th>
         <th>Java</th>
         <th>Python</th>
         <th>Go</th>
         <th>Website</th>
         <th>Whitespace</th>
         <th>Typescript</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Non-portable</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/">
             <img src="https://camo.githubusercontent.com/4565d7b7e907114e6c1d12323408bd903aa252fefed5eeab93701b05c9628a84/68747470733a2f2f63692d6265616d2e6170616368652e6f72672f6a6f622f6265616d5f507265436f6d6d69745f507974686f6e446f636b65725f43726f6e2f62616467652f69636f6e" alt="Build Status" data-canonical-src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/">
             <img src="https://camo.githubusercontent.com/21afb097a5745440598bee5c59a027b140585eec871c3f3b883200610fabf722/68747470733a2f2f63692d6265616d2e6170616368652e6f72672f6a6f622f6265616d5f507265436f6d6d69745f507974686f6e446f63735f43726f6e2f62616467652f69636f6e" alt="Build Status" data-canonical-src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Portable</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606626542



##########
File path: sdks/go/pkg/beam/runners/dataflow/dataflow.go
##########
@@ -181,11 +183,15 @@ func Execute(ctx context.Context, p *beam.Pipeline) (beam.PipelineResult, error)
 	if err != nil {
 		return nil, err
 	}
-	enviroment, err := graphx.CreateEnvironment(ctx, jobopts.GetEnvironmentUrn(ctx), getContainerImage)
+	environment, err := graphx.CreateEnvironment(ctx, jobopts.GetEnvironmentUrn(ctx), getContainerImage)
 	if err != nil {
 		return nil, errors.WithContext(err, "generating model pipeline")
 	}
-	model, err := graphx.Marshal(edges, &graphx.Options{Environment: enviroment})
+	model, err := graphx.Marshal(edges, &graphx.Options{Environment: environment})
+	if err != nil {
+		return nil, errors.WithContext(err, "generating model pipeline")
+	}
+	err = pipelinex.ApplySdkImageOverrides(model, jobopts.GetSdkImageOverrides())
 	if err != nil {
 		return nil, errors.WithContext(err, "generating model pipeline")

Review comment:
       Done, also updated the one a few lines above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-811517322


   Run Go PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
lostluck commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-814221664


   There's light discussion around the format of the override flag happening, but that's not a reason to block this change from getting in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606615602



##########
File path: sdks/go/pkg/beam/core/runtime/pipelinex/replace.go
##########
@@ -243,3 +245,33 @@ func findFreeName(seen map[string]bool, name string) string {
 		}
 	}
 }
+
+// ApplySdkImageOverrides takes a pipeline and a map of patterns to overrides,
+// and proceeds to replace matching ContainerImages in any Environments
+// present in the pipeline.
+func ApplySdkImageOverrides(p *pipepb.Pipeline, patterns map[string]string) error {

Review comment:
       Done. Also adding this to the flag description.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606614546



##########
File path: sdks/go/pkg/beam/core/runtime/pipelinex/replace.go
##########
@@ -243,3 +245,33 @@ func findFreeName(seen map[string]bool, name string) string {
 		}
 	}
 }
+
+// ApplySdkImageOverrides takes a pipeline and a map of patterns to overrides,
+// and proceeds to replace matching ContainerImages in any Environments
+// present in the pipeline.
+func ApplySdkImageOverrides(p *pipepb.Pipeline, patterns map[string]string) error {
+	if len(patterns) == 0 {
+		return nil
+	}
+
+	for _, env := range p.GetComponents().GetEnvironments() {
+		var payload pipepb.DockerPayload
+		if err := proto.Unmarshal(env.GetPayload(), &payload); err != nil {
+			return err
+		}
+		oldImg := payload.GetContainerImage()
+		for pattern, replacement := range patterns {
+			re, err := regexp.Compile(pattern)

Review comment:
       Good point. I went ahead and fixed it anyway, and yeah I'll keep that in mind.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-813678215


   Run Go Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-811527585


   Run GoPortable PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-811554577


   R: @lostluck


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606617381



##########
File path: sdks/go/pkg/beam/core/runtime/xlangx/namespace.go
##########
@@ -81,36 +76,18 @@ func addWindowingStrategyID(c *pipepb.Components, idMap map[string]string, wid s
 	return idMap[wid]
 }
 
-func addEnvironmentID(c *pipepb.Components, idMap map[string]string, eid string, newID func(string) string) string {
-	if _, exists := idMap[eid]; exists {
-		return idMap[eid]
-	}
-
-	environment, exists := c.Environments[eid]
-	if !exists {
-		panic(errors.Errorf("attempted to add namespace to missing windowing strategy id: %v not in %v", eid, c.Environments))
-	}
-
-	idMap[eid] = newID(eid)
-
-	// Updating Environments map
-	c.Environments[idMap[eid]] = environment
-	delete(c.Environments, eid)
-
-	return idMap[eid]
-}
-
 func addNamespace(t *pipepb.PTransform, c *pipepb.Components, namespace string) {
 	newID := func(id string) string {
 		return fmt.Sprintf("%v@%v", id, namespace)
 	}
 
 	idMap := make(map[string]string)
 
-	// Update Environment ID of PTransform
-	if t.EnvironmentId != "" {
-		t.EnvironmentId = addEnvironmentID(c, idMap, t.EnvironmentId, newID)
-	}
+	// TODO: Currently environments are not namespaced. This works under the

Review comment:
       Changed it to a "note". That makes more sense than making a JIRA for the vague possibility of multiple Go environments.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli merged pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli merged pull request #14397:
URL: https://github.com/apache/beam/pull/14397


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
lostluck commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606357992



##########
File path: sdks/go/pkg/beam/core/runtime/pipelinex/replace.go
##########
@@ -243,3 +245,33 @@ func findFreeName(seen map[string]bool, name string) string {
 		}
 	}
 }
+
+// ApplySdkImageOverrides takes a pipeline and a map of patterns to overrides,
+// and proceeds to replace matching ContainerImages in any Environments
+// present in the pipeline.
+func ApplySdkImageOverrides(p *pipepb.Pipeline, patterns map[string]string) error {
+	if len(patterns) == 0 {
+		return nil
+	}
+
+	for _, env := range p.GetComponents().GetEnvironments() {
+		var payload pipepb.DockerPayload
+		if err := proto.Unmarshal(env.GetPayload(), &payload); err != nil {
+			return err
+		}
+		oldImg := payload.GetContainerImage()
+		for pattern, replacement := range patterns {
+			re, err := regexp.Compile(pattern)

Review comment:
       No action required, just a call out to be wary of nested for loops and repeated work.
   
   I think my main concern here is that we're compiling the patterns multiple times (each once per environment at most), which is probably not going to be terrible since this is at construction time, and both patterns and environments are going to be limited.

##########
File path: sdks/go/pkg/beam/core/runtime/xlangx/namespace.go
##########
@@ -81,36 +76,18 @@ func addWindowingStrategyID(c *pipepb.Components, idMap map[string]string, wid s
 	return idMap[wid]
 }
 
-func addEnvironmentID(c *pipepb.Components, idMap map[string]string, eid string, newID func(string) string) string {
-	if _, exists := idMap[eid]; exists {
-		return idMap[eid]
-	}
-
-	environment, exists := c.Environments[eid]
-	if !exists {
-		panic(errors.Errorf("attempted to add namespace to missing windowing strategy id: %v not in %v", eid, c.Environments))
-	}
-
-	idMap[eid] = newID(eid)
-
-	// Updating Environments map
-	c.Environments[idMap[eid]] = environment
-	delete(c.Environments, eid)
-
-	return idMap[eid]
-}
-
 func addNamespace(t *pipepb.PTransform, c *pipepb.Components, namespace string) {
 	newID := func(id string) string {
 		return fmt.Sprintf("%v@%v", id, namespace)
 	}
 
 	idMap := make(map[string]string)
 
-	// Update Environment ID of PTransform
-	if t.EnvironmentId != "" {
-		t.EnvironmentId = addEnvironmentID(c, idMap, t.EnvironmentId, newID)
-	}
+	// TODO: Currently environments are not namespaced. This works under the

Review comment:
       Consider including the JIRA tag for history context, or instead of TODO (which implies work that should likely be done) use Note: or avoid a prefix entirely, since all comments are notes to our future selves.
   
   As for the content, it seems probable that we'd make sure the Go ExpansionService that handles these calls would do the namespacing on the response itself. That way it can build the pipeline graph normally, and we simply run the replacement there, where we are already certain that it's a foreign component, rather than the primary pipeline.

##########
File path: sdks/go/pkg/beam/core/runtime/pipelinex/replace.go
##########
@@ -243,3 +245,33 @@ func findFreeName(seen map[string]bool, name string) string {
 		}
 	}
 }
+
+// ApplySdkImageOverrides takes a pipeline and a map of patterns to overrides,
+// and proceeds to replace matching ContainerImages in any Environments
+// present in the pipeline.
+func ApplySdkImageOverrides(p *pipepb.Pipeline, patterns map[string]string) error {

Review comment:
       Consider documenting the expectation that any given environment is expected to match only a single pattern for replacement, and that if multiple patterns would match, it's arbitrary which will be applied (due to map iteration ordering being random.)
   
   There's no good way to handle such conflict cases for multple matches. I suspect the most we can do is say it's undefined in the flag itself, as well as the commentary change in the previous paragraph.

##########
File path: sdks/go/pkg/beam/options/jobopts/options.go
##########
@@ -31,6 +31,18 @@ import (
 	"github.com/apache/beam/sdks/go/pkg/beam/log"
 )
 
+func init() {
+	flag.Var(&SdkHarnessContainerImageOverrides,
+		"sdk_harness_container_image_override",
+		"Overrides for SDK harness container images. Could be for the "+
+			"local SDK or for a remote SDK that pipeline has to support due "+
+			"to a cross-language transform. Each entry consist of two values "+
+			"separated by a comma where first value gives a regex to "+
+			"identify the container image to override and the second value "+
+			"gives the replacement container image. Multiple entries can be "+
+			"specified by using this flag multiple times.")

Review comment:
       Just confirming, this part matches the semantics of the equivalent flag in the other SDKs? 

##########
File path: sdks/go/pkg/beam/options/jobopts/options.go
##########
@@ -31,6 +31,18 @@ import (
 	"github.com/apache/beam/sdks/go/pkg/beam/log"
 )
 
+func init() {
+	flag.Var(&SdkHarnessContainerImageOverrides,
+		"sdk_harness_container_image_override",
+		"Overrides for SDK harness container images. Could be for the "+
+			"local SDK or for a remote SDK that pipeline has to support due "+
+			"to a cross-language transform. Each entry consist of two values "+

Review comment:
       ```suggestion
   			"to a cross-language transform. Each entry consists of two values "+
   ```

##########
File path: sdks/go/pkg/beam/runners/dataflow/dataflow.go
##########
@@ -181,11 +183,15 @@ func Execute(ctx context.Context, p *beam.Pipeline) (beam.PipelineResult, error)
 	if err != nil {
 		return nil, err
 	}
-	enviroment, err := graphx.CreateEnvironment(ctx, jobopts.GetEnvironmentUrn(ctx), getContainerImage)
+	environment, err := graphx.CreateEnvironment(ctx, jobopts.GetEnvironmentUrn(ctx), getContainerImage)
 	if err != nil {
 		return nil, errors.WithContext(err, "generating model pipeline")
 	}
-	model, err := graphx.Marshal(edges, &graphx.Options{Environment: enviroment})
+	model, err := graphx.Marshal(edges, &graphx.Options{Environment: environment})
+	if err != nil {
+		return nil, errors.WithContext(err, "generating model pipeline")
+	}
+	err = pipelinex.ApplySdkImageOverrides(model, jobopts.GetSdkImageOverrides())
 	if err != nil {
 		return nil, errors.WithContext(err, "generating model pipeline")

Review comment:
       ```suggestion
   		return nil, errors.WithContext(err, "applying container image overrides")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606616857



##########
File path: sdks/go/pkg/beam/options/jobopts/options.go
##########
@@ -31,6 +31,18 @@ import (
 	"github.com/apache/beam/sdks/go/pkg/beam/log"
 )
 
+func init() {
+	flag.Var(&SdkHarnessContainerImageOverrides,
+		"sdk_harness_container_image_override",
+		"Overrides for SDK harness container images. Could be for the "+
+			"local SDK or for a remote SDK that pipeline has to support due "+
+			"to a cross-language transform. Each entry consist of two values "+
+			"separated by a comma where first value gives a regex to "+
+			"identify the container image to override and the second value "+
+			"gives the replacement container image. Multiple entries can be "+
+			"specified by using this flag multiple times.")

Review comment:
       I haven't been able to find the documentation for the Java version of this flag (despite being relatively sure it exists, based on [this line](https://github.com/apache/beam/blob/76c49ac276fc804ccc058c901dc6cb9e063b2529/runners/google-cloud-dataflow-java/build.gradle#L345)), but the Python version works identically, it requires you to use the flag once per entry.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on a change in pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on a change in pull request #14397:
URL: https://github.com/apache/beam/pull/14397#discussion_r606618575



##########
File path: sdks/go/pkg/beam/options/jobopts/options.go
##########
@@ -31,6 +31,18 @@ import (
 	"github.com/apache/beam/sdks/go/pkg/beam/log"
 )
 
+func init() {
+	flag.Var(&SdkHarnessContainerImageOverrides,
+		"sdk_harness_container_image_override",
+		"Overrides for SDK harness container images. Could be for the "+
+			"local SDK or for a remote SDK that pipeline has to support due "+
+			"to a cross-language transform. Each entry consist of two values "+

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] youngoli commented on pull request #14397: [BEAM-11574] Enable cross-language integration tests on Dataflow

Posted by GitBox <gi...@apache.org>.
youngoli commented on pull request #14397:
URL: https://github.com/apache/beam/pull/14397#issuecomment-811524420


   Run Go PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org