You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/03/18 21:29:31 UTC

[GitHub] [beam] kennknowles opened a new pull request #14276: Only use beam_fn_api experiment, since otherwise UW does not work anyhow

kennknowles opened a new pull request #14276:
URL: https://github.com/apache/beam/pull/14276


   The eliminates the branching logic in the runner v1 overrides. It puts the runner v2 overrides into a separate method and applies them before producing the proto for portable job submission.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [x] Update `CHANGES.md` with noteworthy changes.
    - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam
 .apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.a
 pache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam
 .apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website | Whitespace | Typescript
   --- | --- | --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #14276:
URL: https://github.com/apache/beam/pull/14276#discussion_r613575491



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -2344,31 +2300,22 @@ static String getContainerImageForJob(DataflowPipelineOptions options) {
   }
 
   static boolean useUnifiedWorker(DataflowPipelineOptions options) {
-    return hasExperiment(options, "beam_fn_api")
-        || hasExperiment(options, "use_runner_v2")
-        || hasExperiment(options, "use_unified_worker");
+    return hasExperiment(options, "beam_fn_api");

Review comment:
       Done. I suppose soon we simply delete this method.

##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -2344,31 +2300,22 @@ static String getContainerImageForJob(DataflowPipelineOptions options) {
   }
 
   static boolean useUnifiedWorker(DataflowPipelineOptions options) {
-    return hasExperiment(options, "beam_fn_api")
-        || hasExperiment(options, "use_runner_v2")
-        || hasExperiment(options, "use_unified_worker");
+    return hasExperiment(options, "beam_fn_api");
   }
 
   static boolean useStreamingEngine(DataflowPipelineOptions options) {
     return hasExperiment(options, GcpOptions.STREAMING_ENGINE_EXPERIMENT)
         || hasExperiment(options, GcpOptions.WINDMILL_SERVICE_EXPERIMENT);
   }
 
-  static void verifyDoFnSupported(
-      DoFn<?, ?> fn, boolean streaming, boolean workerV2, boolean streamingEngine) {
+  static void verifyDoFnSupported(DoFn<?, ?> fn, boolean streaming, boolean streamingEngine) {
     if (streaming && DoFnSignatures.requiresTimeSortedInput(fn)) {
       throw new UnsupportedOperationException(
           String.format(
               "%s does not currently support @RequiresTimeSortedInput in streaming mode.",
               DataflowRunner.class.getSimpleName()));
     }
     if (DoFnSignatures.usesSetState(fn)) {
-      if (workerV2) {

Review comment:
       Done

##########
File path: runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
##########
@@ -1334,57 +1329,6 @@ public void testTransformTranslator() throws IOException {
     assertTrue(transform.translated);
   }
 
-  @Test
-  public void testSdkHarnessConfiguration() throws IOException {

Review comment:
       When I wrote this, I thought this was testing an unused case. Is that not true? I have restored the test, to investigate if needed...

##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -1005,7 +961,7 @@ public DataflowPipelineJob run(Pipeline pipeline) {
         options.getStager().stageToFile(serializedProtoPipeline, PIPELINE_FILE_NAME);
     dataflowOptions.setPipelineUrl(stagedPipeline.getLocation());
     // Now rewrite things to be as needed for v1 (mutates the pipeline)
-    replaceTransforms(pipeline);
+    replaceV1Transforms(pipeline);

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810753436


   run dataflow validatesrunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804356168


   Run Java Examples on Dataflow Runner V2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810610074






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804355565






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #14276:
URL: https://github.com/apache/beam/pull/14276#discussion_r605987941



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -2344,31 +2300,22 @@ static String getContainerImageForJob(DataflowPipelineOptions options) {
   }
 
   static boolean useUnifiedWorker(DataflowPipelineOptions options) {
-    return hasExperiment(options, "beam_fn_api")
-        || hasExperiment(options, "use_runner_v2")
-        || hasExperiment(options, "use_unified_worker");
+    return hasExperiment(options, "beam_fn_api");
   }
 
   static boolean useStreamingEngine(DataflowPipelineOptions options) {
     return hasExperiment(options, GcpOptions.STREAMING_ENGINE_EXPERIMENT)
         || hasExperiment(options, GcpOptions.WINDMILL_SERVICE_EXPERIMENT);
   }
 
-  static void verifyDoFnSupported(
-      DoFn<?, ?> fn, boolean streaming, boolean workerV2, boolean streamingEngine) {
+  static void verifyDoFnSupported(DoFn<?, ?> fn, boolean streaming, boolean streamingEngine) {
     if (streaming && DoFnSignatures.requiresTimeSortedInput(fn)) {
       throw new UnsupportedOperationException(
           String.format(
               "%s does not currently support @RequiresTimeSortedInput in streaming mode.",
               DataflowRunner.class.getSimpleName()));
     }
     if (DoFnSignatures.usesSetState(fn)) {
-      if (workerV2) {

Review comment:
       Why do we want to remove SetState and MapState check here and below? There is no SDK support over fnapi yet so we should fail here. When we have SDK support, we can remove this check even Dataflow doesn't support.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-820723699


   If you know whether that test case still tests live code, that answer would be helpful. But otherwise wait and I will ping when I get it green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810753500


   Run Dataflow Streaming ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #14276: Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-803213400


   It seems like PubSubRead starts passing from this change: https://ci-beam.apache.org/job/beam_PostCommit_Java_PR/623/testReport/junit/org.apache.beam.sdk.io.gcp.pubsub/PubsubReadIT/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles closed pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles closed pull request #14276:
URL: https://github.com/apache/beam/pull/14276


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810753113


   Run PostCommit_Java_Dataflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-811163543


   The merge conflict is change of gRPC version and doesn't block review. Please take a look while it is green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #14276:
URL: https://github.com/apache/beam/pull/14276#discussion_r617775304



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchStatefulParDoOverrides.java
##########
@@ -172,11 +172,7 @@ private MultiOutputOverrideFactory(boolean isFnApi) {
       verifyFnIsStateful(fn);
       DataflowPipelineOptions options =
           input.getPipeline().getOptions().as(DataflowPipelineOptions.class);
-      DataflowRunner.verifyDoFnSupported(
-          fn,
-          false,
-          DataflowRunner.useUnifiedWorker(options),

Review comment:
       Within this change, we can only fail the job with unsupported stateful dofn after the job has been created.

##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -499,45 +493,34 @@ protected DataflowRunner(DataflowPipelineOptions options) {
           PTransformOverride.of(
               PTransformMatchers.writeWithRunnerDeterminedSharding(),
               new StreamingShardedWriteFactory(options)));
-      if (fnApiEnabled) {
-        overridesBuilder.add(
-            PTransformOverride.of(
-                PTransformMatchers.classEqualTo(Create.Values.class),
-                new StreamingFnApiCreateOverrideFactory()));
-      }
 
       overridesBuilder.add(
           PTransformOverride.of(
               PTransformMatchers.groupWithShardableStates(),
               new GroupIntoBatchesOverride.StreamingGroupIntoBatchesWithShardedKeyOverrideFactory(
                   this)));
 
-      if (!fnApiEnabled) {
-        overridesBuilder
-            .add(
-                // Streaming Bounded Read is implemented in terms of Streaming Unbounded Read, and
-                // must precede it
-                PTransformOverride.of(
-                    PTransformMatchers.classEqualTo(Read.Bounded.class),
-                    new StreamingBoundedReadOverrideFactory()))
-            .add(
-                PTransformOverride.of(
-                    PTransformMatchers.classEqualTo(Read.Unbounded.class),
-                    new StreamingUnboundedReadOverrideFactory()));
-      }
+      overridesBuilder
+          .add(
+              // Streaming Bounded Read is implemented in terms of Streaming Unbounded Read, and
+              // must precede it
+              PTransformOverride.of(
+                  PTransformMatchers.classEqualTo(Read.Bounded.class),
+                  new StreamingBoundedReadOverrideFactory()))
+          .add(
+              PTransformOverride.of(
+                  PTransformMatchers.classEqualTo(Read.Unbounded.class),
+                  new StreamingUnboundedReadOverrideFactory()));
+
+      overridesBuilder.add(
+          PTransformOverride.of(
+              PTransformMatchers.classEqualTo(View.CreatePCollectionView.class),
+              new StreamingCreatePCollectionViewFactory()));
 
-      if (!fnApiEnabled) {
-        overridesBuilder.add(
-            PTransformOverride.of(
-                PTransformMatchers.classEqualTo(View.CreatePCollectionView.class),
-                new StreamingCreatePCollectionViewFactory()));
-      }
       // Dataflow Streaming runner overrides the SPLITTABLE_PROCESS_KEYED transform
       // natively in the Dataflow service.
     } else {
-      if (!fnApiEnabled) {
-        overridesBuilder.add(SplittableParDo.PRIMITIVE_BOUNDED_READ_OVERRIDE);
-      }
+      overridesBuilder.add(SplittableParDo.PRIMITIVE_BOUNDED_READ_OVERRIDE);

Review comment:
       I just noticed that we only swap SDF wrapper out when in batch and we are using SDF wrapper in streaming on runner_v1 now. Though runner_v1 supports Splittable DoFn, should we swap the read for streaming as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-802345834


   Run Java_Examples_Dataflow PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804367937


   Run Java Examples on Dataflow Runner V2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-803141810


   run java postcommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804356405


   Yes, they all occur in `visitValue` in the translator, which still had different code paths for runner v1 and runner v2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #14276:
URL: https://github.com/apache/beam/pull/14276#discussion_r615116090



##########
File path: runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
##########
@@ -1334,57 +1329,6 @@ public void testTransformTranslator() throws IOException {
     assertTrue(transform.translated);
   }
 
-  @Test
-  public void testSdkHarnessConfiguration() throws IOException {

Review comment:
       This test seems to check whether the environment id and docker url are populated correctly in beam fnapi proto. I think we want to keep this test. But we can remove `use_runner_v2 ` from the test setup because of the dual job submission.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810644538


   Run PostCommit_Java_DataflowV2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804192542


   Actually perhaps the thing to do here is to not invoke translation at all in the runner v2 path.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-826157346


   Obsoleted by #14614 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810619185


   Run PostCommit_Java_DataflowV2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-820098514


   Do you want another around of review now or you still want to have more changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804360203


   Run Java_Examples_Dataflow PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-826157346


   Obsoleted by #14614 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804191261


   Yea, but I think the graph is getting corrupted in some ways. I will try to figure it out. To get Pubsub fixed quickly I may just do a binary search of the various changes to see if some of them can go in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #14276:
URL: https://github.com/apache/beam/pull/14276#discussion_r606004515



##########
File path: runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
##########
@@ -1334,57 +1329,6 @@ public void testTransformTranslator() throws IOException {
     assertTrue(transform.translated);
   }
 
-  @Test
-  public void testSdkHarnessConfiguration() throws IOException {

Review comment:
       Any reason that we want to remove this test case?

##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -2344,31 +2300,22 @@ static String getContainerImageForJob(DataflowPipelineOptions options) {
   }
 
   static boolean useUnifiedWorker(DataflowPipelineOptions options) {
-    return hasExperiment(options, "beam_fn_api")
-        || hasExperiment(options, "use_runner_v2")
-        || hasExperiment(options, "use_unified_worker");
+    return hasExperiment(options, "beam_fn_api");

Review comment:
       I think we still want to keep the `use_runner_v2` and `use_unified_worker`?

##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -1005,7 +961,7 @@ public DataflowPipelineJob run(Pipeline pipeline) {
         options.getStager().stageToFile(serializedProtoPipeline, PIPELINE_FILE_NAME);
     dataflowOptions.setPipelineUrl(stagedPipeline.getLocation());
     // Now rewrite things to be as needed for v1 (mutates the pipeline)
-    replaceTransforms(pipeline);
+    replaceV1Transforms(pipeline);

Review comment:
       Maybe more comments on why we still want to generate runner v1 v1b3 proto?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804369601


   Run Java Examples on Dataflow Runner V2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810753321






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-810490513






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles closed pull request #14276: [BEAM-12021] Remove portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
kennknowles closed pull request #14276:
URL: https://github.com/apache/beam/pull/14276


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #14276: [BEAM-12021] Split portability overrides from runner v1 overrides in DataflowRunner

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #14276:
URL: https://github.com/apache/beam/pull/14276#issuecomment-804355821


   I was checking the failures last week. It seems like they are related to side input, but I'm not sure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org