You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/03/25 17:51:51 UTC

[GitHub] [beam] robertwb opened a new pull request #11222: [BEAM-4150] Don't window PCollection coders.

robertwb opened a new pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222
 
 
   Now that no SDKs require it.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398793631
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   lines 398-399 already do this check as part of the `length_prefixed_coder` method.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604619996
 
 
   As forusing the input/output PCollection names, rather than the GRPC ids, that's a good question. Things are structured this way is mostly to follow the existing code, where things are indexed by these PCollections. In particular,
   
       ... -> pcIn -> GBK -> pcOut -> ...
   
   gets replaced by
   
      ... -> pcIn -> GrpcOut
      GrpcIn -> pcOut -> ...
   
   but the original GBK transform itself is kept around (with its references to pcIn and pcOut--it's the object that "connects" the two) to construct the appropriate grouping buffer. This could be refactored, and may be if GBK becomes a full-fledged operator to make it more streamy, but that seemed out of scope for this change. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398677414
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   Note this check is already done within `length_prefix_coder` where it will return the original coder id if its a safe coder so you 
   
   ```suggestion
       channel_coder = self.length_prefixed_coder(channel_coder)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398221421
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##########
 @@ -882,66 +885,13 @@ def _add_residuals_and_channel_splits_to_deferred_inputs(
         prev_stops[
             channel_split.transform_id] = channel_split.last_primary_element
 
-  @staticmethod
 
 Review comment:
   This was an extra copy of the method that wasn't used anywhere.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] pabloem commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
pabloem commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604580447
 
 
   fwiw the precommit breakages are unrelated, and already fixed in master.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604507596
 
 
   Run Python2_PVR_Flink PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398678099
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##########
 @@ -426,7 +423,8 @@ def _collect_written_timers_and_add_to_deferred_inputs(
       pipeline_components,  # type: beam_runner_api_pb2.Components
       stage,  # type: translations.Stage
       bundle_context_manager,  # type: execution.BundleContextManager
-      deferred_inputs  # type: MutableMapping[str, PartitionableBuffer]
+      deferred_inputs,  # type: MutableMapping[str, PartitionableBuffer]
+      data_channel_coders,
 
 Review comment:
   If possible try and keep the typing information on methods.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604788204
 
 
   Run PythonDocker PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398789812
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##########
 @@ -426,7 +423,8 @@ def _collect_written_timers_and_add_to_deferred_inputs(
       pipeline_components,  # type: beam_runner_api_pb2.Components
       stage,  # type: translations.Stage
       bundle_context_manager,  # type: execution.BundleContextManager
-      deferred_inputs  # type: MutableMapping[str, PartitionableBuffer]
+      deferred_inputs,  # type: MutableMapping[str, PartitionableBuffer]
+      data_channel_coders,
 
 Review comment:
   Thanks for calling me out on this. The types themselves aren't very informative as to their meaning, but I added it here and elsewhere. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604646485
 
 
   Task 'pythonDockerBuildPreCommit' not found in root project 'beam'.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398808389
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   Yes, but here `pcoll.coder_id != channel_coder`. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604646548
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604583109
 
 
   > fwiw the precommit breakages are unrelated, and already fixed in master.
   
   Thats what I assumed when I looked at the failure causes so I kicked them off again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604507442
 
 
   Run Portable_Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604732125
 
 
   Run PythonDocker PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604623844
 
 
   Yeah, I was just surprised about the choice, nothing really wrong with being PCollection vs PTransform centric.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] pabloem commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
pabloem commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604744648
 
 
   may need to rebase to get passing Docker PreCommit?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604507872
 
 
   Run PythonLint PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398677414
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   Note this check is already done within `length_prefix_coder` where it will return the original coder id if its a safe coder so you can always use it.
   
   ```suggestion
       channel_coder = self.length_prefixed_coder(channel_coder)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604582850
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398790807
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   Here I want to populate the safe_coders mapping for channel_coder iff the value coder has such a mapping. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604507514
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#discussion_r398806850
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##########
 @@ -347,6 +348,22 @@ def add_or_get_coder_id(self,
     self.components.coders[new_coder_id].CopyFrom(coder_proto)
     return new_coder_id
 
+  def add_data_channel_coder(self, pcoll_id):
+    pcoll = self.components.pcollections[pcoll_id]
+    proto = beam_runner_api_pb2.Coder(
+        spec=beam_runner_api_pb2.FunctionSpec(
+            urn=common_urns.coders.WINDOWED_VALUE.urn),
+        component_coder_ids=[
+            pcoll.coder_id,
+            self.components.windowing_strategies[
+                pcoll.windowing_strategy_id].window_coder_id
+        ])
+    channel_coder = self.add_or_get_coder_id(
+        proto, pcoll.coder_id + '_windowed')
+    if pcoll.coder_id in self.safe_coders:
+      channel_coder = self.length_prefixed_coder(channel_coder)
 
 Review comment:
   I now understand the nuance of your statement.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb commented on issue #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604608176
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] robertwb merged pull request #11222: [BEAM-4150] Don't window PCollection coders.

Posted by GitBox <gi...@apache.org>.
robertwb merged pull request #11222: [BEAM-4150] Don't window PCollection coders.
URL: https://github.com/apache/beam/pull/11222
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services