You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/06/15 04:50:56 UTC

[GitHub] [beam] saavan-google-intern opened a new pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

saavan-google-intern opened a new pull request #12009:
URL: https://github.com/apache/beam/pull/12009


   This PR provides support for type hint annotations for PCollections in the _expand_ function of PTransforms.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-649145031


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r453938415



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,56 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   {'input_types': None},
+                                   ['apache_beam.pvalue.PBegin'],
+                                   'An input typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   {'output_types': None},
+                                   ['apache_beam.pvalue.PDone'],
+                                   'An output typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PDone. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      kwarg_dict,         # type: Dict[str, any]
+      my_valid_classes,   # type: List[str]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+
+    if not has_my_type() or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    valid_classes = ['apache_beam.pvalue.PCollection'] + my_valid_classes
+
+    if not any(valid_class in str(my_type) for valid_class in valid_classes):

Review comment:
       TBD: Try using `__origin__` instead




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r456109961



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       Sounds good
   
   > The python-type-safety update in this PR already mentions PBegin. Did you want to add more details?
   
   It also mentions PDone, do we want to remove that?
   
   > I would like to split the programming guide update into a separate PR, since it will include examples from multiple languages and possibly some more discussion. This will be tracked in [BEAM-10495](https://issues.apache.org/jira/browse/BEAM-10495), and you can take it on if you wish.
   
   I'll focus on benchmarking the runtime type checking system for now but if time permits afterwards, sure




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r441005641



##########
File path: sdks/python/apache_beam/transforms/ptransform.py
##########
@@ -72,7 +72,7 @@ class and wrapper class that allows lambda functions to be used as
 from apache_beam.transforms.display import HasDisplayData
 from apache_beam.typehints import native_type_compatibility
 from apache_beam.typehints import typehints
-from apache_beam.typehints.decorators import TypeCheckError
+from apache_beam.typehints.decorators import TypeCheckError, IOTypeHints, get_type_hints

Review comment:
       Please stay consistent with existing code: one import per line.

##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -158,6 +159,20 @@ def __ne__(self, other):
   def __hash__(self):
     return hash((self.tag, self.producer))
 
+  class PCollectionTypeConstraint(SequenceTypeConstraint):

Review comment:
       I would really prefer it if all type constraint classes were in the same module (typehints).

##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -141,7 +142,7 @@ def __or__(self, ptransform):
     return self.pipeline.apply(ptransform, self)
 
 
-class PCollection(PValue, Generic[T]):
+class PCollection(PValue, CompositeTypeHint):

Review comment:
       This should stay `Generic[T]`, as it provides the same `[]` functionality but is understood by tools like mypy and is familiar to Python users.

##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       Are these whitespace changes coming from yapf? This particularly seems to have moved the comment to the line above it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r451733706



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,56 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   {'input_types': None},
+                                   ['apache_beam.pvalue.PBegin'],
+                                   'An input typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   {'output_types': None},
+                                   ['apache_beam.pvalue.PDone'],
+                                   'An output typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PDone. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      kwarg_dict,         # type: Dict[str, any]
+      my_valid_classes,   # type: List[str]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+
+    if not has_my_type() or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    valid_classes = ['apache_beam.pvalue.PCollection'] + my_valid_classes
+
+    if not any(valid_class in str(my_type) for valid_class in valid_classes):

Review comment:
       I tried using `isinstance` initially but it doesn't work well with generic types
    
   Another option is to use `__origin__` but I don't know if that's fully backwards compatible
   
   Strings are a wacky solution though.. do you have any other ideas?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-648498242


   Let's see


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r444541219



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -158,6 +159,20 @@ def __ne__(self, other):
   def __hash__(self):
     return hash((self.tag, self.producer))
 
+  class PCollectionTypeConstraint(SequenceTypeConstraint):

Review comment:
       This PR uses the built-in Generic[T] instead of a new TypeConstraint so this is no longer relevant.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim merged pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim merged pull request #12009:
URL: https://github.com/apache/beam/pull/12009


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r441852783



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       Yeah, yapf did that. I'm not sure why. When I run "yapf -ir ." from the root python directory, a lot of files are re-formatted. Is this unexpected behavior? If it's not, I can create an MR with just yapf changes to the entire apache_beam module. Thoughts?

##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       Yeah, yapf did that. I'm not sure why. When I run "yapf -ir ." from the root python directory, a lot of files are re-formatted. Is this unexpected behavior? If it's not, I can create an PR with just yapf changes to the entire apache_beam module. Thoughts?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-666631741


   Run Python2_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim edited a comment on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim edited a comment on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-651395927


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-648460175


   pushed tests
   
   can we test?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-665770463


   Yes, let's accept Pipeline as a special case, as users logically consider it the (root) input to the PTransform. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-667432876


   Run Python2_PVR_Flink PreCommit
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern edited a comment on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern edited a comment on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-643904266


   R: @udim 
   CC: @robertwb 
   
   This PR is not ready for merge, but I'm making it available so we can track and discuss its progress.
   
   What do you think of the current implementation? It works on some trivial pipelines that I've run on my machine but I want to confirm the solution's design before I debug deeper and write real tests.
   
   Also what is the expected behavior for runtime type checking with annotations? For example, if an input PCollection to a PTransform is annotated with the type hint _PCollection[str]_, should the runtime type checking system confirm that each element in the PCollection is indeed a string? Is this how the current runtime type checking system works with decorators and inline type hints?
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445941065



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or ' \
+                'PBegin. '
+
+    if any(element is None for element in [self.input_types, self.input_types[0], self.input_types[0][0]]):

Review comment:
       This line will raise an exception when self.input_types is None, compared to:
   ```py
   if self.input_types is None or len(self.input_types[0]) != 1:
   ```
   It doesn't take advantage of short-circuit evaluation.
   
   `self.input_types = None` is a valid value. It means nothing was set. (We should get rid of that and always have ((), {}) as the input/output_types value if there are no type hints, but I haven't had the chance to do that.)
   Same goes for having zero type hints: it's valid and there's nothing to do.
   
   Might be useful to reuse: `self._has_input_types`, `self.has_simple_output_type`.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r454713524



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       Good question. Responded on the list. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r455213683



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       The dev list [discussion](https://lists.apache.org/thread.html/r5cad76ea7b5b8404d159dd14c24964242b1d4b49787d6aa6c162d948%40%3Cdev.beam.apache.org%3E) seems to have a consensus:
   - Make PBegin public (add to \_\_all\_\_ list)
   - Deprecate PDone return type in favor of None. We still need to be backwards compatible, but update this PR to support the None return type for expand.
   
   Additional documentation will be tracked in https://issues.apache.org/jira/browse/BEAM-10495




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r446359288



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or ' \
+                'PBegin. '
+
+    if any(element is None for element in [self.input_types, self.input_types[0], self.input_types[0][0]]):

Review comment:
       I thought they were equivalent because ```any()``` supports short-circuiting.
   
   Thanks, just pushed a fix that replaces the messy check with those helper functions. Also just to clarify, in those situations, the typehint is valid so we can just return self right?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r453938415



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,56 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   {'input_types': None},
+                                   ['apache_beam.pvalue.PBegin'],
+                                   'An input typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   {'output_types': None},
+                                   ['apache_beam.pvalue.PDone'],
+                                   'An output typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PDone. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      kwarg_dict,         # type: Dict[str, any]
+      my_valid_classes,   # type: List[str]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+
+    if not has_my_type() or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    valid_classes = ['apache_beam.pvalue.PCollection'] + my_valid_classes
+
+    if not any(valid_class in str(my_type) for valid_class in valid_classes):

Review comment:
       TBD: Try using `__origin__` instead for subscripted types and non-subscripted types




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-664694485






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r459180008



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       Please also accept `None` wherever `PDone` is accepted, so as to not break existing pipelines.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-649924273


   > The two functions **strip_pcoll_input** and **strip_pcoll_output_** are very similar. Could be refactored into one function. What do you think
   
   Yeah, they do look similar enough to merge. It's your call. If they do get much longer though they probably should be merged or using a shared method.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r441852783



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       Yeah, yapf did that. I'm not sure why. When I run "yapf -ir ." from the root python directory, a lot of files are re-formatted. Is this unexpected behavior? If it's not, I can create an MR with just YAPF changes to the entire apache_beam module. Thoughts?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r446368663



##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,61 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    with self.assertRaises(TypeCheckError) as error:
+      _th = MyPTransform().get_type_hints()
+
+    self.assertEqual(str(error.exception), 'An input typehint to a PTransform must be a single (or nested) type '
+                                           'wrapped by a PCollection.')
+
+  def test_annotations_without_internal_type(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection) -> PCollection:

Review comment:
       Thanks, fixed.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern edited a comment on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern edited a comment on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-643904266


   R: @udim 
   CC: @robertwb 
   
   This PR is not ready for merge, but I'm making it available so we can track and discuss its progress.
   
   What do you think of the current implementation? It works on some trivial pipelines that I've run on my machine but I want to confirm the solution's design before I debug deeper and write real tests.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r441851515



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -158,6 +159,20 @@ def __ne__(self, other):
   def __hash__(self):
     return hash((self.tag, self.producer))
 
+  class PCollectionTypeConstraint(SequenceTypeConstraint):

Review comment:
       Moved it back to the typehints module
   
   Both files import from each other causing a circular import error so the current (hacky) workaround is to put the import inside of PCollectionTypeConstraint's __class_getitem__ function, which works. There are [no performance drawbacks](https://stackoverflow.com/questions/3095071/) to this approach but it's not consistent with the existing styling so I will look for a better solution.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r455230574



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       Sounds good. 
   
   Should we update documentation [here](https://beam.apache.org/documentation/programming-guide/) and [here](https://beam.apache.org/documentation/sdks/python-type-safety/) to note the existence of `PBegin`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-649848772


   The two functions **strip_pcoll_input** and **strip_pcoll_output_** are very similar. Could be refactored into one function. What do you think


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-652079902


   @saavan-google-intern I haven't taken a close look, but the failing tests could be due changes in this PR. PTAL


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r443069576



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       That is unexpected, yes.
   Try running `tox -e py3-yapf`. Perhaps the yapf you have installed is a newer version than tox's (0.29.0).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r451233193



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,56 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   {'input_types': None},
+                                   ['apache_beam.pvalue.PBegin'],
+                                   'An input typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   {'output_types': None},
+                                   ['apache_beam.pvalue.PDone'],
+                                   'An output typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PDone. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      kwarg_dict,         # type: Dict[str, any]
+      my_valid_classes,   # type: List[str]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+
+    if not has_my_type() or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    valid_classes = ['apache_beam.pvalue.PCollection'] + my_valid_classes
+
+    if not any(valid_class in str(my_type) for valid_class in valid_classes):

Review comment:
       Are string representations necessary here? We typically don't compare by string.
   
   You can also write `if not isinstance(my_type, valid_classes)`.

##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or ' \
+                'PBegin. '
+
+    if any(element is None for element in [self.input_types, self.input_types[0], self.input_types[0][0]]):

Review comment:
       It does support short-circuiting, but the list `[self.input_types, self.input_types[0], self.input_types[0][0]]` is first completely evaluated.

##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       @robertwb are PBegin and PDone part of the public API?

##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,56 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   {'input_types': None},
+                                   ['apache_beam.pvalue.PBegin'],
+                                   'An input typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   {'output_types': None},
+                                   ['apache_beam.pvalue.PDone'],
+                                   'An output typehint to a PTransform must be'
+                                   ' a single (or nested) type wrapped by '
+                                   'a PCollection or PDone. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      kwarg_dict,         # type: Dict[str, any]
+      my_valid_classes,   # type: List[str]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+
+    if not has_my_type() or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    valid_classes = ['apache_beam.pvalue.PCollection'] + my_valid_classes
+
+    if not any(valid_class in str(my_type) for valid_class in valid_classes):
+      raise TypeCheckError(error_str)
+
+    if not hasattr(my_type, '__args__'):  # e.g. PCollection
+      kwarg_dict[next(iter(kwarg_dict))] = ((typehints.Any, ), {})

Review comment:
       The `next(iter(kwarg_dict))` call hard to read. Since there is only one item in the dict, you could pass the key instead and create the dictionary below.

##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -222,7 +222,7 @@ class _InvalidUnpickledPCollection(object):
   pass
 
 
-class PBegin(PValue):
+class PBegin(PValue, Generic[T]):

Review comment:
       PBegin and Done do not contain elements like a PCollection. I think of them as placeholders for transforms that don't have inputs or outputs, respectively.

##########
File path: sdks/python/apache_beam/examples/snippets/snippets_test_py3.py
##########
@@ -96,6 +96,18 @@ def my_fn(element: int) -> str:
       ids = numbers | 'to_id' >> beam.Map(my_fn)
       # [END type_hints_map_annotations]
 
+    # Example using an annotated PTransform.
+    with self.assertRaises(typehints.TypeCheckError):
+      # [START type_hints_ptransforms]
+      from apache_beam.pvalue import PCollection
+
+      class IntToStr(beam.PTransform):
+        def expand(pcoll: PCollection[int]) -> PCollection[str]:
+          return pcoll | beam.Map(lambda elem: str(elem))
+
+      ids = numbers | 'convert to str' >> beam.Map(my_fn)

Review comment:
       I think this line was not updated to use `IntToStr`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r451730259



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -222,7 +222,7 @@ class _InvalidUnpickledPCollection(object):
   pass
 
 
-class PBegin(PValue):
+class PBegin(PValue, Generic[T]):

Review comment:
       Oh ok makes sense




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-664632940






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-659714353


   The tests are failing due to a circular import somewhere. Will have to look into a different solution
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-650357620


   > > The two functions **strip_pcoll_input** and **strip_pcoll_output_** are very similar. Could be refactored into one function. What do you think
   > 
   > Yeah, they do look similar enough to merge. It's your call. If they do get much longer though they probably should be merged or using a shared method.
   
   Okay I refactored it & pushed new tests to cover the gaps I missed earlier


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-664605610


   Run Python2_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r455342332



##########
File path: website/www/site/content/en/documentation/sdks/python-type-safety.md
##########
@@ -90,6 +91,15 @@ The following code declares an `int` input and a `str` output type hint on the `
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test_py3.py" type_hints_map_annotations >}}
 {{< /highlight >}}
 
+The following code demonstrates how to use annotations on `PTransform` subclasses. 
+A valid annotation is a `PCollection`, `PBegin`, or `PDone` that wraps an internal (nested) type. 

Review comment:
       I would like to split the programming guide update into a separate PR, since it will include examples from multiple languages and possibly some more discussion. This will be tracked in BEAM-10495, and you can take it on if you wish.
   
   The python-type-safety update in this PR already mentions PBegin. Did you want to add more details?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-666071126


   run python precommit
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445866150



##########
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##########
@@ -40,6 +40,14 @@ def process(self, element: int) -> typehints.Tuple[str]:
     with self.assertRaisesRegex(typehints.TypeCheckError,
                                 r'requires.*int.*got.*str'):
       _ = ['a', 'b', 'c'] | beam.ParDo(MyDoFn())
+  def test_pardo_dofn(self):

Review comment:
       That was an accident, thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r443073000



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -158,6 +158,10 @@ def __ne__(self, other):
   def __hash__(self):
     return hash((self.tag, self.producer))
 
+  def __class_getitem__(cls, type_param):

Review comment:
       I believe this method is unnecessary, since Generic[T] implements something similar. You can can inspect a PCollection to get its arguments.
   For example:
   ```
   >>> import typing
   >>> T = typing.TypeVar('T')
   >>> class A(typing.Generic[T]):
   ...   pass
   ... 
   >>> A[int].__args__
   (<class 'int'>,)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-667296451


   running internal tests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r446355108



##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,59 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or PBegin. '
+
+    with self.assertRaisesRegex(TypeCheckError, error_str):

Review comment:
       Thanks, that fixed it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445246250



##########
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##########
@@ -40,6 +40,14 @@ def process(self, element: int) -> typehints.Tuple[str]:
     with self.assertRaisesRegex(typehints.TypeCheckError,
                                 r'requires.*int.*got.*str'):
       _ = ['a', 'b', 'c'] | beam.ParDo(MyDoFn())
+  def test_pardo_dofn(self):

Review comment:
       Did you mean to leave this test here? It looks like a copy of the one in AnnotationsTest.

##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,61 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    with self.assertRaises(TypeCheckError) as error:
+      _th = MyPTransform().get_type_hints()
+
+    self.assertEqual(str(error.exception), 'An input typehint to a PTransform must be a single (or nested) type '
+                                           'wrapped by a PCollection.')
+
+  def test_annotations_without_internal_type(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection) -> PCollection:

Review comment:
       This is valid. The type hint should convert to `Any`.
   
   Quoting from https://docs.python.org/3/library/typing.html:
   > Using a generic class without specifying type parameters assumes Any for each position.

##########
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##########
@@ -257,6 +265,65 @@ def fn2(element: int) -> int:
     result = [1, 2, 3] | beam.FlatMap(fn) | beam.Map(fn2)
     self.assertCountEqual([4, 6], result)
 
+  def test_typed_ptransform_with_no_error(self):
+    class StrToInt(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[str]) -> beam.pvalue.PCollection[int]:
+        return pcoll | beam.Map(lambda x: int(x))
+
+    class IntToStr(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[int]) -> beam.pvalue.PCollection[str]:
+        return pcoll | beam.Map(lambda x: str(x))
+
+    try:
+      _ = ['1', '2', '3'] | StrToInt() | IntToStr()
+    except Exception:
+      self.fail('An unexpected error was raised during a pipeline with correct typehints.')
+
+  def test_typed_ptransform_with_bad_typehints(self):
+    class StrToInt(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[str]) -> beam.pvalue.PCollection[int]:
+        return pcoll | beam.Map(lambda x: int(x))
+
+    class IntToStr(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[str]) -> beam.pvalue.PCollection[str]:
+        return pcoll | beam.Map(lambda x: str(x))
+
+    with self.assertRaises(typehints.TypeCheckError) as error:
+      # raises error because of mismatched typehints between StrToInt and IntToStr
+      _ = ['1', '2', '3'] | StrToInt() | IntToStr()
+
+    self.assertTrue("Input type hint violation at IntToStr: expected <class 'str'>, got <class 'int'>" in str(error.exception))
+
+  def test_typed_ptransform_with_bad_input(self):
+    class StrToInt(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[str]) -> beam.pvalue.PCollection[int]:
+        return pcoll | beam.Map(lambda x: int(x))
+
+    class IntToStr(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[int]) -> beam.pvalue.PCollection[str]:
+        return pcoll | beam.Map(lambda x: str(x))
+
+    with self.assertRaises(typehints.TypeCheckError) as error:
+      # Feed integers to a PTransform that expects strings
+      _ = [1, 2, 3] | StrToInt() | IntToStr()
+
+    self.assertTrue("Input type hint violation at StrToInt: expected <class 'str'>, got <class 'int'>" in str(error.exception))

Review comment:
       Please use `with self.assertRaisesRegex(..)` above instead of separately checking the exception text.

##########
File path: sdks/python/apache_beam/transforms/ptransform.py
##########
@@ -364,6 +366,15 @@ def default_label(self):
     # type: () -> str
     return self.__class__.__name__
 
+  def default_type_hints(self):
+    fn_type_hints = IOTypeHints.from_callable(self.expand)
+    if fn_type_hints is not None:
+      fn_type_hints = fn_type_hints.strip_pcoll_input()
+      fn_type_hints = fn_type_hints.strip_pcoll_output()

Review comment:
       You can chain the 2 function calls.

##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,43 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    input_type = self.input_types[0][0]
+    if isinstance(input_type, typehints.AnyTypeConstraint):

Review comment:
       Also verify that input_type is a PCollection or PBegin.
   PCollection or PDone for output type

##########
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##########
@@ -40,6 +40,14 @@ def process(self, element: int) -> typehints.Tuple[str]:
     with self.assertRaisesRegex(typehints.TypeCheckError,
                                 r'requires.*int.*got.*str'):
       _ = ['a', 'b', 'c'] | beam.ParDo(MyDoFn())
+  def test_pardo_dofn(self):

Review comment:
       Or perhaps this was also a git merge result

##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,61 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    with self.assertRaises(TypeCheckError) as error:
+      _th = MyPTransform().get_type_hints()
+
+    self.assertEqual(str(error.exception), 'An input typehint to a PTransform must be a single (or nested) type '

Review comment:
       Also test when the output typehint is unsupported.

##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,43 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    input_type = self.input_types[0][0]
+    if isinstance(input_type, typehints.AnyTypeConstraint):
+      return self
+
+    try:
+      input_type = input_type.__args__[0]
+    except:

Review comment:
       As a general rule, don't catch all exceptions but only the ones you expect to be raised.

##########
File path: sdks/python/apache_beam/transforms/ptransform.py
##########
@@ -616,24 +627,14 @@ def register_urn(
     # type: (...) -> Callable[[Union[type, Callable[[beam_runner_api_pb2.PTransform, T, PipelineContext], Any]]], Callable[[T, PipelineContext], Any]]
     pass
 
-  @classmethod
-  @overload
-  def register_urn(
-      cls,
-      urn,  # type: str
-      parameter_type,  # type: None
-  ):
-    # type: (...) -> Callable[[Union[type, Callable[[beam_runner_api_pb2.PTransform, bytes, PipelineContext], Any]]], Callable[[bytes, PipelineContext], Any]]
-    pass
-
   @classmethod
   @overload
   def register_urn(cls,
                    urn,  # type: str
                    parameter_type,  # type: Type[T]
                    constructor  # type: Callable[[beam_runner_api_pb2.PTransform, T, PipelineContext], Any]
                   ):
-    # type: (...) -> None
+    # type: (...) -> Callable[[Union[type, Callable[[beam_runner_api_pb2.PTransform, bytes, PipelineContext], Any]]], Callable[[bytes, PipelineContext], Any]]

Review comment:
       Bad merge?

##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +378,43 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    input_type = self.input_types[0][0]

Review comment:
       Please handle cases where self.input_types is None or the number of arguments is not 1.
   

##########
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##########
@@ -257,6 +265,65 @@ def fn2(element: int) -> int:
     result = [1, 2, 3] | beam.FlatMap(fn) | beam.Map(fn2)
     self.assertCountEqual([4, 6], result)
 
+  def test_typed_ptransform_with_no_error(self):
+    class StrToInt(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[str]) -> beam.pvalue.PCollection[int]:
+        return pcoll | beam.Map(lambda x: int(x))
+
+    class IntToStr(beam.PTransform):
+      def expand(self, pcoll: beam.pvalue.PCollection[int]) -> beam.pvalue.PCollection[str]:
+        return pcoll | beam.Map(lambda x: str(x))
+
+    try:
+      _ = ['1', '2', '3'] | StrToInt() | IntToStr()
+    except Exception:
+      self.fail('An unexpected error was raised during a pipeline with correct typehints.')

Review comment:
       There is no need to assert that no exceptions are raised. The test will already fail if exceptions are raised.
   
   If I need to be explicit I usually put a comment above the line that shouldn't fail.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r443808200



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -158,6 +158,10 @@ def __ne__(self, other):
   def __hash__(self):
     return hash((self.tag, self.producer))
 
+  def __class_getitem__(cls, type_param):

Review comment:
       Thanks, fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-654454954


   I just now took a peek at this, and it'll be great to have this supported! @chadrik might have some interest here as well. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-651396234


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-667445932


   Run Python2_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-659703756


   I've updated the instance check to use classes instead of strings. PTAL


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445877050



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll_input(self):
+    # type: () -> IOTypeHints
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or ' \
+                'PBegin. '
+
+    if any(element is None for element in [self.input_types, self.input_types[0], self.input_types[0][0]]):

Review comment:
       Now that I think about it this can be a simple try/catch IndexError instead of manually checking for None




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-643904266


   R: @udim 
   CC: @robertwb 
   
   This PR is not ready for merge, but I'm making it available so we can track and discuss its progress.
   
   What do you think of the current implementation? It works on some trivial pipelines that I've run on my machine but I want to confirm the solution's design before I debug deeper and write real tests.
   
   Also what is the expected behavior for runtime type checking with annotations? For example, if an input PCollection to a PTransform is annotated with the type hint _PCollection[str]_, should the runtime type checking system confirm that each element in the PCollection is indeed a string? Is this how the current runtime type checking system works with decorators and inline type hints?
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-661677501


   Retest this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r441847967



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -141,7 +142,7 @@ def __or__(self, ptransform):
     return self.pipeline.apply(ptransform, self)
 
 
-class PCollection(PValue, Generic[T]):
+class PCollection(PValue, CompositeTypeHint):

Review comment:
       Switched it back




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-651395927


   adsfdf


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-662767368


   @udim @robertwb 
   
   PTAL - tests are passing, and it looks ready to merge


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445871780



##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,59 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or PBegin. '
+
+    with self.assertRaisesRegex(TypeCheckError, error_str):

Review comment:
       Any idea why this test is failing? It says the strings don't match in stdout but they appear to match.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r461287903



##########
File path: sdks/python/apache_beam/typehints/decorators.py
##########
@@ -378,6 +379,67 @@ def has_simple_output_type(self):
         self.output_types and len(self.output_types[0]) == 1 and
         not self.output_types[1])
 
+  def strip_pcoll(self):
+    from apache_beam.pvalue import PBegin
+    from apache_beam.pvalue import PDone
+
+    return self.strip_pcoll_helper(self.input_types,
+                                   self._has_input_types,
+                                   'input_types',
+                                    [PBegin],
+                                   'An input typehint to a PTransform must be '
+                                   'a single (or nested) type wrapped by '
+                                   'a PCollection or PBegin. ',
+                                   'strip_pcoll_input()').\
+                strip_pcoll_helper(self.output_types,
+                                   self.has_simple_output_type,
+                                   'output_types',
+                                   [PDone, None],
+                                   'An output typehint to a PTransform must be '
+                                   'a single (or nested) type wrapped by '
+                                   'a PCollection, PDone, or None. ',
+                                   'strip_pcoll_output()')
+
+  def strip_pcoll_helper(
+      self,
+      my_type,            # type: any
+      has_my_type,        # type: Callable[[], bool]
+      my_key,             # type: str
+      special_containers,   # type: List[Union[PBegin, PDone, PCollection]]
+      error_str,          # type: str
+      source_str          # type: str
+      ):
+    # type: (...) -> IOTypeHints
+    from apache_beam.pvalue import PCollection
+
+    if not has_my_type() or not my_type or len(my_type[0]) != 1:
+      return self
+
+    my_type = my_type[0][0]
+
+    if isinstance(my_type, typehints.AnyTypeConstraint):
+      return self
+
+    special_containers += [PCollection]
+
+    if (my_type not in special_containers and
+        getattr(my_type, '__origin__', None) != PCollection):
+      raise TypeCheckError(error_str)

Review comment:
       Nit: This error message doesn't say what the incorrect type was. Ex:
   ```suggestion
         raise TypeCheckError(error_str + ' Got: %s' % my_type)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445942319



##########
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##########
@@ -46,11 +51,59 @@ class MyDoFn(DoFn):
       def process(self, element: int) -> Iterable[str]:
         pass
 
-    print(MyDoFn().get_type_hints())
     th = MyDoFn().get_type_hints()
     self.assertEqual(th.input_types, ((int, ), {}))
     self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+        return pcoll | Map(lambda num: str(num))
+
+    th = MyPTransform().get_type_hints()
+    self.assertEqual(th.input_types, ((int, ), {}))
+    self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+    class MyPTransform(PTransform):
+      def expand(self, pcoll: int) -> str:
+        return pcoll | Map(lambda num: str(num))
+
+    error_str = 'An input typehint to a PTransform must be a single (or nested) type wrapped by a PCollection or PBegin. '
+
+    with self.assertRaisesRegex(TypeCheckError, error_str):

Review comment:
       Probably because there are unescaped regex control characters in that string: `().`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r444541691



##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -425,8 +441,7 @@ def _from_runtime_iterable(it, options):
 
   def _view_options(self):
     return {
-        'data': self._data,
-        # For non-fn-api runners.
+        'data': self._data,  # For non-fn-api runners.

Review comment:
       Some of the YAPF changes are fixed but a few whitespace changes remain that I'll look into soon.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] saavannanavati commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

Posted by GitBox <gi...@apache.org>.
saavannanavati commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-656446634


   Tests are passing - just need to resolve a few reviews now :D


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org