You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/07 19:11:43 UTC

[GitHub] [beam] robertwb opened a new pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

robertwb opened a new pull request #11632:
URL: https://github.com/apache/beam/pull/11632


   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421871772



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       I actually prefer unconditionally importing them, but was just trying to avoid lint issues (and did see this pattern elsewhere). Changed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421796648



##########
File path: sdks/python/apache_beam/dataframe/transforms.py
##########
@@ -248,7 +261,12 @@ def _dict_union(dicts):
   return result
 
 
-def _flatten(valueish, root=()):
+def _flatten(
+    valueish,  # type: Union[T, Tuple[T, ...], Dict[Any, T]]

Review comment:
       do we want to cover the `List[T]` case for `valueish`?  
   
   Alternately, do we wan to make this more generic:
   
   ```python
   def _flatten(
       valueish,  # type: Union[T, Iterable[T], Mapping[Any, T]]
       root=(),  # type: Tuple[Any, ...]
       ):
     # type: (...) -> Dict[Tuple[Any, ...], T]
     """Given a nested structure of dicts, tuples, and lists, return a flat
     dictionary where the values are the leafs and the keys are the "paths" to
     these leaves.
   
     For example `{a: x, b: (y, z)}` becomes `{(a,): x, (b, 0): y, (b, 1): c}`.
     """
     if isinstance(valueish, typing.Mapping):
       return _dict_union(_flatten(v, root + (k, )) for k, v in valueish.items())
     elif isinstance(valueish, typing.Iterable):
       return _dict_union(
           _flatten(v, root + (ix, )) for ix, v in enumerate(valueish))
     else:
       return {root: valueish}
   ```
   
   Another thought, is it valid / worthwhile to create a relationship between `valueish`, `root`, and the result key?:
   
   ```python
   def _flatten(
       valueish,  # type: Union[T, Iterable[T], Mapping[U, T]]
       root=(),  # type: Tuple[U, ...]
       ):
     # type: (...) -> Dict[Tuple[U, ...], T]
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-636145253


   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-625892367


   PTAL


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r430715044



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       Oh, yes. Thanks for catching this. Fixed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r429361766



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       This note hasn't been addressed:  
   
   > I had a look at the lint errors, and they are legitimate, but scoping the imports is not the right solution.
   
   see above. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r430848092



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       [sigh] It still didn't like PCollection. `apache_beam/dataframe/transforms.py:30:0: W0611: Unused PCollection imported from apache_beam.pvalue (unused-import)`. But the rest are OK.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r430719974



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       I think it still needs to be fixed for `dataframe.convert`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r431266959



##########
File path: sdks/python/apache_beam/dataframe/transforms.py
##########
@@ -16,13 +16,28 @@
 
 from __future__ import absolute_import
 
+import typing
+from typing import Any
+from typing import Dict
+from typing import List
+from typing import Mapping
+from typing import Tuple
+from typing import TypeVar
+from typing import Union
+
 import pandas as pd
 
 import apache_beam as beam
 from apache_beam import transforms
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frames  # pylint: disable=unused-import
 
+if typing.TYPE_CHECKING:

Review comment:
       The prevailing style for `TYPE_CHECKING` is to import it as `from typing import TYPE_CHECKING`.  I think we should stay consistent.  If we want to change that, it's fine by me, but we can do that in another PR. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421790116



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       I should mention that a solution to the first issue is that we can refer to these types as strings, such as `'Any'`, but that's certainly a lot more awkward, and developers are likely to forget to do so and get confused/frustrated. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421789390



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       Scoping these imports is not a bad idea since it keeps the module namespace cleaner, but there are a couple of issues with this:
   
   - once we get to python 3.x, and move from type comments to annotations, this will fail, as we'll be referencing non-existent objects in our annotations.  
   - it's not consistent with how we've done this throughout the rest of the code
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r422273845



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       What's the lint error?   Is it because of the unused `typing` import?  
   
   I'm confused because unguarded typing imports are used all over the beam codebase without any lint errors.  Check `pipeline`, `pipeline_context`, `pipeline_options`, for starters. 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r422233278



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       So lint complains about unguarded imports, so I put them back. We'll just to a massive sweep to fix these when we change to use type annotations. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-625444026


   R: @chadrik


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-635673990


   LGTM!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r431264152



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -17,12 +17,20 @@
 from __future__ import absolute_import
 
 import inspect
+from typing import Any

Review comment:
       ```suggestion
   from typing import TYPE_CHECKING
   from typing import Any
   ```

##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -17,12 +17,20 @@
 from __future__ import absolute_import
 
 import inspect
+from typing import Any
+from typing import Dict
+from typing import Tuple
+from typing import Union
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:

Review comment:
       ```suggestion
   if TYPE_CHECKING:
   ```
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r431251712



##########
File path: sdks/python/apache_beam/dataframe/convert.py
##########
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
       And pandas needs to be guarded. Hopefully that should be it. PTAL.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb merged pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb merged pull request #11632:
URL: https://github.com/apache/beam/pull/11632


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421878491



##########
File path: sdks/python/apache_beam/dataframe/transforms.py
##########
@@ -248,7 +261,12 @@ def _dict_union(dicts):
   return result
 
 
-def _flatten(valueish, root=()):
+def _flatten(
+    valueish,  # type: Union[T, Tuple[T, ...], Dict[Any, T]]

Review comment:
       Generally I'd agree that `Iterable`/`Mapping` would be preferable, but here I want to be restrictive about the kinds of values I decompose. 
   
   The keys types of the mapping would be `Union[None, int, U]`, so you'd have to cast anyway, so I think it's simpler to leave as is. 
   
   Still, good food for thought. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-632398759


   Ping.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r432170796



##########
File path: sdks/python/apache_beam/dataframe/transforms.py
##########
@@ -16,13 +16,28 @@
 
 from __future__ import absolute_import
 
+import typing
+from typing import Any
+from typing import Dict
+from typing import List
+from typing import Mapping
+from typing import Tuple
+from typing import TypeVar
+from typing import Union
+
 import pandas as pd
 
 import apache_beam as beam
 from apache_beam import transforms
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frames  # pylint: disable=unused-import
 
+if typing.TYPE_CHECKING:

Review comment:
       +1 for consistency. Changed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataflow.

Posted by GitBox <gi...@apache.org>.
chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r421799103



##########
File path: sdks/python/apache_beam/dataframe/transforms.py
##########
@@ -248,7 +261,12 @@ def _dict_union(dicts):
   return result
 
 
-def _flatten(valueish, root=()):
+def _flatten(
+    valueish,  # type: Union[T, Tuple[T, ...], Dict[Any, T]]

Review comment:
       I should mention that typing provides a lot of opportunity to go down rabbit-holes, so the right answer is often "it would be more accurate, but it's not that valuable".   A common motivator behind investing in accurately typing utility functions is when it allows you to avoid adding manual / explicit types or casts elsewhere in the code.   Imagine a scenario where mypy knows key of the `valueish` map, but after it passes through `_flatten`, you need to re-type the result because it becomes `Any`.
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-634286508






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org