You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/09/21 00:40:00 UTC

[jira] [Work logged] (BEAM-7746) Add type hints to python code

     [ https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=316009&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316009 ]

ASF GitHub Bot logged work on BEAM-7746:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Sep/19 00:39
            Start Date: 21/Sep/19 00:39
    Worklog Time Spent: 10m 
      Work Description: chadrik commented on issue #9056: [BEAM-7746] Add python type hints
URL: https://github.com/apache/beam/pull/9056#issuecomment-533752642
 
 
   This PR is getting close.  It represents an enormous amount of effort, and spans a substantial part of the code base, so I'd like to start progressing towards getting this merged.  Rebasing against master has been pretty painless so far, but I'm afraid that could change soon.
   
   I've broken the PR up into bite-sized commits to help tell a story about each set of changes.  I've also dropped the more complex changes -- the mypy plugin and generics -- and I'll deal with those in a future PR.
   
   Other than questions and comments you might have about the content of the PR, the main issues that I need input on are primarily about style.
   
   ## Line length
   
   It's very difficult to keep type comments to 80 characters.  Not only is there more information to describe, the type comments themselves cannot span more than one line.  This will improve when we get to python 3.6 and annotations are an official part of the syntax, since they can be defined over multiple lines just like normal python objects (which they are).  
   
   There are a handful of practices that will minimize our line length, but even in concert they won't work 100% of the time. Here are the main ones:
   
   - Use `from typing import Foo` vs `import typing`.  This is the single most impactful thing we can do.  It also greatly improves legibility.  Compare these two options:  
     - `typing.Optional[typing.Tuple[typing.Dict[str, str], float]]`
     - `Optional[Tuple[Dict[str, str], float]]`
   - Use type aliases.  e.g.  `AwesomeType = Optional[Tuple[Dict[str, str], float]]`.  I prefer to use this sparingly, only when there's a complex type shared in many places.  Here's why:
     - quite often we can use duck-typing to reduce the requirement for certain functions. e.g. there might be a function where `Optional[Tuple[Mapping[str, str], typing.SupportsFloat]]` would do instead of `AwesomeType`.
     - I don't like to have to constantly refer to another location to see the type
   - Change the way that we style functions so that they provide more room for annotations.  e.g. consider these options:
       ```python
       def really_long_function_name(arg1, # type: Optional[Tuple[Dict[str, str], float]]
                                     arg2, # type: int
                                    ):
         # type: (...) -> Tuple[str, float]
         code
       ```
   
       ```python
       def really_long_function_name(
         arg1, # type: Optional[Tuple[Dict[str, str], float]]
         arg2, # type: int
       ):
         # type: (...) -> Tuple[str, float]
         code
       ```
       The beam code seems to favor the former over the latter, though I see both present.  We should decide what our policy will be.  In this PR, I've determined the style on a case-by-case basis, mostly favoring  the former. 
   
   When all else fails we can use `# pylint: disable=line-too-long` *after* the type comment.  I've added these to `apache_beam.pipeline` so you can see an example.
   
   It's a larger conversation, but it might be worth discussing increasing the line length.  Many type-annotated projects have increased their line length to 99 or more characters.  This is a big change, that would involve a lot of debate. 
   
   ## Unused module imports
   
   Pylint is not able to properly track type annotations used within type comments (which is the majority), and so generates spurious errors about unused imports for most of the typing classes.  Newer versions of pylint can track annotations within comments, but only for variable annotations and not for function annotations, so it's not a complete solution.  Even if we think there is a benefit to a partial solution, it will take some work to get to the newer version of pylint because it's python3-only.
   
   The solution I'm proposing for now is to simply ignore the problem, by using the following pattern when importing types:
   
   ```
   # pylint: disable=unused-import
   from typing import TYPE_CHECKING
   from typing import Any
   from typing import Callable
   from typing import Container
   from typing import DefaultDict
   from typing import Dict
   from typing import Iterable
   from typing import Iterator
   from typing import List
   # pylint: enable=unused-import
   ```
   
   That will leave it up to developers to get right for now, and when we get to a pure python3 code-base we can fix anything that's gone astray.
   
   ---
   
   Let me know what you think.  I'm excited to get this into the code base.  If you open this branch up in an IDE that is properly setup for type annotations, you'll see that the annotations are inspectable and greatly improve your ability to comprehend and traverse the code.  They take a bit to get used to, but even the detractors on my team are now converts.   Overall, adding typing to the Beam code base was actually very straight-forward, relatively speaking, because the code is already well structured and implicitly precise about the types that are expected (in other words, I discovered there are very few `Union` types required).  That said, I found a number of docstrings that were wrong about the types expected (I still need to go back and correct a bunch), so that's the beauty of type annotations:  if you get it wrong you'll get an error!
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 316009)
    Time Spent: 7h 20m  (was: 7h 10m)

> Add type hints to python code
> -----------------------------
>
>                 Key: BEAM-7746
>                 URL: https://issues.apache.org/jira/browse/BEAM-7746
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chad Dombrova
>            Assignee: Chad Dombrova
>            Priority: Major
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484 type hints so that I can clearly see what types are required, get completion in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here:  [https://github.com/apache/beam/pull/9056]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)