You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/08 22:49:08 UTC

[GitHub] [beam] chadrik commented on a change in pull request #11070: [BEAM-8280] Blog post: Python typing changes

chadrik commented on a change in pull request #11070:
URL: https://github.com/apache/beam/pull/11070#discussion_r422409561



##########
File path: website/src/_posts/2020-03-06-python-typing.md
##########
@@ -0,0 +1,117 @@
+---
+layout: post
+title:  "Python SDK Typing Changes"
+date:   2020-03-06 00:00:01 -0800
+excerpt_separator: <!--more-->
+categories: blog python typing
+authors:
+  - chadrik
+  - udim
+
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+TODO excerpt
+
+<!--more-->
+
+Python supports type annotations on functions (PEP 484). Static type checkers,
+such as mypy, are used to verify adherence to these types.
+For example:
+```py
+def f(v: int) -> int:
+  return v[0]
+```
+Running mypy on the above code will give the error:
+`Value of type "int" is not indexable`.
+
+We've recently made changes to Beam in 2 areas:
+
+Adding type hints throughout Beam. TODO expand
+
+Second, we've added support for Python 3 type annotations. This allows SDK
+users to specify a DoFn's type hints in one place. 
+We've also expanded Beam's support of `typing` module types.
+
+For more background see: 
+[Ensuring Python Type Safety](https://beam.apache.org/documentation/sdks/python-type-safety/).
+
+# Beam Is Typed
+
+TODO
+
+# New Ways to Annotate
+
+## Python 3 Syntax Annotations
+
+Coming in Beam 2.21 (BEAM-8280), you will be able to use Python annotation
+syntax to specify input and output types.
+
+For example, this new form:
+```py
+class MyDoFn(beam.DoFn):
+  def process(self, element: int) -> typing.Text:
+    yield str(element)
+```
+is equivalent to this:
+```py
+@beam.typehints.with_input_types(int)
+@beam.typehints.with_output_types(typing.Text)
+class MyDoFn(beam.DoFn):
+  def process(self, element):
+    yield str(element)
+```
+
+One of the advantages of the new form is that you may already be using it
+in tandem with a static type checker such as mypy, thus getting additional
+type checking for free.
+
+This feature will be enabled by default, and there will be 2 mechanisms in
+place to disable it:
+1. Calling `apache_beam.typehints.disable_type_annotations()` before pipeline
+construction will disable the new feature completely.
+1. Decorating a function with `@apache_beam.typehints.no_annotations` will
+tell Beam to ignore annotations for it. 
+ 
+Uses of Beam's `with_input_type`, `with_output_type` methods and decorators will 
+still work and take precedence over annotations.
+
+Sidebar:
+
+> You might ask: couldn't we use mypy to type check Beam pipelines? The main issue
+is that such a tool would have to understand type relations between
+pipeline graph nodes, e.g., that the type of element passed to a transform
+should be consistent with its annotated input type.

Review comment:
       I don't fully understand the issue presented here.  Ignoring pipelines that are dynamically generated at runtime, I think it should be possible for mypy to track the types of many pipelines, as long as A) developers avoid certain pitfalls (like lambdas), and B) we write a mypy plugin to smooth over some gaps.  
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org