You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Stephan Hoyer (Jira)" <ji...@apache.org> on 2021/04/29 05:44:00 UTC

[jira] [Updated] (BEAM-5431) StarMap transform for Python SDK

     [ https://issues.apache.org/jira/browse/BEAM-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephan Hoyer updated BEAM-5431:
--------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Open)

beam.MapTuple and beam.FlatMapTuple implement this funtionality.

> StarMap transform for Python SDK
> --------------------------------
>
>                 Key: BEAM-5431
>                 URL: https://issues.apache.org/jira/browse/BEAM-5431
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Stephan Hoyer
>            Priority: P3
>
> I'd like to propose a new high-level transform "StarMap" for the Python SDK. The transform would be syntactic sugar for ParDo like Map, but would would automatically unpack arguments like [itertools.starmap|https://docs.python.org/3/library/itertools.html#itertools.starmap] from Python's standard library.
> The use-case is to handle applying functions to tuples of arguments, which is a common pattern when using Beam's combine and group-by transforms. Right now, it's common to write functions with manual unpacking, e.g., 
> {code:java}
> def my_func(inputs):
>   key, value = inputs
>   ...
> beam.Map(my_func) {code}
> StarMap offers a much more readable alternative: 
> {code:java}
> def my_func(key, value):
>   ...
> beam.StarMap(my_func){code}
>  
> The need for StarMap is especially pressing with the advent of Python 3 support and the eventual wind-down of Python 2. Currently, it's common to achieve this pattern using unpacking in a function definition, e.g., beam.Map(lambda (k, v): my_func(k, v)), but this is invalid syntax in Python 3. My internal search of Google's codebase turns up quite a few matches for "beam\.Map(lambda\ (", none of which would work on Python 3.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)