You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/29 00:46:13 UTC

[GitHub] [beam] TheNeuralBit opened a new issue, #22091: [Bug]: beam.FlatMap() raises an error at pipeline construction time

TheNeuralBit opened a new issue, #22091:
URL: https://github.com/apache/beam/issues/22091

   ### What happened?
   
   This was a regression introduced by 0de98210f4531fbfd88265bc02052b27bd299602, which is in the 2.40.0 release.
   
   When expanding a `FlatMap(<built-in function>)` users will encounter an error:
   ```
   AttributeError: 'builtin_function_or_method' object has no attribute '__func__'
   ```
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: sdk-py-core


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit closed issue #22091: [Bug]: beam.FlatMap() raises an error at pipeline construction time

Posted by GitBox <gi...@apache.org>.
TheNeuralBit closed issue #22091: [Bug]: beam.FlatMap(<built-in function>) raises an error at pipeline construction time
URL: https://github.com/apache/beam/issues/22091


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22091: [Bug]: beam.FlatMap() raises an error at pipeline construction time

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22091:
URL: https://github.com/apache/beam/issues/22091#issuecomment-1169421317

   Note the breakage is just for `beam.FlatMap` constructed with a built-in, `beam.Map` is unaffected. Moreover, only built-in _functions_ (e.g. `len`, `sum`, `iter`) trigger the problematic path, not built-in types (e.g. `list`, `bytes`, `set`). The former have `__self__` attribute which makes them look like methods to the logic added in 0de9821:
   ```
   In [1]: sum.__self__
   Out[1]: <module 'builtins' (built-in)>
   ```
   
   So to summarize, just `beam.FlatMap(<built-in function>)` will fail. It's actually kind of hard to devise a use-case for this, since the built-in function must produce an iterable. I thought this over and only came up with `beam.FlatMap(sum)` over a PCollection where the elements are lists of numpy arrays. I looked over the list of built-in functions and couldn't come up with another way to use one in a FlatMap.
   
   Given that, I'm not sure this issue has a large enough blast radius to warrant a bugfix release. Not there is a straightforward workaround for that use-case, wrap `sum` in a helper function or lambda:
   
   ```py
   def _sum(x):
     return sum(x)
     
   pc | beam.FlatMap(_sum)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org