You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 16:20:38 UTC

[GitHub] [beam] kennknowles opened a new issue, #18100: DoFns should be serialized at apply time and deserialized when executing

kennknowles opened a new issue, #18100:
URL: https://github.com/apache/beam/issues/18100

   1. Serializing DoFns at application time ensures that any modifications of fields within the DoFn after application do not accidentally pollute the execution. This mirrors the approach taken in Java to provide an approximation of lexical-closure (eg., you only need to know the state of the DoFn at the time it was applied, not afterwards, to understand its behavior).
   
   2. Based on 1, the DIrectRunner should also be deserializing DoFns before running them, which should also detect other classes of errors such as using the pipeline object (which is not pickleable) within the DoFn
   
   Imported from Jira [BEAM-681](https://issues.apache.org/jira/browse/BEAM-681). Original Jira may contain additional context.
   Reported by: bchambers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org