You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 19:32:17 UTC

[GitHub] [beam] kennknowles opened a new issue, #18634: Key-aware batching function

kennknowles opened a new issue, #18634:
URL: https://github.com/apache/beam/issues/18634

   I have a CombineFn for which add_input has very large overhead. I would like to batch the incoming elements into a large batch before each call to add_input to reduce this overhead. In other words, I would like to do something like: 
   
   `elements | GroupByKey() | BatchElements() | CombineValues(MyCombineFn())`
   
   Unfortunately, BatchElements is not key-aware, and can't be used after a GroupByKey to batch elements per key. I'm working around this by doing the batching within CombineValues, which makes the CombineFn rather messy. It would be nice if there were a key-aware BatchElements transform which could be used in this context.
   
   Imported from Jira [BEAM-3737](https://issues.apache.org/jira/browse/BEAM-3737). Original Jira may contain additional context.
   Reported by: chuanyu.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org