You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "robertwb (via GitHub)" <gi...@apache.org> on 2023/02/06 20:41:44 UTC

[GitHub] [beam] robertwb commented on a diff in pull request #25351: Better batching for higher fixed costs.

robertwb commented on code in PR #25351:
URL: https://github.com/apache/beam/pull/25351#discussion_r1097896596


##########
sdks/python/apache_beam/transforms/util.py:
##########
@@ -502,8 +502,11 @@ def _calculate_next_batch_size(self):
     target = self._max_batch_size
 
     if self._target_batch_duration_secs:
-      # Solution to a + b*x = self._target_batch_duration_secs.
-      target = min(target, (self._target_batch_duration_secs - a) / b)
+      # Solution to b*x = self._target_batch_duration_secs.
+      # We ignore the fixed cost in this computation as it has negligeabel

Review Comment:
   In that case we still won't batch (much or at all). That's the problem with absolute limits (but it felt too risky of a backwards incompatible change to remove it altogether). But it increases the range that the more general overhead calculation takes precedence instead. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org