You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/11/07 16:10:42 UTC

[GitHub] [beam] TheNeuralBit commented on a diff in pull request #23701: Add record_metrics argument to utils.BatchTransform

TheNeuralBit commented on code in PR #23701:
URL: https://github.com/apache/beam/pull/23701#discussion_r1015613004


##########
sdks/python/apache_beam/transforms/util.py:
##########
@@ -374,8 +378,10 @@ def record_time(self, batch_size):
     yield
     elapsed = self._clock() - start
     elapsed_msec = 1e3 * elapsed + self._remainder_msecs
-    self._size_distribution.update(batch_size)
-    self._time_distribution.update(int(elapsed_msec))
+    if self._size_distribution is not None:
+      self._size_distribution.update(batch_size)
+    if self._time_distribution is not None:
+      self._time_distribution.update(int(elapsed_msec))

Review Comment:
   nit: what about storing `record_metrics` and branching on that here, once?



##########
sdks/python/apache_beam/transforms/util_test.py:
##########
@@ -195,6 +195,17 @@ def test_constant_batch(self):
           | beam.Map(len))
       assert_that(res, equal_to([10, 10, 10, 5]))
 
+  def test_constant_batch_no_metrics(self):
+    # Assumes a single bundle...
+    with TestPipeline() as p:
+      res = (
+          p
+          | beam.Create(range(35))
+          | util.BatchElements(min_batch_size=10, max_batch_size=10,
+                               record_metrics=False)
+          | beam.Map(len))
+      assert_that(res, equal_to([10, 10, 10, 5]))

Review Comment:
   Could you make this test verify that the transform did not create metrics? Let me know if you need help with that.



##########
sdks/python/apache_beam/transforms/util.py:
##########
@@ -641,6 +647,8 @@ class BatchElements(PTransform):
         linear interpolation
     clock: (optional) an alternative to time.time for measuring the cost of
         donwstream operations (mostly for testing)
+    record_metrics: (optional) if given, record beam metrics on distributions
+        of the batch size.

Review Comment:
   ```suggestion
       record_metrics: (optional) whether or not to record beam metrics on distributions
           of the batch size. Defaults to True.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org