You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 16:15:48 UTC

[GitHub] [beam] damccorm opened a new issue, #20204: Resolve differences in beam:metric:element_count:v1 implementations

damccorm opened a new issue, #20204:
URL: https://github.com/apache/beam/issues/20204

   The [element count](https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206) metric represents the number of elements within a PCollection and is interpreted differently across the Beam SDK versions.
   
   In the [Java SDK](https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207) this represents the number of elements and includes how many windows those elements are in. This metric is incremented as soon as the element has been output.
   
   In the [Python SDK](https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247) this represents the number of elements and doesn't include how many windows those elements are in. The metric is also only incremented after the element has finished processing.
   
   The [Go SDK](https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260) does the same thing as Python.
   
   Traditionally in Dataflow this has always been the exploded window element count and the counter is incremented as soon as the element is output.
   
   Imported from Jira [BEAM-9934](https://issues.apache.org/jira/browse/BEAM-9934). Original Jira may contain additional context.
   Reported by: lcwik.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org