You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2020/01/21 23:34:00 UTC

[jira] [Created] (BEAM-9167) Reduce overhead of Go SDK side metrics

Robert Burke created BEAM-9167:
----------------------------------

             Summary: Reduce overhead of Go SDK side metrics
                 Key: BEAM-9167
                 URL: https://issues.apache.org/jira/browse/BEAM-9167
             Project: Beam
          Issue Type: Improvement
          Components: sdk-go
            Reporter: Robert Burke


Locking overhead due to the global store and local caches of SDK counter data can dominate certain workloads, which means we can do better.

Instead of having a global store of metrics data to extract counters, we should use per ptransform (or per bundle) counter sets, which would avoid requiring locking per counter operation. The main detriment compared to the current implementation is that a user would need to add their own locking if they were to spawn multiple goroutines to process a Bundle's work in a DoFn.

Given that self multithreaded DoFns aren't recommended/safe in Java,  largely impossible in Python, and the other beam Go SDK provided constructs (like Iterators and Emitters) are not thread safe, this is a small concern, provided the documentation is clear on this.

Removing the locking and switching to atomic ops reduces the overhead significantly in example jobs and in the benchmarks.

Related: https://issues.apache.org/jira/browse/BEAM-6541 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)