You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2022/01/12 03:50:07 UTC
[jira] [Updated] (BEAM-4224) Go SDK CPU Profiling

     [ https://issues.apache.org/jira/browse/BEAM-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-4224:
----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the pull request resolve the issue? If so, could you please mark it resolved? This will help the project have a clear view of its open issues.

> Go SDK CPU Profiling
> --------------------
>
>                 Key: BEAM-4224
>                 URL: https://issues.apache.org/jira/browse/BEAM-4224
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Robert Burke
>            Priority: P3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Jira tracking work around CPU profiling the Go SDK.
> Prior to this, a hook that enables the Go CPU and trace profiling libraries was added in the following PR
> https://github.com/apache/beam/commit/adb78f6c3055693a053a89bdbaa46ca86685a290
> At present, it's broken on distributed runners.
> https://github.com/apache/beam/blob/410ad7699621e28433d81809f6b9c42fe7bd6a60/sdks/go/pkg/beam/x/hooks/perf/perf.go#L50
> See also: https://stackoverflow.com/questions/67076744/cpu-profiling-not-covering-all-the-vcpu-time-of-apache-beam-pipeline-on-dataflow/67082075?noredirect=1#comment118629835_67082075
> The original intent was to have each bundle profiled individually, but this is at odds with how CPU profiling works with Go, which measures the whole process.
> At this point, different bundles start and stop each others profiling leading to a severe undercounting, which is not ideal. A better approach would be to start the profiling on Init, and do the sampling periodically.  So that we can get ~30 second chunks or similar, writing to new files each time, per worker. This at least avoid losing most of the profiling information at the end of a worker life. (profiles can be "merged" after the fact, so if something is stopped and started again right away, little is lost).
> Optionally, we should add a Teardown trigger to the hooks so we can do a clean exit in this case, but it's not a hard requirement for a first pass.
> Optionally, figure out a clean way to get a job to work with Google Cloud Profiler, likely as a different hook. 
> https://cloud.google.com/profiler/docs/profiling-go



--
This message was sent by Atlassian Jira
(v8.20.1#820001)