You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2021/12/22 18:47:00 UTC
[jira] [Updated] (BEAM-4224) Go SDK CPU Profiling

     [ https://issues.apache.org/jira/browse/BEAM-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Burke updated BEAM-4224:
-------------------------------
    Description: 
Jira tracking work around CPU profiling the Go SDK.

Prior to this, a hook that enables the Go CPU and trace profiling libraries was added in the following PR
https://github.com/apache/beam/commit/adb78f6c3055693a053a89bdbaa46ca86685a290

At present, it's broken on distributed runners.
https://github.com/apache/beam/blob/410ad7699621e28433d81809f6b9c42fe7bd6a60/sdks/go/pkg/beam/x/hooks/perf/perf.go#L50

The original intent was to have each bundle profiled individually, but this is at odds with how CPU profiling works with Go, which measures the whole process.

At this point, different bundles start and stop each others profiling leading to a severe undercounting, which is not ideal. A better approach would be to start the profiling on Init, and do the sampling periodically.  So that we can get ~30 second chunks or similar, writing to new files each time, per worker. This at least avoid losing most of the profiling information at the end of a worker life. (profiles can be "merged" after the fact, so if something is stopped and started again right away, little is lost).

Optionally, we should add a Teardown trigger to the hooks so we can do a clean exit in this case, but it's not a hard requirement for a first pass.

Optionally, figure out a clean way to get a job to work with Google Cloud Profiler, likely as a different hook. 
https://cloud.google.com/profiler/docs/profiling-go

  was:
Umbrella Jira tracking work around CPU profiling the Go SDK.

Prior to this, a hook that enables the Go CPU and trace profiling libraries was added in the following PR
https://github.com/apache/beam/commit/adb78f6c3055693a053a89bdbaa46ca86685a290


> Go SDK CPU Profiling
> --------------------
>
>                 Key: BEAM-4224
>                 URL: https://issues.apache.org/jira/browse/BEAM-4224
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Robert Burke
>            Priority: P3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Jira tracking work around CPU profiling the Go SDK.
> Prior to this, a hook that enables the Go CPU and trace profiling libraries was added in the following PR
> https://github.com/apache/beam/commit/adb78f6c3055693a053a89bdbaa46ca86685a290
> At present, it's broken on distributed runners.
> https://github.com/apache/beam/blob/410ad7699621e28433d81809f6b9c42fe7bd6a60/sdks/go/pkg/beam/x/hooks/perf/perf.go#L50
> The original intent was to have each bundle profiled individually, but this is at odds with how CPU profiling works with Go, which measures the whole process.
> At this point, different bundles start and stop each others profiling leading to a severe undercounting, which is not ideal. A better approach would be to start the profiling on Init, and do the sampling periodically.  So that we can get ~30 second chunks or similar, writing to new files each time, per worker. This at least avoid losing most of the profiling information at the end of a worker life. (profiles can be "merged" after the fact, so if something is stopped and started again right away, little is lost).
> Optionally, we should add a Teardown trigger to the hooks so we can do a clean exit in this case, but it's not a hard requirement for a first pass.
> Optionally, figure out a clean way to get a job to work with Google Cloud Profiler, likely as a different hook. 
> https://cloud.google.com/profiler/docs/profiling-go



--
This message was sent by Atlassian Jira
(v8.20.1#820001)