You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/09/19 17:08:02 UTC
[jira] [Updated] (BEAM-10200) Improve memory profiling for users of
Portable Beam Python
[ https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Beam JIRA Bot updated BEAM-10200:
---------------------------------
Labels: stale-P2 stale-assigned starter (was: stale-P2 starter)
> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
> Key: BEAM-10200
> URL: https://issues.apache.org/jira/browse/BEAM-10200
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-harness
> Reporter: Valentyn Tymofieiev
> Assignee: Yichi Zhang
> Priority: P2
> Labels: stale-P2, stale-assigned, starter
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not documented on Beam Website and does not respect the --profile_memory and --profile_location options[3]. The profile_memory flag currently works only for Dataflow Runner users who run non-portable batch pipelines; profiles are saved only if memory usage between samples exceeds 1000M.
> We should improve memory profiling experience for Portable Python users and consider making a guide on how users can investigate OOMing pipelines on Beam website.
>
> [1] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846
--
This message was sent by Atlassian Jira
(v8.3.4#803005)