You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/09/26 17:08:02 UTC

[jira] [Assigned] (BEAM-10200) Improve memory profiling for users of Portable Beam Python

     [ https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Beam JIRA Bot reassigned BEAM-10200:
------------------------------------

    Assignee:     (was: Yichi Zhang)

> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
>                 Key: BEAM-10200
>                 URL: https://issues.apache.org/jira/browse/BEAM-10200
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Priority: P2
>              Labels: stale-P2, stale-assigned, starter
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not documented on Beam Website and does not respect the --profile_memory and --profile_location options[3]. The profile_memory flag currently works only for  Dataflow Runner users who run non-portable batch pipelines;  profiles are saved only if memory usage between samples exceeds 1000M. 
> We should improve memory profiling experience for Portable Python users and consider making a guide on how users can investigate OOMing pipelines on Beam website.
>  
> [1] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846



--
This message was sent by Atlassian Jira
(v8.3.4#803005)