You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/09/19 17:08:02 UTC

[jira] [Commented] (BEAM-10200) Improve memory profiling for users of Portable Beam Python

    [ https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198788#comment-17198788 ] 

Beam JIRA Bot commented on BEAM-10200:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned.

> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
>                 Key: BEAM-10200
>                 URL: https://issues.apache.org/jira/browse/BEAM-10200
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Assignee: Yichi Zhang
>            Priority: P2
>              Labels: stale-P2, stale-assigned, starter
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not documented on Beam Website and does not respect the --profile_memory and --profile_location options[3]. The profile_memory flag currently works only for  Dataflow Runner users who run non-portable batch pipelines;  profiles are saved only if memory usage between samples exceeds 1000M. 
> We should improve memory profiling experience for Portable Python users and consider making a guide on how users can investigate OOMing pipelines on Beam website.
>  
> [1] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846



--
This message was sent by Atlassian Jira
(v8.3.4#803005)