You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/08/04 17:07:00 UTC

[jira] [Commented] (BEAM-10200) Improve memory profiling for users of Portable Beam Python

    [ https://issues.apache.org/jira/browse/BEAM-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170964#comment-17170964 ] 

Beam JIRA Bot commented on BEAM-10200:
--------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.


> Improve memory profiling for users of Portable Beam Python
> ----------------------------------------------------------
>
>                 Key: BEAM-10200
>                 URL: https://issues.apache.org/jira/browse/BEAM-10200
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Priority: P2
>              Labels: stale-P2, starter
>
> We have a Profiler[1] that is integrated with SDK worker[1a], however it only saves CPU metrics [1b].
> We have a MemoryReporter util[2] which can log heap dumps, however it is not documented on Beam Website and does not respect the --profile_memory and --profile_location options[3]. The profile_memory flag currently works only for  Dataflow Runner users who run non-portable batch pipelines;  profiles are saved only if memory usage between samples exceeds 1000M. 
> We should improve memory profiling experience for Portable Python users and consider making a guide on how users can investigate OOMing pipelines on Beam website.
>  
> [1] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
> [1a] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
> [1b] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
> [2] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
> [3] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846



--
This message was sent by Atlassian Jira
(v8.3.4#803005)