You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 16:29:43 UTC

[GitHub] [beam] damccorm opened a new issue, #20298: Improve memory profiling for users of Portable Beam Python

damccorm opened a new issue, #20298:
URL: https://github.com/apache/beam/issues/20298

   We have a Profiler[1] that is integrated with SDK worker[1a], however it only saves CPU metrics [1b].
   We have a MemoryReporter util[2] which can log heap dumps, however it is not documented on Beam Website and does not respect the \--profile_memory and \--profile_location options[3]. The profile_memory flag currently works only for  Dataflow Runner users who run non-portable batch pipelines;  profiles are saved only if memory usage between samples exceeds 1000M. 
   
   We should improve memory profiling experience for Portable Python users and consider making a guide on how users can investigate OOMing pipelines on Beam website.
    
   [1] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L46
   [1a] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L157
   [1b] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L112
   [2] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/utils/profiler.py#L124
   [3] https://github.com/apache/beam/blob/095589c28f5c427bf99fc0330af91c859bb2ad6b/sdks/python/apache_beam/options/pipeline_options.py#L846
   
   Imported from Jira [BEAM-10200](https://issues.apache.org/jira/browse/BEAM-10200). Original Jira may contain additional context.
   Reported by: tvalentyn.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on issue #20298: Improve memory profiling for users of Portable Beam Python

Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on issue #20298:
URL: https://github.com/apache/beam/issues/20298#issuecomment-1547888993

   Hey @blazingbhavneek I would recommend looking at issues with the `good first issue` tag (like this one) and trying to solve them (and asking questions if you have specific questions about the issue. You can also find our contribution guide here - https://beam.apache.org/contribute/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] blazingbhavneek commented on issue #20298: Improve memory profiling for users of Portable Beam Python

Posted by "blazingbhavneek (via GitHub)" <gi...@apache.org>.
blazingbhavneek commented on issue #20298:
URL: https://github.com/apache/beam/issues/20298#issuecomment-1547927270

   Hi @damccorm ! (Apologies for the apparent spamming, I feel motivated when I KNOW I have a lot on my plate, otherwise I spend a lot of time just looking what to do and then eventually start wasting time on internet) I will start looking into the codebase and update you with specific questions, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] RhysJohnLewis commented on issue #20298: Improve memory profiling for users of Portable Beam Python

Posted by "RhysJohnLewis (via GitHub)" <gi...@apache.org>.
RhysJohnLewis commented on issue #20298:
URL: https://github.com/apache/beam/issues/20298#issuecomment-1618876430

   .take-issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] blazingbhavneek commented on issue #20298: Improve memory profiling for users of Portable Beam Python

Posted by "blazingbhavneek (via GitHub)" <gi...@apache.org>.
blazingbhavneek commented on issue #20298:
URL: https://github.com/apache/beam/issues/20298#issuecomment-1546163296

   Hey there! 👋 I'm new to this repository and eager to contribute! 🌟 Could you kindly suggest some entry point or files to look into?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org