You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Xander Song <ia...@gmail.com> on 2020/03/05 23:09:44 UTC
Workers running out of memory
I am running a Beam batch pipeline on Dataflow using the Python SDK. When I
turn off autoscaling and specify a large number of workers (> 100), the
jobs succeeds. When I specify a smaller number of workers (e.g., 20),
however, the job fails. I believe the cause is that workers are running out
of memory, as I see in the workers logs that many workers are reaching
memory usage around 530-540 MB before the first exceptions are raised.
[image: Screen Shot 2020-03-05 at 2.52.14 PM.png]
I am looking for suggestions on how to debug this issue. Some options I've
been exploring are:
1. Setting up Cloud Stackdriver with Beam. I've found a guide to setting
up Cloud Stackdriver with the Java Beam SDK (
https://medium.com/google-cloud/profiling-dataflow-pipelines-ddbbef07761d),
but haven't found instructions on how to set it up with Python.
2. I noticed in the Beam pipeline options source that there is a flag
for --profile_memory. If I specify this flag, how do I access the profile
information?
Any suggestions or advice are welcome. Thank you!