You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by OrielResearch Eila Arich-Landkof <ei...@orielresearch.org> on 2020/08/01 21:54:20 UTC

Job is taking long time - machine scaling issue? Something else?

Hello,

I am running an Apache Beam job with the following options:
Runner is dataflow.

options = PipelineOptions()
standard_cloud_options = options.view_as(StandardOptions)
standard_cloud_options.runner = RUNNER #'DataflowRunner' DirectRunner
worker_cloud_options = options.view_as(WorkerOptions)
setup_cloud_options = options.view_as(SetupOptions)
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = PROJECT_ID

In practice, the job run for a long time (days). see attached the pipeline visualization and resources report. What am I missing?




Thank you so much,


 <https://orielresearch.com/>
——————
Eila

www.orielresearch.com <http://www.orielresearch.com/>
Meetup <https://www.meetup.com/Deep-Learning-In-Production/>

Re: Job is taking long time - machine scaling issue? Something else?

Posted by Luke Cwik <lc...@google.com>.
Are your pcollection counts increasing?
Are you seeing errors in the logs?
Have you taken a look at the troubleshooting page[1]?
Have you tried opening a support case with Google Cloud?

1:
https://cloud.google.com/dataflow/docs/guides/troubleshooting-your-pipeline


On Sat, Aug 1, 2020 at 2:54 PM OrielResearch Eila Arich-Landkof <
eila@orielresearch.org> wrote:

> Hello,
>
> I am running an Apache Beam job with the following options:
> Runner is dataflow.
>
> options = PipelineOptions()
> standard_cloud_options = options.view_as(StandardOptions)
> standard_cloud_options.runner = RUNNER #'DataflowRunner' DirectRunner
> worker_cloud_options = options.view_as(WorkerOptions)
> setup_cloud_options = options.view_as(SetupOptions)
> google_cloud_options = options.view_as(GoogleCloudOptions)
> google_cloud_options.project = PROJECT_ID
>
> In practice, the job run for a long time (days). see attached the pipeline
> visualization and resources report. What am I missing?
>
>
>
> Thank you so much,
>
>
> <https://orielresearch.com/>
> ——————
> Eila
>
> www.orielresearch.com
> Meetup <https://www.meetup.com/Deep-Learning-In-Production/>
>
>