You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Thomas Groh (JIRA)" <ji...@apache.org> on 2017/08/16 20:35:00 UTC

[jira] [Resolved] (BEAM-2702) Dataflow pipeline stalls after autoscaling

     [ https://issues.apache.org/jira/browse/BEAM-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Groh resolved BEAM-2702.
-------------------------------
       Resolution: Later
    Fix Version/s: Not applicable

> Dataflow pipeline stalls after autoscaling
> ------------------------------------------
>
>                 Key: BEAM-2702
>                 URL: https://issues.apache.org/jira/browse/BEAM-2702
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.0.0
>            Reporter: Johann Steinbrecher
>            Assignee: Thomas Groh
>             Fix For: Not applicable
>
>
> A 4 step dataflow pipeline (Pubsubio.Read, windowing, message parsing, DatastoreV1.write) stalls as soon as the autoscaling algorithm is increasing the number of workers from 1 to 4. 
> *Expected*:
> Throughput (elements/sec) for each pipeline step increases due to more workers.
> *Actual*:
> Throughput (elements/sec) goes to 0 for all steps. The number of processed elements in the first step equals the number of processed elements in the last step. The number of workers stays high.
> Runner: google-cloud-platform managed dataflow runner
> Sample dataflow job id (log level debug):
> 2017-07-27_14_51_37-4624978117098944513
> Log message after autoscaling:
> Rpc to .. completed with error DEADLINE_EXCEEDED (cause or symptom?)
> autoscaling configuration 
> --autoscalingAlgorithm=THROUGHPUT_BASED 
> --maxNumWorkers=4 
> machine types tested:
> - n1-highmem-2
> - n1-standard-1
> zone: us-east1-d
> sdk version:
> org.apache.beam@2.0.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)