You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (JIRA)" <ji...@apache.org> on 2019/04/05 20:21:00 UTC

[jira] [Updated] (BEAM-6724) Go SDK on Dataflow processing step emits but it doesn't reach framework

     [ https://issues.apache.org/jira/browse/BEAM-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Burke updated BEAM-6724:
-------------------------------
    Component/s: sdk-go

> Go SDK on Dataflow processing step emits but it doesn't reach framework
> -----------------------------------------------------------------------
>
>                 Key: BEAM-6724
>                 URL: https://issues.apache.org/jira/browse/BEAM-6724
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow, sdk-go
>            Reporter: Robin Palotai
>            Priority: Minor
>
> When sending a job with a larger (not so large, 30MB) input to Dataflow runner, I can see the worker logs that it emits everything in a given step, but then the framework (not sure which one, harness or above) doesn't seem to register that it reached the finish state (or maybe it doesn't reach the finish state).
> For a smaller input (~1MB) the whole pipeline runs fine.
> Do you have any pointers how to debug (say, add logging) the cause of the stuckness? Maybe there are some buffers not flushed? A state not transitioned? Generally, which part of the beam go codebase is responsible for these transitions?
> Thank you!
> Version: current Go SDK from HEAD + https://github.com/apache/beam/pull/7889 patches to make plan check
> Cloud dataflow console says "Apache Beam SDK for Go 0.5.0".
> Logs:
> 16:35:39.788 CET Starting MapTask stage s02
> 16:35:44.508 CET <first worker progress message>
> 16:36:34.588 CET <last progress message + done>
> 16:36:34.963 CET "DataSource: 2 elements in 53507659897 ns"
> 16:40:45.227 CET <new worker is started in place of one perceived as failed> Initializing Go harness: /opt/apache/beam/boot --id=1 --logging_endpoint=localhost:12370 --control_endpoint=localhost:12371 --artifact_endpoint=localhost:12372 --provision_endpoint=localhost:12373 --semi_persist_dir=/var/opt/google undefined
> 16:42:29.027 CET Processing stuck in step s02 for at least 05m00s without outputting or completing in state finish



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)