You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2022/03/17 17:49:00 UTC

[jira] [Commented] (BEAM-14080) Portable runner does not return job exit status to client after long-running job

    [ https://issues.apache.org/jira/browse/BEAM-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508343#comment-17508343 ] 

Kenneth Knowles commented on BEAM-14080:
----------------------------------------

[~ibzib] does this sound familiar at all?

> Portable runner does not return job exit status to client after long-running job
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-14080
>                 URL: https://issues.apache.org/jira/browse/BEAM-14080
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink, sdk-py-core
>    Affects Versions: 2.36.0
>            Reporter: Janek Bevendorff
>            Priority: P2
>
> I submit Python Beam jobs to our Flink cluster with the PortableRunner through a remote job server. If a job finishes within a few seconds or minutes, the return status (including a dump of any Python exceptions in case there was an error) is returned to the client upon completion.
> If the job, however, runs for longer (say) hours, then the client and job server seem to lose connection. This results in the client hanging forever until I press Ctrl+C to terminate it, even long after the actual job has completed (which has no effect whatsoever on the actual job).
> Example pseudo job:
> {code:java}
> print('Job started')
> with beam.Pipeline() as pipeline:
>     pipeline | DoSomething()
> print('Job finished'){code}
> If the pipeline finishes quickly, it looks like this from the client's perspective:
> {code:java}
> $ python3 myjob.py
> Job started
> Job finished
> $ _{code}
> If the job runs for longer, then the {{with}} statement never finishes and I have to abort the Python script with Ctrl+C:
> {code:java}
> $ python3 myjob.py
> Job started
> ^C
> $ _{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)