You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brian Hulette (Jira)" <ji...@apache.org> on 2022/01/29 00:42:00 UTC

[jira] [Updated] (BEAM-13760) Add randomness to default Dataflow job name in Python sdk

     [ https://issues.apache.org/jira/browse/BEAM-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Hulette updated BEAM-13760:
---------------------------------
    Status: Open  (was: Triage Needed)

> Add randomness to default Dataflow job name in Python sdk
> ---------------------------------------------------------
>
>                 Key: BEAM-13760
>                 URL: https://issues.apache.org/jira/browse/BEAM-13760
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Will Nicholson
>            Priority: P2
>   Original Estimate: 48h
>          Time Spent: 50m
>  Remaining Estimate: 47h 10m
>
> Currently, when a Dataflow job is created with the default name in python, the name is a concatenation of the word "beamapp", the username, and the time in microseconds, as seen [here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L415-L428].
> Therefore, when two jobs are created by the same user at the same time, the jobs names collide and the second job fails. 
> However, the Java SDK has already solved this problem, by appending a random hex string to the job name, seen [here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L338-L351].
> The objective of this issue is to align the python sdk with the java sdk, by appending a random string to the default job name. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)