You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brian Hulette (Jira)" <ji...@apache.org> on 2022/01/29 00:48:00 UTC

[jira] [Assigned] (BEAM-13760) Add randomness to default Dataflow job name in Python sdk

     [ https://issues.apache.org/jira/browse/BEAM-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Hulette reassigned BEAM-13760:
------------------------------------

    Assignee: Will Nicholson

> Add randomness to default Dataflow job name in Python sdk
> ---------------------------------------------------------
>
>                 Key: BEAM-13760
>                 URL: https://issues.apache.org/jira/browse/BEAM-13760
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Will Nicholson
>            Assignee: Will Nicholson
>            Priority: P2
>   Original Estimate: 48h
>          Time Spent: 1h
>  Remaining Estimate: 47h
>
> Currently, when a Dataflow job is created with the default name in python, the name is a concatenation of the word "beamapp", the username, and the time in microseconds, as seen [here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L415-L428].
> Therefore, when two jobs are created by the same user at the same time, the jobs names collide and the second job fails. 
> However, the Java SDK has already solved this problem, by appending a random hex string to the job name, seen [here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L338-L351].
> The objective of this issue is to align the python sdk with the java sdk, by appending a random string to the default job name. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)