You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Will Nicholson (Jira)" <ji...@apache.org> on 2022/01/27 18:46:00 UTC

[jira] [Created] (BEAM-13760) Add randomness to default Dataflow job name in Python sdk

Will Nicholson created BEAM-13760:
-------------------------------------

             Summary: Add randomness to default Dataflow job name in Python sdk
                 Key: BEAM-13760
                 URL: https://issues.apache.org/jira/browse/BEAM-13760
             Project: Beam
          Issue Type: Improvement
          Components: runner-dataflow
            Reporter: Will Nicholson


Currently, when a Dataflow job is created with the default name in python, the name is a concatenation of the word "beamapp", the username, and the time in microseconds, as seen [here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L415-L428].

Therefore, when two jobs are created by the same user at the same time, the jobs names collide and the second job fails. 

However, the Java SDK has already solved this problem, by appending a random hex string to the job name, seen [here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L338-L351].

The objective of this issue is to align the python sdk with the java sdk, by appending a random string to the default job name. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)