You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/06/11 21:24:46 UTC

[GitHub] [beam] udim commented on a change in pull request #11982: [BEAM-6892] Supporting bucket auto-creation for Dataflow.

udim commented on a change in pull request #11982:
URL: https://github.com/apache/beam/pull/11982#discussion_r439078406



##########
File path: sdks/python/apache_beam/io/gcp/gcsio.py
##########
@@ -110,6 +110,27 @@ def parse_gcs_path(gcs_path, object_optional=False):
   return match.group(1), match.group(2)
 
 
+def default_gcs_bucket_name(project, region):
+  from hashlib import md5
+  return 'dataflow-staging-%s-%s' % (
+      region, md5(project.encode('utf8')).hexdigest())
+
+
+def get_or_create_default_gcs_bucket(project, region, kms_key=None):

Review comment:
       I realize that there's a check for this in `_create_default_gcs_bucket`, but this method should fail if kms_key is set. Otherwise, it's saying that it's okay to have a default bucket with a KMS key.
   
   (The default bucket should not use a KMS key, and if a KMS key is specified Beam should not use a bucket with some other key (KMS or Google managed).)
   
   Refs:
   https://github.com/apache/beam/pull/8135#discussion_r274695249
   https://github.com/apache/beam/pull/8830/files




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org