You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/05 00:13:52 UTC

[GitHub] [beam] damccorm opened a new issue, #21518: Specify GCS Object Version in apache_beam.io.gcp.gcsio

damccorm opened a new issue, #21518:
URL: https://github.com/apache/beam/issues/21518

   I would like to specify a generation when accessing a gcs object via the beam filesystem.
   Via the cli with the gsutil command a specific version can be access by the following syntax. 
   
   ```
   
   gsutil cp gs://{bucket}/{object_path}#{generation} .
   
   ```
   
   
   So the corresponding python code would look something like this
   ```
   
   with apache_beam.io.filesystems.open("gs://{bucket}/{object_path}#{generation}") as f:
   pass
   
   ```
   
   
   Fortunately, the [StorageObjectsGetRequest](https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/internal/clients/storage/storage_v1_messages.py#L2133) can already be passed a generation. 
   However, this is *****not done***** within the [GcsDownloader](https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/gcsio.py#L611). 
   
   I think when [parsing the GCS path](https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/gcsio.py#L583) the generation should be extracted as well. 
   
   
   
   
   
   Imported from Jira [BEAM-14165](https://issues.apache.org/jira/browse/BEAM-14165). Original Jira may contain additional context.
   Reported by: l_karls.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org