You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2022/04/14 17:57:00 UTC

[jira] [Commented] (BEAM-14165) Specify GCS Object Version in apache_beam.io.gcp.gcsio

    [ https://issues.apache.org/jira/browse/BEAM-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522469#comment-17522469 ] 

Kenneth Knowles commented on BEAM-14165:
----------------------------------------

[~chamikara]

> Specify GCS Object Version in apache_beam.io.gcp.gcsio
> ------------------------------------------------------
>
>                 Key: BEAM-14165
>                 URL: https://issues.apache.org/jira/browse/BEAM-14165
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-gcp
>    Affects Versions: 2.37.0
>            Reporter: Lasse Karls
>            Priority: P2
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I would like to specify a generation when accessing a gcs object via the beam filesystem.
> Via the cli with the gsutil command a specific version can be access by the following syntax. 
> {code:sh}
> gsutil cp gs://{bucket}/{object_path}#{generation} .
> {code}
> So the corresponding python code would look something like this
> {code:python}
> with apache_beam.io.filesystems.open("gs://{bucket}/{object_path}#{generation}") as f:
> pass
> {code}
> Fortunately, the [StorageObjectsGetRequest|https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/internal/clients/storage/storage_v1_messages.py#L2133] can already be passed a generation. 
> However, this is +*not done*+ within the [GcsDownloader|https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/gcsio.py#L611]. 
> I think when [parsing the GCS path|https://github.com/apache/beam/blob/14862ccbdf2879574b6ce49149bdd7c9bf197322/sdks/python/apache_beam/io/gcp/gcsio.py#L583] the generation should be extracted as well. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)