You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Daniel Halperin (JIRA)" <ji...@apache.org> on 2017/05/08 17:35:04 UTC

[jira] [Comment Edited] (BEAM-2150) Support for recursive wildcards in GcsPath

    [ https://issues.apache.org/jira/browse/BEAM-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001163#comment-16001163 ] 

Daniel Halperin edited comment on BEAM-2150 at 5/8/17 5:34 PM:
---------------------------------------------------------------

Or rather, maybe we should change our thinking. Rather than thinking of this as adding recursive glob support, we should think of this as relaxing the regex accepted by the transform.

The key thing is whether {{\*\*}} is a meaningful glob -- if so, it's a filesystem and we should probably try to match {{gsutil}}. But if we just say these are normal regular expressions, then {{\*\*}} is actually the same as {{\*}} and your changes make sense.

I'm fine with _either_, as long as we document this (say, on GCS FileSystem).


was (Author: dhalperi@google.com):
Or rather, maybe we should change our thinking. Rather than thinking of this as adding recursive glob support, we should think of this as relaxing the regex accepted by the transform.

The key thing is whether {{**}} is a meaningful glob -- if so, it's a filesystem and we should probably try to match {{gsutil}}. But if we just say these are normal regular expressions, then {{**}} is actually the same as {{*}} and your changes make sense.

I'm fine with _either_, as long as we document this (say, on GCS FileSystem).

> Support for recursive wildcards in GcsPath
> ------------------------------------------
>
>                 Key: BEAM-2150
>                 URL: https://issues.apache.org/jira/browse/BEAM-2150
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core, sdk-java-gcp
>            Reporter: Devon Meunier
>            Assignee: Devon Meunier
>            Priority: Minor
>
> When working with heavily nested folder structures in Google Cloud Storage, it's great to make use of recursive wildcards, which the current API explicitly does not support.
> This code hasn't been touched in 2 years so it's likely that simply no one's gotten around to it yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)