You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/18 17:24:43 UTC

[GitHub] [beam] gonzojive opened a new pull request, #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

gonzojive opened a new pull request, #21943:
URL: https://github.com/apache/beam/pull/21943

   This makes the memfs implementation consistent with the filesystem
   implementation of List().
   
   The docstring for filesystem.Interface.List does not specify how the pattern
   should be interpretted. It says: "List expands a pattern to a list of
   filenames." Perhaps that docstring should be updated to be more specific.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck merged pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
lostluck merged PR #21943:
URL: https://github.com/apache/beam/pull/21943


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gonzojive commented on a diff in pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
gonzojive commented on code in PR #21943:
URL: https://github.com/apache/beam/pull/21943#discussion_r906638189


##########
sdks/go/pkg/beam/io/filesystem/memfs/memory.go:
##########
@@ -54,14 +55,18 @@ func (f *fs) List(_ context.Context, glob string) ([]string, error) {
 	f.mu.Lock()
 	defer f.mu.Unlock()
 
-	pattern, err := regexp.Compile(glob)
-	if err != nil {
-		return nil, err
+	globNoScheme := strings.TrimPrefix(glob, "memfs://")
+	if globNoScheme == glob {
+		return nil, fmt.Errorf("invalid pattern %q does not have prefix memfs://", glob)
 	}
 
 	var ret []string
 	for k := range f.m {
-		if pattern.MatchString(k) {
+		matched, err := path.Match(globNoScheme, strings.TrimPrefix(k, "memfs://"))

Review Comment:
   Done.
   
   Please note that I also made the pattern accepted by List() more tolerant. It no longer requires a memfs:// prefix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519407

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gonzojive commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
gonzojive commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1166197500

   I'm not sure what the etiquette is for resolving comments on github, but I think I addressed the comment, so I resolved it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519408

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
lostluck commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1162017374

   In this instance, I think we should change it.
   
   1. We don't really demonstrate usage of the memfs anywhere. https://github.com/apache/beam/search?l=Go&q=memfs
   2. We don't have properly defined behavior.
   3. Regex in this case is surprising, especially if coming from the local or gcs file systems.
   
   Looking at the local and gcs implementations, they use filepath.Match/Glob instead of plain path.Match. filepath is OS separator aware.
   
   A github search doesn't reveal any usage, and neither does the [pkg page for v2](https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam/io/filesystem/memfs?tab=importedby).  Checking the pre-modules version has [1 package ](https://pkg.go.dev/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs?tab=importedby) that hasn't updated in 3 years, and doesn't use it explicitly (but does use local and gcs).
   
   As a result, I think a move towards consistency, and documenting the behavior is the right one.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] codecov[bot] commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519769

   # [Codecov](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#21943](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c9d950f) into [master](https://codecov.io/gh/apache/beam/commit/525a169e6f807e301f1ac5e039645d4961da18d7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (525a169) will **increase** coverage by `0.00%`.
   > The diff coverage is `62.50%`.
   
   ```diff
   @@           Coverage Diff           @@
   ##           master   #21943   +/-   ##
   =======================================
     Coverage   73.98%   73.98%           
   =======================================
     Files         702      702           
     Lines       92845    92849    +4     
   =======================================
   + Hits        68687    68691    +4     
     Misses      22903    22903           
     Partials     1255     1255           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | go | `50.83% <62.50%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [sdks/go/pkg/beam/io/filesystem/memfs/memory.go](https://codecov.io/gh/apache/beam/pull/21943/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9maWxlc3lzdGVtL21lbWZzL21lbW9yeS5nbw==) | `92.40% <62.50%> (+0.40%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [525a169...c9d950f](https://codecov.io/gh/apache/beam/pull/21943?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519406

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519405

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159519404

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1159526585

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @damccorm for label go.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #21943:
URL: https://github.com/apache/beam/pull/21943#issuecomment-1172888589

   Reminder, please take a look at this pr: @damccorm 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #21943: Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp

Posted by GitBox <gi...@apache.org>.
lostluck commented on code in PR #21943:
URL: https://github.com/apache/beam/pull/21943#discussion_r902860553


##########
sdks/go/pkg/beam/io/filesystem/memfs/memory.go:
##########
@@ -54,14 +55,18 @@ func (f *fs) List(_ context.Context, glob string) ([]string, error) {
 	f.mu.Lock()
 	defer f.mu.Unlock()
 
-	pattern, err := regexp.Compile(glob)
-	if err != nil {
-		return nil, err
+	globNoScheme := strings.TrimPrefix(glob, "memfs://")
+	if globNoScheme == glob {
+		return nil, fmt.Errorf("invalid pattern %q does not have prefix memfs://", glob)
 	}
 
 	var ret []string
 	for k := range f.m {
-		if pattern.MatchString(k) {
+		matched, err := path.Match(globNoScheme, strings.TrimPrefix(k, "memfs://"))

Review Comment:
   Please switch this to (path/filepath) filepath.Match instead, for consistency with the gcs and local implementations. In practice it's always linux.  Please also update the documentation in filesystem.Interface.List to be explicit as you suggested.
   
   Thank you for the contribution!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org