You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2014/11/21 16:37:28 UTC

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

GitHub user zentol opened a pull request:

    https://github.com/apache/incubator-flink/pull/224

    [FLINK-1270] FS.get() supports relative paths

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/incubator-flink fs_get_rel

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-flink/pull/224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #224
    
----
commit 86916b0ba818f8dc08c938e9a57441f51d94739e
Author: zentol <s....@web.de>
Date:   2014-11-21T15:33:59Z

    [FLINK-1270] FS.get() supports relative paths

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-flink/pull/224


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/incubator-flink/pull/224#issuecomment-64629081
  
    Very cool, looks good.
    
    +1 to merge this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/incubator-flink/pull/224#issuecomment-64234711
  
    Okay, I see that point and agree that it would be good to have.
    
    Could you add a few tests that check exactly those cases and make sure that the system behaves correctly? It would be cool to also have a check for the *hdfs* scheme with and without authority, so see if those are cases are correctly parsed with the changes.
    
    (We should have had such test in the first place, but this seems like a good point to add them.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/incubator-flink/pull/224#issuecomment-64086044
  
    This change has probably a lot of implications. We may loose some normalization of the paths by skipping that call.
    
    Also, in commit 8b39ba9e45145a30c2b95c47d7b03bc3ed2a8a3a I played with various versions of paths. Turns out that it is very easy to forget a slash in the scheme part of the URL, which is currently caught by the check for absolute paths. We may loose that as well.
    
    The `get()` method only obtains the file system, it does not actually open a file stream. The only important parts here are the scheme and the authority parts of the URI. Is it any problem when interpreting the path as absolute?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/incubator-flink/pull/224#issuecomment-64092085
  
    ```
    Path relativePath = new Path("wordcount.py");
    Path absolutePath = relativePath.makeQualified(relativePath.getFileSystem());
    ```
    This code fails since its not an absolute path. I think it absolutely should work since relative paths are so common. without *some* change, supporting relative paths is always a clunky affair since you have to check for scheme's yourself before using `getAbsolutePath()`.
    
    Regarding normalization: We could move `uri = new URI("file", null, new File(uri.getPath()).getAbsolutePath(), null);` into the URISyntaxException block. It will only be executed if the current code fails anyway, and as such shouldn't affect working code.
    
    Regarding forgotten slashes:  If you go through the Path class the scheme will be correctly identified no matter how many slashes are behind `file:`, and if there are none it throws an exception in the constructor. The only problematic case i can think of is : `FileSystem.get(new URI("file:world.txt"))`. In this case, getPath would return null. But since the scheme is correctly identified as `file` the code in questions will not be executed, and as such has no bearing on this issue.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-flink pull request: [FLINK-1270] FS.get() supports relat...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the pull request:

    https://github.com/apache/incubator-flink/pull/224#issuecomment-64597521
  
    I've addressed the following:
     1. the path returned by `getAbsolutePath()` is also normalized using `Path`
     2. added tests for Path.makeQualified() to check that absolute/relative local paths work
     3. added tests for the parsing in the Path constructor. I used various hdfs paths, with(-out) authority and varying # of slashes
     4. added tests for FileSystem.get(), making sure that both absolute/relative local paths work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---