You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Stephen Sisk (JIRA)" <ji...@apache.org> on 2017/04/24 23:41:04 UTC

[jira] [Commented] (BEAM-2069) Remove ResourceId.getCurrentDirectory()?

    [ https://issues.apache.org/jira/browse/BEAM-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982123#comment-15982123 ] 

Stephen Sisk commented on BEAM-2069:
------------------------------------

It may be worth considering moving directory over onto the FileSystem implementation - it's not clear that a wrapper around a string (which is what resourceId is) is ever going to be able to answer this question, and it's how hadoop implements this. (org.apache.hadoop...FileSystem has an isDirectory() method)

It was pointed out to me that I could inject hadoop's FileSystem into the ResourceId and use isDirectory there to solve this problem for hadoop, so there is likely a solution for hadoop.

> Remove ResourceId.getCurrentDirectory()?
> ----------------------------------------
>
>                 Key: BEAM-2069
>                 URL: https://issues.apache.org/jira/browse/BEAM-2069
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>    Affects Versions: First stable release
>            Reporter: Stephen Sisk
>            Assignee: Davor Bonaci
>              Labels: backward-incompatible
>
> Beam ResourceId currently has a getCurrentDirectory method that returns the current resource id if it's a directory, or the parent directory if it's a directory.
> To implement this you need to know whether or not a particular path is a directory or not.
> I'm trying to implement the Hadoop ResourceId implementation, and it's not clear if it's possible. Hadoop's Paths do not end a / if they are a directory (they are stripped), nor do hadoop paths tell you if something is a directory, so it's not possible to determine if a given path is a file that does not have a suffix, or a directory.
> It's not clear to me that all file systems can determine whether a path is a directory and thus I don't believe it can be implemented reliably.
> The only usages of getCurrentDirectory that I could find are in tests so it's not clear we actually need this.
> I propose that we remove this method.
> cc [~davor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)