You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2007/09/14 03:36:32 UTC

[jira] Created: (HADOOP-1891) "." is converted to an empty path

"." is converted to an empty path
---------------------------------

                 Key: HADOOP-1891
                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.14.1
         Environment: Linux
            Reporter: Olga Natkovich


Path p = new Path(".");
System.out.println("path=(" + p.toString() +")");

 path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530154 ] 

Doug Cutting commented on HADOOP-1891:
--------------------------------------

> Since this fails to do what one would reasonably expect with "." as a path [ ... ]

Hmm.  It does what I'd expect.  "./foo" and "foo" name the same file, no?  What's unexpected?

> new Path(".") should throw an exception [ ... ]

I don't see why.  Having an unresolved Path that represents the connected directory seems reasonable to me.

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1891) "." is converted to an empty path

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas reassigned HADOOP-1891:
-------------------------------------

    Assignee: Chris Douglas

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530261 ] 

Chris Douglas commented on HADOOP-1891:
---------------------------------------

I see now what you meant, and I retract my point: the existing behavior matches expectations, except as in the original example.

Coupled with HADOOP-1909, I like the idea of leaving Paths relative until dereferenced within a FileSystem. Would it make sense to go further and *require* all Paths to be dereferenced this way? There's a lot of string manipulation and special-casing in Path, particularly for Windows filesystems. Pushing that out to the FS seems like a reasonable abstraction. Introducing a new type would also let users employ POSIX semantics for Paths, but URI semantics for Hadoop Paths (as in HADOOP-1858). The new type could even be a subtype of Path, where Path assumes the default FileSystem where it's used in a URI context (just as it does now). It would be a pervasive/risky change, though...


> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530047 ] 

dhruba borthakur commented on HADOOP-1891:
------------------------------------------

My thinking is that it new Path(".") should throw an exception if there isn't enough information to convert it into an absolute path name.

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-1891) "." is converted to an empty path

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530191 ] 

chris.douglas edited comment on HADOOP-1891 at 9/25/07 11:20 AM:
-----------------------------------------------------------------

> Hmm. It does what I'd expect. "./foo" and "foo" name the same file, no? What's unexpected?

Well, "." and {{fs.getWorkingDirectory()}} aren't the same thing, as in the above example. That was surprising to me, at least. Path can keep enough information after URI normalization to know that the original was a relative path when the string is "./foo", but not when it's simply "."

Path already throws when it gets an empty string; would it be reasonable to assume that a Path successfully constructed as the empty string refers to the working directory? I can't think of a situation where reporting its URI as Path.CUR_DIR would be an error. It would also work in {{new Path("foo/bar", "../..")}}, etc.

What problem is this causing?

[Edit]
We'd also see fewer bugs like HADOOP-1902

      was (Author: chris.douglas):
    > Hmm. It does what I'd expect. "./foo" and "foo" name the same file, no? What's unexpected?

Well, "." and {{fs.getWorkingDirectory()}} aren't the same thing, as in the above example. That was surprising to me, at least. Path can keep enough information after URI normalization to know that the original was a relative path when the string is "./foo", but not when it's simply "."

Path already throws when it gets an empty string; would it be reasonable to assume that a Path successfully constructed as the empty string refers to the working directory? I can't think of a situation where reporting its URI as Path.CUR_DIR would be an error. It would also work in {{new Path("foo/bar", "../..")}}, etc.

What problem is this causing?
  
> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530206 ] 

Doug Cutting commented on HADOOP-1891:
--------------------------------------

> Well, "." and fs.getWorkingDirectory() aren't the same thing, as in the above example.

Can you describe what you'd expect the example to print?  Perhaps the fix is to avoid normalizing URIs until they are dereferenced within a FileSystem implementation?  That way "./foo" would print as "./foo" rather than just "foo".

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530021 ] 

Chris Douglas commented on HADOOP-1891:
---------------------------------------

I'm uncertain of the correct behavior, here. Absent a filesystem- or a configuration to determine the default filesystem- there's no "working directory" to resolve. Regrettably, this:

{noformat}
Configuration conf = new Configuration();
Path cwd = new Path(".");
Path kid1 = new Path(parent, "blah");
Path kid2 = new Path(FileSystem.get(conf).getWorkingDirectory(), "blah");
// kid1: blah
// kid2: /home/user/blah
{noformat}

is neither intuitive nor succinct. Paths are evaluated at construction and segments matching dot are summarily excised as part of URI normalization. Since this fails to do what one would reasonably expect with "." as a path, would it make sense to throw in this case? Certainly, Path doesn't have enough information to do much else.

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530174 ] 

Owen O'Malley commented on HADOOP-1891:
---------------------------------------

Paths can be relative and that is handy. Most applications want to make them fully qualified sooner rather than later, but I don't think an exception is the right answer.

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530271 ] 

Doug Cutting commented on HADOOP-1891:
--------------------------------------

> There's a lot of string manipulation and special-casing in Path, particularly for Windows filesystems. Pushing that out to the FS seems like a reasonable abstraction.

One problem is that there's lots of code that passes things returned by File#getPath() to 'new Path(String)', and Windows file names are invalid URI paths.  When we added Path.java to Hadoop we needed to do so back compatibly, since lots of user code manipulates file names and we didn't want to break it.

To avoid processing Windows-specifics in Path.java and stay compatible, we'd need to either avoid creating URIs in a Path at all, or we'd have to escape backslashes and colons in the URI's path, and have FileSystem implementations remove those escapes.  Perhaps that would work, although it might be hard to make it back-compatible with existing code.

I've pulled a lot of my hair out in the process of getting Path to work on Windows and am personally reluctant to revisit this.  But feel free to experiment and see if you can find a cleaner approach.

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1891) "." is converted to an empty path

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530191 ] 

Chris Douglas commented on HADOOP-1891:
---------------------------------------

> Hmm. It does what I'd expect. "./foo" and "foo" name the same file, no? What's unexpected?

Well, "." and {{fs.getWorkingDirectory()}} aren't the same thing, as in the above example. That was surprising to me, at least. Path can keep enough information after URI normalization to know that the original was a relative path when the string is "./foo", but not when it's simply "."

Path already throws when it gets an empty string; would it be reasonable to assume that a Path successfully constructed as the empty string refers to the working directory? I can't think of a situation where reporting its URI as Path.CUR_DIR would be an error. It would also work in {{new Path("foo/bar", "../..")}}, etc.

What problem is this causing?

> "." is converted to an empty path
> ---------------------------------
>
>                 Key: HADOOP-1891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1891
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.1
>         Environment: Linux
>            Reporter: Olga Natkovich
>            Assignee: Chris Douglas
>
> Path p = new Path(".");
> System.out.println("path=(" + p.toString() +")");
>  path =()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.