You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Matt Foley (JIRA)" <ji...@apache.org> on 2011/06/15 00:10:47 UTC

[jira] [Commented] (HADOOP-7360) FsShell does not preserve relative paths with globs

    [ https://issues.apache.org/jira/browse/HADOOP-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049467#comment-13049467 ] 

Matt Foley commented on HADOOP-7360:
------------------------------------

TestPathData:

Multiple places: The preferred way to express "new Path(testDir+"/d1")" is "new Path(testDir, "d1")".  This avoids hard-coding the path delimiter for a local filesystem path.

initialize():
* Consider "fs.createNewFile(Path)" instead of "fs.create(Path).close()"
* If you put the "setWorkingDirectory" command near the beginning of the method, wouldn't all the creates and mkdirs get simpler?

I'm concerned that testDir is a relative path.  Can you convert it to an absolute path as early in the process as possible, before anyone might have set a non-default working directory?

Why did you remove testWithFsAndPath()?  It is not a null test.

getParent() should be named testGetParent().

testRelativeGlobBack(): see previous concern about making "testDir" absolute before interacting with second and later setWorkingDirectory() invocations.  This might give you testDir/testDir/d1 ?

PathData:

Agree with your elimination of PathData(FileSystem, Path, FileStatus),
but suggest also eliminating PathData(FileSystem, String, FileStatus)
and PathData(FileSystem, FileStatus).
Basically, the status should be looked up in the FS, not set as an argument.
Should do so at the end of method PathData(FileSystem, String).
In comment for PathData(FileSystem, String), remove "Convenience" adjective.  It becomes the primary ctor.

setStat(boolean ignoreFNF) is counter-intuitive.  Most people seeing "setStat(boolean)" will assume the status is being set to the argument.  Please call it lookupStatus(boolean ignoreFNF).  Or could use two methods, both with zero arguments, called setStat() and setStatIgnoreFNF().

Suggest letting lookupStatus(ignoreFNF) return the stat value instead of void.  Then refreshStatus just becomes "return lookupStatus(false);".

getChecksumFile(): Why is this method needed (returns PathData), vs just using Fs.getChecksumFile() (returns Path)?

getParent(): Several concerns:
* Why is this method needed instead of just using Path.getParent()?  There are currently no callers of this new method.
* Why is it important to preserve the relative-ness of the parent path?  Trying to do so exposes you to serious issues because the current working directory may have changed since PathData#string was stored, which means the value returned by this method could be meaningless if string was relative.  I think I would go so far as to say this can't work.  I would much rather let Path.getParent() do what it was carefully implemented to do.
* The same issue applies to the next two chunks.

This brings up the issue of why PathData saves the value of #string?
This just begs for such mis-use.  I've opened HADOOP-7393 for this.


> FsShell does not preserve relative paths with globs
> ---------------------------------------------------
>
>                 Key: HADOOP-7360
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7360
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.23.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-7360.patch
>
>
> FsShell currently preserves relative paths that do not contain globs.  Unfortunately the method {{fs.globStatus()}} is fully qualifying all returned paths.  This is causing inconsistent display of paths.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira