You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (Created) (JIRA)" <ji...@apache.org> on 2012/03/05 18:27:57 UTC

[jira] [Created] (HADOOP-8139) Path does not allow metachars to be escaped

Path does not allow metachars to be escaped
-------------------------------------------

                 Key: HADOOP-8139
                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.23.0, 0.24.0
            Reporter: Daryn Sharp
            Priority: Blocker


Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.

Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
{noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
"hadoop fs -rmr -skipTrash /user/me/*"{noformat}
* User/Admin: Nuked their home directory or any given directory

{noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
"hadoop fs -rmr -skipTrash /*"{noformat}
* User:  Deleted _everything_ they have access to on the cluster
* Admin: *Nukes the entire cluster*

Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227199#comment-13227199 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

Daryn, why are we trying to provide incomplete support for back slash based paths? Why not just remove support for it. I thought we decided on this approach and agreed that it is backward incompatible change.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226434#comment-13226434 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

This patch looks like a good start.
 - Someone should test this on Windows before it is committed.  I suspect there will be several other places besides RawLocalFileSystem where Windows paths with backslashes enter the system.
 - I like the new Path(File) constructor.  Rather than using toString() maybe it could use toURI()?
 - Rather than removing the {code}new Path("c:\\a\\b"){code} test, you might change it to something like {code}
for (File root : File.listRoots())
  assertEquals(???, new Path(new File(root, "a"+File.separator+"b")));
{code}
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226574#comment-13226574 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> if it's drive-letter:\, then it's converted as if it's a file

That might work if most of the cases are fully-qualified with a drive letter, which they probably are.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228052#comment-13228052 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> From what I know for a while windows tests have not been run on Hadoop.

If tests have not been run in a while is that a license to knowingly break things and make incompatible changes?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226291#comment-13226291 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

bq.  You can't always tell whether it's a local filesystem URI. Lots of relative paths are constructed that do not contain a scheme.

Yes, but I might be able to pull it with some creative path manipulations where ultimately RLFS on windows converts ^ to \...
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227554#comment-13227554 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

Yes, I think we all agreed on the incompatible change, but Doug wanted additional blockers that I interpreted as making this jira unfit for 23.2.  Re:

bq.  Doug: I think this is a good approach but it will break things on Windows and we should file blockers for these regressions or fix them here.

The latest patch should work in the vast majority of cases.  I believe the only case it arguably remains incompatible is {{new Path("dir", "dir2\foo")}}.  The first component of the path does not look like a native windows path so it is interpreted as {{dir/dir2\foo}} with the assumption it's a normal URI.

I need clarification and agreement from you and Doug on what is sufficient to warrant a possible followup jira, but avoid a blocker.  Thanks.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222522#comment-13222522 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

Daryn, are you referring to Path#normalizePath() where "\\" is replaced by "/"?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139-4.patch

Ok, only change to unix is quotable meta-chars.

Windows rules, please confirm:
* c:/... - no conversion
* c:\... - is converted
* scheme:... - no conversion
* \... - is converted (correct??)
* /... - not converted
* multi-arg ctors: first path determines how next path is handled; ie. handled the same

I've added some more tests, although I think more are needed as suggested by Doug.  I've tested as much as I can by flipping WINDOWS to true in the source.  It dies on cygwin.

Unfortunately I do not have access to a windows machine.  I will be eternally grateful if somone within the community will help me test on windows.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226429#comment-13226429 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

I was rushing to meeting, so allow me to re-summarize what (I think) I did.
* Existing Path ctors require the path to have a dir sep of / because they expected to be URIs
* Path(File) can instantiate path objects for the local fs.  For unix, it does the same as Path(file.toString()), so no change.  For windows, it allows \ paths by rewriting the File to sub \ to /.
* RLFS will rewrite windows paths to sub ^ with \ in order to preserve windows quoting semantics.  Unix is unchanged.

This means that windows users can use \ paths only with {{File}} objects, and convert to {{Path}} with {{new Path(new File("\foo\bar"))}} which turns into {{/foo/bar}}.  However, {{new Path("\foo\bar")}} is presumed to be a URI so it remains {{\foo\bar}}.  RLFS for windows will allow {{^*}} to be rewritten into {{\ *}} so the normal globbing code will operate as expected.

In short:
# Behavior is consistent between unix and windows, unless windows uses File objects to obtain local semantics
# Unix allows \ as a meta-quote.
# Windows RLFS paths may use \ and ^ semantics, while other paths follow the URI semantics of / and \.

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223878#comment-13223878 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. I suspect there are folks who manage to run Hadoop on Windows with Cygwin today that such a change would break. This would be a regression.
I am not sure how your change this addresses comment from Doug. See the comment below.

Here are the scenarios to consider (sorry I do not have access to Windows PC).
# On windows terminal paths use '/' and WINDOWS is true.
# On Cygwin terminal paths use '\' and WINDOWS is (probably) true. In this case the patch might introduce unnecessary behavior?


 
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139.patch

I haven't run the full suite of tests, but I want to get the patch up for comments this evening.  I did have to make a couple of lines to the glob parser due to directly bugs found during testing this patch.  I'll make it a separate jira if there are objections to including it.

Glob will try to build a regexp for each path component.  If it doesn't see an unescaped shell metachar, then it falls back to using the raw path component string.  In the case of quoted metachars, the quoting is never removed.  I fixed that.

Ironically, the glob quoting would only work if there was also an unquoted metachar.  This forced the use of a regexp where the unstripped quoting was valid.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225606#comment-13225606 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. ... this introduces regressions on Windows

Doug, not sure what you mean by this. Hadoop can still use through java File Windows right? All we are changing is, the file paths to/through Hadoop now are based on forward slash.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225647#comment-13225647 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

I'm sorry,  I misunderstood what I was expected to do.  We'd like to get this into 23.2 so I've been trying to avoid an in-depth change.  Adding a windows blocker is undesirable too.

Apparently windows NT has long internally handled / as \, plus the JVM will implicitly do the conversion.  I read it on the internet, so it must be true!  However, for File objects returned from operations, I'm not sure if it's smart enough to use the same delimiter used in the input path.

Here's some options.  Doug, I think you are essentially proposing #2?
# Add a config option to sub \ to /
#* Pros: backward-compatibility for those that want it (ie. no quoting), or Unix and windows can quote.
#* Cons: none?
# No subbing of \ to / (latest patch)
#* Pros: Unix and windows can quote metas
#* Cons: File objects returned from Windows might not convert to Path.  Requires changes to RLFS (*)
# Use the WINDOWS conditional for subbing \ to /
#* Pros: Unix can quote metas.
#* Cons: Unix can't use windows-style paths to access hdfs. Windows can't quota metas.  Not a regression because it can't do that today, but now inconsistent with unix feature.
# Use a WINDOWS conditional for subbing \ to /, and ^ to \ (prior patch)
#* Pros: Unix and windows can quota metas
#* Cons: The quoting requires different/inconsistent chars that are system-dependent.  Unix can't use windows-style paths.

(*) A new Path ctor might make this easy to fix:
{code}
Path(File file) {
  (File.separatorChar == Path.SEPARATOR_CHAR)
    ? this(file.toString())
    : this(file.toString().replace(File.separatorChar, Path.SEPARATOR_CHAR));
  }
}
{code}
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222609#comment-13222609 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

I'm hoping someone with more {{FileSystem}} expertise and historical knowledge will fix this bug.  I'm unclear if/how we can use \ and / inter-changably.  I'd think we should just remove it.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
              Status: Open  (was: Patch Available)
    
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp reassigned HADOOP-8139:
-----------------------------------

    Assignee: Daryn Sharp
    
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228060#comment-13228060 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. If tests have not been run in a while is that a license to knowingly break things and make incompatible changes?
Doug, I am confused by this statement.

The last proposal was to add {{ if (windows) }} check for the code that replaces "\\" with "/". For a larger change of removing support for "\\" altogether, I agree, we need more elaborate tests. So for that one line change, I was suggesting, no windows testing is needed, given no one seems to have windows machines.

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227925#comment-13227925 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Why? Once the decision is made to support only / based paths, HADOOP-8079 will work also with that path format right?

Since the folks working on that will have to pick up the pieces then perhaps they ought to be consulted first, or at least warned?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226508#comment-13226508 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

How this change to Path(String): if it's drive-letter:\, then it's converted as if it's a file.  The path remembers it's file-based so the 2-arg ctors can convert the child path too.  Not perfect, but as about as close to backwards-compatible as reasonably possible.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Alexander Stojanovic (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227924#comment-13227924 ] 

Alexander Stojanovic commented on HADOOP-8139:
----------------------------------------------

We are only using backslashes (\) for escaping meta-characters and not as alternative path-separators. So, things should work in a uniform manner.     
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225468#comment-13225468 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Disallow '\' as separator in paths. I believe, Hadoop should choose a format for the path and not changed based on the OS. 

I'm fine with this approach too.  It'll be more work to get things working on Windows again this way, and that work will have to be done.  So it's not so much a back-compatibility issue as it will introduce regressions on Windows that will need to be fixed by someone.

> The behavior on Windows on Cygwin and without Cygwin could be very different.

I don't think Cygwin has much to do with it.  Cygwin's mostly just used to run shell scripts.  Directory listings from java.io.File, classpath entries, resource paths, etc. are all the same with or without Cygwin, since these come from the JVM.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226462#comment-13226462 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

Thanks for the feedback, Doug.  I'm glad we appear to be zeroing in on a solution.

bq. I suspect there will be several other places besides RawLocalFileSystem where Windows paths with backslashes enter the system.
Do you have any idea what or where these may lurk...?  Once a user has instantiated a path via a file, further path manipulations _should_ work.  I'll have to think about it some more.

bq. I like the new Path(File) constructor. Rather than using toString() maybe it could use toURI()?
I tried that first, but it fully qualifies the path.  It uses the actual current working directory, instead of the one set in the filesystem.  It also percent encodes the path, so I had to use URLDecoder.  Messy.

Re: the tests, good idea.  I didn't update the test because I was trying to throw up a proof of concept to see if it was a reasonable approach before I spent too much time on it.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225560#comment-13225560 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517612/HADOOP-8139-2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/689//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/689//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224575#comment-13224575 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> For windows only, \ is converted to /, and the windows escape char ^ is converted to \.

Ah.  Now I get it.  I like this approach!  +1
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229557#comment-13229557 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

@Alexander - Yes, nearly every FsShell command supports unix style globs.  "tail" is the only exception that comes to mind.  The problem is the subbing of \ to / rendered it impossible to quote metachars.  For instance, \ * turned into /*!

One of the problems I encountered with trying to modify RLFS is it seems, based only on reading, that local files may be returned with \ or with /.  If true, we probably don't want to blindly convert \ to /.

If we don't want the as-backwards-compatible-as-possible solution I tried to implement, we can try modifying RLFS to sub \ to / during File->Path conversion if File.separatorChar != Path.SEPARATOR.  RLFS will also need to override globStatus to convert \ to ^ for quoting of metachars.

I think this is doable as long as we are willing to sacrifice c:\dir\... in 23.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223653#comment-13223653 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

What about something like this?
{code}
   private String normalizePath(String path) {
     // remove double slashes & backslashes
     path = StringUtils.replace(path, "//", "/");
-    path = StringUtils.replace(path, "\\", "/");
+    if (WINDOWS) {
+      path = StringUtils.replace(path, "\\", "/");
+      path = StringUtils.replace(path, "^", "\\");
+    }
{code}
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139-5.patch

I believe this patch should not introduce incompatibilities with current behavior.

Had to remove Path(File) logic from RLFS because hadoop can return home dir paths, etc with / in them.  Forcing them to parse as local is inappropriate in those cases.  The "intelligence" when parsing windows paths should now handle the paths correctly.  Moved ^ handling into FileSystem when the path is being converted from native to ensure consistency in path handling.

Minor change to not parse windows path as native if it contains a /.  Corrects inconsistency where the user may be trying to use URIs and quoting the first char of the path.

I was able to test by manually flipping the value of Path.WINDOWS.  The final test case of course failed when trying to call cygwin.

Please let me know if this is sufficient for 23.2 w/o a blocker.  Thanks!
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226619#comment-13226619 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517769/HADOOP-8139-3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.fs.TestPath

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/695//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/695//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224444#comment-13224444 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

bq. It'd be nice to fix the problem for Windows users too.
True, which I believe it would it in the small snippet above.  For windows only, \ is converted to /, and the windows escape char ^ is converted to \.

bq. Why not try fixing it in RawLocalFileSystem?
I think that could work, but windows users who might be relying on "\my\hdfs\path" to work will be in for an incompatible surprise.  While it might be the right long-term approach, I think presents more risk than conditionalizing the \ to / for now.

bq. Thinking further, perhaps this is not a bug. Perhaps the user error is assuming that backslash is an escape character in paths. The escape character in URIs is %. Does this work with 'rm %2A'?

No, it doesn't work because paths are not subject to percent encoding.  Path has hybrid behavior of a URI and a File.  Path is really only using a URI to hold the scheme and authority.  The rest of the URI is considered a literal path.  Percent encoding would require many non-alphanumeric characters to be escaped, including almost all of the glob metachars.

Changing Path to require percent encoding would be extremely disruptive and incompatible.


                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227995#comment-13227995 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Doug, is this acceptable?

I trust Alexander on this one.  If he's okay with it I'm okay with it.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139.patch

Sorry, wrong patch.  Finally learned that I can't select my patch file, leave the dialog open, sometime later overwrite the patch & hit submit, and then expect it to pick up the modified file...
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228639#comment-13228639 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

What do people feel is the next step?  Should this jira be resolved as won't fix since Alexander seems to be eliminating support for windows native paths?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223884#comment-13223884 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Any recommendations? Maybe only convert \ to / if the OS is windows?

It'd be nice to fix the problem for Windows users too.

Why not try fixing it in RawLocalFileSystem?

Thinking further, perhaps this is not a bug.  Perhaps the user error is assuming that backslash is an escape character in paths.  The escape character in URIs is %.  Does this work with 'rm %2A'?

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224495#comment-13224495 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

{quote}
Here are the scenarios to consider (sorry I do not have access to Windows PC).
# On windows terminal paths use '/' and WINDOWS is true.
# On Cygwin terminal paths use '\' and WINDOWS is (probably) true. In this case the patch might introduce unnecessary behavior?
{quote}

Good questions. The pre-existing WINDOWS flag is based on {{System.getProperty("os.name").startsWith("Windows");}} so it should be true for both cygwin and non-cygwin.

If the "^" to "\" substitution is removed, then no behavior changes for windows.  It just won't be able to quote metachars.  I'm fine with that.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225397#comment-13225397 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

Hadoop currently allows both paths with '\' or '/' as separators and some times '\' is used as an escape character. This results in issues when the escape characters is interprted as separator (as this jira describes).

What we need is (as you have proposed):
# Paths with '\' as separator only use '^' as escape character.
# Paths with '/' as separator only use '\' as escape character.

But the main problem is to decipher the intended separator in a path.

Some possible directions for solving this problem:
# Disallow '\' as separator in paths. I believe, Hadoop should choose a format for the path and not changed based on the OS. This is my preferred solution - has the following issues.
#* This is backward incompatible. I am guessing the impact of this on existing users is minimmal.
# Allow both separators, the approach you have described.
#* An application that uses '\' based path, works from both windows and non windows system, before this change. With this change, an application needs to use the right path based on the OS. This could be very confusing to the users.
#* The behavior on Windows on Cygwin and without Cygwin could be very different.

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228647#comment-13228647 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

>  Should this jira be resolved as won't fix [ ... ]

Can meta-characters be escaped on Windows now?  If not, then an issue should remain open indicating this, no?  We could perhaps create a new sub-issue of HADOOP-8164, or mark this as a dependency of that one.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224954#comment-13224954 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517501/HADOOP-8139.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.fs.TestPath
                  org.apache.hadoop.ipc.TestRPCCallBenchmark
                  org.apache.hadoop.fs.viewfs.TestViewFsTrash

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/684//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/684//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Alexander Stojanovic (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229455#comment-13229455 ] 

Alexander Stojanovic commented on HADOOP-8139:
----------------------------------------------

@doug and @ daryn - Does the presence of * in paths work as expected in Linux/Unix? This seems to be the parallel issue being partially masked by the discussion of \ vs. / in path specification.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223576#comment-13223576 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

Any recommendations?  Maybe only convert \ to / if the OS is windows?  Windows will still have the bug of not being able to escape metas, but unix will work as expected.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139-2.patch

Changed patch to eliminate conversion of \ to /.  Updated a couple tests in TestPath to no longer expects the conversion.  Thanks to Doug & Suresh for discussion on the issue.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226224#comment-13226224 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> on windows (#4) we are going to have nasty cross-platform script compatibility in the future

Command-line scripts on Windows tend to be .bat files while shell scripts are used on Unix.  What would be required for this to be a problem is a cross-platform scripting environment that doesn't treat backslash as a directory separator on Windows, right?  Do you know of any such?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227931#comment-13227931 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

Alexander filed HADOOP-8079, so I think he's been warned. :)  Alexander, if I understand you correctly, the only change needed to {{Path}} is to completely remove the \ to / substitution?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227973#comment-13227973 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

I think what it's coming down to is HADOOP-8139-2.patch is the desired implementation.  Alexander, please confirm?  Doug, is this acceptable?

We have internal customer testing blocked by this jira, so we please need to find an acceptable closure today.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227928#comment-13227928 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518078/HADOOP-8139-5.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.fs.viewfs.TestViewFsTrash

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/701//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/701//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228035#comment-13228035 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. It looks reasonable to me, but someone should probably run tests on Windows before it is committed, don't you think?
If it is the simpler version of the patch, it would be good run some tests on windows. However, if windows is not available for testing, I am okay with it being not tested on Windows. From what I know for a while windows tests have not been run on Hadoop.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225310#comment-13225310 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517573/HADOOP-8139.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.fs.viewfs.TestViewFsTrash

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/686//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/686//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226298#comment-13226298 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> I feel we should consider removing support for paths using \ as separator.

I think this is a good approach but it will break things on Windows and we should file blockers for these regressions or fix them here.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227902#comment-13227902 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

It looks reasonable to me, but someone should probably run tests on Windows before it is committed, don't you think?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-8139:
------------------------------------

    Target Version/s: 0.24.0, 0.23.3  (was: 0.23.2, 0.24.0)

I created HADOOP-8164. This jira will be used for removing the back slash based path support, once the required testing on Windows is done.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227897#comment-13227897 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. This should be coordinated with HADOOP-8079, no?
Why? Once the decision is made to support only / based paths, HADOOP-8079 will work also with that path format right?

Given that this is a blocker for 0.23, and can cause accidental deletion of data, we need to fix this ASAP. I propose going with earlier solution proposed by Daryn - substituting "\\" with "/", only if it is Windows. Given that this is an incomplete fix, lets do this change in a new jira. We could keep this jira open for more complete solution/testing.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222528#comment-13222528 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

{noformat}
"\\" is replaced by "/"
{noformat}
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223456#comment-13223456 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Does windows resolve paths with forward slashes?

Java on Windows does.

> I was under the impression that hadoop in its current state didn't work on windows [ ... ]

It has worked in the past and may still work with Cygwin for shell scripts.

> could we consider punting, remove the \ stripping, and let those leading the windows effort think about how to fix it correctly?

I suspect there are folks who manage to run Hadoop on Windows with Cygwin today that such a change would break.  This would be a regression.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226728#comment-13226728 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517815/HADOOP-8139-4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests:
                  org.apache.hadoop.fs.viewfs.TestViewFsTrash

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/696//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/696//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226216#comment-13226216 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> If it's windows and it's a local filesystem URI, do the os-specific subbing.

You can't always tell whether it's a local filesystem URI.  Lots of relative paths are constructed that do not contain a scheme.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229177#comment-13229177 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. Should this jira be resolved as won't fix 
This bug should still be fixed. As discussed earlier, we need to remove support for back slash based paths, make the changes needed in RLFS and test it on Windows. Alexander is currently working on 1.0 experimental branch. Not sure if he will be able to do this in 23 time frame.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226437#comment-13226437 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

Note that even when things are made to work again on Windows after this change it will still be an incompatible change, since programs that created Paths with backslashes won't work the same.  This may be well warranted, but it should still be documented as an incompatible change.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228085#comment-13228085 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. Wrapping it in 'if (WINDOWS) { ... }' fixes the issue for Unix but not for Windows, so a follow up issue for Windows should probably be filed, no?

Agreed. I was actually thinking of doing the simple change in a new jira and make this jira blocker for 0.23. That way the history of these discussions are preserved. 
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139-3.patch

Completely untested, requesting comments.  Add Path(File) ctor that converts \ to / on windows.  RLFS with convert ^ to \ when globbing.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222617#comment-13222617 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

I suspect that simply removing this will break things on Windows.  Windows directory listings return paths with backslashes as the directory separator, but URIs expect slashes as directory separators, and Java on Windows permits slashes as directory separators.

An approach to fixing this might be to change RawLocalFileSystem#listStatus() to replace backslash with slash on Windows, then remove this in Path#normalizePath().  Slash is reserved in Windows path names, so this should be safe there.

There are probably other places that would need fixing too.  On Windows, paths from the classpath and environment will contain backslashes that confuse the directory parsing of Path and URI.  So we'd need to find each place such a path enters and perform substitution there.

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229483#comment-13229483 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

Hadoop supports globbing in some places, like the command line 'rm'.  The glob pattern is passed in as a Path.  The glob patterns are defined in the javadoc:

http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226274#comment-13226274 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

bq. If it's windows and it's a local filesystem URI, do the os-specific subbing.
I already had raised a few issues with this approach. Let me state it again:
# Applications with paths using / as separator could be used from/on Unix/Windows:
#* Some times \ becomes escape character and some times separator. That means the problem stated in this jira some times happens and some times not! We could argue that it is better than what we have today - that is - every time \ is misinterpreted as path separator. Still, this will be very confusing for the users. The only way to around this, I can see is, all users running apps on Unix must use / and windows use \ as separators, which is even more confusing.
# Applications with paths using \ as separator could be used from/on Unix/Windows:
#* Here on windows, things works. The same path will not work at all for Unix!

I feel we should consider removing support for paths using \ as separator. Some of the arguments around RawLocalFileSystem, I have hard time following, may be because I have not seen them.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222602#comment-13222602 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

That appears to be the culprit, but I didn't investigate deeper.  A user reported blowing away all the directory contents while trying to delete a sub-directory erroneously named "*".  I initially thought it was due to improper quoting by the user, until I discovered what {{Path}} does.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp reassigned HADOOP-8139:
-----------------------------------

    Assignee:     (was: Daryn Sharp)

I do not have the resources or knowledge necessary to test on windows.  I hope a windows user will find one of my patches useful.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225425#comment-13225425 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

bq.  Hadoop currently allows both paths with '\' or '/' as separators and some times '\' is used as an escape character.
Yes, hadoop currently allows both as separators, which means currently '\' _never_ works as an escape as intended.

bq.  Disallow '\' as separator in paths. I believe, Hadoop should choose a format for the path and not changed based on the OS.
I completely agree!  I originally suggested "punting" the windows issue, but Doug has valid compatibility concerns.

bq. Allow both separators [...] an application needs to use the right path based on the OS. This could be very confusing to the users.
Agreed.  I don't like inconsistency.

bq. The behavior on Windows on Cygwin and without Cygwin could be very different.
I thought hadoop currently requires cygwin, so do we need to worry about non-cygwin behavior right now?  

My goal is the most minimal non-invasive change to minimize risk. If you and Doug reach a consensus, I'll implement either approach.


                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226538#comment-13226538 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

> Do you have any idea what or where these may lurk...?

It's hard to know without testing on Windows.  Maybe there won't be that any or many.  I find a number of places that call new Path(file.getCanonicalPath()) which might cause problems or might not, depending on what's done with the Path returned.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228061#comment-13228061 ] 

Suresh Srinivas commented on HADOOP-8139:
-----------------------------------------

> code that replaces "
code that replaces backslash
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226131#comment-13226131 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

#2 doesn't look easy.  So how about a more natural hybrid solution?  If it's windows _and_ it's a local filesystem URI, do the os-specific subbing.  For all other URIs, the standard policy of / is dir sep, and \ is quote is used.  The patch will be a bit more complex but I don't think it will be too hard.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
              Status: Patch Available  (was: Open)
    
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228151#comment-13228151 ] 

Hadoop QA commented on HADOOP-8139:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518120/HADOOP-8139-6.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/703//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/703//console

This message is automatically generated.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225672#comment-13225672 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

I don't understand the motivation for (1) over (3).  When would a non-windows user want subbing?  When would a Windows user not require it?

(2) will probably require changes to more than RawLocalFileSystem.  You won't know unless you try to run things on Windows.  Last time I worked on this (years ago) I was surprised at the number of places that paths with backslashes crept in.  This is probably the best long-term approach, but it may be a lot of work to stomp all the bugs.

(4) Is simplest, but system-dependent escapes would be unfortunate.

I don't see a clear winner.  I prefer (2) or (4) to (1) or (3): if the ability to escape paths is a critical feature for Unix users then it's probably a critical feature for Windows users.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225755#comment-13225755 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

bq. When would a non-windows user want subbing? When would a Windows user not require it?

I'm just presenting the options, but if we consider hadoop to be generally URI-based, then no subbing should ever occur for URIs.  MS deprecated their non-standard use of \ in URIs back in 2006.  A valid use case _may_ be C:\dir\file.  It's easy to identify the drive letter and convert to file:///C:/dir/file -- official file URI per MS support docs.  However, once you start using relative paths, it becomes murky whether \ is a dir sep or quoting char.  If we want consistent paths and quoting that are system-independent, then we should mandate /.  I think if we run external paths through File (at least for windows), it might normalize for us, but I need to find myself a windows machine to test.

All said, I agree #2 is the ideal for consistency.  #4 or #3 is an immediate remedy for unix.  #3 would not alter the existing windows behavior, giving us time to figure out how we want to improve it.  I'm somewhat concerned that if we let the cat out the bag with a "^" on windows (#4) we are going to have nasty cross-platform script compatibility in the future.

I'll investigate #2 tomorrow morning, but you've concerned me that it may be harder than it looks, and I don't have the means to stomp down windows bugs.  What would you be willing to accept in 23.2 if #2 proves difficult?

                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228074#comment-13228074 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

Sorry, Suresh, I thought we were discussing just removing the line.

Wrapping it in 'if (WINDOWS) { ... }' fixes the issue for Unix but not for Windows, so a follow up issue for Windows should probably be filed, no?  I'll let others determine its priority, but if this is a blocker for Unix users, that one should probably be a blocker for Windows users.  Perhaps not for 0.23.2, which is still a beta, but probably for the next stable release, which I expect should support Windows, since prior stable releases do.  Does that make sense?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222709#comment-13222709 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

I too had thought about {{RawLocalFileSystem}} being the place to hide the translation.  Does windows resolve paths with forward slashes?  If not, there are interesting edge cases like paths containing "useless" escapes.  The \ will now look like a windows directory sep so I guess \ gets stripped and then / turned into \.  Hopefully nobody expects a path with a component name like "foo\bar" since it works in other filesystems...

I was under the impression that hadoop in its current state didn't work on windows, although I know an effort is underway to fix it.  If it doesn't work, could we consider punting, remove the \ stripping, and let those leading the windows effort think about how to fix it correctly?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226295#comment-13226295 ] 

Daryn Sharp commented on HADOOP-8139:
-------------------------------------

But I also agree with Suresh that it sure would be easier to say windows has to use / too.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225569#comment-13225569 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

Do you plan to make any changes to RawLocalFileSystem or test anything on Windows?  If not, since this introduces regressions on Windows, you should probably at least file some issues to fix these.  As regressions, these should probably be blockers on releases that include this change.  These might be sub-tasks of HADOOP-8079.  Should we link this issue to HADOOP-8079?
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Daryn Sharp (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-8139:
--------------------------------

    Attachment: HADOOP-8139-6.patch

Per request by Suresh, posting a patch that fixes unix, and leaves windows unchanged.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139-3.patch, HADOOP-8139-4.patch, HADOOP-8139-5.patch, HADOOP-8139-6.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8139) Path does not allow metachars to be escaped

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225636#comment-13225636 ] 

Doug Cutting commented on HADOOP-8139:
--------------------------------------

If a directory containing backslashes (from RawLocalFileSystem, CLASSPATH, or somewhere) is turned into a Path and someone tries to access the parent directory of that Path then that not work correctly unless backslash is interpreted as a directory separator.  Lots of, e.g., MapReduce unit tests depend on this in order to pass on Windows.  That's why the substitution you're removing was added.
                
> Path does not allow metachars to be escaped
> -------------------------------------------
>
>                 Key: HADOOP-8139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8139
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Blocker
>         Attachments: HADOOP-8139-2.patch, HADOOP-8139.patch, HADOOP-8139.patch
>
>
> Path converts "\" into "/", probably for windows support?  This means it's impossible for the user to escape metachars in a path name.  Glob expansion can have deadly results.
> Here are the most egregious examples. A user accidentally creates a path like "/user/me/*/file".  Now they want to remove it.
> {noformat}"hadoop fs -rmr -skipTrash '/user/me/\*'" becomes...
> "hadoop fs -rmr -skipTrash /user/me/*"{noformat}
> * User/Admin: Nuked their home directory or any given directory
> {noformat}"hadoop fs -rmr -skipTrash '\*'" becomes...
> "hadoop fs -rmr -skipTrash /*"{noformat}
> * User:  Deleted _everything_ they have access to on the cluster
> * Admin: *Nukes the entire cluster*
> Note: FsShell is shown for illustrative purposes, however the problem is in the Path object, not FsShell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira