You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Rajiv Chittajallu (JIRA)" <ji...@apache.org> on 2008/04/04 18:19:24 UTC

[jira] Created: (HADOOP-3173) inconsistent globbing support for dfs commands

inconsistent globbing support for dfs commands
----------------------------------------------

                 Key: HADOOP-3173
                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
         Environment: Hadoop 0.16.1
            Reporter: Rajiv Chittajallu


hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob

$ hadoop dfs -mkdir /user/rajive/a/*/foo
$ hadoop dfs -ls /user/rajive/a
Found 4 items
/user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
/user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
/user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
/user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users

$ hadoop dfs -ls /user/rajive/a/*
/user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users

$ hadoop dfs -rmr /user/rajive/a/*
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d


I am not able to escape '*' from being expanded.

$ hadoop dfs -rmr '/user/rajive/a/*'
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

$ hadoop dfs -rmr  '/user/rajive/a/\*'
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

$ hadoop dfs -rmr  /user/rajive/a/\* 
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3173:
--------------------------------

         Priority: Blocker  (was: Major)
    Fix Version/s: 0.17.0

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Mukund Madhugiri (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mukund Madhugiri updated HADOOP-3173:
-------------------------------------

    Fix Version/s:     (was: 0.18.0)

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>            Assignee: Chris Douglas
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-3173:
----------------------------------

    Status: Patch Available  (was: Open)

I'm marking this PA, mostly for discussion purposes. I'm not ecstatic about the current patch, but it passes tests on my machine and resolves the issue called out in this JIRA.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597675#action_12597675 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3173:
------------------------------------------------

> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob

Special characters like * is really problematic: even JIRA cannot display it correctly.  :)

Again, we need to escape:
hadoop dfs -mkdir /user/\*/bar creates a directory "/user/\*/bar" and you cant deleted /user/* as -rmr expands the glob

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-3173:
----------------------------------

         Priority: Major  (was: Blocker)
    Fix Version/s:     (was: 0.17.0)
                   0.18.0
         Assignee:     (was: Edward J. Yoon)

This is a known issue and is not a regression. See HADOOP-1995. So I am going to mark it a blocker to release 0.17. Code-wise we use backslash to escape glob special characters. But backslash collides with the windows path separator. It does not work.

For now I would suggest a user to programatically remove a directory/file whose name contains glob special characters.

Also Rajiv, you should quote all dfs globs in the command line. Otherwise, the local file system will interpret it first.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586607#action_12586607 ] 

hairong edited comment on HADOOP-3173 at 4/7/08 5:24 PM:
---------------------------------------------------------------

This is a known issue and is not a regression. See HADOOP-1995. So I am not going to mark it a blocker to release 0.17. Code-wise we use backslash to escape glob special characters. But backslash collides with the windows path separator. It does not work.

For now I would suggest a user to programatically remove a directory/file whose name contains glob special characters.

Also Rajiv, you should quote all dfs globs in the command line. Otherwise, the local file system will interpret it first.

      was (Author: hairong):
    This is a known issue and is not a regression. See HADOOP-1995. So I am going to mark it a blocker to release 0.17. Code-wise we use backslash to escape glob special characters. But backslash collides with the windows path separator. It does not work.

For now I would suggest a user to programatically remove a directory/file whose name contains glob special characters.

Also Rajiv, you should quote all dfs globs in the command line. Otherwise, the local file system will interpret it first.
  
> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600672#action_12600672 ] 

Chris Douglas commented on HADOOP-3173:
---------------------------------------

I've been asked to clarify the implications of this proposal. There are 6 Path constructors:
# Path(String, String)
# Path(Path, String)
# Path(String, Path)
# Path(Path, Path)
# Path(String)
# Path(String, String, String)

Constructors 5 and 6 would preserve the path component of the URI as a String (the "rawPath") used only for globbing; all other Path operations would continue to work as they always have. For the following, let {{p}}, {{q}} be Paths, where {{q}} is initialized as:
{noformat}
Path q = new Path(p.toString());
{noformat}

Given the following initializations for {{p}}:
{noformat}
1. p = new Path("/foo/\\*/bar");
2. p = new Path("hdfs://foobar:8020/foo/p-1{\?}");
{noformat}

{{globStatus\(x)}} would return different results. In the first instance, globbing {{p}} would return the directory "\*", as _expected_ in this JIRA, while globbing {{q}} would have the result as _observed_ in this JIRA. In the second, p would be a legal glob (the escape prior to '?' wouldn't be converted to a path separator), so given:
{noformat}
user@host$ bin/hadoop dfs -ls 'foo/bar/'
Found 5 items:
1    0           2008-05-28 20:00  -rw-r--r--  chrisdo  supergroup  /user/chrisdo/foo/bar/p-00
1    0           2008-05-28 20:01  -rw-r--r--  chrisdo  supergroup  /user/chrisdo/foo/bar/p-01
1    0           2008-05-28 20:01  -rw-r--r--  chrisdo  supergroup  /user/chrisdo/foo/bar/p-10
1    0           2008-05-28 20:01  -rw-r--r--  chrisdo  supergroup  /user/chrisdo/foo/bar/p-11
1    0           2008-05-28 20:03  -rw-r--r--  chrisdo  supergroup  /user/chrisdo/foo/bar/p-1?
{noformat}

One could specify both '{{foo/bar/p-1{\?}}}' (file 5) and '{{foo/bar/p-1?}}' (files 3-5).

There are two primary "globbers" in the codebase, FsShell and FileInputFormats. In the current proposal, the latter would continue to be in the "{{q}} case", i.e. there would be no change to its behavior. FsShell, however, would be in the "{{p}} case", i.e. the user string would be used for globbing without first passing through Path and URI normalization. This has the advantage of resolving this JIRA, but the significant disadvantage of making globbing in FsShell and map/reduce inconsistent. If a user were to test out a pattern in the shell and try to use it as a pattern for their FileInputFormat derivative, they could get different results.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602060#action_12602060 ] 

Chris Douglas commented on HADOOP-3173:
---------------------------------------

The possible solutions are constrained by the fact that Path objects go through two passes of normalization before they're globbed. Unless we preserve the path as a String prior those transformations, we can only glob the result, which yields the behavior in this JIRA. Come to think of it, how does the escape get to the globbing code? Given:

{code:title=Path.java::normalizePath}
    path = path.replace("\\", "/");
{code}

How does

{code:title=FileSystem.GlobFilter::setRegex}
if (pCh == PAT_ESCAPE) {
  fileRegex.append(pCh);
  i++;
  if (i >= len)
    error("An escaped character does not present", filePattern, i);
  pCh = filePattern.charAt(i);
}
{code}

ever occur, when the pattern comes from {{pathPattern.toUri().getPath()}} ? Anyway- I agree with Hairong, the proposed solution is clearly a hack, but I don't see how we can avoid something like it.

I propose we mark this as "Won't fix" and live with programmatically manipulating the odd file until we have a model that can handle this elegantly.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Rajiv Chittajallu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585616#action_12585616 ] 

Rajiv Chittajallu commented on HADOOP-3173:
-------------------------------------------

Escaped formatting:

{hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant just deleted /user/* as -rmr expands the glob}

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597672#action_12597672 ] 

Chris Douglas commented on HADOOP-3173:
---------------------------------------

I mostly agree with Hairong; this is easy to do programmatically, and while there are a few alternatives (different escape character, URI encoding, new "literal" FsShell commands, etc), most appear to make the general case worse to accommodate a fairly esoteric use.

On the other hand, there are only a few places (FsShell and FileInputFormat, mainly) where we call globStatus, and in each case a String is converted to a Path before being converted back into a String in globStatus. Without the conversion, the pattern syntax can mandate that the path separator must be '/' independent of the Path syntax. Unfortunately, actually effecting this change is awkward, primarily because one must still create a Path of the glob string to obtain the FileSystem to resolve it against. If the glob string creates a Path to be resolved against a FileSystem other than the default, then the scheme, authority, etc. must be excised from the original string to preserve the escaping, etc., which will ultimately duplicate much of the URI parsing that's already happening in Path. Particularly for FileInputFormat and its users, pulling out all the Path dependencies (i.e. changing users of the globbing API) is a huge job with a modest payback.

Since Path(String) already isolates this segment, we could introduce Path::getRawPath that would preserve the path before Path::normalizePath and URI::normalize. With this, globStatus would resolve Path::getRawPath instead of p.toUri().getPath(). Unfortunately, this would mean that globStatus(p) might return different results than globstatus(new Path(p.toString())), which means FileInputFormat would still have this issue. Even if Path(Path, String) and variants preserved a raw path, its semantics would be unclear. In Path(Path, String), is the raw Path only eq the raw Path from the second arg if it is absolute? Is it the raw path from the first arg preserved in some way? We could just assert that the raw path is only different from p.toUri().getPath() if it was created with Path(String), but this could be confusing when creating globs from a base path (i.e. Path(Path, String) or possibly more confusing, Path(String, Path)). The URI normalization also removes all the ".." and "." entries in the Path, which the regexp would then have to handle (e.g. "a/b/../c*" is resolved to "a/c*" now, but using the raw path, GlobFilter would accept "a/b/dd/c" since '.' matches GlobFilter::PAT_ANY). That said, FileInputFormats and all Strings that were once Paths wouldn't have to deal with this, while utilities like FsShell could match "a/b/../c" as regexps, which might not be a bad thing.

If we want to fix this, I'd propose adding Path::getRawPath which would be used in FileSystem::globStatus, but could only be different from p.getUri().getPath() when the Path was created from a String. This covers cases where one wants to create a Path regexp manually and use it as a glob (as in FsShell), but should not change behavior elsewhere.

Thoughts?

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598136#action_12598136 ] 

Hadoop QA commented on HADOOP-3173:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12382236/3173-0.patch
  against trunk revision 657985.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 8 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2501/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2501/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2501/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2501/console

This message is automatically generated.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597678#action_12597678 ] 

Runping Qi commented on HADOOP-3173:
------------------------------------

Why don't we declare  not to allow those special characters?
It will simplify things a lot and make everybody's life easier.
I don't think that causes serious limitation.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601205#action_12601205 ] 

Hairong Kuang commented on HADOOP-3173:
---------------------------------------

Chris's proposal solves the problem. But the patch gives different semantics to the constructors of Path. It seems not elegant and might bring confusion to users.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler reassigned HADOOP-3173:
---------------------------------------

    Assignee: Chris Douglas

Assigning to whomever submitted a patch so as to better manage committing things that are ready for prime time. Although this one remains controversial ...

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>            Assignee: Chris Douglas
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-3173:
----------------------------------

    Attachment: 3173-0.patch

Attached an example of the getRawPath approach.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-3173:
----------------------------------

    Resolution: Won't Fix
        Status: Resolved  (was: Patch Available)

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>            Assignee: Chris Douglas
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597695#action_12597695 ] 

Chris Douglas commented on HADOOP-3173:
---------------------------------------

One more drawback: on windows, if one's path has backslashes in it, then Paths created with Path(String) cannot be globbed. This may not be a serious limitation, since any paths with backslashes are likely to be from environment variables, so real uses will come from Path(Path, String) and other variants.

bq. Why don't we declare not to allow those special characters?

We could. The namenode would have to check for special characters when creating new nodes, and we'd also need to provide an upgrade path for users. In theory it's a pretty dramatic change to cut down the charset, but in practice I doubt many will notice.

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>             Fix For: 0.18.0
>
>         Attachments: 3173-0.patch
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3173) inconsistent globbing support for dfs commands

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon reassigned HADOOP-3173:
--------------------------------------

    Assignee: Edward J. Yoon

> inconsistent globbing support for dfs commands
> ----------------------------------------------
>
>                 Key: HADOOP-3173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3173
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>         Environment: Hadoop 0.16.1
>            Reporter: Rajiv Chittajallu
>            Assignee: Edward J. Yoon
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> hadoop dfs -mkdir /user/*/bar creates a directory "/user/*/bar" and you cant deleted /user/* as -rmr expands the glob
> $ hadoop dfs -mkdir /user/rajive/a/*/foo
> $ hadoop dfs -ls /user/rajive/a
> Found 4 items
> /user/rajive/a/*	<dir>		2008-04-04 16:09	rwx------	rajive	users
> /user/rajive/a/b	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/c	<dir>		2008-04-04 16:08	rwx------	rajive	users
> /user/rajive/a/d	<dir>		2008-04-04 16:08	rwx------	rajive	users
> $ hadoop dfs -ls /user/rajive/a/*
> /user/rajive/a/*/foo	<dir>		2008-04-04 16:09	rwx------	rajive	users
> $ hadoop dfs -rmr /user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> I am not able to escape '*' from being expanded.
> $ hadoop dfs -rmr '/user/rajive/a/*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  '/user/rajive/a/\*'
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d
> $ hadoop dfs -rmr  /user/rajive/a/\* 
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/*
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/b
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/c
> Moved to trash: hdfs://namenode-1:8020/user/rajive/a/d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.