You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Karl Kuntz (JIRA)" <ji...@apache.org> on 2011/04/12 16:55:06 UTC

[jira] [Created] (HADOOP-7222) Inconsistent behavior when passing a path with special characters as literals to some FsShell commands

Inconsistent behavior when passing a path with special characters as literals to some FsShell commands
------------------------------------------------------------------------------------------------------

                 Key: HADOOP-7222
                 URL: https://issues.apache.org/jira/browse/HADOOP-7222
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs, scripts
    Affects Versions: 0.20.2
         Environment: Unix, Java 1.6, hadoop 0.20.2
            Reporter: Karl Kuntz


hadoop dfs -put test^ing /tmp             <- works
hadoop dfs -ls /tmp                       <- works, shows the file in the dir
hadoop dfs -ls /tmp/test^ing              <- fails, returns "ls: Cannot access /tmp/test^ing: No such file or directory."
hadoop dfs -get /tmp/test^ing test^ing    <- fails, returns "get: null"

It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   

As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:

...
    } else if (pCh == '[' && setOpen == 0) {
          setOpen++;
          hasPattern = true;
        } else if (pCh == '^' && setOpen > 0) {
        } else if (pCh == '-' && setOpen > 0) {
          // Character set range
          setRange = true;
...

After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)

...
 case '^': // ^ inside [...] can be unescaped
          if (setOpen == 0) {
            regex.append(BACKSLASH);
          }
          break;
 case '!': //
...


but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7222) Inconsistent behavior when passing a path with special characters as literals to some FsShell commands

Posted by "Karl Kuntz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Kuntz updated HADOOP-7222:
-------------------------------

    Description: 
hadoop dfs --put test^ing /tmp             < - works
hadoop dfs --ls /tmp                       < - works, shows the file in the dir
hadoop dfs --ls /tmp/test^ing              < - fails, returns "ls: Cannot access /tmp/test^ing: No such file or directory."
hadoop dfs --get /tmp/test^ing test^ing    < - fails, returns "get: null"

It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   

As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:

...
    } else if (pCh == '[' && setOpen == 0) {
          setOpen++;
          hasPattern = true;
        } else if (pCh == '^' && setOpen > 0) {
        } else if (pCh == '-' && setOpen > 0) {
          // Character set range
          setRange = true;
...

After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)

...
 case '^': // ^ inside [...] can be unescaped
          if (setOpen == 0) {
            regex.append(BACKSLASH);
          }
          break;
 case '!': //
...


but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?



  was:
hadoop dfs -put test^ing /tmp             <- works
hadoop dfs -ls /tmp                       <- works, shows the file in the dir
hadoop dfs -ls /tmp/test^ing              <- fails, returns "ls: Cannot access /tmp/test^ing: No such file or directory."
hadoop dfs -get /tmp/test^ing test^ing    <- fails, returns "get: null"

It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   

As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:

...
    } else if (pCh == '[' && setOpen == 0) {
          setOpen++;
          hasPattern = true;
        } else if (pCh == '^' && setOpen > 0) {
        } else if (pCh == '-' && setOpen > 0) {
          // Character set range
          setRange = true;
...

After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)

...
 case '^': // ^ inside [...] can be unescaped
          if (setOpen == 0) {
            regex.append(BACKSLASH);
          }
          break;
 case '!': //
...


but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?




> Inconsistent behavior when passing a path with special characters as literals to some FsShell commands
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7222
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7222
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, scripts
>    Affects Versions: 0.20.2
>         Environment: Unix, Java 1.6, hadoop 0.20.2
>            Reporter: Karl Kuntz
>
> hadoop dfs --put test^ing /tmp             < - works
> hadoop dfs --ls /tmp                       < - works, shows the file in the dir
> hadoop dfs --ls /tmp/test^ing              < - fails, returns "ls: Cannot access /tmp/test^ing: No such file or directory."
> hadoop dfs --get /tmp/test^ing test^ing    < - fails, returns "get: null"
> It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   
> As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:
> ...
>     } else if (pCh == '[' && setOpen == 0) {
>           setOpen++;
>           hasPattern = true;
>         } else if (pCh == '^' && setOpen > 0) {
>         } else if (pCh == '-' && setOpen > 0) {
>           // Character set range
>           setRange = true;
> ...
> After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)
> ...
>  case '^': // ^ inside [...] can be unescaped
>           if (setOpen == 0) {
>             regex.append(BACKSLASH);
>           }
>           break;
>  case '!': //
> ...
> but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7222) Inconsistent behavior when passing a path with special characters as literals to some FsShell commands

Posted by "Karl Kuntz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Kuntz updated HADOOP-7222:
-------------------------------

    Description: 
The following work:
hadoop dfs --put test^ing /tmp             
hadoop dfs --ls /tmp                       

The following do not:
hadoop dfs --ls /tmp/test^ing              
hadoop dfs --get /tmp/test^ing test^ing 

The first fails with "ls: Cannot access /tmp/test^ing: No such file or directory." 
The second fails with "get: null".
 
It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   

As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:

...
    } else if (pCh == '[' && setOpen == 0) {
          setOpen++;
          hasPattern = true;
        } else if (pCh == '^' && setOpen > 0) {
        } else if (pCh == '-' && setOpen > 0) {
          // Character set range
          setRange = true;
...

After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)

...
 case '^': // ^ inside [...] can be unescaped
          if (setOpen == 0) {
            regex.append(BACKSLASH);
          }
          break;
 case '!': //
...


but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?



  was:
hadoop dfs --put test^ing /tmp             < - works
hadoop dfs --ls /tmp                       < - works, shows the file in the dir
hadoop dfs --ls /tmp/test^ing              < - fails, returns "ls: Cannot access /tmp/test^ing: No such file or directory."
hadoop dfs --get /tmp/test^ing test^ing    < - fails, returns "get: null"

It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   

As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:

...
    } else if (pCh == '[' && setOpen == 0) {
          setOpen++;
          hasPattern = true;
        } else if (pCh == '^' && setOpen > 0) {
        } else if (pCh == '-' && setOpen > 0) {
          // Character set range
          setRange = true;
...

After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)

...
 case '^': // ^ inside [...] can be unescaped
          if (setOpen == 0) {
            regex.append(BACKSLASH);
          }
          break;
 case '!': //
...


but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?




> Inconsistent behavior when passing a path with special characters as literals to some FsShell commands
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7222
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7222
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, scripts
>    Affects Versions: 0.20.2
>         Environment: Unix, Java 1.6, hadoop 0.20.2
>            Reporter: Karl Kuntz
>
> The following work:
> hadoop dfs --put test^ing /tmp             
> hadoop dfs --ls /tmp                       
> The following do not:
> hadoop dfs --ls /tmp/test^ing              
> hadoop dfs --get /tmp/test^ing test^ing 
> The first fails with "ls: Cannot access /tmp/test^ing: No such file or directory." 
> The second fails with "get: null".
>  
> It is possible to put a file with some special characters, such as ^ using the hadoop shell.  But once put one cannot ls, cat, or get the file due to the way some commands deal with file globbing.  Harsh J suggested on the mailing list that perhaps a flag that would turn off globbing could be implemented. Perhaps something like single quoting the file path on the command line to disable globbing would work as well.   
> As an example in the source for 0.20.2 the ^ character in particular wasn't escaped in in the output pattern in FileSystem.java @line 1050 in setRegex(String filePattern).:
> ...
>     } else if (pCh == '[' && setOpen == 0) {
>           setOpen++;
>           hasPattern = true;
>         } else if (pCh == '^' && setOpen > 0) {
>         } else if (pCh == '-' && setOpen > 0) {
>           // Character set range
>           setRange = true;
> ...
> After looking in trunk, it seems to have been dealt with in later versions (refactored into GlobPattern.java)
> ...
>  case '^': // ^ inside [...] can be unescaped
>           if (setOpen == 0) {
>             regex.append(BACKSLASH);
>           }
>           break;
>  case '!': //
> ...
> but even after pushing that back in 0.20.2 and testing it appears to resolve the issue for commands like ls, but not for get.  So perhaps there is more to be done for other commands?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira