You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2010/03/05 22:51:27 UTC

[jira] Created: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

fs -ls does not work if a path name contains the ^ character
------------------------------------------------------------

                 Key: HADOOP-6618
                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
            Reporter: Tsz Wo (Nicholas), SZE


Using a wildcard, the file is found.
{noformat}
-bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
Found 1 items
-rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
{noformat}
Replace the wildcard with ^, the file is not found.
{noformat}
-bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873196#action_12873196 ] 

Luke Lu commented on HADOOP-6618:
---------------------------------

@Nicholas, if you make ^ as part of the escape characters in your patch, the [^ functionality would be broken. 

BTW, glob pattern [! is defined by the POSIX standard, and [^ is actually undefined (see above link for details). We can support both though.

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873184#action_12873184 ] 

Luke Lu commented on HADOOP-6618:
---------------------------------

The ^ handling is definitely broken in the current code, so is the patch.

{code}
} else if (pCh == '^' && setOpen > 0) {
  // looks like ^ is skipped here?!
} else if (pCh == '-' && setOpen > 0) {
{code}

The correct fix, according to the POSIX.2 standard for glob: http://www.kernel.org/doc/man-pages/online/pages/man7/glob.7.html, seems to be escaping the ^ and translating "[!" to "[^" to handle the set negation, which is missing right now.

I'll post a patch @ HADOOP-6787

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873185#action_12873185 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6618:
------------------------------------------------

> ... translating "[!" to "[^" to handle the set negation, which is missing right now.
The regular expression pattern with [^ is not missing.  Have you tried it?

{noformat}
bash-3.1$ hadoop fs -ls \*.\*
-rw-------   3 tsz users       1366 2010-02-08 23:29 /user/tsz/README.txt
-rw-------   3 tsz users       2936 2009-07-13 18:05 /user/tsz/cmd.txt
-rw-------   3 tsz users     208348 2010-02-08 22:05 /user/tsz/excite-small.log
-rw-------  32 tsz users  514786675 2009-12-03 00:26 /user/tsz/p.zip
-bash-3.1$ hadoop fs -ls [^p]\*.\*
-rw-------   3 tsz users       1366 2010-02-08 23:29 /user/tsz/README.txt
-rw-------   3 tsz users       2936 2009-07-13 18:05 /user/tsz/cmd.txt
-rw-------   3 tsz users     208348 2010-02-08 22:05 /user/tsz/excite-small.log
{noformat}

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-6618:
-------------------------------------------

    Attachment: c6618_20100305.patch

It seems that the bug is in FileSystem.GlobFilter.isJavaRegexSpecialChar(..), the ^ character is not included.

c6618_20100305.patch: added ^

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842050#action_12842050 ] 

Todd Lipcon commented on HADOOP-6618:
-------------------------------------

Can we make use of Pattern.quote rather than hard coding the list of special chars?

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842073#action_12842073 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6618:
------------------------------------------------

> .. just seems inelegant to manually specify this list)
I totally agree but it is the current implementation, which is quite complicated.  Indeed, I am not sure whether there are still other special characters missing in the list.

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE resolved HADOOP-6618.
--------------------------------------------

    Resolution: Duplicate

Thanks Luke for fixing this in HADOOP-6787.

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842057#action_12842057 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-6618:
------------------------------------------------

How to use Pattern.quote exactly?  We cannot simply quote the entire filename.

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6618) fs -ls does not work if a path name contains the ^ character

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842066#action_12842066 ] 

Todd Lipcon commented on HADOOP-6618:
-------------------------------------

It would be a bigger change, but you could call Pattern.quote() on each character that wasn't already determined to have special significance in the other branches of the if.

(I'm fine with your patch, just seems inelegant to manually specify this list)

> fs -ls does not work if a path name contains the ^ character
> ------------------------------------------------------------
>
>                 Key: HADOOP-6618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6618
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: c6618_20100305.patch
>
>
> Using a wildcard, the file is found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2\?04+1_AF650AE776488A4D
> Found 1 items
> -rw-------   3 tsz users         17 2010-03-05 19:43 /user/tsz/k20d2f4/bin-2^04+1_AF650AE776488A4D
> {noformat}
> Replace the wildcard with ^, the file is not found.
> {noformat}
> -bash-3.1$ hadoop fs -ls k20d2f4/bin-2^04+1_AF650AE776488A4D
> ls: Cannot access k20d2f4/bin-2^04+1_AF650AE776488A4D: No such file or directory.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.