You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2012/09/25 23:20:07 UTC

[jira] [Created] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Harsh J created HADOOP-8845:
-------------------------------

             Summary: When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
                 Key: HADOOP-8845
                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 2.0.0-alpha
            Reporter: Harsh J
            Assignee: Harsh J


A brief description from my colleague Stephen Fritz who helped discover it:

{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467162#comment-13467162 ] 

Robert Joseph Evans commented on HADOOP-8845:
---------------------------------------------

I would argue that even if there is a specific need for non-standard globbing we don't want to support it.  POSIX compliance is what most people would expect from HDFS, when we deviate from it users will get confused and angry. Especially if rm deletes more files then they want.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463603#comment-13463603 ] 

Hadoop QA commented on HADOOP-8845:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546641/HADOOP-8845.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

                  org.apache.hadoop.ha.TestZKFailoverController
                  org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
                  org.apache.hadoop.hdfs.TestPersistBlocks

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1524//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1524//console

This message is automatically generated.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Attachment: HADOOP-8845.patch

Here is a failing test case first, to demonstrate the issue.

Fix to follow.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466965#comment-13466965 ] 

Eli Collins commented on HADOOP-8845:
-------------------------------------

globPathsLevel is a generic method, and globStatus which calls it claims to return all matching path names, why is it OK to unconditionally filter out all files from its results?  Since * can match the empty string, in other contexts it could be appropriate to return ""/tmp/testdir/testfile" for "/tmp/testdir/*/testfile".

Ie is there a place where we know we should just be checking directory path elements? The comment in globStatusInternal ("// list parent directories and then glob the results") by one of the cases indicates is the intent but it's valid to pass both files and directories to listStatus.

                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467110#comment-13467110 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

bq. Since * can match the empty string, in other contexts it could be appropriate to return ""/tmp/testdir/testfile" for "/tmp/testdir/*/testfile".

Nice catch. I will add a test for this to see if we aren't handling it already.

bq. Ie is there a place where we know we should just be checking directory path elements? The comment in globStatusInternal ("// list parent directories and then glob the results") by one of the cases indicates is the intent but it's valid to pass both files and directories to listStatus.

The parts I've changed this under, try to fetch "parents", which can't mean anything but directories AFAICT.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Labels: glob  (was: )
    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464889#comment-13464889 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

{quote}
-1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.ha.TestZKFailoverController
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
org.apache.hadoop.hdfs.TestPersistBlocks
{quote}

These tests do not rely on globbing at all, and hence the failures are unrelated to this patch on the core-side.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Description: 
A brief description from my colleague Stephen Fritz who helped discover it:

{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{quote}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

  was:
A brief description from my colleague Stephen Fritz who helped discover it:

{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {quote}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {quote}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471267#comment-13471267 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

Thanks all. Am addressing all your comments. What ought to be the intended behavior when symlinks are thrown into the picture? To resolve or not to resolve?

It is quite unfortunate that we've the globbing code copied instead of shared. The FileContext copy does not even have tests for itself!

But I can't imagine sharing code if we begin trying to filter or resolve symlinks - given two implementations, one that handles it and the other that doesn't.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467853#comment-13467853 ] 

Eli Collins edited comment on HADOOP-8845 at 10/3/12 3:52 AM:
--------------------------------------------------------------

Harsh,

Per the discussion, my earlier comment was incorrect {{/tmp/testdir/\*/testfile}} should *not* match {{/tmp/testdir/testfile}}. Let's add a test for that if we don't have one.

bq. The parts I've changed this under, try to fetch "parents", which can't mean anything but directories AFAICT.

I took another look, and that appears to be true for FileSystem, however not FileContext which also needs to handle symlinks.  Unfortunately it looks like this glob handling code was duplicated, so the equivalent change needs to be made to the same code in FileContext, which file a jira for sharing it across FileSystem and FileContext? Can do that in a separate change.
                
      was (Author: eli):
    Harsh,

Per the discussion, my earlier comment was incorrect "/tmp/testdir/*/testfile" should *not* match "/tmp/testdir/testfile". Let's add a test for that if we don't have one.

bq. The parts I've changed this under, try to fetch "parents", which can't mean anything but directories AFAICT.

I took another look, and that appears to be true for FileSystem, however not FileContext which also needs to handle symlinks.  Unfortunately it looks like this glob handling code was duplicated, so the equivalent change needs to be made to the same code in FileContext, which file a jira for sharing it across FileSystem and FileContext? Can do that in a separate change.
                  
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467853#comment-13467853 ] 

Eli Collins commented on HADOOP-8845:
-------------------------------------

Harsh,

Per the discussion, my earlier comment was incorrect "/tmp/testdir/*/testfile" should *not* match "/tmp/testdir/testfile". Let's add a test for that if we don't have one.

bq. The parts I've changed this under, try to fetch "parents", which can't mean anything but directories AFAICT.

I took another look, and that appears to be true for FileSystem, however not FileContext which also needs to handle symlinks.  Unfortunately it looks like this glob handling code was duplicated, so the equivalent change needs to be made to the same code in FileContext, which file a jira for sharing it across FileSystem and FileContext? Can do that in a separate change.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Status: Patch Available  (was: Open)
    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465519#comment-13465519 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

Ping? Its a trivial fix to not lookup non-directories and I have tests attached (similar to pClosure5, but that mkdired everything and couldn't run into EXECUTE-less issues as described).
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Description: 
A brief description from my colleague Stephen Fritz who helped discover it:

{code}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

{code}
2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
{code}

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

  was:
A brief description from my colleague Stephen Fritz who helped discover it:

{code}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501261#comment-13501261 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

I'll revise the patch here to include fixes for FileContext's globbing implementation as well, but for the reuse goal I have filed HADOOP-9068 for later.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Attachment: HADOOP-8845.patch

Patch fixes the findbugs warning (I had a @Nullable)
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Status: Open  (was: Patch Available)
    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Andy Isaacson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467158#comment-13467158 ] 

Andy Isaacson commented on HADOOP-8845:
---------------------------------------

bq. Since * can match the empty string, in other contexts it could be appropriate to return ""/tmp/testdir/testfile" for "/tmp/testdir/*/testfile".

That's not right for Posix-style path glob: /usr/*/bin does match /usr/X11/bin but does not match /usr/bin, even though /usr//bin is a valid synonym for /usr/bin, and this is an important feature that is commonly depended on in scripts. For example an admin might {{rm /var/www/user/*/.htaccess}} to remove all the user's htaccess files while leaving {{/var/www/user/.htaccess}} intact.

So unless there's a specific need for that kind of funky glob, I don't think we need to support it?
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Andy Isaacson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467229#comment-13467229 ] 

Andy Isaacson commented on HADOOP-8845:
---------------------------------------

(sorry for the markup messup in my last comment.)

The currently pending patch specifically checks in {{pTestClosure6}} that the case I mentioned is handled correctly, so I think we're all on the same page. :)

Code-wise, one minor comment:
{code}
+              public boolean apply(FileStatus input) {
+                return input.isDirectory() ? true : false;
+              }
{code}

This is an anti-pattern; {{foo() ? true : false}} is the same as {{foo()}}.

Other than that, LGTM on the code level. I haven't carefully read the GlobFilter implementation to see if there's a cleaner/simpler way to implement this bugfix.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Description: 
A brief description from my colleague Stephen Fritz who helped discover it:

{code}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

  was:
A brief description from my colleague Stephen Fritz who helped discover it:

{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{quote}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503750#comment-13503750 ] 

Harsh J commented on HADOOP-8845:
---------------------------------

Hi Daryn,

Indeed, the UGI test in my pTestClosure6 passes on trunk now so HADOOP-8906 should have fixed this case completely. Thanks for the heads up!

Shall we retarget this JIRA for fixing the same on FileContext (sort of a clone of HADOOP-8906 for the new FC)?

Or would the UGI test be worth an addition anyway? Cause we couldn't catch this issue unless its a non super-user.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Daryn Sharp (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501362#comment-13501362 ] 

Daryn Sharp commented on HADOOP-8845:
-------------------------------------

Are you sure this patch is still needed?  This should have already been fixed by HADOOP-8906.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Description: 
A brief description from my colleague Stephen Fritz who helped discover it:

{code}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

{code}
2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
{code}

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

  was:
A brief description from my colleague Stephen Fritz who helped discover it:

{code}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!  
Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.

{code}
2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
{code}

Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

    
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463520#comment-13463520 ] 

Hadoop QA commented on HADOOP-8845:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546608/HADOOP-8845.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//console

This message is automatically generated.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated HADOOP-8845:
----------------------------

    Attachment: HADOOP-8845.patch

Here's a patch that fixes it up as well.
                
> When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8845
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8845
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Harsh J
>              Labels: glob
>         Attachments: HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop          0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop         15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira