You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2014/01/03 23:32:50 UTC

[jira] [Created] (MAPREDUCE-5706) toBeDeleted parent directories aren't being cleaned up

Robert Kanter created MAPREDUCE-5706:
----------------------------------------

             Summary: toBeDeleted parent directories aren't being cleaned up
                 Key: MAPREDUCE-5706
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5706
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: security
    Affects Versions: 0.22.0
            Reporter: Robert Kanter
            Assignee: Robert Kanter


When security is enabled on 0.22, MRASyncDiskService doesn't always delete the parent directories under {{toBeDeleted}}.

MRAsyncDiskService goes through {{toBeDeleted}} and creates "tasks" to delete the directories under there using the LinuxTaskController. It chooses which user to run as by looking at who owns that directory.
For example:
{noformat}
ls -al /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0
total 12
drwxr-xr-x 3 mapred mapred 4096 Jul  5 05:37 .
drwxr-xr-x 5 mapred mapred 4096 Dec 19 10:15 ..
drwxr-s--- 4 test   mapred 4096 Jul  2 02:54 test
{noformat}

It would create a task to use "test" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0/test (there could be more in there for other users). It then creates a task to use "mapred" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0.

So, the problem is that we normally configure "mapred" to not be allowed by the LinuxTaskController in the /etc/hadoop/conf.cloudera.mapreduce1/taskcontroller.cfg.  The permissions on the toBeDeleted dir is drwxr-xr-x mapred:mapred, which means that only "mapred" can delete things in it (i.e. the timestamped dirs).  However, the MRAsyncDiskService is already running as the mapred user, so there's no reason to use the LinuxTaskController for impersonation anyway; we can directly do it from the Java code.

Another issue is that {{MRAsyncDiskService#deletePathsInSecureCluster}} expects an absolute file path (e.g. {{/mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0}}, but {{MRAsyncDiskService#moveAndDeleteRelativePath}} passes in a relative path (e.g. {{toBeDeleted/2013-07-05_05-37-49.052_0}}).  





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)