You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Jim Finnessy (JIRA)" <ji...@apache.org> on 2010/02/09 16:25:28 UTC

[jira] Created: (MAPREDUCE-1471) FileOutputCommitter does not safely clean up it's temporary files

FileOutputCommitter does not safely clean up it's temporary files
-----------------------------------------------------------------

                 Key: MAPREDUCE-1471
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1471
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.1
            Reporter: Jim Finnessy


When the FileOutputCommitter cleans up during it's cleanupJob method, it potentially deletes the temporary files of other concurrent jobs.

Since all the temporary files for all concurrent jobs are written to working_path/_temporary/ any concurrent tasks that have the same working_path will remove all currently executing jobs when it removes working_path/_temporary during job cleanup.

If the file name output is guaranteed by the client application to be unique, the temporary files/directories should also be guaranteed to be unique to avoid this problem. Suggest modifying cleanupJob to only remove files that it created itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.