You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/04/07 05:35:13 UTC
[jira] Commented: (HADOOP-3578) mapred.system.dir should be
accessible only to hadoop daemons
[ https://issues.apache.org/jira/browse/HADOOP-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696382#action_12696382 ]
Amar Kamat commented on HADOOP-3578:
------------------------------------
Here is the proposal :
_Terms :_
# mapred.system.dir : the common location where the users (jobclient) uploads job files (job split and job jars). This dir will have rwx-w--w- permissions.
# mapred.system.dir/jobtracker : jobtracker's private scratch space with rwx------ permissions. This is the place where the jobtracker moves files upon successful job submission (upload + validation).
The process of job submission is as follows
# jobclient/user asks jobtracker for a new jobid
# jobclient generates a new x digit random number and upload the job files (split and jar) to mapred.system.dir/jobid-random-number
# jobclient/user pass this information and the jobconf to the jobtracker via the rpc (submitJob api).
# jobtracker loads the conf via the rpc, does the acls check and only then the job is *accepted* (moved to mapred.system.dir/jobtracker)
# jobtracker serializes the job.xml (changing the location of split and jar file info in the conf) to mapred.system.dir/jobtracker/jobid, moves job.jar and job.split to mapred.system.dir/jobtracker/jobid (this is imp for tasktracker rely on the information in the conf for job.jar and job.split).
# Upon restart all the jobs that are present in mapred.system.dir/jobtracker/ will be blindly loaded and jobs in mapred.system.dir/ will be queued for cleanup.
_Benefits :_
# guessing job-dir will be hard as random number will be appended
# separation between faulty jobs (jobs failing on access etc) and accepted jobs will be clear (helps in recovery)
# jobtracker system dir will be clean and cannot be garbled
# jobconf need not be read from fs as it wil be passed via rpc, this helps in making quick decisions whether the job is faulty or not
# re-initing jobtracker is as simple as deleting jobtracker's system.dir (mapred.system.dir/jobtracker) without touching the mapred.system.dir
_Questions :_
# Should default api assume that the job.xml, job.jar and job.xml are still present in mapred.system.dir/jobid?
----
Thoughts? Comments?
> mapred.system.dir should be accessible only to hadoop daemons
> --------------------------------------------------------------
>
> Key: HADOOP-3578
> URL: https://issues.apache.org/jira/browse/HADOOP-3578
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job details. Hence the {{mapred.system.dir}} has the permissions of {{rwx-wx-wx}}. This could be a security loophole where the job files might get overwritten/tampered after the job submission.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.