You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2007/12/21 20:32:43 UTC

[jira] Issue Comment Edited: (HADOOP-2116) Job.local.dir to be exposed to tasks

    [ https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554006 ] 

shv edited comment on HADOOP-2116 at 12/21/07 11:32 AM:
------------------------------------------------------------------------

This is also practically fixed by HADOOP-2227. The only thing left is to expose the shared directory through the configuration.
JobConf now has a property "mapred.jar" accessible through getJar() method, which points to the jar file located in the jobcache 
directory, which in fact is in the common shared directory for the job tasks.
Namely,
{code}
"mapred.jar" = "mapred.local.dir"[i]/taskTracker/jobcache/<job_id>/job.jar
{code}

So we can replace configuration parameter "mapred.jar" by "job.local.dir", which will point to the parent of "mapred.jar".
JobConf.getJar() can be implemented then as
{code}
String getJar() {
    return get("job.local.dir") + "/job.jar";
}
{code}

Will that work?

With respect to all the above I wonder why do we need to use LocalDirAllocator in TaskRunner.run()
if job cache directory (jobCacheDir) can be obtained directly from TaskRunner.conf
{code}
File jobCacheDir = new File(new File(conf.getJar()).getParentFile(), "work");
{code}


      was (Author: shv):
    This is also practically fixed by HADOOP-2227. The only thing left is to expose the shared directory through the configuration.
JobConf now has a property "mapred.jar" accessible through getJar() method, which points to the jar file located in the jobcache 
directory, which in fact is in the common shared directory for the job tasks.
Namely,
{code}
"mapred.jar" = "mapred.local.dir"[i]/taskTracker/jobcache/<job_id>/job.jar
{code}

So we can replace configuration parameter "mapred.jar" by "job.local.dir", which will point to the parent of "mapred.jar".
JobConf.getJar() can be implemented then as
{code}
String getJar() {
    return get("job.local.dir") + "job.xml";
}
{code}

Will that work?

With respect to all the above I wonder why do we need to use LocalDirAllocator in TaskRunner.run()
if job cache directory (jobCacheDir) can be obtained directly from TaskRunner.conf
{code}
File jobCacheDir = new File(new File(conf.getJar()).getParentFile(), "work");
{code}

  
> Job.local.dir to be exposed to tasks
> ------------------------------------
>
>                 Key: HADOOP-2116
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2116
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.3
>         Environment: All
>            Reporter: Milind Bhandarkar
>             Fix For: 0.16.0
>
>
> Currently, since all task cwds are created under a jobcache directory, users that need a job-specific shared directory for use as scratch space, create ../work. This is hacky, and will break when HADOOP-2115 is addressed. For such jobs, hadoop mapred should expose job.local.dir via localized configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.