You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jens Rabe (JIRA)" <ji...@apache.org> on 2015/04/17 14:26:59 UTC

[jira] [Created] (MAPREDUCE-6320) Configuration of retrieved Job via Cluster is not properly set-up

Jens Rabe created MAPREDUCE-6320:
------------------------------------

             Summary: Configuration of retrieved Job via Cluster is not properly set-up
                 Key: MAPREDUCE-6320
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6320
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.4.1
            Reporter: Jens Rabe


When getting a Job via the Cluster API, it is not correctly configured.

To reproduce this:

# Submit a MR job, and set some arbitrary parameter to its configuration
{code:java}
job.getConfiguration().set("foo", "bar");
job.setJobName("foo-bug-demo");
{code}
# Get the job in a client:
{code:java}
final Cluster c = new Cluster(conf);
final JobStatus[] statuses = c.getAllJobStatuses();
final JobStatus s = ... // get the status for the job named foo-bug-demo
final Job j = c.getJob(s.getJobId());
final Configuration conf = job.getConfiguration();
{code}
# Get its "foo" entry
{code:java}
final String s = conf.get("foo");
{code}
# Expected: s is "bar"; But: s is null.

The reason is that the job's configuration is stored on HDFS (the Configuration has a resource with a *hdfs://* URL) and in the *loadResource* it is changed to a path on the local file system (hdfs://host.domain:port/tmp/hadoop-yarn/... is changed to /tmp/hadoop-yarn/...), which does not exist, and thus the configuration is not populated.

The bug happens in the *Cluster* class, where *JobConfs* are created from *status.getJobFile()*. A quick fix would be to copy this job file to a temporary file in the local file system and populate the JobConf from this file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)