You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Subramaniam Krishnan (JIRA)" <ji...@apache.org> on 2008/04/14 09:07:04 UTC

[jira] Created: (HADOOP-3244) MiniMRCluster should take a "conf" object in it's constructor

MiniMRCluster should take a "conf" object in it's constructor
-------------------------------------------------------------

                 Key: HADOOP-3244
                 URL: https://issues.apache.org/jira/browse/HADOOP-3244
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.16.1
         Environment: all
            Reporter: Subramaniam Krishnan
            Assignee: Subramaniam Krishnan
            Priority: Blocker
             Fix For: 0.17.0


Until Hadoop 0.13 or so, at submission time the full path of the job.xml and all supporting files in DFS was given by the client to the jobtracker.

Since 0.15 onwards (we did not test 0.14) the jobclient is obtaining the job ID from the jobtracker and creating the directory for all the supporting files using the a system-dir computed from the local jobconf.

Line 696-7 in the JobClient:

    String jobId = jobSubmitClient.getNewJobId();
    Path submitJobDir = new Path(job.getSystemDir(), jobId);

This makes submissions to fail when the value of the 'mapred.system.dir' on the client is different from the one in the JobTracker.

A simple way o fixing this would be to introduce a new method in the JobSubmissionProtocol 'getSystemDir()' that would return the jobtracker system dir and use that dir for uploading all the files on submission.

----
For the future: A more comprehensive way of this doing would to obtain a base jobConf from the jobtracker, carrying final information for each element and the construct the job.xml on the client using the final semantics. And, in this case the 'mapred.system.dir' property should be set as final in the jobtracker. As there may be some configuration properties that are sensitive and for security reasons should not be exposed to the clients a new flag 'private' could be introduced and only properties that don't have the 'private' flag would be send over from the jobtracker to the jobclient for job.xml resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3244) MiniMRCluster should take a "conf" object in it's constructor

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3244:
--------------------------------

    Fix Version/s:     (was: 0.17.0)

Removing from 0.17.  Improvements must be committed before feature freeze (which happened 4/4 for Hadoop 0.17)

> MiniMRCluster should take a "conf" object in it's constructor
> -------------------------------------------------------------
>
>                 Key: HADOOP-3244
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3244
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.16.2
>         Environment: all
>            Reporter: Subramaniam Krishnan
>            Priority: Minor
>
> MiniMRCluster should take a "conf" object in it's constructor. This is required if we want to pass some non-default configurations like system directory.
> MiniMRCluster should use this configuration to create it's internal Job Tracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-3244) MiniMRCluster should take a "conf" object in it's constructor

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu resolved HADOOP-3244.
---------------------------------------------

    Resolution: Duplicate

Fixed as part of HADOOP-3296

> MiniMRCluster should take a "conf" object in it's constructor
> -------------------------------------------------------------
>
>                 Key: HADOOP-3244
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3244
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.16.2
>         Environment: all
>            Reporter: Subramaniam Krishnan
>            Priority: Minor
>
> MiniMRCluster should take a "conf" object in it's constructor. This is required if we want to pass some non-default configurations like system directory.
> MiniMRCluster should use this configuration to create it's internal Job Tracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3244) MiniMRCluster should take a "conf" object in it's constructor

Posted by "Subramaniam Krishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Subramaniam Krishnan updated HADOOP-3244:
-----------------------------------------

          Description: 
MiniMRCluster should take a "conf" object in it's constructor. This is required if we want to pass some non-default configurations like system directory.

MiniMRCluster should use this configuration to create it's internal Job Tracker.

  was:
Until Hadoop 0.13 or so, at submission time the full path of the job.xml and all supporting files in DFS was given by the client to the jobtracker.

Since 0.15 onwards (we did not test 0.14) the jobclient is obtaining the job ID from the jobtracker and creating the directory for all the supporting files using the a system-dir computed from the local jobconf.

Line 696-7 in the JobClient:

    String jobId = jobSubmitClient.getNewJobId();
    Path submitJobDir = new Path(job.getSystemDir(), jobId);

This makes submissions to fail when the value of the 'mapred.system.dir' on the client is different from the one in the JobTracker.

A simple way o fixing this would be to introduce a new method in the JobSubmissionProtocol 'getSystemDir()' that would return the jobtracker system dir and use that dir for uploading all the files on submission.

----
For the future: A more comprehensive way of this doing would to obtain a base jobConf from the jobtracker, carrying final information for each element and the construct the job.xml on the client using the final semantics. And, in this case the 'mapred.system.dir' property should be set as final in the jobtracker. As there may be some configuration properties that are sensitive and for security reasons should not be exposed to the clients a new flag 'private' could be introduced and only properties that don't have the 'private' flag would be send over from the jobtracker to the jobclient for job.xml resolution.

             Priority: Minor  (was: Blocker)
    Affects Version/s:     (was: 0.16.1)
                       0.16.2
             Assignee:     (was: Subramaniam Krishnan)
           Issue Type: Improvement  (was: Bug)

> MiniMRCluster should take a "conf" object in it's constructor
> -------------------------------------------------------------
>
>                 Key: HADOOP-3244
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3244
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.16.2
>         Environment: all
>            Reporter: Subramaniam Krishnan
>            Priority: Minor
>             Fix For: 0.17.0
>
>
> MiniMRCluster should take a "conf" object in it's constructor. This is required if we want to pass some non-default configurations like system directory.
> MiniMRCluster should use this configuration to create it's internal Job Tracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3244) MiniMRCluster should take a "conf" object in it's constructor

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591275#action_12591275 ] 

Amar Kamat commented on HADOOP-3244:
------------------------------------

I too require this for HADOOP-3296. I will be doing this as a part of it.

> MiniMRCluster should take a "conf" object in it's constructor
> -------------------------------------------------------------
>
>                 Key: HADOOP-3244
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3244
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.16.2
>         Environment: all
>            Reporter: Subramaniam Krishnan
>            Priority: Minor
>
> MiniMRCluster should take a "conf" object in it's constructor. This is required if we want to pass some non-default configurations like system directory.
> MiniMRCluster should use this configuration to create it's internal Job Tracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.