You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2008/10/17 05:48:44 UTC

[jira] Created: (HADOOP-4439) Cleanup memory related resource management

Cleanup memory related resource management
------------------------------------------

                 Key: HADOOP-4439
                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.19.0
            Reporter: Hemanth Yamijala
            Priority: Blocker
             Fix For: 0.19.0


HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640465#action_12640465 ] 

Hemanth Yamijala commented on HADOOP-4439:
------------------------------------------

The following changes are identified:

- We remove the concept of having a default memory per task on the TT, introduced in HADOOP-3759 as the max memory per TT / number of slots. The problem with this model is that in a heterogenous cluster, different TTs could give different default memory per task values for the same job, which is confusing.
- Instead, we introduce a default memory per task configuration variable that is expected to be controlled by the cluster admin. This is the value that will be used for a job which does not specify any memory requirements. The advantage with this model is that it eases configuration, and makes the default value consistent for the users.
- If a job has not specified any memory requirements, this variable would be set to the job's configuration, maybe via the {{Task}} object.
- We modify the algorithm of protecting RAM limits introduced in HADOOP-3581 to use the configured memory per task instead of the default memory per task.
- We remove the reporting of the default memory per task, introduced in HADOOP-3759, done via the {{TaskTrackerStatus.ResourceStatus}}. Instead we report the total memory available on the TT instead of the default memory per task.
- When HADOOP-4053 is fixed, the above values would be used to schedule tasks.
- However until HADOOP-4053 is fixed, these configuration parameters and the corresponding {{JobConf}} variables should not be exposed in any public API or documentation, as they could confuse users. This can be turned on after Hadoop 0.19

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642313#action_12642313 ] 

Hudson commented on HADOOP-4439:
--------------------------------

Integrated in Hadoop-trunk #640 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/640/])
    

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4439) Cleanup memory related resource management

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-4439:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Hemanth!

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4439:
-------------------------------------

    Status: Patch Available  (was: Open)

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala reassigned HADOOP-4439:
----------------------------------------

    Assignee: Hemanth Yamijala

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641285#action_12641285 ] 

Vinod K V commented on HADOOP-4439:
-----------------------------------

+1 for the latest patch.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4439:
-------------------------------------

    Attachment: HADOOP-4439.patch

Patch correcting javadoc.

test-patch still runs fine:

     [exec] +1 overall.

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.

     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

I didn't re-run the tests because there are no code changes.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4439:
-------------------------------------

    Attachment: HADOOP-4439.patch

Attached patch just fixes java documentation. Rest of the code is the same.

Results of ant test-patch:
     [exec] +1 overall.

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.

     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

ant test also passed on my box.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640501#action_12640501 ] 

Hemanth Yamijala commented on HADOOP-4439:
------------------------------------------

bq. If a job has not specified any memory requirements, this variable would be set to the job's configuration, maybe via the Task object.

This value is a cluster configuration that should be the same across all task trackers. Hence, we were trying to see if this can be configured in the job tracker and passed on to the task trackers. The only way I could find to do it is to change the {{Task}} object to hold this variable and write it via RPC when it is transferred to the tasktrackers.

The other option is to simplify, and say that all tasktrackers will be configured with this value. This has a chance that there could be errors in configuration if the admin configures different values for different TTs. But, in an offline discussion with Devaraj, we felt it was simpler to do it this way, rather than introduce a new field in the {{Task}} object. I am OK with this approach. Any objections ?

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640560#action_12640560 ] 

Hemanth Yamijala commented on HADOOP-4439:
------------------------------------------

One other proposal was to change the max memory specification to be a percentage of the physical RAM rather than an absolute value. In interest of time (as this is a blocker for 0.19), I decided to leave this suggestion out. As the configuration is not exposed until HADOOP-4035, we can change it as part of that patch. In fact, the current configuration can be supported in a backwards compatible model as well, if we want.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4439) Cleanup memory related resource management

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641260#action_12641260 ] 

Vinod K V commented on HADOOP-4439:
-----------------------------------

One minor comment - 'return' javadoc attributes in documentation strings of TaskTrackerStatus.ResourceStatus.{setTotalMemory() and getTotalMemory()} still incorrectly refer to default memory per task.

Otherwise +1 for the patch.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch, HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4439) Cleanup memory related resource management

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4439:
-------------------------------------

    Attachment: HADOOP-4439.patch

Patch implementing the proposed changes.

> Cleanup memory related resource management
> ------------------------------------------
>
>                 Key: HADOOP-4439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4439
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4439.patch
>
>
> HADOOP-3759 and HADOOP-3581 introduced memory based resource management. This JIRA is to cleanup certain aspects of the two issues that came up while doing HADOOP-4035, which is filed to support memory based scheduling 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.