You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2009/05/27 07:46:45 UTC

[jira] Created: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Memory management variables need a backwards compatibility option after HADOOP-5881
-----------------------------------------------------------------------------------

                 Key: HADOOP-5919
                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
            Reporter: Hemanth Yamijala
            Priority: Blocker


HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V reassigned HADOOP-5919:
---------------------------------

    Assignee: rahul k singh

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717210#action_12717210 ] 

rahul k singh edited comment on HADOOP-5919 at 6/8/09 2:46 AM:
---------------------------------------------------------------

All physical memory related stuff have been removed.

mapred.task.default.maxvmem => removed completely
mapred.tasktracker.vmem.reserved ==> this key is removed.Admin can attain similar behaviour using keys mapred.cluster.map.memory.mb and mapred.cluster.map.memory.mb. 

mapred.task.limit.maxvmem   ==> split into  mapred.cluster.max.map.memory.mb mapred.cluster.max.reduce.memory.mb 
for example if you set "mapred.task.limit.maxvmem" = 10 , then "mapred.cluster.max.map.memory.mb" = 10/1024. same with reduce.

mapred.task.maxvmem ==> split into mapred.job.map.memory.mb,mapred.job.reduce.memory.mb. The set behaviour would be same 
as "mapred.task.limit.maxvmem"

Design.
Assumption:

1. Incase there is 1-1 mapping of the key , the value of the deprecated key is applied to the new key.

Data Structures.
{code}
deprecatedKeys<String,DeprecatedKeyMapping>
Skeleton of DeprecatedKeyMapping

static class DeprecatedKeyMapping{

   String[] keyMappings;
   String customMessage;
}
{code}

for keys which are removed , keyMappings would be null and keys would have only custom message.


while setting the conf , if there is 1-1 mapping ,Configuration would take care of that , incase of multiple values for a deprecated key , respective code paths would be setting the value.


      was (Author: rksingh):
    All physical memory related stuff have been removed.

mapred.task.default.maxvmem => removed completely
mapred.tasktracker.vmem.reserved ==> this key is removed.Admin can attain similar behaviour using keys mapred.cluster.map.memory.mb and mapred.cluster.map.memory.mb. 

mapred.task.limit.maxvmem   ==> split into  mapred.cluster.max.map.memory.mb mapred.cluster.max.reduce.memory.mb 
for example if you set "mapred.task.limit.maxvmem" = 10 , then "mapred.cluster.max.map.memory.mb" = 10/1024. same with reduce.

mapred.task.maxvmem ==> split into mapred.job.map.memory.mb,mapred.job.reduce.memory.mb. The set behaviour would be same 
as "mapred.task.limit.maxvmem"

Design.
Assumption:

1. Incase there is 1-1 mapping of the key , the value of the deprecated key is applied to the new key.

Data Structures.

deprecatedKeys<String,DeprecatedKeyMapping>
Skeleton of DeprecatedKeyMapping

static class DeprecatedKeyMapping{

   String[] keyMappings;
   String customMessage;
}


for keys which are removed , keyMappings would be null and keys would have only custom message.


while setting the conf , if there is 1-1 mapping ,Configuration would take care of that , incase of multiple values for a deprecated key , respective code paths would be setting the value.

  
> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated HADOOP-5919:
----------------------------------

    Attachment: hadoop-5919-2.patch

Attached new patch.

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>         Attachments: hadoop-5919-1.patch, hadoop-5919-2.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713865#action_12713865 ] 

eric baldeschwieler commented on HADOOP-5919:
---------------------------------------------

You are suggesting that it is possible to have an empty list of new variables to map to the old.

In this case a set should be a noop that prints a WARNING, correct?

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated HADOOP-5919:
----------------------------------

    Attachment: hadoop-5919-1.patch

attaching the first patch with comments.

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>         Attachments: hadoop-5919-1.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715923#action_12715923 ] 

Hemanth Yamijala commented on HADOOP-5919:
------------------------------------------

In some more discussion, we found some problems in the proposed approach. Primarily, two problems:

- If the list of deprecated configurations is centralized in the Configuration class, what happens after project split ? So, if I have to deprecate a configuration of mapred, should I submit a patch to core ? And since theoretically they can have different release cycles, would that not conflict with requirements of different projects.

- Another problem is that the mapping as proposed above seems simplistic for some cases. For e.g. consider the split of number of slots into map slots and reduce slots that happened a couple of releases back. If some change like that needs to be supported, just setting the same value to both map slots and reduce slots seems incorrect. A better mapping could be to split half of them into map slots and half into reduce slots. In other words, it seems like we may need a more complex mapping mechanism.

Given all these, I am thinking this should definitely be the subject on another JIRA, as people might have more comments on the approach.

And since this bug is a blocker for 0.20.1, I suggest we do a mapping specifically in the relevant classes for this bug (the memory related classes) and move over to using a centralized framework for deprecating configurations once the configuration JIRA is made available.

Does this make sense ?

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718378#action_12718378 ] 

Vinod K V commented on HADOOP-5919:
-----------------------------------

Looked at the patch. Comments:

General:
 - After class/function names, there should be a space before "{" or "(" or ","
 - Some lines are longer than 80 char boundary. Wrap them around.

Configuration.DeprecatedKeyMapping
 - can be private.
 - Fix the javadoc for this. Move the comments from isDeprecatedKey to here.
 - We can treat an empty array value the same way we treat a null value
 - keymappings and message should be initialized
 - Put some documentation before the static block for the deprecated keys as to what kind of entries can be present.
 - A single NoLongerUsedKey object can be used instead of repetitive new DeprecatedKeyMapping(null,"The key is no longer used")
 - In the messages, reference should be made to other keys in cases where keys are no longer used.
 - isDeprecatedKey name is inappropriate. It should be something in the lines of processDeprecatedKey.
 - isDeprecatedKey can be simplified like the following:
   {code}
      if (mapping.getKeyMappings() != null) {
        for (String newKey : mapping.getKeyMappings()) {
          properties.setProperty(newKey, value);
        }
      }
      LOG.warn(mapping.getMessage());
      return true;
   {code}

Configuration.set():
 - needs to set variables to the overlay too. We need a test case for this specific scenario.

Configuratin.get()
 - document conf.get() method regarding the treatment of deprecated keys
 - put the log statement in a {} block. In general, we put even single line conditional statements in a {} block.

Configuration.getRaw(), Configuration.get(String,String):
 - These also need the deprecated variables code.
 - I think the code should be in a common place.
 - Correspondingly, the javadoc for these two methods also needs to be fixed.

JobConf:
 - The four methods {get|set}MemoryFor{Map|Reduce}Task() need to be public. It was missed in HADOOP-5881.
 - MAPRED_TASK_MAXVMEM_PROPERTY, UPPER_LIMIT_ON_TASK_VMEM_PROPERTY should be deprecated with appropriate references in javadoc to what instead has to be used.
 - The deprecated methods should also carry the same/similar javadoc messages.
 - Remove useless empty log warn messages in the deprecated methods
 - getMaxVirtualMemoryForTask()
     If one of the new parameters for map or reduce is missing, the job is anyway not accepted. So the logic in this method has to change.
 - getMemoryForMapTask and getMemoryForReduceTask
     -- Remove/correct log messages
     -- val>0 && val<1 check will never happen. It should actually be done by converting values into float/double
     -- The logic can be simplified by checking for new values first. Also, a separate null check for old values is not needed.
     -- Add documentation for the old variables and the old methods.

CapacityTaskScheduler.initializeMemoryConf
 - Separate null checks for the values is not needed.
 - val>0 && val<1 check will never happen. It should actually be done by converting values into float/double

JobTracker.initializeTaskMemoryRelatedConfig()
 - Changes are also needed in this method when some job has old values.

Haven't looked at the testcases yet, will look at them in the next iteration.

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>         Attachments: hadoop-5919-1.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716168#action_12716168 ] 

Hemanth Yamijala commented on HADOOP-5919:
------------------------------------------

Discussed with Owen on the issues. 

We think the project split is a knotty issue and may need a resolution that merits further discussion in a separate JIRA. But we do have a solution that we think will work. Hence for the purpose of this jira, we are still proposing to go ahead with introducing the deprecated mapping. And handle the project split issue in a follow-up jira, maybe for Hadoop 0.21.

Regarding the second point of a more complicated mapping, again we agree that valid use cases for this could exist. We think it is not necessary for the deprecation map approach to handle every single deprecation in configuration. In other words, if a complicated use case exists, it could be done by the application itself - outside the configuration API. Or an extension to the deprecation mechanism can also be looked at as a further enhancement.

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717210#action_12717210 ] 

rahul k singh commented on HADOOP-5919:
---------------------------------------

All physical memory related stuff have been removed.

mapred.task.default.maxvmem => removed completely
mapred.tasktracker.vmem.reserved ==> this key is removed.Admin can attain similar behaviour using keys mapred.cluster.map.memory.mb and mapred.cluster.map.memory.mb. 

mapred.task.limit.maxvmem   ==> split into  mapred.cluster.max.map.memory.mb mapred.cluster.max.reduce.memory.mb 
for example if you set "mapred.task.limit.maxvmem" = 10 , then "mapred.cluster.max.map.memory.mb" = 10/1024. same with reduce.

mapred.task.maxvmem ==> split into mapred.job.map.memory.mb,mapred.job.reduce.memory.mb. The set behaviour would be same 
as "mapred.task.limit.maxvmem"

Design.
Assumption:

1. Incase there is 1-1 mapping of the key , the value of the deprecated key is applied to the new key.

Data Structures.

deprecatedKeys<String,DeprecatedKeyMapping>
Skeleton of DeprecatedKeyMapping

static class DeprecatedKeyMapping{

   String[] keyMappings;
   String customMessage;
}


for keys which are removed , keyMappings would be null and keys would have only custom message.


while setting the conf , if there is 1-1 mapping ,Configuration would take care of that , incase of multiple values for a deprecated key , respective code paths would be setting the value.


> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713638#action_12713638 ] 

Hemanth Yamijala commented on HADOOP-5919:
------------------------------------------

In a brief discussion I had with Owen, he suggested that we use this patch as an opportunity to provide a facility in the configuration API to handle deprecated configuration variables.

The basic idea is to have a Map<String, String[]> in the Configuration class that would be statically defined and populated. The key is the deprecated config, and the value is a list of replacement configs (in plural because there are instances as in HADOOP-5881, where a single key is split into two for e.g. memory-per-slot into memory-per-map-slot and memory-per-reduce-slot).

When a get is done on a deprecated key, this Configuration API will print a deprecation warning, and return the value of the first string in the list.

When a set is done, likewise, the API will print a deprecation warning and set all the values in the replacement keys.

In cases where there is no replacement, we could print an ERROR message and return null maybe, or throw an exception.

Thoughts ?

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Priority: Blocker
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5919) Memory management variables need a backwards compatibility option after HADOOP-5881

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated HADOOP-5919:
----------------------------------

    Attachment: hadoop-5919-3.patch

> Memory management variables need a backwards compatibility option after HADOOP-5881
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-5919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5919
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>            Priority: Blocker
>         Attachments: hadoop-5919-1.patch, hadoop-5919-2.patch, hadoop-5919-3.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking at the backwards compatibility angle. This JIRA is to adress the gap. Marking it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.