You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2007/04/20 01:13:15 UTC

[jira] Created: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Configuring different number of mappers and reducers per TaskTracker
--------------------------------------------------------------------

                 Key: HADOOP-1274
                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
            Reporter: Koji Noguchi
            Priority: Minor


Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
(I'm assuming user either has a dedicated cluster or use HOD.)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Michael Bieniosek (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490198 ] 

Michael Bieniosek commented on HADOOP-1274:
-------------------------------------------

See my HADOOP-1245.  Currently this is possible by setting mapred.tasktracker.tasks.maximum differently for tasktrackers, but the jobtracker will  assign an inappropriate number of jobs when it initially creates the job. 

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Priority: Minor
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543205 ] 

Doug Cutting commented on HADOOP-1274:
--------------------------------------

> Please open a new jira to fix the documentation on the hadoop website [ ...]

Why shouldn't that be included with this patch?  I think updating documentation in the same commit as APIs are changed is appropriate.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1274:
----------------------------------

    Status: Open  (was: Patch Available)

Some more comments:

1. Please remove the LOG.warn for the deprecated api.
2. Fix hadoop-default.xml, add the new configuration parameters and remove the deprecated {{mapred.tasktracker.tasks.maximum}}.
3. Fix references to {{mapred.tasktracker.tasks.maximum}} in all javadocs.
4. Please open a new jira to fix the documentation on the hadoop website (specifically, http://lucene.apache.org/hadoop/cluster_setup.html talks about {{mapred.tasktracker.tasks.maximum}}).

Thanks!

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

Submitting patch with review comments incorporated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490474 ] 

Koji Noguchi commented on HADOOP-1274:
--------------------------------------

> Currently this is possible by setting mapred.tasktracker.tasks.maximum differently for tasktrackers,

My subject was maybe misleading.
What I meant was to have more mappers than reducers within each TaskTracker.


> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Priority: Minor
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1274:
----------------------------------

    Status: Open  (was: Patch Available)

I'm sorry, but this patch removes the public api {{ClusterStatus.getMaxTasks}}. The convention is to deprecate this for one release before removing a public api, could you please fix it and submit a new patch? 


{noformat}
Index: src/java/org/apache/hadoop/mapred/ClusterStatus.java
===================================================================
--- src/java/org/apache/hadoop/mapred/ClusterStatus.java  (revision 591915)
+++ src/java/org/apache/hadoop/mapred/ClusterStatus.java  (working copy)
@@ -106,15 +108,24 @@
   }

   /**
-   * Get the maximum capacity for running tasks in the cluster.
+   * Get the maximum capacity for running map tasks in the cluster.
    *
-   * @return the maximum capacity for running tasks in the cluster.
+   * @return the maximum capacity for running map tasks in the cluster.
    */
-  public int getMaxTasks() {
-    return max_tasks;
+  public int getMaxMapTasks() {
+    return max_map_tasks;
   }

   /**
+   * Get the maximum capacity for running reduce tasks in the cluster.
+   * 
+   * @return the maximum capacity for running reduce tasks in the cluster.
+   */
+  public int getMaxReduceTasks() {
+    return max_reduce_tasks;
+  }
+  
+  /**
    * Get the current state of the <code>JobTracker</code>,
    * as {@link JobTracker.State}
    *
{noformat}


> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538481 ] 

Devaraj Das commented on HADOOP-1274:
-------------------------------------

Arun, regarding comment#1, I think we can leave the conversion as is since we won't have "int" number of task-slots in a tasktracker. Maybe 4-8-10. But we do save some network bandwidth by transmitting shorts. 
Regarding comment#3, I think we should continue to have both map/reduce slots use the same padding. Don't think this is related to the issue being addressed here. 

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Open  (was: Patch Available)

some white space related issues

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543626 ] 

Doug Cutting commented on HADOOP-1274:
--------------------------------------

> we really need HADOOP-2160 (documentation by releases) before we can start enforcing this, correct?

Sigh.  I see your point.  If we commit the doc change to trunk and someone republishes the website, from trunk, then the website documentation will document 0.16, which isn't yet released, rather than 0.15, the current release.  We could use 'svn switch' on people.apache.org so that the website is stuck on the 0.15 branch until we release 0.16, but that would prohibit other changes to the website.  it sure would be nice to fix HADOOP-2160...

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540225 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12368969/patch-1274.txt
against trunk revision r591880.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1063/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1063/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1063/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1063/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated HADOOP-1274:
---------------------------------

    Issue Type: Improvement  (was: Bug)

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Priority: Minor
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539054 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12368755/patch-1274.txt
against trunk revision r590273.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1038/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1038/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1038/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1038/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

Submiting patch again with deprecated ClusterStatus.getMaxTasks

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu reassigned HADOOP-1274:
------------------------------------------------

    Assignee: Amareshwari Sri Ramadasu

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1274:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541702 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12369347/patch-1274.txt
against trunk revision r593855.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1091/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1091/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1091/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1091/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544774 ] 

Hudson commented on HADOOP-1274:
--------------------------------

Integrated in Hadoop-Nightly #311 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/311/])

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1274:
----------------------------------

    Status: Open  (was: Patch Available)

Amareshwari, the patch still doesn't deprecate {{ClusterStatus.getMaxTasks}}...

{noformat}
Index: src/java/org/apache/hadoop/mapred/ClusterStatus.java
===================================================================
--- src/java/org/apache/hadoop/mapred/ClusterStatus.java	(revision 593416)
+++ src/java/org/apache/hadoop/mapred/ClusterStatus.java	(working copy)
@@ -111,10 +117,31 @@
    * @return the maximum capacity for running tasks in the cluster.
    */
   public int getMaxTasks() {
-    return max_tasks;
+    LOG.warn("ClusterStatus.getMaxTasks()is deprecated. Use " +
+             "ClusterStatus.getMaxMapTasks() and " +
+             "ClusterStatus.getMaxReduceTasks() instead.");
+    return (max_map_tasks + max_reduce_tasks);
+  }
{noformat}

should look like:

{noformat}
Index: src/java/org/apache/hadoop/mapred/ClusterStatus.java
===================================================================
--- src/java/org/apache/hadoop/mapred/ClusterStatus.java	(revision 593416)
+++ src/java/org/apache/hadoop/mapred/ClusterStatus.java	(working copy)
@@ -111,10 +117,31 @@
    * @return the maximum capacity for running tasks in the cluster.
+ * @deprecated Use {@link #getMaxMapTasks()} and/or {@link #getMaxReduceTasks()}
    */
   public int getMaxTasks() {
-    return max_tasks;
+    LOG.warn("ClusterStatus.getMaxTasks()is deprecated. Use " +
+             "ClusterStatus.getMaxMapTasks() and " +
+             "ClusterStatus.getMaxReduceTasks() instead.");
+    return (max_map_tasks + max_reduce_tasks);
+  }
{noformat}

Also, could you consider submitting a test case for this one? Thanks!

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Priority: Major  (was: Minor)

Marking it major, this patch also gives a fix to the number of reducers in sort example, which was changed due to HADOOP-1245.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543245 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12369645/patch-1274.txt
against trunk revision r595563.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1106/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1106/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1106/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1106/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541696 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1274:
--------------------------------------------------

Added @deprecated

bq. Also, could you consider submitting a test case for this one? Thanks!
I dont think we need a test case for this, as it deals with change of number of tasks per task tracker with config variable.
I tested this on a 200-node cluster.


> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537577 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1274:
--------------------------------------------------

priliminary patch attached

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

Submiting with comments incorporated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Attachment: patch-1274.txt

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sri Ramadasu updated HADOOP-1274:
---------------------------------------------

    Status: Patch Available  (was: Open)

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537926 ] 

acmurthy edited comment on HADOOP-1274 at 10/26/07 4:30 AM:
-----------------------------------------------------------------

Comments:

1. Conversion to short here is unnecessary, probably dangerous:
{noformat}
-          metricsRecord.setMetric("taskSlots", (short)maxCurrentTasks);
+          metricsRecord.setMetric("mapTaskSlots", (short)maxCurrentMapTasks);
+          metricsRecord.setMetric("reduceTaskSlots", 
+                                      (short)maxCurrentReduceTasks);
{noformat}

2. Please deprecate {{mapred.tasktracker.tasks.maximum}} for this release, as per existing norms. We cannot remove it right-away. 

{noformat}
-    maxCurrentTasks = conf.getInt("mapred.tasktracker.tasks.maximum", 2);
+    maxCurrentMapTasks = conf.getInt(
+                             "mapred.tasktracker.map.tasks.maximum", 2);
+    maxCurrentReduceTasks = conf.getInt(
+                             "mapred.tasktracker.reduce.tasks.maximum", 1);
{noformat}

{{mapred.tasktracker.tasks.maximum}} should be superceded by {{mapred.tasktracker.map.tasks.maximum}} and {{mapred.tasktracker.reduce.tasks.maximum}} for hadoop-0.16.0.

3. Should we consider having different {{JobTracker.PAD_FRACTION}} for maps and reduces? Clearly the no. of padded slots for reduces should be higher since reduce-failures are more expensive...

      was (Author: acmurthy):
    Comments:

1. Conversion to short here is unnecessary, probably dangerous:
{noformat}
-          metricsRecord.setMetric("taskSlots", (short)maxCurrentTasks);
+          metricsRecord.setMetric("mapTaskSlots", (short)maxCurrentMapTasks);
+          metricsRecord.setMetric("reduceTaskSlots", 
+                                      (short)maxCurrentReduceTasks);
{noformat}

2. Please deprecate {{mapred.tasktracker.tasks.maximum}} for this release, as per existing norms. We cannot remove it right-away. {{mapred.tasktracker.tasks.maximum}} should be superceded by {{mapred.tasktracker.map.tasks.maximum} and {{mapred.tasktracker.reduce.tasks.maximum}} for 0.16.0.

{noformat}
-    maxCurrentTasks = conf.getInt("mapred.tasktracker.tasks.maximum", 2);
+    maxCurrentMapTasks = conf.getInt(
+                             "mapred.tasktracker.map.tasks.maximum", 2);
+    maxCurrentReduceTasks = conf.getInt(
+                             "mapred.tasktracker.reduce.tasks.maximum", 1);
{noformat}

3. Should we consider having different {{JobTracker.PAD_FRACTION}} for maps and reduces? Clearly the no. of padded slots for reduces should be higher since reduce-failures are more expensive...
  
> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537896 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12368364/patch-1274.txt
against trunk revision r588341.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1004/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1004/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1004/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1004/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-1274:
----------------------------------

    Fix Version/s: 0.16.0
           Status: Open  (was: Patch Available)

Comments:

1. Conversion to short here is unnecessary, probably dangerous:
{noformat}
-          metricsRecord.setMetric("taskSlots", (short)maxCurrentTasks);
+          metricsRecord.setMetric("mapTaskSlots", (short)maxCurrentMapTasks);
+          metricsRecord.setMetric("reduceTaskSlots", 
+                                      (short)maxCurrentReduceTasks);
{noformat}

2. Please deprecate {{mapred.tasktracker.tasks.maximum}} for this release, as per existing norms. We cannot remove it right-away. {{mapred.tasktracker.tasks.maximum}} should be superceded by {{mapred.tasktracker.map.tasks.maximum} and {{mapred.tasktracker.reduce.tasks.maximum}} for 0.16.0.

{noformat}
-    maxCurrentTasks = conf.getInt("mapred.tasktracker.tasks.maximum", 2);
+    maxCurrentMapTasks = conf.getInt(
+                             "mapred.tasktracker.map.tasks.maximum", 2);
+    maxCurrentReduceTasks = conf.getInt(
+                             "mapred.tasktracker.reduce.tasks.maximum", 1);
{noformat}

3. Should we consider having different {{JobTracker.PAD_FRACTION}} for maps and reduces? Clearly the no. of padded slots for reduces should be higher since reduce-failures are more expensive...

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>            Priority: Minor
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543261 ] 

Arun C Murthy commented on HADOOP-1274:
---------------------------------------

bq. Why shouldn't that be included with this patch? I think updating documentation in the same commit as APIs are changed is appropriate.

I completely agree in principle. However we really need HADOOP-2160 (documentation by releases) before we can start enforcing this, correct?

Specifically this patch deprecates  -{{mapred.tasktracker.tasks.maximum}}- and adds {{mapred.tasktracker.map.tasks.maximum}} and {{mapred.tasktracker.reduce.tasks.maximum}} for 0.16.0.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1274) Configuring different number of mappers and reducers per TaskTracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541364 ] 

Hadoop QA commented on HADOOP-1274:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12369211/patch-1274.txt
against trunk revision r592860.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1080/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1080/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1080/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1080/console

This message is automatically generated.

> Configuring different number of mappers and reducers per TaskTracker
> --------------------------------------------------------------------
>
>                 Key: HADOOP-1274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1274
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Koji Noguchi
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-1274.txt, patch-1274.txt, patch-1274.txt, patch-1274.txt
>
>
> Depending on the application, it sometimes make sense to have more mappers than reducers assigned to each node. 
> (I'm assuming user either has a dedicated cluster or use HOD.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.