You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2009/10/14 02:03:31 UTC

[jira] Created: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
--------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-1105
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: contrib/capacity-sched
    Affects Versions: 0.21.0
            Reporter: Arun C Murthy
            Priority: Blocker
             Fix For: 0.21.0


Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.

Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768326#action_12768326 ] 

Hudson commented on MAPREDUCE-1105:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #91 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/91/])
    . Remove max limit configuration in capacity scheduler in favor of max capacity percentage thus allowing the limit to go over queue capacity. Contributed by Rahul Kumar Singh.


> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768724#action_12768724 ] 

Hudson commented on MAPREDUCE-1105:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #120 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/120/])
    . Remove max limit configuration in capacity scheduler in favor of max capacity percentage thus allowing the limit to go over queue capacity. Contributed by Rahul Kumar Singh.


> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Release Note: 
Replaced the existing max task limits variables "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots"  with  "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" . 

max task limit variables were used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further
maximum-capacity variable defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use.
 
maximum-capacity variable  behavior is different from max task limit variables, as maximum-capacity is a percentage and it grows and shrinks in absolute terms based on total cluster capacity.Also same maximum-capacity percentage is applied to both map and reduce.

  was:
Remove the existing "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" variables in trunk and yahoo distribution, these where used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further.

Backported maximum-capacity feature from 21 to yahoo distribution.
This introduces new parameter "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" in yahoo distribution. maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use.
 
This is different from existing max task limit , as maximum-capacity is a percentage and it grows and shrinks in absolute terms based on total cluster capacity.


> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768261#action_12768261 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

Seems like the test failure is happening intermittently - MAPREDUCE-1124. At any rate, there is no relation to this patch as the patch only modifies capacity scheduler.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767209#action_12767209 ] 

rahul k singh commented on MAPREDUCE-1105:
------------------------------------------

Summary for yahoo distribution patch.
==============================
- Remove the existing "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" variables , these where used to throttle the queue, i.e, these were the hard limit and not allowing 
queue to grow further.
-Added the new parameter "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" 
maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.The maximum-capacity of a queue can only be greater than or equal to its minimum capacity. Default value of -1 implies a queue can use complete capacity of the cluster.

 This property could be to curtail certain jobs which are long running in nature from occupying more than a  certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of  other queues being affected. One important thing to note is that maximum-capacity is a percentage , so based on the cluster's capacity the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in  absolute terms would increase accordingly.

-Added some testcases for unit testing the maximum-capacity knob.
-remove testcase for max.map.slots and max.reduce.slots.


Summary of changes for patch 21.
===========================

-Removed "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" entries.

-Removed testcases for the same.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer updated MAPREDUCE-1105:
----------------------------------------

    Attachment: MAPREDUCE-1105_apache0202.txt

This is a back port of this (and likely other capacity changes) from Yahoo! 0.20 to Apache 0.20.2.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch, MAPREDUCE-1105_apache0202.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPREDUCE-1105-version20-2.patch

Implemented the all hemanth's comments except:

- checkMultipleAssignment is being called only when multiple assignment is set to true. So we can verify for the exact number of tasks returned, i.e. tasks.size should never be more than the expected number of tasks.

checkMultipleAssignment method takes mapAttempt and reduceAttempt as the string , and there is a check for assignTask value to be null or empty and checking if any of the mapAttempt or reduceAttempt being set.

Also there is a check to see if both mapAttempt and reduceAttempt are set and assignTask size is 1.



> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767210#action_12767210 ] 

rahul k singh commented on MAPREDUCE-1105:
------------------------------------------

Summary for yahoo distribution patch.
==============================
- Remove the existing "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" variables , these where used to throttle the queue, i.e, these were the hard limit and not allowing
queue to grow further.

- Added the new parameter "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" 
maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.The maximum-capacity of a queue can only be greater than or equal to its minimum capacity. Default value of -1 implies a queue can use complete capacity of the cluster.
This property could be to curtail certain jobs which are long running in nature from occupying more than a certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of other queues being affected. One important thing to note is that maximum-capacity is a percentage , so based on the cluster's capacity the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in absolute terms would increase accordingly.

- Added some testcases for unit testing the maximum-capacity knob.

- remove testcase for max.map.slots and max.reduce.slots.

Summary of changes for patch 21.
===========================

- Removed "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" entries.

- Removed testcases for the same.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPREDUCE-1105-yahoo-version20-4.patch
                MAPRED-1105-21-3.patch

Added hemanth's comments.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPREDUCE-1105-yahoo-version20-3.patch

Added the comments by hemanth

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated MAPREDUCE-1105:
----------------------------------------

    Attachment: MAPREDUCE-1105-yahoo-version20-5.patch

Attached file (MAPREDUCE-1105-yahoo-version20-5.patch) fixes an expected string in TestCapacitySchedulerConf that was causing a test case to fail. Re-ran all CapacityScheduler tests on the yahoo distribution to verify sanity.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer updated MAPREDUCE-1105:
----------------------------------------

    Attachment: MAPREDUCE-1105_apache0202.txt

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch, MAPREDUCE-1105_apache0202.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843042#action_12843042 ] 

Allen Wittenauer commented on MAPREDUCE-1105:
---------------------------------------------

For those patching capacity in 0.20.. you'll also want MAPREDUCE-1160.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch, MAPREDUCE-1105_apache0202.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPRED-1105-21-1.patch

Uploading the first cut for the trunk

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768166#action_12768166 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

Changes look fine. +1.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768187#action_12768187 ] 

Hadoop QA commented on MAPREDUCE-1105:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422783/MAPRED-1105-21-3.patch
  against trunk revision 827889.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 16 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/197/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/197/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/197/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/197/console

This message is automatically generated.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala reassigned MAPREDUCE-1105:
-------------------------------------------

    Assignee: rahul k singh

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Release Note: 
Replaced the existing max task limits variables "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots"  with  "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" . 

max task limit variables were used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further.
maximum-capacity variable defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use.
 
maximum-capacity variable  behavior is different from max task limit variables, as maximum-capacity is a percentage and it grows and shrinks in absolute terms based on total cluster capacity.Also same maximum-capacity percentage is applied to both map and reduce.

  was:
Replaced the existing max task limits variables "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots"  with  "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" . 

max task limit variables were used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further
maximum-capacity variable defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use.
 
maximum-capacity variable  behavior is different from max task limit variables, as maximum-capacity is a percentage and it grows and shrinks in absolute terms based on total cluster capacity.Also same maximum-capacity percentage is applied to both map and reduce.


> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767853#action_12767853 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

This is looking fine. I have a few minor comments on the trunk patch:

- TaskSchedulingContext.toString has a spurious comment like "//If maxTaskLimit is less than the capacity"
- Javadoc of areTasksInQueueOverCapacity needs some update. "check if numofslotsoccupied+numSlotsPerTask is greater maxCapacity" is missing a 'than'. maximum-capacity is spelt incorrectly as maxium-capacity at a few places.
- areTasksInQueueOverCapacity is returning false if tsi.getNumSlotsOccupied() + noOfSlotsPerTask is equal to maxCapacity. The problem is if noOfSlotsPerTask is 0, then this means the number of occupied slots is already at maxCapacity and we will end up scheduling more. This seems a bug. One approach that seems to work is the call from assignTasks can pass in noOfSlotsPerTask as 1 instead of 0. This way, the API is interpreted as saying if we add 1 more slot to the current number of slots, will we be over limit.
- In testHighMemoryBlockingWithMaxCapacity, memory for reduce tasks for job1 should be 1024 instead of 1.
- In the test cases related to max capacities, expectedStrings map is not being cleared. Why not use the convenience API populateExpectedStrings.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPREDUCE-1105-version20.patch.txt

Attaching the patch for yahoo distribution

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPRED-1105-21-2.patch

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767352#action_12767352 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

Comments on the 20 patch:

- Some of the comments in trunk apply for the 20 patch as well, please check them once to ensure they are covered too.
- Updates to forrest documentation needs to be done.
- Javadoc for the API CapacitySchedulerConf.getMaxCapacity can be improved. Something like: Return the maximum percentage of the cluster capacity that can be used by the given queue. Also, @param and @return values can be filled up.
- The call to getRaw seems redundant, as we can query with default value -1 right from the beginning.
- Since maxCapacity can be equal to capacity, in the IllegalArgumentException in getMaxCapacity, we can say maxCapacity should be greater than or equal to capacity.
- Comments in testHighMemoryBlockingWithMaxCapacity are broken. For instance, we have 4 maps and 4 reduces but are saying 2 reduces. Can we please not comment on or log things that are obvious ? Then we wouldn't have the pain of having to keep them in sync with code.
- Also, testHighMemoryBlockingWithMaxCapacity may need change after fixing the bug about adhering to limits while scheduling high RAM tasks, since we won't be able to assign tasks from the high RAM reduces without crossing limits.
- For testUserLimitsWithMaxCapacities, the last call should be null. As discussed offline, I think there's a bug with checkMultipleAssignment which is throwing a false positive.
- checkMultipleAssignment is being called only when multiple assignment is set to true. So we can verify for the exact number of tasks returned, i.e. tasks.size should never be more than the expected number of tasks.


> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer updated MAPREDUCE-1105:
----------------------------------------

    Attachment:     (was: MAPREDUCE-1105_apache0202.txt)

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch, MAPREDUCE-1105_apache0202.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Attachment: MAPRED-1105-21-3.patch

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Status: Patch Available  (was: Open)

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767236#action_12767236 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

Some comments on the patch for trunk:

- QueueSchedulingContext's constructor is not indented correctly.
- The check that the capacity of a queue is <= maxCapacity is done in CapacitySchedulerConf.getMaxCapacity, when the queue hierarchy is being constructed. This happens before spare capacity is distributed for queues whose capacity is undefined. If the capacity of a queue is undefined, but it's maxCapacity is defined, this leads to a case where the validation will pass, but after the spare capacity is distributed, this may no longer be valid.
- It doesn't seem like TaskSchedulingContext.LIMIT_NORMALIZED_CAPACITY_STRING is required at all. The only purpose it was there was to tell administrators that they are getting less than their capacities because of limits. Since this should never happen now, we should remove this. Likewise, the TaskSchedulingContext.toString() method will need modification to not have any checks for maxCapacity limits.
- Javadoc of TaskSchedulingContext.getCapacity needs update.
- TaskSchedulingContext.toString is not showing maxCapacity value for the queue. This seems an existing bug - not introduced in this patch.
- CapacityTaskScheduler.areTasksInQueueOverLimit is still referring to maxTaskLimit in a comment.
- In TestCapacityScheduler, rather than removing the test cases related to limits, I would think it is better to convert them into test cases for maxCapacities. There will be some change required, because in these tests we were setting the maxLimit to be less than the capacity. But, in general, these tests will ensure that features like handling high RAM jobs and user limits is working with maxCapacities.
- The current implementation of implementing limits on capacity may have a bug. Consider if a queue is one slot below it's maxCapacity. CapacityTaskScheduler.areTasksInQueueOverLimit will return false. Now suppose the job being scheduled is a high RAM job, then it could actually go beyond its capacity. Maybe when we are assigning a task from a job in a queue, before the assignment we should again check if the currently used slots + number of slots required for this job will cause the limit to exceed.
- TestCapacityScheduler.checkOccupiedSlots is adding zeros to some indices. This should be removed.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768265#action_12768265 ] 

rahul k singh commented on MAPREDUCE-1105:
------------------------------------------

ran ant test for yahoo distribution, all testcases passed except 2.

    [junit] Test org.apache.hadoop.hdfs.server.namenode.TestStartup FAILED
    [junit] Test org.apache.hadoop.fs.loadGenerator.TestLoadGenerator FAILED

both are related to hdfs. Current patch has no relation with these  , it only modifies capacity scheduler.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767868#action_12767868 ] 

Hemanth Yamijala commented on MAPREDUCE-1105:
---------------------------------------------

Regarding the 20 patch:

- Typo maxiumCapacity in the code that distributes capacities to queues without configured capacity.
- This patch is missing the optimizing check in assignTasks for a queue being over capacity.
- In the Forrest documentation, max capacity is being referred to as maximum-capacity sometimes and max capacity sometimes. They should be consistently called maximum-capacity.
- In checkMultipleAssignment, if the returned tasks is null or empty and some tasks are expected, we are failing the test, but also returning tasks. The return is a no-op and can be removed.

Also in both the trunk and 20 patch, I think it would be useful to add a test that ensures that if undistributed capacity results in maxCapacity lesser than configured capacity, scheduler fails on startup.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated MAPREDUCE-1105:
----------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this to trunk and branch 0.21. Thanks, Rahul !

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

Posted by "rahul k singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

rahul k singh updated MAPREDUCE-1105:
-------------------------------------

    Release Note: 
Remove the existing "mapred.capacity-scheduler.queue.<queue-name>.max.map.slots" and "mapred.capacity-scheduler.queue.<queue-name>.max.reduce.slots" variables in trunk and yahoo distribution, these where used to throttle the queue, i.e, these were the hard limit and not allowing queue to grow further.

Backported maximum-capacity feature from 21 to yahoo distribution.
This introduces new parameter "mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity" in yahoo distribution. maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use.
 
This is different from existing max task limit , as maximum-capacity is a percentage and it grows and shrinks in absolute terms based on total cluster capacity.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1105
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.21.0
>            Reporter: Arun C Murthy
>            Assignee: rahul k singh
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPRED-1105-21-1.patch, MAPRED-1105-21-2.patch, MAPRED-1105-21-3.patch, MAPRED-1105-21-3.patch, MAPREDUCE-1105-version20-2.patch, MAPREDUCE-1105-version20.patch.txt, MAPREDUCE-1105-yahoo-version20-3.patch, MAPREDUCE-1105-yahoo-version20-4.patch, MAPREDUCE-1105-yahoo-version20-5.patch
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a hard-limit is specified to be greater than it's actual capacity. We should allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.