You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2008/09/16 01:47:44 UTC

[jira] Created: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

The capacity scheduler's defaults for queues should be configurable.
--------------------------------------------------------------------

                 Key: HADOOP-4178
                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/capacity-sched
            Reporter: Owen O'Malley
            Assignee: Hemanth Yamijala


The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634067#action_12634067 ] 

Vinod K V commented on HADOOP-4178:
-----------------------------------

Some quick review comments:
- In the test-case, there is some code-duplication writing the tags "<property>", "<name>" etc. This can be avoided, say, by having a method that writes a single key-val pair to the configuration.
- Two look-ups are being made for getting each configuration item (in getReclaimTimeLimit, isPrioritySupported etc.), this can be a single look-up. This precludes us from logging proper error messages while using default values, but don't know for sure if we really need those log messages.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-8.patch

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch, HADOOP-4178-8.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-4178:
----------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.19.0)
                   0.20.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this.

I can't see this as a blocker, so I changed the fixed in to 0.20.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch, HADOOP-4178-8.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-4.patch

Attaching the latest patch fixing rounding errors which would be caused if we don't round off the values which we are dividing which might reduce the actual usage of nodes in a very large clusters.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635296#action_12635296 ] 

Hemanth Yamijala commented on HADOOP-4178:
------------------------------------------

Some minor changes:

- capacity-scheduler-conf.xml.template - The new configuration items are not expanding tabs to spaces. We follow the same convention for tabs even for conf files.
- {{getMinimumUserLimitPercent}} is looking up the property name twice.
- Due to recent changes, the LOG variable is again unused.
- Tests for invalid values are not correct. They don't add the invalid values to the configuration file. 
The tests are passing because upon construction, no checks are done. You must do a get in order to exercise the error checking code. Also, if the Exception is expected, it is good to have a {{fail}} line in the success path. Otherwise, erroneously, the wrong test will produce successful results. So, the code should be something like this:

{code}
try {
  testConf = new CapacitySchedulerConf(new Path(testConfFile));
  testConf.getInvalidReclaimTimeLimit("default");
  fail("Expect invalid reclaim time limit to raise exception");
} catch(IllegalArgumentException e) {
  assertTrue(true);
}
{code}

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Status: Patch Available  (was: Open)

Introducing default configurable parameters for CapacityScheduler

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch, HADOOP-4178-8.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634058#action_12634058 ] 

Hemanth Yamijala commented on HADOOP-4178:
------------------------------------------

I like the idea described for guaranteed capacity. The point of the default configurations is to give a shortcut to the administrators to define queues fast. The way we are defining defaults for capacities actually does not help us accomplish this purpose. So, I am +1 for the suggestion.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-1.patch

Configuration for the following default queue parameter should are added to the capacity_scheduler.xml.

- Default Guaranteed Capacity: _mapred.capacity-scheduler.default-guaranteed-capacity_
- Default Reclaim time limit: _mapred.capacity-scheduler.default-reclaim-time-limit_
- Default priority supported: _mapred.capacity-scheduler.default-supports-priority_
- Default User limit:: _mapred.capacity-scheduler.default-minimum-user-limit-percent_

Modified the test case in order to write out default configuration which is required by the _CapacitySchedulerConf_ add a new test case to test the default parameter setting.

Removed an empty method from _CapacitySchedulerConf_ and implemented logic for reading defaults first in the _CapacitySchedulerConf_


> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4178-1.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4178:
-------------------------------------

             Priority: Blocker  (was: Major)
    Affects Version/s: 0.19.0
        Fix Version/s: 0.19.0

Marking this a blocker for 0.19, based on the decision being reached in HADOOP-4079.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636027#action_12636027 ] 

Hudson commented on HADOOP-4178:
--------------------------------

Integrated in Hadoop-trunk #620 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/620/])
    . Make the capacity scheduler's default values configurable.
(Sreekanth Ramakrishnan via omalley)


> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch, HADOOP-4178-8.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634789#action_12634789 ] 

Hemanth Yamijala commented on HADOOP-4178:
------------------------------------------

Looks mostly right. Couple of points:

- I think it will be best to fail if the configured values are invalid. Hiding it may show up as problems later.
- The computation for percentages are a float value, so I don't see the need to round still. There is some rounding done anyway when the actual capacity is computed. Let's handle any rounding related issues at one place there. I think the computation of the allocated capacity should be simple, and just divided equally at the configuration level.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634755#action_12634755 ] 

Sreekanth Ramakrishnan commented on HADOOP-4178:
------------------------------------------------

The check we are doing should exist, for one reason which I see, capacity for queue is configured and user specifies it as negative. In that case user has specified a wrong capacity, user has option of specifying only positive non-zero values. The -1 is passed by the system and it is private and internal. I don't think it is a good idea for us to allow user to define negative capacities. 

In my opinion the checks should always be present, but the question is what should be the fail over mechanism, should we pass on default value or should we throw and exception and stop the JT. I think we should do checks log.warn the negative values Log.Warn the errors and continue processing as we now have defaults configured for these values.

With respect to computation calculation one place which I can see where rounding errors can possibly happen is as follows : 3 queues with no GC configured. 

So values for GC would be 33, 33, 33 (this would mean 1% capacity not used).

Or case where 75% is to be allocated to 4 queues. You get 19 and 4 * 19 = 76 over allocation

With respect to testGCAllocationToQueues: the 25% is allocated to default you have remaining 75% to be allocated to the 4 queues. So 75/4 = 18.75. Eighteen is rounded up to 19. And not all queue can have 19 as capacity, since we are internally balancing and computing the capacity so that it does not exceed more than 100, One of the queues would get 18.



> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634050#action_12634050 ] 

Vinod K V commented on HADOOP-4178:
-----------------------------------

The patch submitted sets the default guaranteed capacity to be 100. This doesn't seem to be right. For e.g, this won't let me get go running just by creating two queues; the total capacity flows over 100 and the scheduler fails to start.

Instead, we could knock this configuration off, take the default guaranteed capacity of a queue to be -1. While starting off the scheduler, we iterate over the list of queues. If the cumulative configured guaranteed-capacities go over 100, fail, otherwise distribute the remaining capacity (100 less cumulative guaranteed-capacities) among the queues that are not configured. This way we can kick -tart multiple queues just by creating them (i.e. simply via mapred.queue.names).

The other three configuration parameters are useful in the current format.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-3.patch

Attaching latest patch with changed incorporated from Vinod's comments:
- Added a new method in the Configuration class to set float values.
- Removed logging when setting default values.
- Refactored code in TestCapacitySchedulerConf
- Added new test case in TestCapacityScheduler to test distribution of gc when not default gc is not present.


> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4178:
-------------------------------------

    Assignee: Sreekanth Ramakrishnan  (was: Hemanth Yamijala)

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635463#action_12635463 ] 

Hadoop QA commented on HADOOP-4178:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391124/HADOOP-4178-8.patch
  against trunk revision 700056.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3392/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3392/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3392/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3392/console

This message is automatically generated.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch, HADOOP-4178-8.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-2.patch

Attaching new patch with following:

-Added log.info statements when default values are applied to a particular queue.
-Added new test case to check scenario where one job queue overrides default values whereas rest use the default value.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-7.patch

Attaching patch with changes mentioned in the review. Modified test cases to catch the actual failure points in the code.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-6.patch

Attaching patch incorporating Hemanth's comments.
- Removing the rounding check.
- Failing when an Invalid userlimit or reclaim-time-limit is set.
- added test case to confirm the exceptions are being thrown.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634479#action_12634479 ] 

Hemanth Yamijala commented on HADOOP-4178:
------------------------------------------

Comments:

capacity-scheduler.xml:
- Typo: "The default values would be applied to all the queues which don't" - should be "The default values would be applied to all the queues which don't have"
- My understanding for default guaranteed capacity is that there will be *no* configuration variable for the default. Instead, in code, we use -1 as the default value so that we know when it is not defined. 

CapacitySchedulerConf:
- defaultGuaranteedCapacity is not needed.
- getGuaranteedCapacity should return -1 if the value is not configured. So, there is no need to check if the value is not defined. The check for invalid values can ignore -1 and check for other negatives and values > 100.
- The LOG variable is not used, and should be removed.
- Very minor nit: There are some extra lines after the variables for the defaults are declared.
- Should we check for sanity of other variables like reclaim time limit and minimum user limit also ?

CapacityTaskScheduler:
- The computation of remaining capacities can be simpler, I think.
{code}
remainingCapacity = 100 - totalCapacity;
if (gcNotConfiguredQueues.size() > 0) {
  remainingCapacityPerQueue = Math.round(remainingCapacity / gcNotConfiguredQueues.size());
}
// distribute this for all queues in gcNotConfiguredQueues
{code}
Would this work ?
- I would recommend a variable name like queuesWithoutConfiguredGC instead of gcNotConfiguredQueues

Tests:
- testQueueWithUserDefinedDefaultProperties: The effective test would be to not define some properties for queues, and verify that the overridden defaults are returned. This will be similar to testQueueWithDefaultProperties.
- testGCAllocationToQueues:  one of the expected values is 18f, the others are 19f, shouldn't they all be the same ?

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635337#action_12635337 ] 

Hemanth Yamijala commented on HADOOP-4178:
------------------------------------------

Some LOG.debug statements have crept into {{CapacityTaskScheduler}} which seem unrelated to this patch. Can you please remove them, and submit to Hudson ?

Otherwise, patch looks good to me.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch, HADOOP-4178-6.patch, HADOOP-4178-7.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633202#action_12633202 ] 

Sreekanth Ramakrishnan commented on HADOOP-4178:
------------------------------------------------

According to [HADOOP-4079|https://issues.apache.org/jira/browse/HADOOP-4079] , checking for the invalid queues should be done at global level i.e. for the requisite parameter for specific job queues. The Capacity scheduler should only check for the parameters it requires to be present. If it is not present it should fall back to the defaults which is configured in this JIRA and use the same.  So, if required we can check for negative capacity and user limits here and consider them to be invalid. Assume that _QueueManager_ always gives us the right valid queues when _CapacityTaskScheduler_ is initialized.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4178-1.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4178) The capacity scheduler's defaults for queues should be configurable.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4178:
-------------------------------------------

    Attachment: HADOOP-4178-5.patch

Attaching patch incorporating comments. Still not sure about the test cases in the _TestCapacitySchedulerConf_. I have removed few values and use some negative values to test if the class returns the default values.

> The capacity scheduler's defaults for queues should be configurable.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4178
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Owen O'Malley
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4178-1.patch, HADOOP-4178-2.patch, HADOOP-4178-3.patch, HADOOP-4178-4.patch, HADOOP-4178-5.patch
>
>
> The default values for the queue attributes should be configurable rather than hard coded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.