You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2011/04/16 01:02:05 UTC
[jira] [Created] (MAPREDUCE-2441) regression: maximum limit of -1
doesn't appear to be unlimited anymore
regression: maximum limit of -1 doesn't appear to be unlimited anymore
----------------------------------------------------------------------
Key: MAPREDUCE-2441
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: benchmarks
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Blocker
The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Allen Wittenauer (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved MAPREDUCE-2441.
-----------------------------------------
Resolution: Won't Fix
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Critical
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032190#comment-13032190 ]
Arun C Murthy commented on MAPREDUCE-2441:
------------------------------------------
Allen, I'm sorry I missed this ticket.
As we briefly spoke over IM previously, the CS in 0.20.203 is designed to not allow a single user to go over the natural limit of the queue. As in the docs, you'll need to set the user-limit-factor for the queue to allow a user to go over... I'm pretty sure I told you on in person ;)
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Critical
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032504#comment-13032504 ]
Allen Wittenauer commented on MAPREDUCE-2441:
---------------------------------------------
Nope, not about user-limit-factor. But doesn't this mean that the first jobs in an expanding queue can starve out jobs in another queue? In other words, if I have:
job1 = max-lim -1 queue
job2 = max-lim -1 queue
job3 = max-lim % queue
job1 and job2 could take all slots before job3 gets executed, especially if they are submitted by the same user and that is the only user in the job submission queue.
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Critical
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032165#comment-13032165 ]
Allen Wittenauer commented on MAPREDUCE-2441:
---------------------------------------------
Actually, it looks like queue spillage/task stealing doesn't work at all, whether it is -1 or not. The problem code appears to be in assignSlotsToJob which appears to have replaced the two-phase system in previous versions with a single phase. This single phase does this check to determine the limit:
{code}
int limit =
Math.min(
Math.max(divideAndCeil(currentCapacity, activeUsers),
divideAndCeil(ulMin*currentCapacity, 100)),
(int)(queueCapacity * ulMinFactor)
);
{code}
In a two queue system where one is -1 and the other is a number, the maximum queue capacity ends up being set to the remainder. Without a second pass, any additional slots from other queues are essentially ignored.
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Blocker
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020466#comment-13020466 ]
Allen Wittenauer commented on MAPREDUCE-2441:
---------------------------------------------
or the queue limit.
So where does the 266 come from? The job was a terasort job with 1000 map tasks.
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: benchmarks
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Blocker
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2441) regression: maximum limit of -1
doesn't appear to be unlimited anymore
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated MAPREDUCE-2441:
----------------------------------------
Attachment: capsched.xml
This is our capacity scheduler configuration. On a test grid with 762 map slots, the first user in running in the default queue only got 266 map slots. This doesn't appear to be either the user limit or the max limit.
> regression: maximum limit of -1 doesn't appear to be unlimited anymore
> ----------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: benchmarks
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Blocker
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2441) regression: maximum limit of -1 +
user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated MAPREDUCE-2441:
----------------------------------------
Component/s: (was: benchmarks)
contrib/capacity-sched
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Blocker
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2441) regression: maximum limit of -1 +
user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated MAPREDUCE-2441:
----------------------------------------
Summary: regression: maximum limit of -1 + user-lmit math appears to be off (was: regression: maximum limit of -1 doesn't appear to be unlimited anymore)
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: benchmarks
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Blocker
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2441) regression: maximum limit of -1 +
user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated MAPREDUCE-2441:
----------------------------------------
Priority: Critical (was: Blocker)
Changing this from a blocker, since no one but me apparently cares that capacity scheduler doesn't actually work as advertised.
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Critical
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2441) regression: maximum limit of -1
+ user-lmit math appears to be off
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032173#comment-13032173 ]
Allen Wittenauer commented on MAPREDUCE-2441:
---------------------------------------------
Actually, let me correct myself. Task stealing does work--but in a sort of weird and unpredictable way. Basically, an individual user is limited to the "natural" size of the queue they submitted. So if two users are in the same queue that queue can steal up to 2xqueue size, etc.
> regression: maximum limit of -1 + user-lmit math appears to be off
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-2441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2441
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.20.203.0
> Reporter: Allen Wittenauer
> Priority: Critical
> Attachments: capsched.xml
>
>
> The math around the slot usage when maximum-capacity=-1 appears to be faulty. See comments.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira