You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "nandan (JIRA)" <ji...@apache.org> on 2011/07/07 17:55:16 UTC
[jira] [Created] (MAPREDUCE-2653) dynamic map slots (in addition to
predifined) on each node which allows to execute cpu intensive jobs along
with memory intensive jobs thereby reducing wastage of cpu cycles
dynamic map slots (in addition to predifined) on each node which allows to execute cpu intensive jobs along with memory intensive jobs thereby reducing wastage of cpu cycles
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: MAPREDUCE-2653
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2653
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: jobtracker, tasktracker
Reporter: nandan
I have introduced process monitoring system in hadoop inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2653) dynamic map slots (in addition
to predifined) on each node which allows to execute cpu intensive jobs
along with memory intensive jobs thereby reducing wastage of cpu cycles
Posted by "nandan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090015#comment-13090015 ]
nandan commented on MAPREDUCE-2653:
-----------------------------------
In response to Allen Wittenauer's question:
How does this method work when the tasks are IO intensive?
Monitoring system on every TT categorizes and stores each task it runs, into CPU-Intensive and CPU-NonIntensive lists (this includes Memory as well IO intensive tasks) and generates job request by selecting jobs from these lists one by one alternately, considering current cpu-idle time and cpu utilization of the task. Request consists of list of jobs whose map tasks TT can run as extra tasks. This request is submitted to JT through heartbeat, which processes jobs from the request one by one.
So currently I am treating IO and Memory processes as same.
> dynamic map slots (in addition to predifined) on each node which allows to execute cpu intensive jobs along with memory intensive jobs thereby reducing wastage of cpu cycles
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2653
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2653
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, tasktracker
> Affects Versions: 0.20.203.0
> Environment: linux
> Reporter: nandan
> Labels: map, scheduler, tasks
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> I have introduced process monitoring system inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node. With this I can combine cpu intensive jobs along with memory intensive jobs, thereby reducing the cpu idle time.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2653) dynamic map slots (in addition to
predifined) on each node which allows to execute cpu intensive jobs along
with memory intensive jobs thereby reducing wastage of cpu cycles
Posted by "nandan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nandan updated MAPREDUCE-2653:
------------------------------
Description: I have introduced process monitoring system inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node. With this I can combine cpu intensive jobs along with memory intensive jobs, thereby reducing the cpu idle time. (was: I have introduced process monitoring system in hadoop inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node)
Environment: linux
Affects Version/s: 0.20.203.0
Tags: dynamic map slots
Labels: map scheduler tasks (was: )
Remaining Estimate: 672h
Original Estimate: 672h
> dynamic map slots (in addition to predifined) on each node which allows to execute cpu intensive jobs along with memory intensive jobs thereby reducing wastage of cpu cycles
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2653
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2653
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, tasktracker
> Affects Versions: 0.20.203.0
> Environment: linux
> Reporter: nandan
> Labels: map, scheduler, tasks
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> I have introduced process monitoring system inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node. With this I can combine cpu intensive jobs along with memory intensive jobs, thereby reducing the cpu idle time.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2653) dynamic map slots (in addition
to predifined) on each node which allows to execute cpu intensive jobs
along with memory intensive jobs thereby reducing wastage of cpu cycles
Posted by "nandan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063731#comment-13063731 ]
nandan commented on MAPREDUCE-2653:
-----------------------------------
To develop a proof of concept I have just concentrated on a cpu utilization.
Currently, I am running multiple jobs simultaneously.
Based on cpu utilization of the task and current cpu idle time, I decide if I can run an extra task of that job (by dynamically increasing map slots), thereby coupling cpu intensive jobs along with jobs which are not cpu intensive
> dynamic map slots (in addition to predifined) on each node which allows to execute cpu intensive jobs along with memory intensive jobs thereby reducing wastage of cpu cycles
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2653
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2653
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, tasktracker
> Affects Versions: 0.20.203.0
> Environment: linux
> Reporter: nandan
> Labels: map, scheduler, tasks
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> I have introduced process monitoring system inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node. With this I can combine cpu intensive jobs along with memory intensive jobs, thereby reducing the cpu idle time.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2653) dynamic map slots (in addition
to predifined) on each node which allows to execute cpu intensive jobs
along with memory intensive jobs thereby reducing wastage of cpu cycles
Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062123#comment-13062123 ]
Allen Wittenauer commented on MAPREDUCE-2653:
---------------------------------------------
How does this method work when the tasks are IO intensive? What happens if the task forks sub processes?
> dynamic map slots (in addition to predifined) on each node which allows to execute cpu intensive jobs along with memory intensive jobs thereby reducing wastage of cpu cycles
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2653
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2653
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker, tasktracker
> Affects Versions: 0.20.203.0
> Environment: linux
> Reporter: nandan
> Labels: map, scheduler, tasks
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> I have introduced process monitoring system inside tasktracker, which analyses the cpu and memory utilization of each map task and allows me to increase/decrease maximum number of map slots dynamically on each node. With this I can combine cpu intensive jobs along with memory intensive jobs, thereby reducing the cpu idle time.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira