You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Radim Kolar (JIRA)" <ji...@apache.org> on 2012/05/14 17:47:53 UTC

[jira] [Created] (MAPREDUCE-4256) Improve resource scheduling

Radim Kolar created MAPREDUCE-4256:
--------------------------------------

             Summary: Improve resource scheduling
                 Key: MAPREDUCE-4256
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: resourcemanager
            Reporter: Radim Kolar
             Fix For: 2.0.0, 3.0.0


Currently resource manager supports only Memory resource during container allocation.

I propose following improvements:

1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.

2. add support for custom resources. In node configuration will be something like:

name=node.resource.GPU, value=1 (node has 1 GPU).

If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400230#comment-13400230 ] 

Radim Kolar commented on MAPREDUCE-4256:
----------------------------------------

This ticket has 2 main points.

1. pluggable resources framework
2. resource scheduling using pluggable resources

If pluggable resources are implemented pluggable contained monitoring is not needed, you can not do much other things than kill application using too much resources. Possible extensions can be done by some kind of pluggable policy what to do if resource consumption is exceeded.

Instead of hacking hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Resource.java in MAPREDUCE-4327 for cpu cores, its better to add general purpose Resource management.
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480731#comment-13480731 ] 

Radim Kolar commented on MAPREDUCE-4256:
----------------------------------------

About optional resources like GPU for computation. Lets task to specify weight in application resource requests.

for example request like "gpu:1:.8" means requesting 1 of gpu resources with weight 0.8 -> give node 80% bonus if gpu resource is available

it could handle data locality as well, because for some compute tasks data locality its not needed because you do not move much data over net. In that case request will be "data:1:0.0"
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284356#comment-13284356 ] 

Radim Kolar commented on MAPREDUCE-4256:
----------------------------------------

resources will be detectable by plugin. Plugin will work in 2 modes (node wide and cluster wide) and report currently in use resource and max. available. If resource will not have plugin assigned, it will use simple allocation - just hardcode into hadoop config how much resources of that kind you have.
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Radim Kolar updated MAPREDUCE-4256:
-----------------------------------

    Attachment:     (was: hadoop-jdk.tools.txt)
    
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280049#comment-13280049 ] 

Radim Kolar commented on MAPREDUCE-4256:
----------------------------------------

Two kind of resources:

 1. cluster shared resources - every node in cluster can use them. For example WAN bandwidth.

 2. node local resources - for example GPU card.

these resources will be configured using standard hadoop configuration framework.

mapreduce.resources.cluster.available=resourcename:float,... (this option will be used by resourcemanager).
mapreduce.resources.node.available=resourcename:float (for use by nodemanager). Node manager will report available resources to resourcemanager

job config:
mapreduce.resources.mapper= list of resources needed by one mapper
mapreduce.resources.reducer= list of resources needed by one reducer

if same resource is configured per node and per cluster then per node version is preferred.
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>             Fix For: 2.0.0, 3.0.0
>
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292554#comment-13292554 ] 

Radim Kolar commented on MAPREDUCE-4256:
----------------------------------------

1 would leave optional resources and other stuff like tiered resources and data locality vs optional resources for Resource scheduling for hadoop-3. For hadoop-2 lets start with something simple.

1. handling different OS - one plugin for each operation system. If plugin runs on unsupported OS, it will throw exception during @PostConstruct method.

2. Resource restrictions can be supported if resource plugin will have optional function implemented - query how much resource of that type is currently in use by container.

3. Good point are resources which can change without any input from RM. Lets call them volatile resources - if resources is marked as such, RM will need to do periodic queries.

4. network accessible resources will be handled as global resources
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396894#comment-13396894 ] 

Robert Joseph Evans commented on MAPREDUCE-4256:
------------------------------------------------

I agree with you that we should start off with something simple.  I have no idea if we even want to implement everything that I had in the list.  I personally think that we do not want to, but I also want us to fully think about the problem so we don't paint ourselves into a corner.

Your initial ideas seem very reasonable to me.  I think the big thing here is coordinating with what others are trying to do in this area already.

Andrew Ferguson under MAPREDUCE-4351 is trying to make Container Monitoring Pluggable.  It looks like we are trying to split up container monitoring and make it pluggable, so we need work that out.

Arun is also pushing forward with adding in CPU as a resource under MAPREDUCE-4334.
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284926#comment-13284926 ] 

Robert Joseph Evans commented on MAPREDUCE-4256:
------------------------------------------------

@Radim,

I have been trying to think of any resource or resource request that does not really fit into this model.  There are a few that I have come up with.  I don't know if they should be supported or not.  I am including them here mostly for discussion to see what others think.

 # optional resources.  For example I have a mapper that could really be sped up by having access to a GPU card, but it can run without it.  So if there is one free I would like to use it, but if it isn't available I am OK without it.  This opens up a bit of a can of worms because how do I indicate in the request how much effort would I like to go into giving me access to that resource before giving up.
 # a subclass of this might be flexible resources.  I would love to have 10GB of RAM, it would really make my process run fast, with all of the caching I would like to do, but I can get by with as little as 2GB.
 # data locality/network accessible resources.  I want to be near a given resource but I don't necessarily need to be on the same box as the resource.  This could be used when reading from/writing to a filer, DB server, or even a different yet to be scheduled container. This is a bit tricky because we already try to do this, but in a different way.  With each container we request a node/rack that we would like to run on.  The reality is that at least for map reduce we are requesting an HDFS block that we would like to run near, but there are too many blocks to effectively put it into a resource model.  Instead it was generalized, and would probably work for most things, except for the case of I want to be near a yet to be scheduled container.  Even that would work, so long as we only launch containers one at a time.
 # global resources.  Things like bandwidth between datacenters for distcp copies.  Perhaps these could be modeled more as cluster wide resources. But the APIs would have to be set up assuming that the free/max resources can change without any input from the RM.  Meaning the scheduler may need to be able to deal with asking for those resources and having them now, not be available.
  # default resources.  For backwards compatibility with existing requests we may want to have some resources like CPU, that have a default value if not explicitly given.
  # tiered or dependent resources.  With network for example if I am doing a distcp across colos, I am not only using up network bandwidth for the given box that I am on, I am also using up network bandwidth on the rack switch, and the bandwidth going between the colos.  I am not a network engineer so if I am completely wrong with the example hopefully the general concept still holds.  Do we want to some how be able to tie different resources together so a request for one thing (intra-colo bandwidth) also implies a request for other resources?  

I also have a few questions.

How are the plug-ins intended to adapt to different operating environments? Determining the amount of free CPU on Windows is very different from doing it on Linux.  Would we have different plug-ins for different OS's? Would it be up to the end user to configure it all appropriately or would we try to auto-detect the environment and adjust automatically?

Do we want to handle enforcement of the resource limits as well?  We currently kill off containers that are using more memory than requested (with some leeway). What about other things like CPU or network, do we kill the container if it goes over, do we try to use the OS to restrict the amount used, do we warn the user that they went over, or do we just ignore it and hope everyone is a good citizen?
                
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4256) Improve resource scheduling

Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Radim Kolar updated MAPREDUCE-4256:
-----------------------------------

    Attachment: hadoop-jdk.tools.txt
    
> Improve resource scheduling
> ---------------------------
>
>                 Key: MAPREDUCE-4256
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Radim Kolar
>
> Currently resource manager supports only Memory resource during container allocation.
> I propose following improvements:
> 1. add support for CPU utilization. Node CPU used information can be obtained by ResourceCalculatorPlugin.
> 2. add support for custom resources. In node configuration will be something like:
> name=node.resource.GPU, value=1 (node has 1 GPU).
> If job will need to use GPU for computation, it will add "GPU=1" requirement to its job config and Resource Manager will allocate container on node with GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira