You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@twill.apache.org by "Gourav Khaneja (JIRA)" <ji...@apache.org> on 2014/07/21 20:54:41 UTC

[jira] [Updated] (TWILL-87) Adding Container Placement control (Placement Policy API)

     [ https://issues.apache.org/jira/browse/TWILL-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gourav Khaneja updated TWILL-87:
--------------------------------

    Description: 
Yarn AMRMClient provides API to control container placement. We need to enhance Twill API so that user can specify container placement policies. Twill could use AMRMClient to try allocating containers according to specified placement policy.

Added Placement Policy API in TwillSpecification. For now, the Placement policy type includes: (a) DISTRIBUTED, which tries to spawn specified runnables on different hosts.  (b) DEFAULT, i.e. no special placement policy requirements.

Implementation Detail: The DISTRIBUTED runnable instances are provisioned one after another (as opposed to grouping provision requests in one allocate call based on ResourceSpecs). AM blacklists the hosts on which existing DISTRIBUTED runnables are running. If no container is provisioned for MAX_CONSTRAINED_PROVISION_ATTEMPTS number of attempts, AM relaxes blacklist constraints (or any other constraint). 

Also, it make sense to specify Hosts and Racks through Placement Policy API instead of using Resource Specification. So, moved that logic into Placement Policy too.
    

Tests
    1. Added unit tests to test Placement Policy (using MiniYarnCluster)
          (a) Specify DISTRIBUTED runnables, runnables with Hosts and Racks in a Twill App and verify all constraints and appropriately honored.
          (b) Specify DISTRIBUTED and DEFAULT runnables in a Twill App and verify all constraints and appropriately honored. Increase number of instances for all runnables and verify all constraints and appropriately honored.
          (c) Tested DISTRIBUTED placement policy under stress (i.e. not enough resources available to honor constraints). Verify that AM relaxes constraints and try it's best.  

    2. Tested on a real Cluster.
         

Please review the API and changes in PR - https://github.com/apache/incubator-twill/pull/7
    
  





  was:
Yarn AMRMClient provides API to control container placement. We need to enhance Twill API so that user can specify container placement policy. Twill could use AMRMClient to try allocating containers according to specified placement policy.

1. API: 
    a. Added Placement Policy API in TwillSpecificationBuilder.
    b. For now, the only Placement policy added is DISTRIBUTED, which tries to spawn specified runnables on different hosts.
    c. Added a placeholder for 'PlacementHints' through which user can specify host names they want the runnables to go to. 
    
2. Code Changes:
    a. Allocating Containers for one runnable instance at a time, instead of grouping them together based on Resource Specs.
    b. Added PlacementPolicy and PlacementPolicyGroup data structures /interface for storing placement policies.
    c. Storing node name too (apart from hostname) in TwillRunResource.


3. AMRMClient API Usage
    a. Using Blacklist to support DISTRIBUTED placement policy. 
    b. Not using addContainerRequest.relaxLocality = false for now.

4. Unit Tests
    a. Added PlacementPolicyTestRun Class with test method testDistributedPlacementPolicy.
    b. Running MiniYarnCluster with 3 nodes instead of 1.


Please review changes in PR - https://github.com/apache/incubator-twill/pull/7
    
  






> Adding Container Placement control (Placement Policy API) 
> ----------------------------------------------------------
>
>                 Key: TWILL-87
>                 URL: https://issues.apache.org/jira/browse/TWILL-87
>             Project: Apache Twill
>          Issue Type: New Feature
>          Components: api, yarn
>    Affects Versions: 0.3.0-incubating
>         Environment: Tested on Hadoop Yarn 2.2 and 2.3 running on Ubuntu-nodes (4 GB , 8 Cores) cluster. 
>            Reporter: Gourav Khaneja
>            Assignee: Gourav Khaneja
>              Labels: features, github
>             Fix For: 0.3.0-incubating
>
>
> Yarn AMRMClient provides API to control container placement. We need to enhance Twill API so that user can specify container placement policies. Twill could use AMRMClient to try allocating containers according to specified placement policy.
> Added Placement Policy API in TwillSpecification. For now, the Placement policy type includes: (a) DISTRIBUTED, which tries to spawn specified runnables on different hosts.  (b) DEFAULT, i.e. no special placement policy requirements.
> Implementation Detail: The DISTRIBUTED runnable instances are provisioned one after another (as opposed to grouping provision requests in one allocate call based on ResourceSpecs). AM blacklists the hosts on which existing DISTRIBUTED runnables are running. If no container is provisioned for MAX_CONSTRAINED_PROVISION_ATTEMPTS number of attempts, AM relaxes blacklist constraints (or any other constraint). 
> Also, it make sense to specify Hosts and Racks through Placement Policy API instead of using Resource Specification. So, moved that logic into Placement Policy too.
>     
> Tests
>     1. Added unit tests to test Placement Policy (using MiniYarnCluster)
>           (a) Specify DISTRIBUTED runnables, runnables with Hosts and Racks in a Twill App and verify all constraints and appropriately honored.
>           (b) Specify DISTRIBUTED and DEFAULT runnables in a Twill App and verify all constraints and appropriately honored. Increase number of instances for all runnables and verify all constraints and appropriately honored.
>           (c) Tested DISTRIBUTED placement policy under stress (i.e. not enough resources available to honor constraints). Verify that AM relaxes constraints and try it's best.  
>     2. Tested on a real Cluster.
>          
> Please review the API and changes in PR - https://github.com/apache/incubator-twill/pull/7
>     
>   



--
This message was sent by Atlassian JIRA
(v6.2#6252)