You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Maysam Yabandeh (JIRA)" <ji...@apache.org> on 2013/08/24 11:17:53 UTC

[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

    [ https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749341#comment-13749341 ] 

Maysam Yabandeh commented on YARN-779:
--------------------------------------

I am thinking perhaps we can solve the problem without needing a complete change in the API. Since we are using Protocol Buffers, we can freely add new fields to the message.

What we need is a way to express in a set of ResourceRequests the disjunction between the requested containers in ContainerRequest. For that we can use a locally unique resourceRequestId generated by the AMRMClientImpl.java. For example if application requires one container in (node1 || node2), #addContainerRequest decomposes it into two ResourceRequests but tagged with the same resourceRequestId. 
* ResourceRequest(node1, id1234);
* ResourceRequest(node2, id1234);

Later, when the ResourceManager services a ResourceRequest with ID id1234, it can update all other corresponding ResourceRequests from the same application with the same ID of id1234. Thanks to Protocol Buffers, there will be no inconsistency between old/new clients with new/old servers.

Feedbacks are appreciated.
                
> AMRMClient should clean up dangling unsatisfied request
> -------------------------------------------------------
>
>                 Key: YARN-779
>                 URL: https://issues.apache.org/jira/browse/YARN-779
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.0.4-alpha
>            Reporter: Alejandro Abdelnur
>            Priority: Critical
>         Attachments: YARN-779.patch, YARN-779.patch
>
>
> If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or node2 is placed (assuming a single rack) the resulting ResourceRequests will be
> {code}
> location - containers
> ---------------------
> node1    - 10
> node2    - 10
> rack     - 10
> ANY      - 10
> {code}
> Assuming 5 containers are allocated in node1 and 5 containers are allocated in node2, the following ResourceRequests will be outstanding on the RM.
> {code}
> location - containers
> ---------------------
> node1    - 5
> node2    - 5
> {code}
> If the AMMRClient does a new ContainerRequest allocation, this time for 5 containers in node3, the resulting outstanding ResourceRequests on the RM will be:
> {code}
> location - containers
> ---------------------
> node1    - 5
> node2    - 5
> node3    - 5
> rack     - 5
> ANY      - 5
> {code}
> At this point, the scheduler may assign 5 containers to node1 and it will never assign the 5 containers node3 asked for.
> AMRMClient should keep track of the outstanding allocations counts per ContainerRequest and when gets to zero it should update the the RACK/ANY decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira