You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "MENG DING (JIRA)" <ji...@apache.org> on 2015/06/08 18:05:01 UTC

[jira] [Updated] (YARN-1449) Protocol changes in NM side to support change container resource

     [ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

MENG DING updated YARN-1449:
----------------------------
    Attachment: YARN-1449.1.patch

Attaching patch for review.

The patch has passed the {{test-patch}} script, and includes the following changes:
* Added *ChangeContainersResourceRequest*/*ChangeContainersResourceResponse* protocol
* Added *changeContainersResource* method in *ContainerManagementProtocol*
* Updated *ContainerManagerImpl* to implement the container resource change actions
* Updated unit tests

The patch does *NOT* include the implementation of changes to the *NodeStatus* yet. I would like to have some further discussion on the changes to the NodeStatusProto, especially now we want to update the node heartbeat response to let RM confirm the final resource changes with NM. [~leftnoteasy], do you think it would be a good idea to reopen YARN-1644 so that I can initiate the discussion and post patches in that thread for NodeStatus changes? If you think it is not necessary, I will discuss in this thread. 

I was able to reuse a lot of the code from the original patch :-). The major differences are listed as follows:

* The *ChangeContainersResourceResponse* now returns a containerID to exception Map for failed requests, instead of a list of failed containerIDs. This is to be consistent with other APIs.
* In {{ContainerManagerImpl.java}}
** More strict checking of the resource change request, including checking token expiration and RM identifier.
** Reject resource change requests with both resource increase and decrease specified for the same container in the same request.
** Check validity of the target resource. For decrease request, the target resource must fit in the current resource, otherwise, the request will be rejected right away.
** Added a {{CHANGE_CONTAINER}} event so that container resource change and nodemanager metrics updates will be routed to {{ContainerImpl}}. I believe this is more consistent with the current event model (e.g., from {{CONTAINER_LAUNCHED}} event to {{START_MONITORING_CONTAINER}}).
** Synchronize the calls to change/stop/getstatus of containers.
* In {{ContainerImpl}}
** The {{Resource}} field must be updated now after each successful resource change. It will be used to compare against any invalid resource change coming from AM.
** The nodemanager metrics needs to be updated as well.
** Fire {{CHANGE_MONITORING_CONTAINER}} event in {{ContainerResourceChangeTransition}}.

Thanks a lot.

> Protocol changes in NM side to support change container resource
> ----------------------------------------------------------------
>
>                 Key: YARN-1449
>                 URL: https://issues.apache.org/jira/browse/YARN-1449
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Wangda Tan (No longer used)
>         Attachments: YARN-1449.1.patch, yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, yarn-1449.5.patch
>
>
> As described in YARN-1197, we need add API/implementation changes,
> 1) Add a "changeContainersResources" method in ContainerManagementProtocol
> 2) Can get succeed/failed increased/decreased containers in response of "changeContainersResources"
> 3) Add a "new decreased containers" field in NodeStatus which can help NM notify RM such changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)