You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Rohith (JIRA)" <ji...@apache.org> on 2014/07/01 07:50:25 UTC

[jira] [Updated] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

     [ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohith updated YARN-1366:
-------------------------

    Attachment: YARN-1366.9.patch

Thank for reviewing patch.. I updated patch as per comments. Please review update patch

bq. “blacklistRemovals.addAll(blacklistToRemove);”, we don't need to add this in isResyncCommand check? as RM after restart will just forget all previously blacklisted nodes.
DONE, 

bq. below code needs synchronize ? 
Yes

bq. “isApplicationMasterRegistered = false;” not needed in allocate and unregisterApplicationMaster.
DONE, removed this variale itself since not used.

bq. we may check pendingRelease isEmpty as well to avoid unnecessary loops
DONE

bq. Instead of adding a new core-site.xml file, we can just set the config in the test code conf object.
core-site.xml has to be there since SecurityUtil.java loads configurations during class loading.It can not be passed through conf object.!!

> AM should implement Resync with the ApplicationMasterService instead of shutting down
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, YARN-1366.4.patch, YARN-1366.5.patch, YARN-1366.6.patch, YARN-1366.7.patch, YARN-1366.8.patch, YARN-1366.9.patch, YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the AM responds by shutting down. The AM behavior is expected to change to calling resyncing with the RM. Resync means resetting the allocate RPC sequence number to 0 and the AM should send its entire outstanding request to the RM. Note that if the AM is making its first allocate call to the RM then things should proceed like normal without needing a resync. The RM will return all containers that have completed since the RM last synced with the AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)