You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Li Lu (JIRA)" <ji...@apache.org> on 2014/07/26 01:32:40 UTC

[jira] [Updated] (YARN-2354) DistributedShell may allocate more containers than client specified after it restarts

     [ https://issues.apache.org/jira/browse/YARN-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Li Lu updated YARN-2354:
------------------------

    Attachment: YARN-2354-072514.patch

The problem was on numRequestedContainers. In the previous version, initially, it was set to numTotalContainers - previousAMRunningContainers.size(). Then, on container completion, the number of containers that need to to relaunched is calculated by numTotalContainers - numRequestedContainers, and normally this equals to previousAMRunningContainers.size(). If the containers are not reused (no -keep_containers_across_application_attempts), there should be no previousAMRunningContainers, so this problem only occurs when -keep_containers_across_application_attempts is set. 

I'm also fixing the testDSRestartWithPreviousRunningContainers UT associated with this issue. 

> DistributedShell may allocate more containers than client specified after it restarts
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-2354
>                 URL: https://issues.apache.org/jira/browse/YARN-2354
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Li Lu
>         Attachments: YARN-2354-072514.patch
>
>
> To reproduce, run distributed shell with -num_containers option,
> In ApplicationMaster.java, the following code has some issue.
> {code}
>   int numTotalContainersToRequest =
>         numTotalContainers - previousAMRunningContainers.size();
>     for (int i = 0; i < numTotalContainersToRequest; ++i) {
>       ContainerRequest containerAsk = setupContainerAskForRM();
>       amRMClient.addContainerRequest(containerAsk);
>     }
>     numRequestedContainers.set(numTotalContainersToRequest);
> {code}
>  numRequestedContainers doesn't account for previous AM's requested containers. so numRequestedContainers should be set to numTotalContainers



--
This message was sent by Atlassian JIRA
(v6.2#6252)