You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jian He (JIRA)" <ji...@apache.org> on 2014/04/12 03:03:23 UTC

[jira] [Commented] (YARN-1933) TestAMRestart and TestNodeHealthService failing sometimes on Windows

    [ https://issues.apache.org/jira/browse/YARN-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967318#comment-13967318 ] 

Jian He commented on YARN-1933:
-------------------------------

- TestAMRestart:
Removed the following check, because after we send the container complete event,  the containers could  be just removed immediately from the liveContainers inside the schedulerAttempt, which causes NPE
{code}
     nm1.nodeHeartbeat(am1.getApplicationAttemptId(), 3, ContainerState.COMPLETE);
-    rm1.waitForState(nm1, containerId3, RMContainerState.COMPLETED);
{code}
Also  changed some test logic to wait until the expected number of containers reached.

- TestNodeHealthService:
Give write and read permission of the script file and also Put the close() in finally block.

- Minor side fix in ZKRMStateStore.java: moved the error message to debug level  as I found that the createRootDir method will throw NodeAlreadyExistsException if the root already exits. And it's always the case that the root exits after RM restarts.

> TestAMRestart and TestNodeHealthService failing sometimes on Windows
> --------------------------------------------------------------------
>
>                 Key: YARN-1933
>                 URL: https://issues.apache.org/jira/browse/YARN-1933
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-1933.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)