You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/05/11 03:09:04 UTC

[jira] [Commented] (YARN-6583) Hadoop-sls failed to start because of premature state of RM

    [ https://issues.apache.org/jira/browse/YARN-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005829#comment-16005829 ] 

ASF GitHub Bot commented on YARN-6583:
--------------------------------------

GitHub user scutojr opened a pull request:

    https://github.com/apache/hadoop/pull/222

    YARN-6583 Hadoop-sls failed to start because of premature state of RM

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/scutojr/hadoop sls

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hadoop/pull/222.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #222
    
----
commit 70d996ae4cd482aacfa8cdc0a4330e4433911bc1
Author: Jayce Au <ja...@outlook.com>
Date:   2017-05-10T14:04:31Z

    YARN-6583 Hadoop-sls failed to start because of premature state of RM

----


> Hadoop-sls failed to start because of premature state of RM
> -----------------------------------------------------------
>
>                 Key: YARN-6583
>                 URL: https://issues.apache.org/jira/browse/YARN-6583
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler-load-simulator
>    Affects Versions: 2.6.0
>            Reporter: JayceAu
>              Labels: easyfix
>
> During startup of SLS, after startRM() in SLSRunner.start(), BaseContainerTokenSecretManager not yet generate its onw internal key or it's not yet exposed to the other thread, then NM registration will fail because of the following exception. Finally, the whole SLS process will crash.
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.security.BaseContainerTokenSecretManager.getCurrentKey(BaseContainerTokenSecretManager.java:81)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.registerNodeManager(ResourceTrackerService.java:300)
>         at org.apache.hadoop.yarn.sls.nodemanager.NMSimulator.init(NMSimulator.java:105)
>         at org.apache.hadoop.yarn.sls.SLSRunner.startNM(SLSRunner.java:202)
>         at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:143)
>         at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:528)
> 17/05/11 10:21:06 INFO resourcemanager.ResourceManager: Recovery started
> 17/05/11 10:21:06 INFO recovery.ZKRMStateStore: Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org