You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Dmytro Sen (JIRA)" <ji...@apache.org> on 2017/06/08 16:37:18 UTC

[jira] [Created] (AMBARI-21204) Yarn stopped by itself after start. HA run

Dmytro Sen created AMBARI-21204:
-----------------------------------

             Summary: Yarn stopped by itself after start. HA run
                 Key: AMBARI-21204
                 URL: https://issues.apache.org/jira/browse/AMBARI-21204
             Project: Ambari
          Issue Type: Bug
    Affects Versions: 2.5.1
            Reporter: Dmytro Sen
            Assignee: Dmytro Sen
            Priority: Critical
             Fix For: 2.5.2


From RM logs :
{code}
2017-06-07 14:23:19,191 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1240)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236)
Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
        at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351)
        at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        ... 7 more
Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election
{code}
The problem is that disabling security changes zk ACL for resource manager as part of AMBARI-19331. After the recent change in HDFS-11403, RM checks znode version and fails if it's different than expected.
The correct fix could be to remove znode during security disabling and do not break election znode consistency by manually changing ACL to all. RM should create it with proper ACL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)