You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2017/06/08 17:02:21 UTC

[jira] [Commented] (AMBARI-21204) Yarn stopped by itself after start. HA run

    [ https://issues.apache.org/jira/browse/AMBARI-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043039#comment-16043039 ] 

Hadoop QA commented on AMBARI-21204:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12872100/AMBARI-21204.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-server.

Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/11638//console

This message is automatically generated.

> Yarn stopped by itself after start. HA run
> ------------------------------------------
>
>                 Key: AMBARI-21204
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21204
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Dmytro Sen
>            Assignee: Dmytro Sen
>            Priority: Critical
>             Fix For: 2.5.2
>
>         Attachments: AMBARI-21204.patch
>
>
> From RM logs :
> {code}
> 2017-06-07 14:23:19,191 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1240)) - Error starting ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
>         at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236)
> Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
>         at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351)
>         at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 7 more
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election
> {code}
> The problem is that disabling security changes zk ACL for resource manager as part of AMBARI-19331. After the recent change in HDFS-11403, RM checks znode version and fails if it's different than expected.
> The correct fix could be to remove znode during security disabling and do not break election znode consistency by manually changing ACL to all. RM should create it with proper ACL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)