You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2017/07/28 08:52:00 UTC

[jira] [Commented] (AMBARI-21593) RU: AMS stopped after RU [AMS distributed mode]

    [ https://issues.apache.org/jira/browse/AMBARI-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104676#comment-16104676 ] 

Hadoop QA commented on AMBARI-21593:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12879292/AMBARI-21593.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-metrics/ambari-metrics-timelineservice.

Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/11882//console

This message is automatically generated.

> RU: AMS stopped after RU [AMS distributed mode]
> -----------------------------------------------
>
>                 Key: AMBARI-21593
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21593
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-metrics
>    Affects Versions: 2.5.2
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Blocker
>             Fix For: 2.5.2
>
>         Attachments: AMBARI-21593.patch
>
>
> *PROBLEM*
> When 2 metric collectors are started up simultaneously, both of them fail to start.
> *BUG*
> There exists a race condition in the Metric Collector HA controller initialization which was introduced through AMBARI-20179. When a helix controller instance finds that the /ambari-metrics-collector znode exists but a child node does not exists, it deletes the entire znode and recreates. If another controller instance also initializes simultaneously, a race condition can occur wherein each instance will end up cancelling the effort of the other. 
> *FIX*
> Do not delete and recreate the znode. Wait and retry for a few seconds to check if /ambari-metrics-collector was fully initailized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)