You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2016/03/04 18:41:40 UTC

[jira] [Created] (AMBARI-15303) New Alerts Do Not Honor Existing Maintenance Mode Setting

Jonathan Hurley created AMBARI-15303:
----------------------------------------

             Summary: New Alerts Do Not Honor Existing Maintenance Mode Setting
                 Key: AMBARI-15303
                 URL: https://issues.apache.org/jira/browse/AMBARI-15303
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.0.0
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Critical
             Fix For: 2.2.0


Alerts "suppress" maintenance mode by indicating a {{maintenance_state}} attribute in addition to the actual state which is being reported:

{code}
      "Alert": {
        "cluster_name": "c1",
        "component_name": "METRICS_COLLECTOR",
        "definition_id": 43,
        "definition_name": "ams_metrics_collector_process",
        "host_name": "c6401.ambari.apache.org",
        "id": 28,
        "instance": null,
        "label": "Metrics Collector Process",
        "latest_timestamp": 1457108946118,
        "maintenance_state": "ON",
        "original_timestamp": 1457108646099,
        "scope": "ANY",
        "service_name": "AMBARI_METRICS",
        "state": "CRITICAL",
        "text": "Connection failed: [Errno 111] Connection refused to c6401.ambari.apache.org"
      }
{code}

When a host/service/component is placed into MM, the database is updated so that all {{alert_current}} rows which are affected have their MM updated as well.

However, this fails under two scenarios:
- The alert hasn't been received yet in a brand new cluster
- The alert definition was disabled, which removed all current alerts. Then, it was re-enabled.

In both cases, when constructing a new {{AlertCurrentEntity}}, we need to calculate the correct maintenance state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)