You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Siddharth Wagle (JIRA)" <ji...@apache.org> on 2014/05/11 00:08:38 UTC
[jira] [Comment Edited] (AMBARI-5707) Replace Ganglia with high performant and pluggable Metrics System

    [ https://issues.apache.org/jira/browse/AMBARI-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992398#comment-13992398 ] 

Siddharth Wagle edited comment on AMBARI-5707 at 5/8/14 4:31 AM:
-----------------------------------------------------------------

h2. Proposed Architecture
Please refer to the attachment
*Legend*: Green: New components / services, Blue: Integration points, -> Arrows indicate direction of data flow.

h2. Details

*Ambari Metrics Sink*: 
Replacement for Ganglia Sink that implements the Hadoop Metrics Sink interface, http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/metrics2/MetricsSink.html

*Ambari Metrics Collector Service*:  A replacement for gmetad.
- We get rid of the writing to FS and then reading back the data in response to the call from GangliaPropertyProvider, instead we use a lightweight wire protocol to push metrics to a collector service which writes it to a local DB (MySQL/Postgres/Oracle) as well as to a pluggable Storage service layer.
- The write to DB can be done in parallel with the push to a remote long term storage and analysis solution like OpenTSDB by the collector service using a named pipe in an asynchronous and process space independent manner.
- The Remote Storage Service provider will be expected to provide a jar file with implementation of a shared Sink interface for pushing metrics at real-time. The vision is to allow user to extend a Sink interface and hook their own metrics storage.

*Ambari Metrics Service*: 
- An API layer which provides access to the stored metric data and capability to query it. Additionally, pluggability in terms of where the fine grained metrics data is written for long term storage. 
- The Amabri admin can configure this to use their own metric storage and thereby configure the collectors.

*Host Metrics Collector Daemon*: This is replacement for the gmond running on hosts.
- The host level metrics like cpu, disk, etc are captured by the Ganglia monitor daemon. We should be able to re-purpose this to push metrics to the Ambari Metrics Collector Service.
- Long term goal is to re-write gmond and create our own collector to achieve the following goals:
-- Reduce network traffic by reducing number of packets sent over the wire
-- Reduce the number of processes running per host for monitoring workload

*HA Requirements*:
Ambari Metrics Service: This is a Master daemon and might have built in HA support in the future.

*Scaling out*:
The Ambari Metrics Collector can be envsioned as a Slave and a typical cluster should be able to deploy multiple instances of this service achieving fan out based number of hosts in the cluster.



was (Author: swagle):
h2. Proposed Architecture
Please refer to the attachment
*Legend*: Green: New components / services, Blue: Integration points, -> Arrows indicate direction of data flow.

h2. Details

*Ambari Metrics Sink*: 
Replacement for Ganglia Sink that implements the Hadoop Metrics Sink interface, http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/metrics2/MetricsSink.html

*Ambari Metrics Collector Service*:  A replacement for gmetad.
- We get rid of the writing to FS and then reading back the data in response to the call from GangliaPropertyProvider, instead we use a lightweight wire protocol to push metrics to a collector service which writes it to a local DB (MySQL/Postgres/Oracle) as well as to a pluggable Storage service layer.
- The write to DB can be done in parallel with the push to a remote long term storage and analysis solution like OpenTSDB by the collector service using a named pipe in an asynchronous and process space independent manner.
- The Remote Storage Service provider will be expected to provide a jar file with implementation of a shared Sink interface for pushing metrics at real-time. The vision is to allow user to extend a Sink interface and hook their own metrics storage.

*Ambari Metrics Service*: 
- An API layer which provides access to the stored metric data and capability to query it. Additionally, pluggability in terms of where the fine grained metrics data is written for long term storage. 
- The Amabri admin can configure this to use their own metric storage and thereby configure the collectors.

*Host Metrics Collector Daemon*: This is replacement for the gmond running on hosts.
- The host level metrics like cpu, disk, etc are captured by the Ganglia monitor daemon. We should be able to re-purpose this to push metrics to the Ambari Metrics Collector Service.
- Long term goal is to re-write gmond and create our own collector to achieve the following goals:
-- Reduce network traffic by reducing number of packets sent over the wire
-- Reduce the number of processes running per host for monitoring workload

*HA Requirements*:
Ambari Metrics Service: This is a Master daemon and might have built in HA support in the future.

*Scaling out*:
The Ambari Metrics Collector can be envsioned as a Slave and a typical cluster should be able to deploy multiple instances of this services achieving fan out based number of hosts in the cluster.


> Replace Ganglia with high performant and pluggable Metrics System
> -----------------------------------------------------------------
>
>                 Key: AMBARI-5707
>                 URL: https://issues.apache.org/jira/browse/AMBARI-5707
>             Project: Ambari
>          Issue Type: New Feature
>          Components: agent, controller
>    Affects Versions: 1.6.0
>            Reporter: Siddharth Wagle
>            Assignee: Siddharth Wagle
>            Priority: Critical
>         Attachments: MetricsSystemArch.png
>
>
> Ambari Metrics System
> - Ability to collect metrics from Hadoop and other Stack services
> - Ability to retain metrics at a high precision for a configurable time period (say 5 days)
> - Ability to automatically purge metrics after retention period
> - At collection time, provide clear integration point for external system (such as TSDB)
> - At purge time, provide clear integration point for metrics retention by external system
> - Should provide default options for external metrics retention (say “HDFS”)
> - Provide tools / utilities for analyzing metrics in retention system (say “Hive schema, Pig scripts, etc” that can be used with the default retention store “HDFS”)
> System Requirements
> - Must be portable and platform independent
> - Must not conflict with any existing metrics system (such as Ganglia)
> - Must not conflict with existing SNMP infra
> - Must not run as root
> - Must have HA story (no SPOF)
> Usage
> - Ability to obtain metrics from Ambari REST API (point in time and temporal)
> - Ability to view metric graphs in Ambari Web (currently, fixed)
> - Ability to configure custom metric graphs in Ambari Web (currently, we have metric graphs “fixed” into the UI)
> - Need to improve metric graph “navigation” in Ambari Web (currently, metric graphs do not allow navigation at arbitrary timeframes, but only at ganglia aggregation intervals) 
> - Ability to “view cluster” at point in time (i.e. see all metrics at that point)
> - Ability to define metrics (and how + where to obtain) in Stack Definitions



--
This message was sent by Atlassian JIRA
(v6.2#6252)