You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Travis Crawford (JIRA)" <ji...@apache.org> on 2010/04/19 23:54:50 UTC

[jira] Updated: (ZOOKEEPER-744) Add monitoring four-letter word

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated ZOOKEEPER-744:
--------------------------------------

    Attachment: zk-ganglia.png

Attaching a Ganglia screen capture showing an example of how this data would be used. Many monitoring systems have a way to collect & store timeseries data; exporting these data as easily-parsable plaintext will make writing importers to any monitoring system easy.

> Add monitoring four-letter word
> -------------------------------
>
>                 Key: ZOOKEEPER-744
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-744
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: server
>            Reporter: Travis Crawford
>         Attachments: zk-ganglia.png
>
>
> Filing a feature request based on a zookeeper-user discussion.
> Zookeeper should have a new four-letter word that returns key-value pairs appropriate for importing to a monitoring system (such as Ganglia which has a large installed base)
> This command should initially export the following:
> (a) Count of instances in the ensemble.
> (b) Count of up-to-date instances in the ensemble.
> But be designed such that in the future additional data can be added. For example, the output could define the statistic in a comment, then print a key "space character" value line:
> """
> # Total number of instances in the ensemble
> zk_ensemble_instances_total 5
> # Number of instances currently participating in the quorum.
> zk_ensemble_instances_active 4
> """
> From the mailing list:
> """
> Date: Mon, 19 Apr 2010 12:10:44 -0700
> From: Patrick Hunt <ph...@apache.org>
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: Recovery issue - how to debug?
> On 04/19/2010 11:55 AM, Travis Crawford wrote:
> > It would be a lot easier from the operations perspective if the leader
> > explicitly published some health stats:
> >
> > (a) Count of instances in the ensemble.
> > (b) Count of up-to-date instances in the ensemble.
> >
> > This would greatly simplify monitoring&  alerting - when an instance
> > falls behind one could configure their monitoring system to let
> > someone know and take a look at the logs.
> That's a great idea. Please enter a JIRA for this - a new 4 letter word 
> and JMX support. It would also be a great starter project for someone 
> interested in becoming more familiar with the server code.
> Patrick
> """

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.