You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Matt Wise <ma...@nextdoor.com> on 2013/03/16 17:39:32 UTC

Re: What do you use to monitor ZooKeeper Performance?

For what its worth, we just wrote up a quick little shell script to integrate with our Collectd-based monitoring system:

#!/bin/bash

# Defaults
PORT=2181
HOSTNAME="${COLLECTD_HOSTNAME:-localhost}"
INTERVAL="${COLLECTD_INTERVAL:-5}"

# Generic function for putting a value out there
put_gauge() {
    NAME=$1
    VAL=$2

    # If the VAL or NAME are empty for some reason, we just drop the stat.
    if [ -z $NAME ] || [ -z $VAL ]; then
        return
    fi

    echo "PUTVAL \"$HOSTNAME/zookeeper/gauge-$NAME\" interval=$INTERVAL N:$VAL"
}

# Generic function for putting a value out there
put_counter() {
    NAME=$1
    VAL=$2
    TYPE="${3:-counter}"
    NOW=`date +%s`

    # If the VAL or NAME are empty for some reason, we just drop the stat.
    if [ -z $NAME ] || [ -z $VAL ]; then
        return
    fi

    echo "PUTVAL \"$HOSTNAME/zookeeper/$TYPE-$NAME\" interval=$INTERVAL $NOW:$VAL"
}

# Gather 'global' server stats from the 'srvr' four-letter word.
get_srvr() {
    RAW_STAT=`echo srvr | nc 127.0.0.1 $PORT`

    # Total server connection count
    put_gauge 'connections' `echo $RAW_STAT | egrep -o 'Connections:\ ([0-9]+)' | awk '{print $2}'`

    # Number of outstanding requests
    put_gauge 'outstanding-requests' `echo $RAW_STAT | egrep -o 'Outstanding:\ ([0-9]+)' | awk '{print $2}'`

    # Total number of zNodes registered on this node
    put_gauge 'nodes' `echo $RAW_STAT | egrep -o 'Node count:\ ([0-9]+)' | awk '{print $3}'`

    # Total number of zNodes registered on this node
    LATENCY=`echo $RAW_STAT | egrep -o 'Latency min\/avg\/max:\ ([0-9]+)/([0-9]+)/([0-9]+)' | awk '{print $3}'`
    put_gauge 'latency-min' `echo $LATENCY | awk -F\/ '{print $1}'`
    put_gauge 'latency-avg' `echo $LATENCY | awk -F\/ '{print $2}'`
    put_gauge 'latency-max' `echo $LATENCY | awk -F\/ '{print $3}'`

    # Packets in and out.
    RECEIVED=`echo $RAW_STAT | egrep -o 'Received:\ ([0-9]+)' | awk '{print $2}'`
    SENT=`echo $RAW_STAT | egrep -o 'Sent:\ ([0-9]+)' | awk '{print $2}'`
    put_counter 'traffic' "$RECEIVED:$SENT" 'if_packets'
}

# Gather stats on the total number of 'watches' established
get_wchs() {
    RAW_STAT=`echo wchs | nc 127.0.0.1 $PORT`

    # Total number of watches established
    put_gauge 'local-watches-total' `echo $RAW_STAT | egrep -o 'Total watches:([0-9]+)' | awk -F: '{print $2}'`

    # Number of unique paths being watched
    put_gauge 'local-watches-unique-paths' `echo $RAW_STAT | egrep -o 'watching ([0-9]+) paths' | awk '{print $2}'`
}

# Loop through all of our stat collection functions and print them out on the $INTERVAL
while sleep "${INTERVAL}"; do
    get_srvr
    get_wchs
done


On Feb 24, 2013, at 4:50 AM, Erez Mazor <er...@gmail.com> wrote:

> Andrei Savu <sa...@...> writes:
> 
>> 
>> There are some scripts available that you can use to send ZooKeeper
>> related metrics to Nagios, Ganglia or Cacti.
>> 
>> See https://github.com/andreisavu/zookeeper-monitoring for more details.
>> 
>> The monitoring code will be available as a contrib in the upcoming 3.4 release.
>> 
> 
> We are using graphite with JMXTrans to monitor Zookeeper:
> 
> We use graphite with JMXTrans to monitor zookeeper:
> 
> http://techo-ecco.com/blog/monitoring-apache-hadoop-cassandra-and-zookeeper-using-graphite-and-
> jmxtrans/
> 
> 
> 
>