You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Matthew Lowe <gi...@gmail.com> on 2016/05/30 12:47:24 UTC

Storm monitoring

Hello all. 

What kind of monitoring solutions do you use with storm?

For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities. 

It's only small and hacky, but I am genuinely interested to how you all monitor your topologies. 

Best Regards
Matthew Lowe

Re: Storm monitoring

Posted by Abhishek Agarwal <ab...@gmail.com>.
You can check this tool I have built. The tool sends email for *new*
errors/exception or if the topology is deactivated/killed
https://github.com/abhishekagarwal87/storm-monitor

On Mon, May 30, 2016 at 6:23 PM, anshu shukla <an...@gmail.com>
wrote:

> hello ,
>
> Can you please share the bash script that you are talking about . I am
> also interested in trying that .
>
> Thanks,
> Anshu  Shukla
> IISC,Bangalore
>
>
> On Mon, May 30, 2016 at 6:17 PM, Matthew Lowe <gi...@gmail.com>
> wrote:
>
>> Hello all.
>>
>> What kind of monitoring solutions do you use with storm?
>>
>> For example I have a bash script that reads the Json data from the REST
>> UI and alerts if there are any bolts with high capacities.
>>
>> It's only small and hacky, but I am genuinely interested to how you all
>> monitor your topologies.
>>
>> Best Regards
>> Matthew Lowe
>
>
>
>
> --
> Thanks & Regards,
> Anshu Shukla
>



-- 
Regards,
Abhishek Agarwal

Re: Storm monitoring

Posted by Matthew Lowe <gi...@gmail.com>.
The script will gather the topology JSON data for each topology
Gather all the boltId and capacity values
Then check if each capacity is over the limit, which in this case is 0.75
Any and all that fall into this criteria will trigger an email.



# the max capacity you want to allow
CAPACITY_LIMIT='0.75'

# the email address you would like to inform of the issue.
TRIGGER_EMAIL="FOO@BAR.com"

# get all the topology id's
TOPOLOGY_IDS=`curl -s
http://ADDYOURHOSTNAMEHERE:8080/api/v1/topology/summary | json_pp | grep
'"id"' | sed 's/.* : "\(.*\)".*/\1/'`

for id in ${TOPOLOGY_IDS} ; do
TOPOLOGY=`curl -s
http://ADDYOURHOSTNAMEHERE:8080/api/v1/topology/${id}?sys=false`

# collet all the capacities in the topology
CAPACITIES=`echo ${TOPOLOGY} | json_pp | grep '"capacity"' | sed 's/.* :
"\(.*\)".*/\1/'`
CAPACITIES_ARRAY=()
for capactiy in ${CAPACITIES} ; do
CAPACITIES_ARRAY+=("${capactiy}")
done

# collect all of the bolt ids in the topology
BOLT_IDS=`echo ${TOPOLOGY} | json_pp | grep '"boltId"' | sed 's/.* :
"\(.*\)".*/\1/'`
BOLT_IDS_ARRAY=()
for boltId in ${BOLT_IDS} ; do
BOLT_IDS_ARRAY+=("${boltId}")
done

# check and collect bad capacites
BAD_CAPACITIES_ARRAY=()
BAD_BOLT_ID_ARRAY=()
for(( x=0; x<${#BOLT_IDS_ARRAY[@]}; x++ ));  do
result=`echo ${CAPACITIES_ARRAY[$x]} '>' ${CAPACITY_LIMIT} | bc -l`
if [ "${result}" -eq "1" ] ; then
BAD_CAPACITIES_ARRAY+=("${CAPACITIES_ARRAY[$x]}")
BAD_BOLT_ID_ARRAY+=("${BOLT_IDS_ARRAY[$x]}")
fi
    done

    # send a trigger if needed
    # capacity and bolt id arrays will always be the same size
for(( x=0; x<${#BAD_CAPACITIES_ARRAY[@]}; x++ ));  do
mail -s "${id} is nearing capacity!" "${TRIGGER_EMAIL}" <<EOF
"${BAD_BOLT_ID_ARRAY[${x}]}" has a capacity of
"${BAD_CAPACITIES_ARRAY[$x]}".
This is an early warning limit suggesting that you look into allocating
more resources to the given topology.
This email is set to trigger when a bolt capacity is more than
"${CAPACITY_LIMIT}".
EOF
done

done




On Mon, May 30, 2016 at 2:53 PM, anshu shukla <an...@gmail.com>
wrote:

> hello ,
>
> Can you please share the bash script that you are talking about . I am
> also interested in trying that .
>
> Thanks,
> Anshu  Shukla
> IISC,Bangalore
>
>
> On Mon, May 30, 2016 at 6:17 PM, Matthew Lowe <gi...@gmail.com>
> wrote:
>
>> Hello all.
>>
>> What kind of monitoring solutions do you use with storm?
>>
>> For example I have a bash script that reads the Json data from the REST
>> UI and alerts if there are any bolts with high capacities.
>>
>> It's only small and hacky, but I am genuinely interested to how you all
>> monitor your topologies.
>>
>> Best Regards
>> Matthew Lowe
>
>
>
>
> --
> Thanks & Regards,
> Anshu Shukla
>

Re: Storm monitoring

Posted by anshu shukla <an...@gmail.com>.
hello ,

Can you please share the bash script that you are talking about . I am also
interested in trying that .

Thanks,
Anshu  Shukla
IISC,Bangalore


On Mon, May 30, 2016 at 6:17 PM, Matthew Lowe <gi...@gmail.com>
wrote:

> Hello all.
>
> What kind of monitoring solutions do you use with storm?
>
> For example I have a bash script that reads the Json data from the REST UI
> and alerts if there are any bolts with high capacities.
>
> It's only small and hacky, but I am genuinely interested to how you all
> monitor your topologies.
>
> Best Regards
> Matthew Lowe




-- 
Thanks & Regards,
Anshu Shukla

Re: Storm monitoring

Posted by Julien Nioche <li...@gmail.com>.
Hi,

I had written a MetricsConsumer for CloudWatch some time ago. The code
would need reworking a bit, I might release it if I find the time.

For StormCrawler <http://stormcrawler.net/> topologies, our Elasticsearch
module
<https://github.com/DigitalPebble/storm-crawler/tree/master/external/elasticsearch>
has
a MetricsConsumer which sends the metrics to an ES index; we use Kibana to
display the graphs and monitor the crawls.

Julien

On 31 May 2016 at 14:19, Matthew Lowe <gi...@gmail.com> wrote:

> Thought you all might be interested.
>
> I have now got capacity monitoring working with AWS Cloudwatch. These are
> 2 bolts within the same topology:
>
> ​
>
>
> #! /bin/bash
>
> API_KEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
>
> # get all the topology id's
> TOPOLOGY_IDS=`curl -s
> http://XXXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/summary | json_pp |
> grep '"id"' | sed 's/.* : "\(.*\)".*/\1/'`
> for id in ${TOPOLOGY_IDS} ; do
> TOPOLOGY=`curl -s
> http://XXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/${id}?sys=false`
> <http://XXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/$%7Bid%7D?sys=false>
>
> # check the capacity for each task
> CAPACITIES=`echo ${TOPOLOGY} | json_pp | grep '"capacity"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> CAPACITIES_ARRAY=()
> for capactiy in ${CAPACITIES} ; do
> CAPACITIES_ARRAY+=("${capactiy}")
> done
>
> # get all of the bolt ids
> BOLT_IDS=`echo ${TOPOLOGY} | json_pp | grep '"boltId"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> BOLT_IDS_ARRAY=()
> for boltId in ${BOLT_IDS} ; do
> BOLT_IDS_ARRAY+=("${boltId}")
> done
>
>     # send a trigger if needed
>     # capacity and bolt id arrays will always be the same size
> for(( x=0; x<${#CAPACITIES_ARRAY[@]}; x++ ));  do
> TIMESTAMP=`date +"%Y-%m-%dT%H:%M:%S.%N"`
> /usr/local/bin/aws cloudwatch put-metric-data --metric-name
> "${BOLT_IDS_ARRAY[$x]}-capacity" --namespace "${id}" --value
> "${CAPACITIES_ARRAY[$x]}" --timestamp "${TIMESTAMP}"
> done
>
>
> done
>
> On Mon, May 30, 2016 at 5:37 PM, Radhwane Chebaane <
> r.chebaane@mindlytix.com> wrote:
>
>>
>> Using graphite API was so helpful for us, but since it doesn't support
>> *Tags* introduced since *InfluxDB 0.9*, we created a *new metric
>> Consumer <https://github.com/mathieuboniface/storm-metrics-influxdb>*
>> that supports the new InfluxDB API instead of Grahpite API. *Tags* are
>> important when filtering dashboard metrics based on components, bolt or
>> worker name.
>>
>> Regards,
>> Radhwane
>>
>>
>> On 30/05/2016 17:19, Stephen Powis wrote:
>>
>> +1 for graphite and grafana via Verisign's plugin.
>>
>> Using graphite a few years ago was a real game changer for us, and more
>> recently grafana to help build out dashboards instead of copy/pasting
>> graphite urls around.  Here's two different dashboards we have relating to
>> our storm topologies.  We're able to correlate information from all parts
>> of our app, hardware monitoring metrics (via zabbix) and of course storm.
>> Additionally we use seyren on top of graphite for our alerting as well.
>>
>> bolt specific dashboard <http://i.imgur.com/ftKtci5.png>
>>
>> Dashboard correlating lots of related information from various sources
>> <http://i.imgur.com/t7yJ8d5.jpg>
>>
>>
>>
>> On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane <
>> <r....@mindlytix.com> wrote:
>>
>>> Hi Matthew,
>>>
>>> We actually use the InfluxData <https://influxdata.com/> Stack
>>> (InfluxDb + Grafana).
>>> We send our data directly to a time-series database, *InfluxDB*
>>> <https://influxdata.com/time-series-platform/influxdb/>. Then, we
>>> visualize metrics with a customizable dashboard, *Grafana*
>>> <http://grafana.org/>.
>>> This way, you can have real-time metrics on your Storm topology. You may
>>> also add custom metrics for enhanced monitoring.
>>>
>>> To export Storm metrics to InfluxDB you can use this *MetricsConsumer *which
>>> is compatible with the latest version of InfluxDB and Storm 1.0.0:
>>> https://github.com/mathieuboniface/storm-metrics-influxdb
>>>
>>> Or you can use the old Verisign plug-in with Graphite protocol:
>>> https://github.com/verisign/storm-graphite
>>>
>>> Best regards,
>>> Radhwane CHEBAANE
>>>
>>>
>>> On 30/05/2016 14:47, Matthew Lowe wrote:
>>>
>>> Hello all.
>>>
>>> What kind of monitoring solutions do you use with storm?
>>>
>>> For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>>>
>>> It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>>>
>>> Best Regards
>>> Matthew Lowe
>>>
>>>
>>>
>>
>>
>


-- 

*Open Source Solutions for Text Engineering*

http://www.digitalpebble.com
http://digitalpebble.blogspot.com/
#digitalpebble <http://twitter.com/digitalpebble>

Re: Storm monitoring

Posted by Otis Gospodnetić <ot...@gmail.com>.
For anyone interested in Storm monitoring and not interested in the DIY
approach, SPM has nice built-in Storm metrics monitoring/charting/alerting:
https://sematext.com/spm/

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, May 31, 2016 at 9:19 AM, Matthew Lowe <gi...@gmail.com>
wrote:

> Thought you all might be interested.
>
> I have now got capacity monitoring working with AWS Cloudwatch. These are
> 2 bolts within the same topology:
>
> ​
>
>
> #! /bin/bash
>
> API_KEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
>
> # get all the topology id's
> TOPOLOGY_IDS=`curl -s
> http://XXXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/summary | json_pp |
> grep '"id"' | sed 's/.* : "\(.*\)".*/\1/'`
> for id in ${TOPOLOGY_IDS} ; do
> TOPOLOGY=`curl -s
> http://XXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/${id}?sys=false`
> <http://XXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/$%7Bid%7D?sys=false>
>
> # check the capacity for each task
> CAPACITIES=`echo ${TOPOLOGY} | json_pp | grep '"capacity"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> CAPACITIES_ARRAY=()
> for capactiy in ${CAPACITIES} ; do
> CAPACITIES_ARRAY+=("${capactiy}")
> done
>
> # get all of the bolt ids
> BOLT_IDS=`echo ${TOPOLOGY} | json_pp | grep '"boltId"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> BOLT_IDS_ARRAY=()
> for boltId in ${BOLT_IDS} ; do
> BOLT_IDS_ARRAY+=("${boltId}")
> done
>
>     # send a trigger if needed
>     # capacity and bolt id arrays will always be the same size
> for(( x=0; x<${#CAPACITIES_ARRAY[@]}; x++ ));  do
> TIMESTAMP=`date +"%Y-%m-%dT%H:%M:%S.%N"`
> /usr/local/bin/aws cloudwatch put-metric-data --metric-name
> "${BOLT_IDS_ARRAY[$x]}-capacity" --namespace "${id}" --value
> "${CAPACITIES_ARRAY[$x]}" --timestamp "${TIMESTAMP}"
> done
>
>
> done
>
> On Mon, May 30, 2016 at 5:37 PM, Radhwane Chebaane <
> r.chebaane@mindlytix.com> wrote:
>
>>
>> Using graphite API was so helpful for us, but since it doesn't support
>> *Tags* introduced since *InfluxDB 0.9*, we created a *new metric
>> Consumer <https://github.com/mathieuboniface/storm-metrics-influxdb>*
>> that supports the new InfluxDB API instead of Grahpite API. *Tags* are
>> important when filtering dashboard metrics based on components, bolt or
>> worker name.
>>
>> Regards,
>> Radhwane
>>
>>
>> On 30/05/2016 17:19, Stephen Powis wrote:
>>
>> +1 for graphite and grafana via Verisign's plugin.
>>
>> Using graphite a few years ago was a real game changer for us, and more
>> recently grafana to help build out dashboards instead of copy/pasting
>> graphite urls around.  Here's two different dashboards we have relating to
>> our storm topologies.  We're able to correlate information from all parts
>> of our app, hardware monitoring metrics (via zabbix) and of course storm.
>> Additionally we use seyren on top of graphite for our alerting as well.
>>
>> bolt specific dashboard <http://i.imgur.com/ftKtci5.png>
>>
>> Dashboard correlating lots of related information from various sources
>> <http://i.imgur.com/t7yJ8d5.jpg>
>>
>>
>>
>> On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane <
>> <r....@mindlytix.com> wrote:
>>
>>> Hi Matthew,
>>>
>>> We actually use the InfluxData <https://influxdata.com/> Stack
>>> (InfluxDb + Grafana).
>>> We send our data directly to a time-series database, *InfluxDB*
>>> <https://influxdata.com/time-series-platform/influxdb/>. Then, we
>>> visualize metrics with a customizable dashboard, *Grafana*
>>> <http://grafana.org/>.
>>> This way, you can have real-time metrics on your Storm topology. You may
>>> also add custom metrics for enhanced monitoring.
>>>
>>> To export Storm metrics to InfluxDB you can use this *MetricsConsumer *which
>>> is compatible with the latest version of InfluxDB and Storm 1.0.0:
>>> https://github.com/mathieuboniface/storm-metrics-influxdb
>>>
>>> Or you can use the old Verisign plug-in with Graphite protocol:
>>> https://github.com/verisign/storm-graphite
>>>
>>> Best regards,
>>> Radhwane CHEBAANE
>>>
>>>
>>> On 30/05/2016 14:47, Matthew Lowe wrote:
>>>
>>> Hello all.
>>>
>>> What kind of monitoring solutions do you use with storm?
>>>
>>> For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>>>
>>> It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>>>
>>> Best Regards
>>> Matthew Lowe
>>>
>>>
>>>
>>
>>
>

Re: Storm monitoring

Posted by Matthew Lowe <gi...@gmail.com>.
Thought you all might be interested.

I have now got capacity monitoring working with AWS Cloudwatch. These are 2
bolts within the same topology:

​


#! /bin/bash

API_KEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

# get all the topology id's
TOPOLOGY_IDS=`curl -s
http://XXXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/summary | json_pp |
grep '"id"' | sed 's/.* : "\(.*\)".*/\1/'`
for id in ${TOPOLOGY_IDS} ; do
TOPOLOGY=`curl -s
http://XXXXXXXXXXXXXXXXXXXXXXX:8080/api/v1/topology/${id}?sys=false`

# check the capacity for each task
CAPACITIES=`echo ${TOPOLOGY} | json_pp | grep '"capacity"' | sed 's/.* :
"\(.*\)".*/\1/'`
CAPACITIES_ARRAY=()
for capactiy in ${CAPACITIES} ; do
CAPACITIES_ARRAY+=("${capactiy}")
done

# get all of the bolt ids
BOLT_IDS=`echo ${TOPOLOGY} | json_pp | grep '"boltId"' | sed 's/.* :
"\(.*\)".*/\1/'`
BOLT_IDS_ARRAY=()
for boltId in ${BOLT_IDS} ; do
BOLT_IDS_ARRAY+=("${boltId}")
done

    # send a trigger if needed
    # capacity and bolt id arrays will always be the same size
for(( x=0; x<${#CAPACITIES_ARRAY[@]}; x++ ));  do
TIMESTAMP=`date +"%Y-%m-%dT%H:%M:%S.%N"`
/usr/local/bin/aws cloudwatch put-metric-data --metric-name
"${BOLT_IDS_ARRAY[$x]}-capacity" --namespace "${id}" --value
"${CAPACITIES_ARRAY[$x]}" --timestamp "${TIMESTAMP}"
done


done

On Mon, May 30, 2016 at 5:37 PM, Radhwane Chebaane <r.chebaane@mindlytix.com
> wrote:

>
> Using graphite API was so helpful for us, but since it doesn't support
> *Tags* introduced since *InfluxDB 0.9*, we created a *new metric Consumer
> <https://github.com/mathieuboniface/storm-metrics-influxdb>* that
> supports the new InfluxDB API instead of Grahpite API. *Tags* are
> important when filtering dashboard metrics based on components, bolt or
> worker name.
>
> Regards,
> Radhwane
>
>
> On 30/05/2016 17:19, Stephen Powis wrote:
>
> +1 for graphite and grafana via Verisign's plugin.
>
> Using graphite a few years ago was a real game changer for us, and more
> recently grafana to help build out dashboards instead of copy/pasting
> graphite urls around.  Here's two different dashboards we have relating to
> our storm topologies.  We're able to correlate information from all parts
> of our app, hardware monitoring metrics (via zabbix) and of course storm.
> Additionally we use seyren on top of graphite for our alerting as well.
>
> bolt specific dashboard <http://i.imgur.com/ftKtci5.png>
>
> Dashboard correlating lots of related information from various sources
> <http://i.imgur.com/t7yJ8d5.jpg>
>
>
>
> On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane <
> <r....@mindlytix.com> wrote:
>
>> Hi Matthew,
>>
>> We actually use the InfluxData <https://influxdata.com/> Stack (InfluxDb
>> + Grafana).
>> We send our data directly to a time-series database, *InfluxDB*
>> <https://influxdata.com/time-series-platform/influxdb/>. Then, we
>> visualize metrics with a customizable dashboard, *Grafana*
>> <http://grafana.org/>.
>> This way, you can have real-time metrics on your Storm topology. You may
>> also add custom metrics for enhanced monitoring.
>>
>> To export Storm metrics to InfluxDB you can use this *MetricsConsumer *which
>> is compatible with the latest version of InfluxDB and Storm 1.0.0:
>> https://github.com/mathieuboniface/storm-metrics-influxdb
>>
>> Or you can use the old Verisign plug-in with Graphite protocol:
>> https://github.com/verisign/storm-graphite
>>
>> Best regards,
>> Radhwane CHEBAANE
>>
>>
>> On 30/05/2016 14:47, Matthew Lowe wrote:
>>
>> Hello all.
>>
>> What kind of monitoring solutions do you use with storm?
>>
>> For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>>
>> It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>>
>> Best Regards
>> Matthew Lowe
>>
>>
>>
>
>

Re: Storm monitoring

Posted by Radhwane Chebaane <r....@mindlytix.com>.
Using graphite API was so helpful for us, but since it doesn't support 
*Tags* introduced since *InfluxDB 0.9*, we created a *new metric 
Consumer <https://github.com/mathieuboniface/storm-metrics-influxdb>* 
that supports the new InfluxDB API instead of Grahpite API. *Tags* are 
important when filtering dashboard metrics based on components, bolt or 
worker name.

Regards,
Radhwane

On 30/05/2016 17:19, Stephen Powis wrote:
> +1 for graphite and grafana via Verisign's plugin.
>
> Using graphite a few years ago was a real game changer for us, and 
> more recently grafana to help build out dashboards instead of 
> copy/pasting graphite urls around.  Here's two different dashboards we 
> have relating to our storm topologies.  We're able to correlate 
> information from all parts of our app, hardware monitoring metrics 
> (via zabbix) and of course storm. Additionally we use seyren on top of 
> graphite for our alerting as well.
>
> bolt specific dashboard <http://i.imgur.com/ftKtci5.png>
>
> Dashboard correlating lots of related information from various sources 
> <http://i.imgur.com/t7yJ8d5.jpg>
>
>
>
> On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane 
> <r.chebaane@mindlytix.com <ma...@mindlytix.com>> wrote:
>
>     Hi Matthew,
>
>     We actually use the InfluxData <https://influxdata.com/> Stack
>     (InfluxDb + Grafana).
>     We send our data directly to a time-series database, *InfluxDB*
>     <https://influxdata.com/time-series-platform/influxdb/>. Then, we
>     visualize metrics with a customizable dashboard, *Grafana*
>     <http://grafana.org/>.
>     This way, you can have real-time metrics on your Storm topology.
>     You may also add custom metrics for enhanced monitoring.
>
>     To export Storm metrics to InfluxDB you can use this
>     *MetricsConsumer *which is compatible with the latest version of
>     InfluxDB and Storm 1.0.0:**
>     https://github.com/mathieuboniface/storm-metrics-influxdb
>
>     Or you can use the old Verisign plug-in with Graphite protocol:
>     https://github.com/verisign/storm-graphite
>
>     Best regards,
>     Radhwane CHEBAANE
>
>
>     On 30/05/2016 14:47, Matthew Lowe wrote:
>>     Hello all.
>>
>>     What kind of monitoring solutions do you use with storm?
>>
>>     For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>>
>>     It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>>
>>     Best Regards
>>     Matthew Lowe
>
>


Re: Storm monitoring

Posted by Stephen Powis <sp...@salesforce.com>.
+1 for graphite and grafana via Verisign's plugin.

Using graphite a few years ago was a real game changer for us, and more
recently grafana to help build out dashboards instead of copy/pasting
graphite urls around.  Here's two different dashboards we have relating to
our storm topologies.  We're able to correlate information from all parts
of our app, hardware monitoring metrics (via zabbix) and of course storm.
Additionally we use seyren on top of graphite for our alerting as well.

bolt specific dashboard <http://i.imgur.com/ftKtci5.png>

Dashboard correlating lots of related information from various sources
<http://i.imgur.com/t7yJ8d5.jpg>



On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane <r.chebaane@mindlytix.com
> wrote:

> Hi Matthew,
>
> We actually use the InfluxData <https://influxdata.com/> Stack (InfluxDb
> + Grafana).
> We send our data directly to a time-series database, *InfluxDB*
> <https://influxdata.com/time-series-platform/influxdb/>. Then, we
> visualize metrics with a customizable dashboard, *Grafana*
> <http://grafana.org/>.
> This way, you can have real-time metrics on your Storm topology. You may
> also add custom metrics for enhanced monitoring.
>
> To export Storm metrics to InfluxDB you can use this *MetricsConsumer *which
> is compatible with the latest version of InfluxDB and Storm 1.0.0:
> https://github.com/mathieuboniface/storm-metrics-influxdb
>
> Or you can use the old Verisign plug-in with Graphite protocol:
> https://github.com/verisign/storm-graphite
>
> Best regards,
> Radhwane CHEBAANE
>
>
> On 30/05/2016 14:47, Matthew Lowe wrote:
>
> Hello all.
>
> What kind of monitoring solutions do you use with storm?
>
> For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>
> It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>
> Best Regards
> Matthew Lowe
>
>
>

Re: Storm monitoring

Posted by Radhwane Chebaane <r....@mindlytix.com>.
Hi Matthew,

We actually use the InfluxData <https://influxdata.com/> Stack (InfluxDb 
+ Grafana).
We send our data directly to a time-series database, *InfluxDB* 
<https://influxdata.com/time-series-platform/influxdb/>. Then, we 
visualize metrics with a customizable dashboard, *Grafana* 
<http://grafana.org/>.
This way, you can have real-time metrics on your Storm topology. You may 
also add custom metrics for enhanced monitoring.

To export Storm metrics to InfluxDB you can use this *MetricsConsumer 
*which is compatible with the latest version of InfluxDB and Storm 1.0.0:**
https://github.com/mathieuboniface/storm-metrics-influxdb

Or you can use the old Verisign plug-in with Graphite protocol:
https://github.com/verisign/storm-graphite

Best regards,
Radhwane CHEBAANE

On 30/05/2016 14:47, Matthew Lowe wrote:
> Hello all.
>
> What kind of monitoring solutions do you use with storm?
>
> For example I have a bash script that reads the Json data from the REST UI and alerts if there are any bolts with high capacities.
>
> It's only small and hacky, but I am genuinely interested to how you all monitor your topologies.
>
> Best Regards
> Matthew Lowe