You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Daniel Doubleday <da...@gmx.net> on 2010/12/17 11:48:43 UTC

Cassandra Monitoring

Hi all

just wanted to share a simple way we use to monitor cassandra internals with zabbix.

We use a minimal http server which reads jmx and shows returns them in a property form. Thats read by zabbix every 30secs.
That's started together with cassandra:

https://gist.github.com/744761

Output looks something like:

dd@caladan[~]$ curl http://b22:9090/jmxexport
OperationMode=Normal
Load=151.379
ReadOperations=506334
WriteOperations=865867
TotalReadLatencyMicros=6663882635
TotalWriteLatencyMicros=352292885
BytesCompacted=0
BytesTotalInProgress=0
PendingTasks=0
HeapUsed=1153810280

How / what are you monitoring? Best practices someone?

Cheers,

Daniel Doubleday,
smeet.com, Berlin

Re: Cassandra Monitoring

Posted by Ivan Ho <ih...@evidentsoftware.com>.
Another option is Evident ClearStone (
http://www.evidentsoftware.com/products/clearstone-for-cassandra/).

It collects the Cassandra metrics via JMX as well. As long as one node in
the cluster is configured, it'll find the rest of them. The UI is written in
Adobe Flex. The Cassandra management pack comes with some pre-built
visualizations for Cassandra. However, you can easily create adhoc
visualizations to monitor any other metric. Users can set thresholds and
alerts on JVM heap, GC, LiveSSTables, disk usage, cache hit rate,
compactions, CPU utilization, and other metrics. In addition, some of the
nodetool features have been incorporated into the UI for quick and simple
access.

In our upcoming Q1 release, we're adding support for system level monitoring
(via SAR) for the purposes of correlating application performance with
system.

Check it out on our website.


On Sun, Dec 19, 2010 at 10:01 AM, Ran Tavory <ra...@gmail.com> wrote:

> FYI, I just added an mx4j section to the bottom of this page
> http://wiki.apache.org/cassandra/Operations
>
>
> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>>
>>
>> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <
>> peter.schuller@infidyne.com> wrote:
>>
>>> > How / what are you monitoring? Best practices someone?
>>>
>>> I recently set up monitoring using the cassandra-munin-plugins
>>> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>>> to various little details that wasn't too fun to integrate properly
>>> with munin-node-configure and automated configuration management. A
>>> problem is also the starting of a JVM for each use of jmxquery, which
>>> can become a problem with many column families.
>>>
>>> I like your web server idea. Something persistent that can sit there
>>> and do the JMX acrobatics, and expose something more easily consumed
>>> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>>> out of the box with Cassandra, though I expect that would be
>>> considered bloat. :)
>>>
>>> --
>>> / Peter Schuller
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>
>
> --
> /Ran
>
>


-- 
----------------------------------------------------
Ivan Ho <ih...@evidentsoftware.com>  <http://www.linkedin.com/in/ivan1818>
CTO/Co-Founder
Evident Software, Inc. <http://www.evidentsoftware.com>
<http://twitter.com/evidentsoftware>
Office: 973-622-5656 ext. 288
----------------------------------------------------
*THIS TRANSMISSION CONTAINS CONFIDENTIAL AND/OR LEGALLY PRIVILEGED
INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUALS NAMED IN THIS
MESSAGE. IF YOU ARE NOT THE INTENDED RECIPIENT, YOU ARE HEREBY NOTIFIED THAT
ANY DISCLOSURE, COPYING, DISTRIBUTION OR THE TAKING OF ANY ACTION IN
RELIANCE ON THE CONTENTS OF THIS E-MAIL TRANSMISSION IS STRICTLY PROHIBITED.
IF YOU HAVE RECEIVED THIS TRANSMISSION IN ERROR, PLEASE NOTIFY US
IMMEDIATELY SO THAT WE CAN ARRANGE FOR THE RETURN OF THE DOCUMENTS TO US AT
NO COST TO YOU.*

Re: Cassandra Monitoring

Posted by Edward Capriolo <ed...@gmail.com>.
On Thu, Jul 14, 2011 at 8:58 AM, Albert Vila <av...@imente.com> wrote:

> Anyone has Cassandra's cacti templates for server 0.7.4+?
>
> On 20 December 2010 17:40, Edward Capriolo <ed...@gmail.com> wrote:
> > On Sun, Dec 19, 2010 at 10:37 PM, Dave Viner <da...@gmail.com>
> wrote:
> >> Can you share the code for run_column_family_stores.sh ?
> >>
> >> On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo <edlinuxguru@gmail.com
> >
> >> wrote:
> >>>
> >>> On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory <ra...@gmail.com> wrote:
> >>> > Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar
> in
> >>> > the lib before you start Cassandra jmx-to-rest runs in a separate
> jvm.
> >>> >  It also has a nice useful HTML interface that you can look into any
> >>> > running host.
> >>> >
> >>> > On Sunday, December 19, 2010, Dave Viner <da...@gmail.com>
> wrote:
> >>> >> How does mx4j compare with the earlier jmx-to-rest bridge listed in
> the
> >>> >> operations page:
> >>> >> "JMX-to-REST bridge available
> >>> >> at http://code.google.com/p/polarrose-jmx-rest-bridge"
> >>> >>
> >>> >> ThanksDave Viner
> >>> >>
> >>> >>
> >>> >> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com>
> wrote:
> >>> >> FYI, I just added an mx4j section to the bottom of this
> >>> >> page http://wiki.apache.org/cassandra/Operations
> >>> >>
> >>> >>
> >>> >> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com>
> >>> >> wrote:
> >>> >> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller
> >>> >> <pe...@infidyne.com> wrote:
> >>> >>> How / what are you monitoring? Best practices someone?
> >>> >>
> >>> >> I recently set up monitoring using the cassandra-munin-plugins
> >>> >> (https://github.com/jamesgolick/cassandra-munin-plugins). However,
> due
> >>> >> to various little details that wasn't too fun to integrate properly
> >>> >> with munin-node-configure and automated configuration management. A
> >>> >> problem is also the starting of a JVM for each use of jmxquery,
> which
> >>> >> can become a problem with many column families.
> >>> >>
> >>> >> I like your web server idea. Something persistent that can sit there
> >>> >> and do the JMX acrobatics, and expose something more easily consumed
> >>> >> for stuff like munin/zabbix/etc. It would be pretty nice to have
> that
> >>> >> out of the box with Cassandra, though I expect that would be
> >>> >> considered bloat. :)
> >>> >>
> >>> >> --
> >>> >> / Peter Schuller
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Jonathan Ellis
> >>> >> Project Chair, Apache Cassandra
> >>> >> co-founder of Riptano, the source for professional Cassandra support
> >>> >> http://riptano.com
> >>> >>
> >>> >>
> >>> >> --
> >>> >> /Ran
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >
> >>> > --
> >>> > /Ran
> >>> >
> >>>
> >>> There is a lot of overhead on your monitoring station to kick up so
> >>> many JMX connections. There can also be nat/hostname problems for
> >>> remote JMX.
> >>>
> >>> My solution is to execute JMX over nagios remote plugin executor
> (NRPE).
> >>>
> >>>
> command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh
> >>> $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$
> >>>
> >>> Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI
> >>> issues involved in pulling stats over JMX,
> >>
> >>
> >
> > That script is just a wrapper:
> >
> > For example we can have our NMS directly call the JMX fetch code like
> this:
> > java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
> > com.jointhegrid.m6.cassandra.CFStores
> > service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi <user> <pass>
> >
> org.apache.cassandra.db:columnfamily=<columnfamily>,keyspace=<keyspace>,type=ColumnFamilyStores
> >
> > But as mentioned this puts a lot of pressure on the monitoring node to
> > open up all these JMX connections. With NRPE I can "farm" the requests
> > out over NRPE. Nodes end up executing their checks locally.
> >
> > # cat /usr/lib64/nagios/plugins/run_column_family_stores.sh
> > java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
> > com.jointhegrid.m6.cassandra.CFStores
> > service:jmx:rmi:///jndi/rmi://${1}:${2}/jmxrmi ${3} ${4}
> >
> org.apache.cassandra.db:columnfamily=${5},keyspace=${6},type=ColumnFamilyStores
> >
> > All the code is up here:
> > http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
> >
> http://www.jointhegrid.com/svn/cassandra-cacti-m6/trunk/src/com/jointhegrid/m6/cassandra/CFStores.java
> >
> > My main goal was to point out that you do not need REST bridges and
> > embedded web servers to run JMX checks remotely.
> >
>
>
>
> --
> Albert Vila Puig
> <av...@imente.com>
> iMente.com <http://www.imente.com>
>


http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp

There is some preliminary support for 0.7.X but I have not ported over all
the graphs yet. Look over the next couple of days.

Edward

Re: Cassandra Monitoring

Posted by Albert Vila <av...@imente.com>.
Anyone has Cassandra's cacti templates for server 0.7.4+?

On 20 December 2010 17:40, Edward Capriolo <ed...@gmail.com> wrote:
> On Sun, Dec 19, 2010 at 10:37 PM, Dave Viner <da...@gmail.com> wrote:
>> Can you share the code for run_column_family_stores.sh ?
>>
>> On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo <ed...@gmail.com>
>> wrote:
>>>
>>> On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory <ra...@gmail.com> wrote:
>>> > Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in
>>> > the lib before you start Cassandra jmx-to-rest runs in a separate jvm.
>>> >  It also has a nice useful HTML interface that you can look into any
>>> > running host.
>>> >
>>> > On Sunday, December 19, 2010, Dave Viner <da...@gmail.com> wrote:
>>> >> How does mx4j compare with the earlier jmx-to-rest bridge listed in the
>>> >> operations page:
>>> >> "JMX-to-REST bridge available
>>> >> at http://code.google.com/p/polarrose-jmx-rest-bridge"
>>> >>
>>> >> ThanksDave Viner
>>> >>
>>> >>
>>> >> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:
>>> >> FYI, I just added an mx4j section to the bottom of this
>>> >> page http://wiki.apache.org/cassandra/Operations
>>> >>
>>> >>
>>> >> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com>
>>> >> wrote:
>>> >> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller
>>> >> <pe...@infidyne.com> wrote:
>>> >>> How / what are you monitoring? Best practices someone?
>>> >>
>>> >> I recently set up monitoring using the cassandra-munin-plugins
>>> >> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>>> >> to various little details that wasn't too fun to integrate properly
>>> >> with munin-node-configure and automated configuration management. A
>>> >> problem is also the starting of a JVM for each use of jmxquery, which
>>> >> can become a problem with many column families.
>>> >>
>>> >> I like your web server idea. Something persistent that can sit there
>>> >> and do the JMX acrobatics, and expose something more easily consumed
>>> >> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>>> >> out of the box with Cassandra, though I expect that would be
>>> >> considered bloat. :)
>>> >>
>>> >> --
>>> >> / Peter Schuller
>>> >>
>>> >>
>>> >> --
>>> >> Jonathan Ellis
>>> >> Project Chair, Apache Cassandra
>>> >> co-founder of Riptano, the source for professional Cassandra support
>>> >> http://riptano.com
>>> >>
>>> >>
>>> >> --
>>> >> /Ran
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> > --
>>> > /Ran
>>> >
>>>
>>> There is a lot of overhead on your monitoring station to kick up so
>>> many JMX connections. There can also be nat/hostname problems for
>>> remote JMX.
>>>
>>> My solution is to execute JMX over nagios remote plugin executor (NRPE).
>>>
>>> command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh
>>> $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$
>>>
>>> Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI
>>> issues involved in pulling stats over JMX,
>>
>>
>
> That script is just a wrapper:
>
> For example we can have our NMS directly call the JMX fetch code like this:
> java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
> com.jointhegrid.m6.cassandra.CFStores
> service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi <user> <pass>
> org.apache.cassandra.db:columnfamily=<columnfamily>,keyspace=<keyspace>,type=ColumnFamilyStores
>
> But as mentioned this puts a lot of pressure on the monitoring node to
> open up all these JMX connections. With NRPE I can "farm" the requests
> out over NRPE. Nodes end up executing their checks locally.
>
> # cat /usr/lib64/nagios/plugins/run_column_family_stores.sh
> java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
> com.jointhegrid.m6.cassandra.CFStores
> service:jmx:rmi:///jndi/rmi://${1}:${2}/jmxrmi ${3} ${4}
> org.apache.cassandra.db:columnfamily=${5},keyspace=${6},type=ColumnFamilyStores
>
> All the code is up here:
> http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
> http://www.jointhegrid.com/svn/cassandra-cacti-m6/trunk/src/com/jointhegrid/m6/cassandra/CFStores.java
>
> My main goal was to point out that you do not need REST bridges and
> embedded web servers to run JMX checks remotely.
>



-- 
Albert Vila Puig
<av...@imente.com>
iMente.com <http://www.imente.com>

Re: Cassandra Monitoring

Posted by Edward Capriolo <ed...@gmail.com>.
On Sun, Dec 19, 2010 at 10:37 PM, Dave Viner <da...@gmail.com> wrote:
> Can you share the code for run_column_family_stores.sh ?
>
> On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
>>
>> On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory <ra...@gmail.com> wrote:
>> > Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in
>> > the lib before you start Cassandra jmx-to-rest runs in a separate jvm.
>> >  It also has a nice useful HTML interface that you can look into any
>> > running host.
>> >
>> > On Sunday, December 19, 2010, Dave Viner <da...@gmail.com> wrote:
>> >> How does mx4j compare with the earlier jmx-to-rest bridge listed in the
>> >> operations page:
>> >> "JMX-to-REST bridge available
>> >> at http://code.google.com/p/polarrose-jmx-rest-bridge"
>> >>
>> >> ThanksDave Viner
>> >>
>> >>
>> >> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:
>> >> FYI, I just added an mx4j section to the bottom of this
>> >> page http://wiki.apache.org/cassandra/Operations
>> >>
>> >>
>> >> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com>
>> >> wrote:
>> >> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller
>> >> <pe...@infidyne.com> wrote:
>> >>> How / what are you monitoring? Best practices someone?
>> >>
>> >> I recently set up monitoring using the cassandra-munin-plugins
>> >> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>> >> to various little details that wasn't too fun to integrate properly
>> >> with munin-node-configure and automated configuration management. A
>> >> problem is also the starting of a JVM for each use of jmxquery, which
>> >> can become a problem with many column families.
>> >>
>> >> I like your web server idea. Something persistent that can sit there
>> >> and do the JMX acrobatics, and expose something more easily consumed
>> >> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>> >> out of the box with Cassandra, though I expect that would be
>> >> considered bloat. :)
>> >>
>> >> --
>> >> / Peter Schuller
>> >>
>> >>
>> >> --
>> >> Jonathan Ellis
>> >> Project Chair, Apache Cassandra
>> >> co-founder of Riptano, the source for professional Cassandra support
>> >> http://riptano.com
>> >>
>> >>
>> >> --
>> >> /Ran
>> >>
>> >>
>> >>
>> >>
>> >
>> > --
>> > /Ran
>> >
>>
>> There is a lot of overhead on your monitoring station to kick up so
>> many JMX connections. There can also be nat/hostname problems for
>> remote JMX.
>>
>> My solution is to execute JMX over nagios remote plugin executor (NRPE).
>>
>> command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh
>> $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$
>>
>> Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI
>> issues involved in pulling stats over JMX,
>
>

That script is just a wrapper:

For example we can have our NMS directly call the JMX fetch code like this:
java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
com.jointhegrid.m6.cassandra.CFStores
service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi <user> <pass>
org.apache.cassandra.db:columnfamily=<columnfamily>,keyspace=<keyspace>,type=ColumnFamilyStores

But as mentioned this puts a lot of pressure on the monitoring node to
open up all these JMX connections. With NRPE I can "farm" the requests
out over NRPE. Nodes end up executing their checks locally.

# cat /usr/lib64/nagios/plugins/run_column_family_stores.sh
java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar
com.jointhegrid.m6.cassandra.CFStores
service:jmx:rmi:///jndi/rmi://${1}:${2}/jmxrmi ${3} ${4}
org.apache.cassandra.db:columnfamily=${5},keyspace=${6},type=ColumnFamilyStores

All the code is up here:
http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
http://www.jointhegrid.com/svn/cassandra-cacti-m6/trunk/src/com/jointhegrid/m6/cassandra/CFStores.java

My main goal was to point out that you do not need REST bridges and
embedded web servers to run JMX checks remotely.

Re: Cassandra Monitoring

Posted by Dave Viner <da...@gmail.com>.
Can you share the code for run_column_family_stores.sh ?

On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo <ed...@gmail.com>wrote:

> On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory <ra...@gmail.com> wrote:
> > Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in
> > the lib before you start Cassandra jmx-to-rest runs in a separate jvm.
> >  It also has a nice useful HTML interface that you can look into any
> > running host.
> >
> > On Sunday, December 19, 2010, Dave Viner <da...@gmail.com> wrote:
> >> How does mx4j compare with the earlier jmx-to-rest bridge listed in the
> operations page:
> >> "JMX-to-REST bridge available at
> http://code.google.com/p/polarrose-jmx-rest-bridge"
> >>
> >> ThanksDave Viner
> >>
> >>
> >> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:
> >> FYI, I just added an mx4j section to the bottom of this page
> http://wiki.apache.org/cassandra/Operations
> >>
> >>
> >> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
> >>
> >>
> >>
> >>
> >> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <
> peter.schuller@infidyne.com> wrote:
> >>> How / what are you monitoring? Best practices someone?
> >>
> >> I recently set up monitoring using the cassandra-munin-plugins
> >> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
> >> to various little details that wasn't too fun to integrate properly
> >> with munin-node-configure and automated configuration management. A
> >> problem is also the starting of a JVM for each use of jmxquery, which
> >> can become a problem with many column families.
> >>
> >> I like your web server idea. Something persistent that can sit there
> >> and do the JMX acrobatics, and expose something more easily consumed
> >> for stuff like munin/zabbix/etc. It would be pretty nice to have that
> >> out of the box with Cassandra, though I expect that would be
> >> considered bloat. :)
> >>
> >> --
> >> / Peter Schuller
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of Riptano, the source for professional Cassandra support
> >> http://riptano.com
> >>
> >>
> >> --
> >> /Ran
> >>
> >>
> >>
> >>
> >
> > --
> > /Ran
> >
>
> There is a lot of overhead on your monitoring station to kick up so
> many JMX connections. There can also be nat/hostname problems for
> remote JMX.
>
> My solution is to execute JMX over nagios remote plugin executor (NRPE).
>
> command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh
> $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$
>
> Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI
> issues involved in pulling stats over JMX,
>

Re: Cassandra Monitoring

Posted by Edward Capriolo <ed...@gmail.com>.
On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory <ra...@gmail.com> wrote:
> Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in
> the lib before you start Cassandra jmx-to-rest runs in a separate jvm.
>  It also has a nice useful HTML interface that you can look into any
> running host.
>
> On Sunday, December 19, 2010, Dave Viner <da...@gmail.com> wrote:
>> How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page:
>> "JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge"
>>
>> ThanksDave Viner
>>
>>
>> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:
>> FYI, I just added an mx4j section to the bottom of this page http://wiki.apache.org/cassandra/Operations
>>
>>
>> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>>
>>
>>
>>
>> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <pe...@infidyne.com> wrote:
>>> How / what are you monitoring? Best practices someone?
>>
>> I recently set up monitoring using the cassandra-munin-plugins
>> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>> to various little details that wasn't too fun to integrate properly
>> with munin-node-configure and automated configuration management. A
>> problem is also the starting of a JVM for each use of jmxquery, which
>> can become a problem with many column families.
>>
>> I like your web server idea. Something persistent that can sit there
>> and do the JMX acrobatics, and expose something more easily consumed
>> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>> out of the box with Cassandra, though I expect that would be
>> considered bloat. :)
>>
>> --
>> / Peter Schuller
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>>
>> --
>> /Ran
>>
>>
>>
>>
>
> --
> /Ran
>

There is a lot of overhead on your monitoring station to kick up so
many JMX connections. There can also be nat/hostname problems for
remote JMX.

My solution is to execute JMX over nagios remote plugin executor (NRPE).
command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh
$ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$

Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI
issues involved in pulling stats over JMX,

Re: Cassandra Monitoring

Posted by Ran Tavory <ra...@gmail.com>.
Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in
the lib before you start Cassandra jmx-to-rest runs in a separate jvm.
 It also has a nice useful HTML interface that you can look into any
running host.

On Sunday, December 19, 2010, Dave Viner <da...@gmail.com> wrote:
> How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page:
> "JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge"
>
> ThanksDave Viner
>
>
> On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:
> FYI, I just added an mx4j section to the bottom of this page http://wiki.apache.org/cassandra/Operations
>
>
> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>
>
>
>
> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <pe...@infidyne.com> wrote:
>> How / what are you monitoring? Best practices someone?
>
> I recently set up monitoring using the cassandra-munin-plugins
> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
> to various little details that wasn't too fun to integrate properly
> with munin-node-configure and automated configuration management. A
> problem is also the starting of a JVM for each use of jmxquery, which
> can become a problem with many column families.
>
> I like your web server idea. Something persistent that can sit there
> and do the JMX acrobatics, and expose something more easily consumed
> for stuff like munin/zabbix/etc. It would be pretty nice to have that
> out of the box with Cassandra, though I expect that would be
> considered bloat. :)
>
> --
> / Peter Schuller
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>
>
> --
> /Ran
>
>
>
>

-- 
/Ran

Re: Cassandra Monitoring

Posted by Dave Viner <da...@gmail.com>.
How does mx4j compare with the earlier jmx-to-rest bridge listed in the
operations page:

"JMX-to-REST bridge available at
http://code.google.com/p/polarrose-jmx-rest-bridge"

Thanks
Dave Viner


On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory <ra...@gmail.com> wrote:

> FYI, I just added an mx4j section to the bottom of this page
> http://wiki.apache.org/cassandra/Operations
>
>
> On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>>
>>
>> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <
>> peter.schuller@infidyne.com> wrote:
>>
>>> > How / what are you monitoring? Best practices someone?
>>>
>>> I recently set up monitoring using the cassandra-munin-plugins
>>> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>>> to various little details that wasn't too fun to integrate properly
>>> with munin-node-configure and automated configuration management. A
>>> problem is also the starting of a JVM for each use of jmxquery, which
>>> can become a problem with many column families.
>>>
>>> I like your web server idea. Something persistent that can sit there
>>> and do the JMX acrobatics, and expose something more easily consumed
>>> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>>> out of the box with Cassandra, though I expect that would be
>>> considered bloat. :)
>>>
>>> --
>>> / Peter Schuller
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>
>
> --
> /Ran
>
>

Re: Cassandra Monitoring

Posted by Ran Tavory <ra...@gmail.com>.
FYI, I just added an mx4j section to the bottom of this page
http://wiki.apache.org/cassandra/Operations


On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068
>
>
> On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <
> peter.schuller@infidyne.com> wrote:
>
>> > How / what are you monitoring? Best practices someone?
>>
>> I recently set up monitoring using the cassandra-munin-plugins
>> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>> to various little details that wasn't too fun to integrate properly
>> with munin-node-configure and automated configuration management. A
>> problem is also the starting of a JVM for each use of jmxquery, which
>> can become a problem with many column families.
>>
>> I like your web server idea. Something persistent that can sit there
>> and do the JMX acrobatics, and expose something more easily consumed
>> for stuff like munin/zabbix/etc. It would be pretty nice to have that
>> out of the box with Cassandra, though I expect that would be
>> considered bloat. :)
>>
>> --
>> / Peter Schuller
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>



-- 
/Ran

Re: Cassandra Monitoring

Posted by Jonathan Ellis <jb...@gmail.com>.
mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068

On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> > How / what are you monitoring? Best practices someone?
>
> I recently set up monitoring using the cassandra-munin-plugins
> (https://github.com/jamesgolick/cassandra-munin-plugins). However, due
> to various little details that wasn't too fun to integrate properly
> with munin-node-configure and automated configuration management. A
> problem is also the starting of a JVM for each use of jmxquery, which
> can become a problem with many column families.
>
> I like your web server idea. Something persistent that can sit there
> and do the JMX acrobatics, and expose something more easily consumed
> for stuff like munin/zabbix/etc. It would be pretty nice to have that
> out of the box with Cassandra, though I expect that would be
> considered bloat. :)
>
> --
> / Peter Schuller
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Cassandra Monitoring

Posted by Adrian Cockcroft <ac...@netflix.com>.
I'm currently working to configure AppDynamics to monitor cassandra. It
does byte-code instrumentation, so there is an agent added to the
cassandra JVM, which gives the ability to capture latency for requests and
see where the bottleneck is coming from. We have been using it on our
other Java apps. They have a free version to try it out. It doesn't track
thrift calls out of the box, but I'm encouraging AD to figure out a way to
do that, and working on a config for capturing the entry points in the
meantime.

The way the page cache works is that pages stay in memory linked to a
specific file. If you delete that file, the pages are all considered
invalid at that point, so get zero'ed out and go to the start of the free
list. So compaction creates a new file first (which is competing with
existing read traffic to try and keep its pages in memory) then removes
the old files that were being merged, so at that point there is a supply
of blank pages, but disk reads will be needed to warm up the cache again.
The use case that I'm working with is more like a persistent memcached
replacement, so we are trying to have more RAM than data on m2.4xl EC2
instances (~70GB) and keep all reads in memory all the time.

Adrian

On 12/19/10 5:36 AM, "Peter Schuller" <pe...@infidyne.com> wrote:

>> How / what are you monitoring? Best practices someone?
>
>I recently set up monitoring using the cassandra-munin-plugins
>(https://github.com/jamesgolick/cassandra-munin-plugins). However, due
>to various little details that wasn't too fun to integrate properly
>with munin-node-configure and automated configuration management. A
>problem is also the starting of a JVM for each use of jmxquery, which
>can become a problem with many column families.
>
>I like your web server idea. Something persistent that can sit there
>and do the JMX acrobatics, and expose something more easily consumed
>for stuff like munin/zabbix/etc. It would be pretty nice to have that
>out of the box with Cassandra, though I expect that would be
>considered bloat. :)
>
>-- 
>/ Peter Schuller
>


Re: Cassandra Monitoring

Posted by Peter Schuller <pe...@infidyne.com>.
> How / what are you monitoring? Best practices someone?

I recently set up monitoring using the cassandra-munin-plugins
(https://github.com/jamesgolick/cassandra-munin-plugins). However, due
to various little details that wasn't too fun to integrate properly
with munin-node-configure and automated configuration management. A
problem is also the starting of a JVM for each use of jmxquery, which
can become a problem with many column families.

I like your web server idea. Something persistent that can sit there
and do the JMX acrobatics, and expose something more easily consumed
for stuff like munin/zabbix/etc. It would be pretty nice to have that
out of the box with Cassandra, though I expect that would be
considered bloat. :)

-- 
/ Peter Schuller

Re: Cassandra Monitoring

Posted by Dan Kuebrich <da...@gmail.com>.
Is anyone using cassandra with monit?  All I have is this embarrassing bit
of monit config:

check process cassandra with pidfile /var/run/cassandra.pid
  start program = "/etc/init.d/cassandra start" with timeout 60 seconds
  stop program  = "/etc/init.d/cassandra stop"
  if failed port 9160 type tcp
     with timeout 15 seconds
     then restart
  if 3 restarts within 5 cycles then timeout
  group server

I'm sure there's some good numbers available via JMX to alert on as well but
I'm not sure best way to poll it from monit.  Comments/contributions
appreciated.

dan

On Fri, Dec 17, 2010 at 11:03 AM, Edward Capriolo <ed...@gmail.com>wrote:

> On Fri, Dec 17, 2010 at 5:48 AM, Daniel Doubleday
> <da...@gmx.net> wrote:
> > Hi all
> > just wanted to share a simple way we use to monitor cassandra internals
> with
> > zabbix.
> > We use a minimal http server which reads jmx and shows returns them in a
> > property form. Thats read by zabbix every 30secs.
> > That's started together with cassandra:
> > https://gist.github.com/744761
> > Output looks something like:
> > dd@caladan[~]$ curl http://b22:9090/jmxexport
> > OperationMode=Normal
> > Load=151.379
> > ReadOperations=506334
> > WriteOperations=865867
> > TotalReadLatencyMicros=6663882635
> > TotalWriteLatencyMicros=352292885
> > BytesCompacted=0
> > BytesTotalInProgress=0
> > PendingTasks=0
> > HeapUsed=1153810280
> > How / what are you monitoring? Best practices someone?
> > Cheers,
> > Daniel Doubleday,
> > smeet.com, Berlin
>
> Using cacti and - >
> http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
> Many people are using munin good support there.
>
> Best Bractices:
> Monitor SSTable sizes and growth.
> Monitor Reads/Write sec
> Monitor Cache hit rate
> Monitor Compactions (what % of the day and average node is compacting)
> Monitor SSTable count (make sure you do not have to many)
> Monitor IO wait. (make sure you are not disk bound)
> Monitor JVM memory (make sure you have some overhead for bursts of traffic)
>

Re: Cassandra Monitoring

Posted by Edward Capriolo <ed...@gmail.com>.
On Fri, Dec 17, 2010 at 5:48 AM, Daniel Doubleday
<da...@gmx.net> wrote:
> Hi all
> just wanted to share a simple way we use to monitor cassandra internals with
> zabbix.
> We use a minimal http server which reads jmx and shows returns them in a
> property form. Thats read by zabbix every 30secs.
> That's started together with cassandra:
> https://gist.github.com/744761
> Output looks something like:
> dd@caladan[~]$ curl http://b22:9090/jmxexport
> OperationMode=Normal
> Load=151.379
> ReadOperations=506334
> WriteOperations=865867
> TotalReadLatencyMicros=6663882635
> TotalWriteLatencyMicros=352292885
> BytesCompacted=0
> BytesTotalInProgress=0
> PendingTasks=0
> HeapUsed=1153810280
> How / what are you monitoring? Best practices someone?
> Cheers,
> Daniel Doubleday,
> smeet.com, Berlin

Using cacti and - > http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
Many people are using munin good support there.

Best Bractices:
Monitor SSTable sizes and growth.
Monitor Reads/Write sec
Monitor Cache hit rate
Monitor Compactions (what % of the day and average node is compacting)
Monitor SSTable count (make sure you do not have to many)
Monitor IO wait. (make sure you are not disk bound)
Monitor JVM memory (make sure you have some overhead for bursts of traffic)