You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Schubert Zhang <zs...@gmail.com> on 2010/07/16 19:32:55 UTC

Re: Cassandra Health Monitoring

We integrate ganglia

On Mon, Jun 28, 2010 at 1:53 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> short version:
>
> if o.a.c.concurrent.{ROW-READ-STAGE,ROW-MUTATION-STAGE} and
> o.a.c.db.CompactionManager have
>
>  - completed task count increasing
>  - pending tasks stable (for RRS and RMS, stable in low hundreds or
> less, for CM stable in single digits or less)
>  - the log isn't spitting out Error lines
>
> then the node is completing requests and keeping up with demand reasonably
> well.
>
> On Tue, Jun 22, 2010 at 3:41 PM, Andrew Psaltis
> <An...@webtrends.com> wrote:
> > All,
> > We have been working through some operations scenarios, so that we are
> ready to deploy our first Cassandra cluster into production  in the coming
> months. During this process our operations folks have asked us to provide a
> Health Check service. I am using the word service here very liberally -
> really we just need to provide a way for the folks in out NOC to know that
> not only is the Cassandra process running (which they will get with their
> monitoring tools ), but that it is actually alive and well. We do not have
> the intent of verifying that the data is valid, just that every node in the
> cluster that is known to be running is actually alive and healthy. My
> questions are - What does it mean for a Cassandra node to be healthy?  What
> is the minimum (from an impact to the performance of a node) things we can
> check to make sure that a node is not a zombie?
> >
> > Any and all input is greatly appreciated.
> >
> > Thanks,
> > Andrew
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>