You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Peter Wolf <op...@gmail.com> on 2012/02/24 23:16:53 UTC

Ready to Deploy... Need Monitoring (2)

OK I have Ganglia up and running.  I can see tons of metrics.  Quite 
overwhelming...

I want to monitor my system for potential trouble.  Can anyone suggest a 
top ten that I should watch?

For example, "HBase: The Definitive Guide" states "The compaction queue 
size is another recommended early indicator of trouble..." and "Similar 
to the compaction queue you will see a sharp rise in count for the flush 
queue when, for example, your servers are under I/O duress..."

Is there a list posted somewhere?  What do others use?

Thanks
Peter


On 2/24/12 2:48 PM, Tom wrote:
>
>
> On 02/24/2012 11:20 AM, Peter Wolf wrote:
>> Hello again all,
>>
>> We have had a very successful time with HBase and are now ready to
>> deploy.
>
> If I get this right, you were doing your first installation on Hbase 
> only 45 days ago? If so, that is impressive progress; congratulations 
> and all the best for your launch!
>
>
> Our application needs to deal with millions of interactions per
>> day, and is hosted on Amazon. We are currently using a 3 machine cluster
>> for HBase.
>>
>> We need to set up automatic alarms and reports, so we can see trouble
>> before it affects our customers. We like CloudWatch for our alarms.
>>
>> We have currently set up Ganglia and started with Ken Weiner's blog
>>
>> http://blog.kenweiner.com/2010/10/monitor-hbase-hadoop-with-ganglia-on.html 
>>
>>
>> What other tools are available? What issues should we monitor, and how
>> should we monitor them? What guides should I read?
>>
>> Thanks in advance
>> Peter
>>
>>
>>
>

Re: Ready to Deploy... Need Monitoring (2)

Posted by Jean-Daniel Cryans <jd...@apache.org>.

My TSDB dashboard (that pulls in those metrics) has:

- request rate
- GC
- compactions queue
- IO wait
- User CPU

J-D

On Fri, Feb 24, 2012 at 2:16 PM, Peter Wolf <op...@gmail.com> wrote:
> OK I have Ganglia up and running.  I can see tons of metrics.  Quite
> overwhelming...
>
> I want to monitor my system for potential trouble.  Can anyone suggest a top
> ten that I should watch?
>
> For example, "HBase: The Definitive Guide" states "The compaction queue size
> is another recommended early indicator of trouble..." and "Similar to the
> compaction queue you will see a sharp rise in count for the flush queue
> when, for example, your servers are under I/O duress..."
>
> Is there a list posted somewhere?  What do others use?
>
> Thanks
> Peter
>
>
> On 2/24/12 2:48 PM, Tom wrote:
>>
>>
>>
>> On 02/24/2012 11:20 AM, Peter Wolf wrote:
>>>
>>> Hello again all,
>>>
>>> We have had a very successful time with HBase and are now ready to
>>> deploy.
>>
>>
>> If I get this right, you were doing your first installation on Hbase only
>> 45 days ago? If so, that is impressive progress; congratulations and all the
>> best for your launch!
>>
>>
>> Our application needs to deal with millions of interactions per
>>>
>>> day, and is hosted on Amazon. We are currently using a 3 machine cluster
>>> for HBase.
>>>
>>> We need to set up automatic alarms and reports, so we can see trouble
>>> before it affects our customers. We like CloudWatch for our alarms.
>>>
>>> We have currently set up Ganglia and started with Ken Weiner's blog
>>>
>>>
>>> http://blog.kenweiner.com/2010/10/monitor-hbase-hadoop-with-ganglia-on.html
>>>
>>> What other tools are available? What issues should we monitor, and how
>>> should we monitor them? What guides should I read?
>>>
>>> Thanks in advance
>>> Peter
>>>
>>>
>>>
>>
>