You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Robert Kanter <rk...@cloudera.com> on 2014/05/05 22:23:47 UTC

Instrumentation

Hi all,

The JIRA at OOZIE-1817
<https://issues.apache.org/jira/browse/OOZIE-1817>wants to make the
instrumentation timers biased; that is, to make them look
only at the last X amount of time, instead of forever.  When trying
monitoring Oozie, if your Oozie server has been running a long time, they
become less useful.

While looking at some of the instrumentation code, I also saw that in a lot
of places, we create and use a timer (i.e. a Cron object), but we never add
it to the Instrumentation, so it's never actually reported to the user, and
is essentially wasteful and useless.

We haven't really touched our instrumentation code in a while, and I was
thinking it might be a good idea to switch to using a library like Codahale
Metrics (also called Yammer Metrics).  They include a bunch of stuff for us
(e.g. JVM and servlet metrics) and do all of the math for computing
standard deviation, percentiles, etc.  It also looks like most of the
metric types can be upgraded more immediately, instead of every minute.
 Their API looks pretty similar to what we already have, so it shouldn't be
too hard to switch over.  Their "Getting Started" page is here if you want
to see more about it: http://metrics.codahale.com/getting-started/

We'd have to keep the current Instrumentation for Oozie 4.x, but we could
add a /v2/metrics URL for the new one and deprecate the old one.

Thoughts?


thanks
- Robert

Re: Instrumentation

Posted by Mona Chitnis <ch...@yahoo-inc.com>.
Good idea. The current instrumentations give a whole lot of very low-level information, some of which does not contribute to understanding the system load and behavior.
E.g. Counters for the various commands - knowing ‘coord.action.input.check' is useful but not so much ‘kill.preconditionfailed’. We could possibly exclude the unimportant ones to expose only the required, most impactful metrics.

In addition to JVM metrics, following are some of the most useful metrics. No latency related information is currently captured so not sure how much work that will involve, but the 2nd and 3rd ones are indeed captured but just not in a very graspable manner.

  *   average latency of the different database retrieval and update API
  *   A trend of number of jobs handled by Recovery Service
  *   A trend of callable queue occupancy

On 5/5/14, 4:16 PM, "Rohini Palaniswamy" <ro...@gmail.com>> wrote:

+1


On Mon, May 5, 2014 at 2:37 PM, Alejandro Abdelnur <tu...@cloudera.com>>wrote:

+1,

I've used codehale metrics in my last few projects (Llama is one of them)
and I love it.

I'd suggest the following, lets create a new Instrumentation class
delegates to metrics, then people can chose to upgrade. In Oozie 5 we can
swap the default instrumentation in the oozie-default.xml to metrics.

I'd add Hadoop JMX JSON Servlet to Oozie, if metrics is not used, the
servlet would just report JVM metric, if metrics is used, everything will
come out there.

Thx.


On Mon, May 5, 2014 at 1:23 PM, Robert Kanter <rk...@cloudera.com>>
wrote:

> Hi all,
>
> The JIRA at OOZIE-1817
> <https://issues.apache.org/jira/browse/OOZIE-1817>wants to make the
> instrumentation timers biased; that is, to make them look
> only at the last X amount of time, instead of forever.  When trying
> monitoring Oozie, if your Oozie server has been running a long time, they
> become less useful.
>
> While looking at some of the instrumentation code, I also saw that in a
lot
> of places, we create and use a timer (i.e. a Cron object), but we never
add
> it to the Instrumentation, so it's never actually reported to the user,
and
> is essentially wasteful and useless.
>
> We haven't really touched our instrumentation code in a while, and I was
> thinking it might be a good idea to switch to using a library like
Codahale
> Metrics (also called Yammer Metrics).  They include a bunch of stuff for
us
> (e.g. JVM and servlet metrics) and do all of the math for computing
> standard deviation, percentiles, etc.  It also looks like most of the
> metric types can be upgraded more immediately, instead of every minute.
>  Their API looks pretty similar to what we already have, so it shouldn't
be
> too hard to switch over.  Their "Getting Started" page is here if you
want
> to see more about it: http://metrics.codahale.com/getting-started/
>
> We'd have to keep the current Instrumentation for Oozie 4.x, but we could
> add a /v2/metrics URL for the new one and deprecate the old one.
>
> Thoughts?
>
>
> thanks
> - Robert
>



--
Alejandro



Re: Instrumentation

Posted by Rohini Palaniswamy <ro...@gmail.com>.
+1


On Mon, May 5, 2014 at 2:37 PM, Alejandro Abdelnur <tu...@cloudera.com>wrote:

> +1,
>
> I've used codehale metrics in my last few projects (Llama is one of them)
> and I love it.
>
> I'd suggest the following, lets create a new Instrumentation class
> delegates to metrics, then people can chose to upgrade. In Oozie 5 we can
> swap the default instrumentation in the oozie-default.xml to metrics.
>
> I'd add Hadoop JMX JSON Servlet to Oozie, if metrics is not used, the
> servlet would just report JVM metric, if metrics is used, everything will
> come out there.
>
> Thx.
>
>
> On Mon, May 5, 2014 at 1:23 PM, Robert Kanter <rk...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > The JIRA at OOZIE-1817
> > <https://issues.apache.org/jira/browse/OOZIE-1817>wants to make the
> > instrumentation timers biased; that is, to make them look
> > only at the last X amount of time, instead of forever.  When trying
> > monitoring Oozie, if your Oozie server has been running a long time, they
> > become less useful.
> >
> > While looking at some of the instrumentation code, I also saw that in a
> lot
> > of places, we create and use a timer (i.e. a Cron object), but we never
> add
> > it to the Instrumentation, so it's never actually reported to the user,
> and
> > is essentially wasteful and useless.
> >
> > We haven't really touched our instrumentation code in a while, and I was
> > thinking it might be a good idea to switch to using a library like
> Codahale
> > Metrics (also called Yammer Metrics).  They include a bunch of stuff for
> us
> > (e.g. JVM and servlet metrics) and do all of the math for computing
> > standard deviation, percentiles, etc.  It also looks like most of the
> > metric types can be upgraded more immediately, instead of every minute.
> >  Their API looks pretty similar to what we already have, so it shouldn't
> be
> > too hard to switch over.  Their "Getting Started" page is here if you
> want
> > to see more about it: http://metrics.codahale.com/getting-started/
> >
> > We'd have to keep the current Instrumentation for Oozie 4.x, but we could
> > add a /v2/metrics URL for the new one and deprecate the old one.
> >
> > Thoughts?
> >
> >
> > thanks
> > - Robert
> >
>
>
>
> --
> Alejandro
>

Re: Instrumentation

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
+1,

I've used codehale metrics in my last few projects (Llama is one of them)
and I love it.

I'd suggest the following, lets create a new Instrumentation class
delegates to metrics, then people can chose to upgrade. In Oozie 5 we can
swap the default instrumentation in the oozie-default.xml to metrics.

I'd add Hadoop JMX JSON Servlet to Oozie, if metrics is not used, the
servlet would just report JVM metric, if metrics is used, everything will
come out there.

Thx.


On Mon, May 5, 2014 at 1:23 PM, Robert Kanter <rk...@cloudera.com> wrote:

> Hi all,
>
> The JIRA at OOZIE-1817
> <https://issues.apache.org/jira/browse/OOZIE-1817>wants to make the
> instrumentation timers biased; that is, to make them look
> only at the last X amount of time, instead of forever.  When trying
> monitoring Oozie, if your Oozie server has been running a long time, they
> become less useful.
>
> While looking at some of the instrumentation code, I also saw that in a lot
> of places, we create and use a timer (i.e. a Cron object), but we never add
> it to the Instrumentation, so it's never actually reported to the user, and
> is essentially wasteful and useless.
>
> We haven't really touched our instrumentation code in a while, and I was
> thinking it might be a good idea to switch to using a library like Codahale
> Metrics (also called Yammer Metrics).  They include a bunch of stuff for us
> (e.g. JVM and servlet metrics) and do all of the math for computing
> standard deviation, percentiles, etc.  It also looks like most of the
> metric types can be upgraded more immediately, instead of every minute.
>  Their API looks pretty similar to what we already have, so it shouldn't be
> too hard to switch over.  Their "Getting Started" page is here if you want
> to see more about it: http://metrics.codahale.com/getting-started/
>
> We'd have to keep the current Instrumentation for Oozie 4.x, but we could
> add a /v2/metrics URL for the new one and deprecate the old one.
>
> Thoughts?
>
>
> thanks
> - Robert
>



-- 
Alejandro