You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Alex McLintock <al...@diversebooks.com> on 2014/12/01 16:27:21 UTC

Application Level Monitoring and Oozie

Is there a "Best Practice" way of monitoring the tasks started by Oozie -
but in a data centric way?

eg

If I am sqooping data every day using Oozie coordinators to specify a day
then I might want to check that I fetched a similar number of records as
yesterday.

If I am populating a Hive partition every day using Oozie then I might want
to check that the new partition exists - and has a sensible looking number
of records.

The best I can come up with so far is

a) shell scripts which are kicked off by oozie at the end of my current
workflow.
b) Possibly add a conditional which emails me if there was an error
condition


Other ideas include doing my data monitoring  in a pig script and writing
its results to either a file somewhere - or writing it directly to some
other monitoring tool using a custom SerDe

But then I wonder what that monitoring tool should be. Typically we have
Ganglia and Nagios - so those are good starts - but they are very much
geared towards hardware and network monitoring rather than application
monitoring?

Is there a really obvious tool or setup i should be following to make it
clear that my batch Oozie bundles really have worked?

Alex