You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Alex McLintock <al...@diversebooks.com> on 2014/12/01 16:27:21 UTC
Application Level Monitoring and Oozie
Is there a "Best Practice" way of monitoring the tasks started by Oozie -
but in a data centric way?
eg
If I am sqooping data every day using Oozie coordinators to specify a day
then I might want to check that I fetched a similar number of records as
yesterday.
If I am populating a Hive partition every day using Oozie then I might want
to check that the new partition exists - and has a sensible looking number
of records.
The best I can come up with so far is
a) shell scripts which are kicked off by oozie at the end of my current
workflow.
b) Possibly add a conditional which emails me if there was an error
condition
Other ideas include doing my data monitoring in a pig script and writing
its results to either a file somewhere - or writing it directly to some
other monitoring tool using a custom SerDe
But then I wonder what that monitoring tool should be. Typically we have
Ganglia and Nagios - so those are good starts - but they are very much
geared towards hardware and network monitoring rather than application
monitoring?
Is there a really obvious tool or setup i should be following to make it
clear that my batch Oozie bundles really have worked?
Alex