You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Richard Duarte <ri...@gmail.com> on 2016/12/06 17:29:49 UTC

Data SLA

Hello,

I'm working on a project that I'd like to use Nifi for, however I'm trying
to figure out if there's a way to alert on the availability of data being
provided correctly to an agreement.

Example:
1. Team X has agreed to provide files in directory Y in HDFS
2. Team X has agreed to provide those files once per day by 9 AM PST
3. Team X has agreed to name those files following naming convention
[x]-[date]-[abc]


Is there any way for Nifi to alert on Team X violating any of those
agreements?  I.E. can we get Nifi to alert on the files not being there,
the files not being there in time, and/or the files not following the
correct naming conventions?

If not, are there any other tools that tie into Nifi to provide help on the
Data SLA?

Thanks,
Richard

Re: Data SLA

Posted by Jeremy Farbota <jf...@payoff.com>.
Richard,

Piggy backing on Andy's comment, you could also use getHDFS to seek files
based on the date/time or whatever parameters then use the failure
relationship to post to Slack or send an email.

On Tue, Dec 6, 2016 at 10:23 AM, Andy LoPresto <al...@apache.org> wrote:

> Richard,
>
> In addition to the flow that loads those files and performs whatever
> follow-on transformation and routing, you can have a parallel flow which
> monitors the directory and uses cron scheduling to run at a specific time
> (0900 daily) and is configured to expect a threshold number of files and in
> a specific naming pattern. This separate flow can route to email
> processors, etc. to alert relevant teams. If ListHDFS does not meet your
> requirements for monitoring, ExecuteScript is very versatile in performing
> non-standard actions within NiFi.
>
> You should also take a look at the monitoring capabilities of the NiFi
> UI/API such as data provenance, queue interaction, and
> statistics/monitoring. These features provide live monitoring for your
> administrators/data flow managers. You can read from these
> resources/queries via external monitoring tools if necessary as well.
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Dec 6, 2016, at 9:29 AM, Richard Duarte <ri...@gmail.com>
> wrote:
>
> Hello,
>
> I'm working on a project that I'd like to use Nifi for, however I'm trying
> to figure out if there's a way to alert on the availability of data being
> provided correctly to an agreement.
>
> Example:
> 1. Team X has agreed to provide files in directory Y in HDFS
> 2. Team X has agreed to provide those files once per day by 9 AM PST
> 3. Team X has agreed to name those files following naming convention
> [x]-[date]-[abc]
>
>
> Is there any way for Nifi to alert on Team X violating any of those
> agreements?  I.E. can we get Nifi to alert on the files not being there,
> the files not being there in time, and/or the files not following the
> correct naming conventions?
>
> If not, are there any other tools that tie into Nifi to provide help on
> the Data SLA?
>
> Thanks,
> Richard
>
>
>


-- 

[image: Payoff, Inc.] <http://www.payoff.com/>

Jeremy Farbota
Software Engineer, Data
jfarbota@payoff.com <em...@payoff.com> • (217) 898-8110 <(949)+430-0630>

I'm a Storyteller. Discover your Financial Personality!
<https://www.payoff.com/quiz>

[image: Facebook]  <https://www.facebook.com/payoff> [image: Twitter]
<https://www.twitter.com/payoff> [image: Linkedin]
<https://www.linkedin.com/company/payoff-com>

Re: Data SLA

Posted by Andy LoPresto <al...@apache.org>.
Richard,

In addition to the flow that loads those files and performs whatever follow-on transformation and routing, you can have a parallel flow which monitors the directory and uses cron scheduling to run at a specific time (0900 daily) and is configured to expect a threshold number of files and in a specific naming pattern. This separate flow can route to email processors, etc. to alert relevant teams. If ListHDFS does not meet your requirements for monitoring, ExecuteScript is very versatile in performing non-standard actions within NiFi.

You should also take a look at the monitoring capabilities of the NiFi UI/API such as data provenance, queue interaction, and statistics/monitoring. These features provide live monitoring for your administrators/data flow managers. You can read from these resources/queries via external monitoring tools if necessary as well.

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Dec 6, 2016, at 9:29 AM, Richard Duarte <ri...@gmail.com> wrote:
> 
> Hello,
> 
> I'm working on a project that I'd like to use Nifi for, however I'm trying to figure out if there's a way to alert on the availability of data being provided correctly to an agreement.
> 
> Example:
> 1. Team X has agreed to provide files in directory Y in HDFS
> 2. Team X has agreed to provide those files once per day by 9 AM PST
> 3. Team X has agreed to name those files following naming convention [x]-[date]-[abc]
> 
> 
> Is there any way for Nifi to alert on Team X violating any of those agreements?  I.E. can we get Nifi to alert on the files not being there, the files not being there in time, and/or the files not following the correct naming conventions?
> 
> If not, are there any other tools that tie into Nifi to provide help on the Data SLA?
> 
> Thanks,
> Richard