You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by KhajaAsmath Mohammed <md...@gmail.com> on 2020/06/02 15:16:13 UTC

Batch Dependency in NIFI - GetFile

Hi,

I have use case where the data is read using Getfile from file location and
loads that data into database. I would like to have trigger once the
database load is successful for all the files.

I tried approach of Wait/Notify but still it does not work as it works for
individual files. Lets say, I have 1300 files and I should alert trigger
after completion of 1300 files.

This count changes and it is dynamic in nature. Any suggestions on this
approach please.

Thanks,
Asmath

Re: Batch Dependency in NIFI - GetFile

Posted by Boris Tyukin <bo...@boristyukin.com>.
Dependencies in NiFi is something I wish could work better. I see your pain
for sure! I wish there was an easier way as we all have to do ETL batch
type dependencies eventually.

I also tried Wait/Notify but it was a very confusing setup and felt a bit
overengineered for what I wanted to do.

The best option I came up and still being simple is this:

1. GenerateFlow processor to schedule your flow (let's say 7am every day).
It generates one flow and we also record some helpful audit attributes for
our framework
2. this flow will trigger your other flowfiles - in your case GetFile will
produce 1300 flowfiles today or 1320 tomorrow.
3. Once you get count of files, you init attributes for MergeContent
processor (number of files will be number of fragments)
4. do you thing here
5. final step is MergeContent processor which will wait for all the
fragments/files to finish. Only then it will proceed further. Here you can
also set timeout in case something went wrong and you got 1290 files
instead of expected 1300.


On Tue, Jun 2, 2020 at 11:16 AM KhajaAsmath Mohammed <
mdkhajaasmath@gmail.com> wrote:

> Hi,
>
> I have use case where the data is read using Getfile from file location
> and loads that data into database. I would like to have trigger once the
> database load is successful for all the files.
>
> I tried approach of Wait/Notify but still it does not work as it works for
> individual files. Lets say, I have 1300 files and I should alert trigger
> after completion of 1300 files.
>
> This count changes and it is dynamic in nature. Any suggestions on this
> approach please.
>
> Thanks,
> Asmath
>