You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by James McMahon <js...@gmail.com> on 2019/03/08 13:44:09 UTC

Processor(s) to monitor for new subdirectories?

Hello. I frequently use ListFile / FetchFile to monitor a subdirectory for
new files that appear. I have a new requirement to monitor a parent
directory for new subdirectories that appear, issuing an alert when I
detect them. It does not look like ListFile will do that. Has anyone used
NiFi processors to accomplish such a goal, and if so how did you solve this
challenge? Thank you in advance for any suggestions.
- Jim

Re: Processor(s) to monitor for new subdirectories?

Posted by Joe Witt <jo...@gmail.com>.
Denes

You might want to ensure FetchFile is able to understand such flowfiles as
well and to skip them so it doesn't try to Fetch a directory and/or it
handles it gracefully.

Thanks

On Mon, Mar 11, 2019 at 12:51 PM Denes Arvay <de...@apache.org> wrote:

> Hi Jim,
>
> I suppose you want to monitor the newly created but still empty
> subdirectories, right? ListFile doesn't list those and I'm not aware of any
> processor for this purpose.
> I created a quick patch for ListFile, feel free to use it as is or as a
> starting point:
> https://github.com/apache/nifi/compare/master...adenes:listfile-list-dirs?expand=1
> I'm curious if the community knows any other solution, if not and this
> seems to be useful feature I'm happy to file a Jira and open a pull request.
>
> Best,
> Denes
>

Re: Processor(s) to monitor for new subdirectories?

Posted by James McMahon <js...@gmail.com>.
Thank you both very much for responding, Denes and Joe. I looked at your
code and see that you build a custom ListFile processor that leverages the
state info to identify and save new directories that appear since the prior
run (please do correct me if I don't have that quite right after initial
review). I'm going to try this too. And I think this would be a very useful
feature.

Most of the things I get tasked with here are of the "get it done
yesterday" variety, and so on Friday evening I rolled my own to get
something up and running. You asked about other solutions, and so I'll tell
you what I did very briefly.

1- I use a GenerateFlowFile processor to generate a small 1KB trigger file.
That processor is configured as CRON driven. I can adjust that easily to
whatever periodicity the customer requires.
2- That trigger flowFile then causes an ExecuteScript processor to run. It
executes a very simple python script.
3- The script does two things. It reads into a List the subdirectories
we've already seen, which I persist in a small configuration file.
4- The code then uses os commands to generate a list of current
subdirectories in the snapshot.
5- It compares the two lists using python list functions, and the
difference represents the directories for which my flow I issue alerts.
6- The trigger file payload is replaced by the product of the list
difference, and I create a few attributes too.
7- As a final step I append to the config file the new subdirs identified
this cycle.

I am happy to share the script here if anyone wants to see it. In truth it
is pretty underwhelming, with just a few interesting list operations.
Jim

On Mon, Mar 11, 2019 at 12:51 PM Denes Arvay <de...@apache.org> wrote:

> Hi Jim,
>
> I suppose you want to monitor the newly created but still empty
> subdirectories, right? ListFile doesn't list those and I'm not aware of any
> processor for this purpose.
> I created a quick patch for ListFile, feel free to use it as is or as a
> starting point:
> https://github.com/apache/nifi/compare/master...adenes:listfile-list-dirs?expand=1
> I'm curious if the community knows any other solution, if not and this
> seems to be useful feature I'm happy to file a Jira and open a pull request.
>
> Best,
> Denes
>

Re: Processor(s) to monitor for new subdirectories?

Posted by Denes Arvay <de...@apache.org>.
Hi Jim,

I suppose you want to monitor the newly created but still empty
subdirectories, right? ListFile doesn't list those and I'm not aware of any
processor for this purpose.
I created a quick patch for ListFile, feel free to use it as is or as a
starting point:
https://github.com/apache/nifi/compare/master...adenes:listfile-list-dirs?expand=1
I'm curious if the community knows any other solution, if not and this
seems to be useful feature I'm happy to file a Jira and open a pull request.

Best,
Denes