You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Jim Murphy <uf...@gmail.com> on 2018/02/23 23:14:17 UTC

Advice on ListS3 Processor

Hey all,

I'm a bit of a newbie to nifi <cringe>. But I am hoping someone might have
some advice for me here. I am using the ListS3 processor and the prefix
field just doesn't seem to like anything but static filenames. I cannot
reference directories or use the nifi expression language, where the docs
say that I should be able to.

I'd ideally like to reference files like this:

/folder1/folder1a/${filename:endsWith('txt')}

to find all files with a .txt extension 2 levels down in that specific path.

Any advice?

I totally know it's something stupid I am doing (or not doing). But
research shows scant examples for this processor out there.

Any help is appreciated greatly.

Thanks,

Jim

Re: Advice on ListS3 Processor

Posted by Sivaprasanna <si...@gmail.com>.
Jim,

${filename} is a NiFi expression language that is used to get the name of
the flowfile that is in the flow. For ex: assume this simple pipeline.
GetFile -> UpdateAttribute -> PutFile. GetFile will read the files from the
input directory and each flowfile will have an attribute ‘filename’ which
is the actual name of the file. You can use ${filename} in the downstream
processoes .. in this case UpdateAttribute and/or PutFile ans manipulate it
to whatever you want. This can’t b used in ListS3 since that is the source
processor and it’s not aware of what ${filename} contains so it will be
replaced with empty string. Hence, it will actually look for
‘folder1/folder1a/ .txt‘.

Having said that , for your usecase, what you can do is: ListS3 (provide
the source *directory* alone in the configuration) -> RouteOnAttribute
(create a property with a name like ‘txtFilesOnly’ and its value as
${filename:endsWith(‘txt’)} -> to any downstream processor you want but for
the relationship, choose ‘txtFilesOnly’ while connecting to the downstream
processor.

How this work is, ListS3 just lists files present in the S3 buckets’
provided path as a separate flowfile. These flowfiles will have an
attribute on them called ‘filename’ and you are using that attribute in the
‘RouteOnAttribute’ processor, saying only if the filename ends with .txt,
route thos particular flowfiles alone to the downstream processors. BTW,
ListS3 doesn’t fetch the content along so you probably want to use FetchS3
after ‘RouteOnAttribute’ to actually read the needed files i.e. txt files
from S3 bucket.

On Sat, 24 Feb 2018 at 4:45 AM, Jim Murphy <uf...@gmail.com> wrote:

> Hey all,
>
> I'm a bit of a newbie to nifi <cringe>. But I am hoping someone might have
> some advice for me here. I am using the ListS3 processor and the prefix
> field just doesn't seem to like anything but static filenames. I cannot
> reference directories or use the nifi expression language, where the docs
> say that I should be able to.
>
> I'd ideally like to reference files like this:
>
> /folder1/folder1a/${filename:endsWith('txt')}
>
> to find all files with a .txt extension 2 levels down in that specific
> path.
>
> Any advice?
>
> I totally know it's something stupid I am doing (or not doing). But
> research shows scant examples for this processor out there.
>
> Any help is appreciated greatly.
>
> Thanks,
>
> Jim
>