You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Mike Harding <mi...@gmail.com> on 2016/04/18 15:50:06 UTC

howto dynamically change the PutHDFS target directory

Hi All,

I have a requirement to write a data stream into HDFS, where the flowfiles
received per day are group into a directory. e.g. so I would end up with a
folder structure as follows:

data/18-04-16
data/19-04-16
data/20-04-16 ... etc

Currently I can specify in the config for the putHDFS processor a target
directory but I want this to change and point to a new directory as each
day ends.

So using nifi id like to 1) be able to create new directories in HDFS
(although I could potentially write a bash script to do the directory
creation) and 2) change the target directory as the day changes.

Any help much appreciated,

Mike

Re: howto dynamically change the PutHDFS target directory

Posted by Mike Harding <mi...@gmail.com>.
Awesome! thanks for the heads up..I'll give that a try.

Mike

On 18 April 2016 at 15:02, Bryan Bende <bb...@gmail.com> wrote:

> Mike,
>
> If I am understanding correctly I think this can be done today... The
> Directory property on PutHDFS supports expression language, so you could
> set it to a value like:
>
> /data/${now():format('dd-MM-yy')}/
>
> This could be set directly in PutHDFS, although it is also a common
> pattern to stick an UpdateAttribute processor in front of PutHDFS and set
> filename and hadoop.dir attributes, and then in PutHDFS reference those as
> ${filename} and ${hadoop.dir}
>
> The advantage to the UpdateAttribute approach is that you can have a
> single PutHDFS processor that actually writes to many different locations.
>
> Hope that helps.
>
> -Bryan
>
>
> On Mon, Apr 18, 2016 at 2:53 PM, Oleg Zhurakousky <
> ozhurakousky@hortonworks.com> wrote:
>
>> Mike
>>
>> Indeed a very common requirement and we should support it.
>> Would you mind raising a JIRA for it?
>> https://issues.apache.org/jira/browse/NIFI
>>
>> Cheers
>> Oleg
>>
>> On Apr 18, 2016, at 9:50 AM, Mike Harding <mi...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I have a requirement to write a data stream into HDFS, where the
>> flowfiles received per day are group into a directory. e.g. so I would end
>> up with a folder structure as follows:
>>
>> data/18-04-16
>> data/19-04-16
>> data/20-04-16 ... etc
>>
>> Currently I can specify in the config for the putHDFS processor a target
>> directory but I want this to change and point to a new directory as each
>> day ends.
>>
>> So using nifi id like to 1) be able to create new directories in HDFS
>> (although I could potentially write a bash script to do the directory
>> creation) and 2) change the target directory as the day changes.
>>
>> Any help much appreciated,
>>
>> Mike
>>
>>
>>
>

Re: howto dynamically change the PutHDFS target directory

Posted by Bryan Bende <bb...@gmail.com>.
Mike,

If I am understanding correctly I think this can be done today... The
Directory property on PutHDFS supports expression language, so you could
set it to a value like:

/data/${now():format('dd-MM-yy')}/

This could be set directly in PutHDFS, although it is also a common pattern
to stick an UpdateAttribute processor in front of PutHDFS and set filename
and hadoop.dir attributes, and then in PutHDFS reference those as
${filename} and ${hadoop.dir}

The advantage to the UpdateAttribute approach is that you can have a single
PutHDFS processor that actually writes to many different locations.

Hope that helps.

-Bryan


On Mon, Apr 18, 2016 at 2:53 PM, Oleg Zhurakousky <
ozhurakousky@hortonworks.com> wrote:

> Mike
>
> Indeed a very common requirement and we should support it.
> Would you mind raising a JIRA for it?
> https://issues.apache.org/jira/browse/NIFI
>
> Cheers
> Oleg
>
> On Apr 18, 2016, at 9:50 AM, Mike Harding <mi...@gmail.com> wrote:
>
> Hi All,
>
> I have a requirement to write a data stream into HDFS, where the flowfiles
> received per day are group into a directory. e.g. so I would end up with a
> folder structure as follows:
>
> data/18-04-16
> data/19-04-16
> data/20-04-16 ... etc
>
> Currently I can specify in the config for the putHDFS processor a target
> directory but I want this to change and point to a new directory as each
> day ends.
>
> So using nifi id like to 1) be able to create new directories in HDFS
> (although I could potentially write a bash script to do the directory
> creation) and 2) change the target directory as the day changes.
>
> Any help much appreciated,
>
> Mike
>
>
>

Re: howto dynamically change the PutHDFS target directory

Posted by Oleg Zhurakousky <oz...@hortonworks.com>.
Mike

Indeed a very common requirement and we should support it.
Would you mind raising a JIRA for it? https://issues.apache.org/jira/browse/NIFI

Cheers
Oleg
On Apr 18, 2016, at 9:50 AM, Mike Harding <mi...@gmail.com>> wrote:

Hi All,

I have a requirement to write a data stream into HDFS, where the flowfiles received per day are group into a directory. e.g. so I would end up with a folder structure as follows:

data/18-04-16
data/19-04-16
data/20-04-16 ... etc

Currently I can specify in the config for the putHDFS processor a target directory but I want this to change and point to a new directory as each day ends.

So using nifi id like to 1) be able to create new directories in HDFS (although I could potentially write a bash script to do the directory creation) and 2) change the target directory as the day changes.

Any help much appreciated,

Mike