You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> on 2016/08/24 20:33:42 UTC

Need to read a small local file into a flow file property

Hi folks,

I’m looking for some ideas here.  I need to read the content of a small local file info a flow file attribute.  I can’t find a processor that does this.  Did I miss one that does?

So without one of these I’ve been trying to do this using a MergeContent processor.

First, I assign a correlation UUID and store it in an attribute

I split by file down two processing paths.  The left hand path goes straight to the MergeContentProcessors.

In the right hand path I

1.       Read the content of the local file using FetchFile

2.       Pull the content of the FlowFile into an attribute using EvaluateJSONPath

3.       Clear the content of the FlowFile using ReplaceText


Then I combine the left and right legs using MergeContent using the assigned correlation UUID to merge the files.

This generally works, except when it doesn’t. ☺

The problem seems to be that the left hand side of the stream flows relatively faster than the right hand path, which makes sense.  This can lead to the “bins” in the MergeContent processor being reused before the file in the bin can be merged with the file traveling down the right hand path causing Uncorrelated files are then sent to the merged output.

Does it sound like I am using the MergeContent processor in the right way?

Any other ideas?


Thanks in advance,

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image001.png@01D1FE25.46823B00]

Re: Need to read a small local file into a flow file property

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Thanks, everyone.  I’ll give the ExecuteScript solution a try.

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image001.png@01D1FECE.1ED700A0]

From: "Oxenberg, Jeff" <je...@hpe.com>
Reply-To: "users@nifi.apache.org" <us...@nifi.apache.org>
Date: Thursday, August 25, 2016 at 11:52 AM
To: "users@nifi.apache.org" <us...@nifi.apache.org>
Subject: RE: Need to read a small local file into a flow file property

Yeah, that would work.. here’s a quick example in python that works on my local machine.

https://gist.github.com/jeffoxenberg/327b0dfeaa6bb63882279dd290222582

Thanks,


Jeff Oxenberg

From: Andre [mailto:andre-lists@fucs.org]
Sent: Thursday, August 25, 2016 8:41 AM
To: users@nifi.apache.org
Subject: Re: Need to read a small local file into a flow file property



wouldn't scripted task using ExecuteScript solve this issue?

You could simply use jython, groovy, jruby, luaj or javascript to read the contents and add to the attributes. Just be mindful that if I recall correctly attributes are size constrained.

Cheers

On Thu, Aug 25, 2016 at 11:17 PM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
Sorry, I should have been more clear.  I have a flow file with conten.  To that flow file, I need to add the content of a disk file as an attribute without losing the original content.

Does that better explain things?

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image002.png@01D1FECE.1ED700A0]

From: Matt Burgess <ma...@gmail.com>>
Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Date: Wednesday, August 24, 2016 at 5:13 PM
To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Subject: Re: Need to read a small local file into a flow file property

Chris,

Are you looking to have a flow file that has its own content also as an attribute? With EvaluateJsonPath, are you taking in the entire document? If so, you could use ExtractText with a regex that captures all text and puts it in an attribute, I believe the content of the flow file is untouched.

Please let me know if I've misunderstood your use case, I'm a little confused as to why you have two paths and step 3. Wouldn't #1 and #2 (with "flowfile-attribute" as the Destination) read the file into an attribute and also keep it in the content?

Regards,
Matt

On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
Hi folks,

I’m looking for some ideas here.  I need to read the content of a small local file info a flow file attribute.  I can’t find a processor that does this.  Did I miss one that does?

So without one of these I’ve been trying to do this using a MergeContent processor.

First, I assign a correlation UUID and store it in an attribute

I split by file down two processing paths.  The left hand path goes straight to the MergeContentProcessors.

In the right hand path I

1.       Read the content of the local file using FetchFile

2.       Pull the content of the FlowFile into an attribute using EvaluateJSONPath

3.       Clear the content of the FlowFile using ReplaceText


Then I combine the left and right legs using MergeContent using the assigned correlation UUID to merge the files.

This generally works, except when it doesn’t. ☺

The problem seems to be that the left hand side of the stream flows relatively faster than the right hand path, which makes sense.  This can lead to the “bins” in the MergeContent processor being reused before the file in the bin can be merged with the file traveling down the right hand path causing Uncorrelated files are then sent to the merged output.

Does it sound like I am using the MergeContent processor in the right way?

Any other ideas?


Thanks in advance,

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image003.png@01D1FECE.1ED700A0]



RE: Need to read a small local file into a flow file property

Posted by "Oxenberg, Jeff" <je...@hpe.com>.
Yeah, that would work.. here’s a quick example in python that works on my local machine.

https://gist.github.com/jeffoxenberg/327b0dfeaa6bb63882279dd290222582

Thanks,


Jeff Oxenberg

From: Andre [mailto:andre-lists@fucs.org]
Sent: Thursday, August 25, 2016 8:41 AM
To: users@nifi.apache.org
Subject: Re: Need to read a small local file into a flow file property



wouldn't scripted task using ExecuteScript solve this issue?

You could simply use jython, groovy, jruby, luaj or javascript to read the contents and add to the attributes. Just be mindful that if I recall correctly attributes are size constrained.

Cheers

On Thu, Aug 25, 2016 at 11:17 PM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
Sorry, I should have been more clear.  I have a flow file with conten.  To that flow file, I need to add the content of a disk file as an attribute without losing the original content.

Does that better explain things?

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image001.png@01D1FEBE.82285E20]

From: Matt Burgess <ma...@gmail.com>>
Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Date: Wednesday, August 24, 2016 at 5:13 PM
To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Subject: Re: Need to read a small local file into a flow file property

Chris,

Are you looking to have a flow file that has its own content also as an attribute? With EvaluateJsonPath, are you taking in the entire document? If so, you could use ExtractText with a regex that captures all text and puts it in an attribute, I believe the content of the flow file is untouched.

Please let me know if I've misunderstood your use case, I'm a little confused as to why you have two paths and step 3. Wouldn't #1 and #2 (with "flowfile-attribute" as the Destination) read the file into an attribute and also keep it in the content?

Regards,
Matt

On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
Hi folks,

I’m looking for some ideas here.  I need to read the content of a small local file info a flow file attribute.  I can’t find a processor that does this.  Did I miss one that does?

So without one of these I’ve been trying to do this using a MergeContent processor.

First, I assign a correlation UUID and store it in an attribute

I split by file down two processing paths.  The left hand path goes straight to the MergeContentProcessors.

In the right hand path I

1.       Read the content of the local file using FetchFile

2.       Pull the content of the FlowFile into an attribute using EvaluateJSONPath

3.       Clear the content of the FlowFile using ReplaceText


Then I combine the left and right legs using MergeContent using the assigned correlation UUID to merge the files.

This generally works, except when it doesn’t. ☺

The problem seems to be that the left hand side of the stream flows relatively faster than the right hand path, which makes sense.  This can lead to the “bins” in the MergeContent processor being reused before the file in the bin can be merged with the file traveling down the right hand path causing Uncorrelated files are then sent to the merged output.

Does it sound like I am using the MergeContent processor in the right way?

Any other ideas?


Thanks in advance,

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image002.png@01D1FEBE.82285E20]



Re: Need to read a small local file into a flow file property

Posted by Andre <an...@fucs.org>.
wouldn't scripted task using ExecuteScript solve this issue?

You could simply use jython, groovy, jruby, luaj or javascript to read the
contents and add to the attributes. Just be mindful that if I recall
correctly attributes are size constrained.

Cheers

On Thu, Aug 25, 2016 at 11:17 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:

> Sorry, I should have been more clear.  I have a flow file with conten.  To
> that flow file, I need to add the content of a disk file as an attribute
> without losing the original content.
>
>
>
> Does that better explain things?
>
>
>
> Chris McDermott
>
>
>
> Remote Business Analytics
>
> STaTS/StoreFront Remote
>
> HPE Storage
>
> Hewlett Packard Enterprise
>
> Mobile: +1 978-697-5315
>
>
>
>
>
> *From: *Matt Burgess <ma...@gmail.com>
> *Reply-To: *"users@nifi.apache.org" <us...@nifi.apache.org>
> *Date: *Wednesday, August 24, 2016 at 5:13 PM
> *To: *"users@nifi.apache.org" <us...@nifi.apache.org>
> *Subject: *Re: Need to read a small local file into a flow file property
>
>
>
> Chris,
>
>
>
> Are you looking to have a flow file that has its own content also as an
> attribute? With EvaluateJsonPath, are you taking in the entire document? If
> so, you could use ExtractText with a regex that captures all text and puts
> it in an attribute, I believe the content of the flow file is untouched.
>
>
>
> Please let me know if I've misunderstood your use case, I'm a little
> confused as to why you have two paths and step 3. Wouldn't #1 and #2 (with
> "flowfile-attribute" as the Destination) read the file into an attribute
> and also keep it in the content?
>
>
>
> Regards,
>
> Matt
>
>
>
> On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>
> Hi folks,
>
>
>
> I’m looking for some ideas here.  I need to read the content of a small
> local file info a flow file attribute.  I can’t find a processor that does
> this.  Did I miss one that does?
>
>
>
> So without one of these I’ve been trying to do this using a MergeContent
> processor.
>
>
>
> First, I assign a correlation UUID and store it in an attribute
>
>
>
> I split by file down two processing paths.  The left hand path goes
> straight to the MergeContentProcessors.
>
>
>
> In the right hand path I
>
> 1.       Read the content of the local file using FetchFile
>
> 2.       Pull the content of the FlowFile into an attribute using
> EvaluateJSONPath
>
> 3.       Clear the content of the FlowFile using ReplaceText
>
>
>
> Then I combine the left and right legs using MergeContent using the
> assigned correlation UUID to merge the files.
>
>
>
> This generally works, except when it doesn’t. J
>
>
>
> The problem seems to be that the left hand side of the stream flows
> relatively faster than the right hand path, which makes sense.  This can
> lead to the “bins” in the MergeContent processor being reused before the
> file in the bin can be merged with the file traveling down the right hand
> path causing Uncorrelated files are then sent to the merged output.
>
>
>
> Does it sound like I am using the MergeContent processor in the right way?
>
>
>
> Any other ideas?
>
>
>
>
>
> Thanks in advance,
>
>
>
> Chris McDermott
>
>
>
> Remote Business Analytics
>
> STaTS/StoreFront Remote
>
> HPE Storage
>
> Hewlett Packard Enterprise
>
> Mobile: +1 978-697-5315
>
>
>
>
>
>

Re: Need to read a small local file into a flow file property

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Sorry, I should have been more clear.  I have a flow file with conten.  To that flow file, I need to add the content of a disk file as an attribute without losing the original content.

Does that better explain things?

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image001.png@01D1FEB1.02CE0E20]

From: Matt Burgess <ma...@gmail.com>
Reply-To: "users@nifi.apache.org" <us...@nifi.apache.org>
Date: Wednesday, August 24, 2016 at 5:13 PM
To: "users@nifi.apache.org" <us...@nifi.apache.org>
Subject: Re: Need to read a small local file into a flow file property

Chris,

Are you looking to have a flow file that has its own content also as an attribute? With EvaluateJsonPath, are you taking in the entire document? If so, you could use ExtractText with a regex that captures all text and puts it in an attribute, I believe the content of the flow file is untouched.

Please let me know if I've misunderstood your use case, I'm a little confused as to why you have two paths and step 3. Wouldn't #1 and #2 (with "flowfile-attribute" as the Destination) read the file into an attribute and also keep it in the content?

Regards,
Matt

On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
Hi folks,

I’m looking for some ideas here.  I need to read the content of a small local file info a flow file attribute.  I can’t find a processor that does this.  Did I miss one that does?

So without one of these I’ve been trying to do this using a MergeContent processor.

First, I assign a correlation UUID and store it in an attribute

I split by file down two processing paths.  The left hand path goes straight to the MergeContentProcessors.

In the right hand path I

1.       Read the content of the local file using FetchFile

2.       Pull the content of the FlowFile into an attribute using EvaluateJSONPath

3.       Clear the content of the FlowFile using ReplaceText


Then I combine the left and right legs using MergeContent using the assigned correlation UUID to merge the files.

This generally works, except when it doesn’t. ☺

The problem seems to be that the left hand side of the stream flows relatively faster than the right hand path, which makes sense.  This can lead to the “bins” in the MergeContent processor being reused before the file in the bin can be merged with the file traveling down the right hand path causing Uncorrelated files are then sent to the merged output.

Does it sound like I am using the MergeContent processor in the right way?

Any other ideas?


Thanks in advance,

Chris McDermott

Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315

[cid:image002.png@01D1FEB1.02CE0E20]


Re: Need to read a small local file into a flow file property

Posted by Matt Burgess <ma...@gmail.com>.
Chris,

Are you looking to have a flow file that has its own content also as an
attribute? With EvaluateJsonPath, are you taking in the entire document? If
so, you could use ExtractText with a regex that captures all text and puts
it in an attribute, I believe the content of the flow file is untouched.

Please let me know if I've misunderstood your use case, I'm a little
confused as to why you have two paths and step 3. Wouldn't #1 and #2 (with
"flowfile-attribute" as the Destination) read the file into an attribute
and also keep it in the content?

Regards,
Matt

On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:

> Hi folks,
>
>
>
> I’m looking for some ideas here.  I need to read the content of a small
> local file info a flow file attribute.  I can’t find a processor that does
> this.  Did I miss one that does?
>
>
>
> So without one of these I’ve been trying to do this using a MergeContent
> processor.
>
>
>
> First, I assign a correlation UUID and store it in an attribute
>
>
>
> I split by file down two processing paths.  The left hand path goes
> straight to the MergeContentProcessors.
>
>
>
> In the right hand path I
>
> 1.       Read the content of the local file using FetchFile
>
> 2.       Pull the content of the FlowFile into an attribute using
> EvaluateJSONPath
>
> 3.       Clear the content of the FlowFile using ReplaceText
>
>
>
> Then I combine the left and right legs using MergeContent using the
> assigned correlation UUID to merge the files.
>
>
>
> This generally works, except when it doesn’t. J
>
>
>
> The problem seems to be that the left hand side of the stream flows
> relatively faster than the right hand path, which makes sense.  This can
> lead to the “bins” in the MergeContent processor being reused before the
> file in the bin can be merged with the file traveling down the right hand
> path causing Uncorrelated files are then sent to the merged output.
>
>
>
> Does it sound like I am using the MergeContent processor in the right way?
>
>
>
> Any other ideas?
>
>
>
>
>
> Thanks in advance,
>
>
>
> Chris McDermott
>
>
>
> Remote Business Analytics
>
> STaTS/StoreFront Remote
>
> HPE Storage
>
> Hewlett Packard Enterprise
>
> Mobile: +1 978-697-5315
>
>
>
>

Re: Need to read a small local file into a flow file property

Posted by James McMahon <js...@gmail.com>.
Greetings Chris. I have an idea that *may* work for you to get the file
data into a flow file attribute. Word of caution though: I am relatively
new to NiFi, so this may be harder than it needs to be <lol>. If nothing
else, perhaps it will give you some food for thought.

Have you seen this?
http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
I found this to be a good stepping stone that permits me to use Python
callbacks to do just about anything I need with attributes - build JSON
objects from attributes, add new attributes, modify existing attributes,
etc etc. I *think* I read somewhere that the actual data for a flow file is
stored in a metadata field called DATA. If that turns out to be the case,
you could mimic this Jython script (thank you to Matt Burgess for his blog
and for these helpful examples), access the data from that DATA field, and
save it to whatever attribute value you'd like. I've done similar many
times to extract xml attributes and values from complex xml metadata,
saving that to a flow file attribute that I add. Use an ExecuteScript
processor to execute the Jython script.

Or if it turns out that there is an attribute metadata field entitled DATA,
perhaps you can just use that?

I hope this helps a little. I hope that I did not misunderstand your
question.
Jim



On Wed, Aug 24, 2016 at 4:33 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:

> Hi folks,
>
>
>
> I’m looking for some ideas here.  I need to read the content of a small
> local file info a flow file attribute.  I can’t find a processor that does
> this.  Did I miss one that does?
>
>
>
> So without one of these I’ve been trying to do this using a MergeContent
> processor.
>
>
>
> First, I assign a correlation UUID and store it in an attribute
>
>
>
> I split by file down two processing paths.  The left hand path goes
> straight to the MergeContentProcessors.
>
>
>
> In the right hand path I
>
> 1.       Read the content of the local file using FetchFile
>
> 2.       Pull the content of the FlowFile into an attribute using
> EvaluateJSONPath
>
> 3.       Clear the content of the FlowFile using ReplaceText
>
>
>
> Then I combine the left and right legs using MergeContent using the
> assigned correlation UUID to merge the files.
>
>
>
> This generally works, except when it doesn’t. J
>
>
>
> The problem seems to be that the left hand side of the stream flows
> relatively faster than the right hand path, which makes sense.  This can
> lead to the “bins” in the MergeContent processor being reused before the
> file in the bin can be merged with the file traveling down the right hand
> path causing Uncorrelated files are then sent to the merged output.
>
>
>
> Does it sound like I am using the MergeContent processor in the right way?
>
>
>
> Any other ideas?
>
>
>
>
>
> Thanks in advance,
>
>
>
> Chris McDermott
>
>
>
> Remote Business Analytics
>
> STaTS/StoreFront Remote
>
> HPE Storage
>
> Hewlett Packard Enterprise
>
> Mobile: +1 978-697-5315
>
>
>
>