You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by David Klim <da...@hotmail.com> on 2015/09/23 17:16:29 UTC
Generate flowfiles from flowfile content
Hello,
In a flow I am defining, I receive a flowfile containing json string. Using the splitJson processor I can extract some json paths pointing to some files I need to retrieve, but the filename is the content of the generated flowfile. So I would need to be able to read the content and generate a flowfile with that name instead. How could I do that?
Thanks!
Re: Generate flowfiles from flowfile content
Posted by Joe Witt <jo...@gmail.com>.
David,
I think if i read your case correctly this should be supported really
well. The flow would be something like:
GetSQS -> SplitJson -> EvaluateJsonPath -> FetchS3Object
In SplitJSON you'll break apart the original object into smaller valid
JSON objects.
In evaluate JsonPath you'll promote the filename/url you need from the
JSON content to flow file attributes
In FetchS3 you'll go grab the item based on the name/url you pulled in
evaluate json path.
Bryan: Any chance you could put together a quick template for David to
check out?
Thanks
Joe
On Wed, Sep 23, 2015 at 3:41 PM, David Klim <da...@hotmail.com> wrote:
> Hello Bryan,
>
> I should have been more specific. What I am trying to do is to fetch files
> from S3. I am using the GetSQS processor to get new object (files) events,
> and each event is a json containing the list of new objects (files) in the
> bucket. The output of the GetSQS is processed by SplitJson and I get
> flowfiles containing one object key (filename) each. I need to feed this
> into FetchS3Object to retrive the actual file, but FetchS3Object expects the
> flowfile filename attribute (or any other) to be the filename. So I guess
> the problem is moving the filename string from the flowfile content to some
> attribute.
>
> If there is no other alternative, I will implement this processor.
>
> Thanks!
>
> ________________________________
> From: rbraddy@softnas.com
> To: users@nifi.apache.org
> Subject: RE: Generate flowfiles from flowfile content
> Date: Wed, 23 Sep 2015 19:59:21 +0000
>
>
> Good idea, Adam.
>
>
>
> I will post a separate review thread on the dev@ list to track comments.
>
>
>
> Here’s the repository link: https://github.com/rickbraddy/nifishare
>
>
>
>
>
> Thanks
>
> Rick
>
>
>
> From: Adam Taft [mailto:adam@adamtaft.com]
> Sent: Wednesday, September 23, 2015 1:48 PM
> To: users@nifi.apache.org
> Subject: Re: Generate flowfiles from flowfile content
>
>
>
> Not speaking for the entire community, but I am sure that such a
> contribution would (at minimum) be appreciated for review, consideration and
> potential inclusion. The best thing would be ideally hosting the source
> code somewhere that the rest of the community could go to for review. Maybe
> you could host the GetFileData and PutFileData processors on a GitHub
> repository somewhere?
>
> I think the idea you proposed is good, but might need to be aligned with the
> work (if any) for the referenced ListFile and FetchFile implementation. And
> the differences in your PutFileData vs. PutFile would ideally be well vetted
> as well.
>
> Adam
>
>
>
>
>
>
>
> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
>
> We have already developed modified a modified GetFIle called GetFileData
> that takes an incoming FlowFile containing the path to the file/directory
> that needs to be transferred. There is a corresponding PutFileData on the
> other side that accepts the incoming file/directory that creates the
> directory/tree as needed or writes the file, then sets the permissions and
> ownership. GetFileData also receives a file.rootdir attribute that gets
> passed along to PutFileData, so it can rebase the original file’s location
> relative to the configured target directory. Unlike GetFile/PutFile, these
> processor work with entire directory trees and are triggered by incoming
> FlowFiles to GetFileData.
>
>
>
> Eventually, we want to further enhance these two processors so they can
> break large files into “chunks” and send as multi-part files that get
> reassembled by PutFileData, resolving the limitations associated with huge
> files and content repository size; e.g., there are default 100MB chunk
> threshold and 10MB chunk size properties that will control the chunking, if
> enabled.
>
>
>
> If the community is interested would benefit from these processors, we’re
> happy to consider further generalizing and contributing these processors,
> along with any further refinements based upon community review and feedback.
>
>
>
> I believe these processors would address both the Jira and David’s original
> inquiry.
>
>
>
> Rick
>
>
>
> From: Adam Taft [mailto:adam@adamtaft.com]
> Sent: Wednesday, September 23, 2015 1:09 PM
> To: users@nifi.apache.org
> Subject: Re: Generate flowfiles from flowfile content
>
>
>
> Right. This would be the use case that FetchFile [1] would help solve.
>
> [1] https://issues.apache.org/jira/browse/NIFI-631
>
>
>
> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Hi David,
>
>
>
> When you say "files I need to retrieve", are you referring to files on the
> local filesystem where NiFi is running?
>
>
>
> If so, I am not aware of an existing processor that does that. Currently we
> have GetFile which polls a directory, but that is not what you want here.
>
>
>
> It would be fairly straight forward to implement with a custom processor
> though... You would read the incoming FlowFile content to get the filename,
> then create a new FlowFile with your desired name, and write the content of
> the local file to the new FlowFile.
>
>
>
> -Bryan
>
>
>
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com> wrote:
>
> Hello,
>
>
>
> In a flow I am defining, I receive a flowfile containing json string. Using
> the splitJson processor I can extract some json paths pointing to some files
> I need to retrieve, but the filename is the content of the generated
> flowfile. So I would need to be able to read the content and generate a
> flowfile with that name instead. How could I do that?
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
RE: Generate flowfiles from flowfile content
Posted by David Klim <da...@hotmail.com>.
ExtractText did the job! Thank you very much! :-)
> Date: Wed, 23 Sep 2015 16:05:44 -0700
> Subject: Re: Generate flowfiles from flowfile content
> From: joe.witt@gmail.com
> To: users@nifi.apache.org
>
> Bryan - you may be right that ExtractText will be the right play once
> splitjson is done doing its thing. Perhaps either will work. Maybe
> we can show either or. If the schema is fairly well known i'm
> thinking extract json would be the winner.
>
> thanks
> Joe
>
> On Wed, Sep 23, 2015 at 4:04 PM, Bryan Bende <bb...@gmail.com> wrote:
> > Sorry I missed Joe's email while sending mine... I can put together a
> > template showing this.
> >
> >
> > On Wednesday, September 23, 2015, Bryan Bende <bb...@gmail.com> wrote:
> >>
> >> David,
> >>
> >> Take a look at ExtractText, it is for pulling FlowFile content into
> >> attributes. I think that will do what you are looking for.
> >>
> >> -Bryan
> >>
> >> On Wednesday, September 23, 2015, David Klim <da...@hotmail.com>
> >> wrote:
> >>>
> >>> Hello Bryan,
> >>>
> >>> I should have been more specific. What I am trying to do is to fetch
> >>> files from S3. I am using the GetSQS processor to get new object (files)
> >>> events, and each event is a json containing the list of new objects (files)
> >>> in the bucket. The output of the GetSQS is processed by SplitJson and I get
> >>> flowfiles containing one object key (filename) each. I need to feed this
> >>> into FetchS3Object to retrive the actual file, but FetchS3Object expects the
> >>> flowfile filename attribute (or any other) to be the filename. So I guess
> >>> the problem is moving the filename string from the flowfile content to some
> >>> attribute.
> >>>
> >>> If there is no other alternative, I will implement this processor.
> >>>
> >>> Thanks!
> >>>
> >>> ________________________________
> >>> From: rbraddy@softnas.com
> >>> To: users@nifi.apache.org
> >>> Subject: RE: Generate flowfiles from flowfile content
> >>> Date: Wed, 23 Sep 2015 19:59:21 +0000
> >>>
> >>> Good idea, Adam.
> >>>
> >>>
> >>>
> >>> I will post a separate review thread on the dev@ list to track comments.
> >>>
> >>>
> >>>
> >>> Here’s the repository link: https://github.com/rickbraddy/nifishare
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>> Rick
> >>>
> >>>
> >>>
> >>> From: Adam Taft [mailto:adam@adamtaft.com]
> >>> Sent: Wednesday, September 23, 2015 1:48 PM
> >>> To: users@nifi.apache.org
> >>> Subject: Re: Generate flowfiles from flowfile content
> >>>
> >>>
> >>>
> >>> Not speaking for the entire community, but I am sure that such a
> >>> contribution would (at minimum) be appreciated for review, consideration and
> >>> potential inclusion. The best thing would be ideally hosting the source
> >>> code somewhere that the rest of the community could go to for review. Maybe
> >>> you could host the GetFileData and PutFileData processors on a GitHub
> >>> repository somewhere?
> >>>
> >>> I think the idea you proposed is good, but might need to be aligned with
> >>> the work (if any) for the referenced ListFile and FetchFile implementation.
> >>> And the differences in your PutFileData vs. PutFile would ideally be well
> >>> vetted as well.
> >>>
> >>> Adam
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
> >>>
> >>> We have already developed modified a modified GetFIle called GetFileData
> >>> that takes an incoming FlowFile containing the path to the file/directory
> >>> that needs to be transferred. There is a corresponding PutFileData on the
> >>> other side that accepts the incoming file/directory that creates the
> >>> directory/tree as needed or writes the file, then sets the permissions and
> >>> ownership. GetFileData also receives a file.rootdir attribute that gets
> >>> passed along to PutFileData, so it can rebase the original file’s location
> >>> relative to the configured target directory. Unlike GetFile/PutFile, these
> >>> processor work with entire directory trees and are triggered by incoming
> >>> FlowFiles to GetFileData.
> >>>
> >>>
> >>>
> >>> Eventually, we want to further enhance these two processors so they can
> >>> break large files into “chunks” and send as multi-part files that get
> >>> reassembled by PutFileData, resolving the limitations associated with huge
> >>> files and content repository size; e.g., there are default 100MB chunk
> >>> threshold and 10MB chunk size properties that will control the chunking, if
> >>> enabled.
> >>>
> >>>
> >>>
> >>> If the community is interested would benefit from these processors, we’re
> >>> happy to consider further generalizing and contributing these processors,
> >>> along with any further refinements based upon community review and feedback.
> >>>
> >>>
> >>>
> >>> I believe these processors would address both the Jira and David’s
> >>> original inquiry.
> >>>
> >>>
> >>>
> >>> Rick
> >>>
> >>>
> >>>
> >>> From: Adam Taft [mailto:adam@adamtaft.com]
> >>> Sent: Wednesday, September 23, 2015 1:09 PM
> >>> To: users@nifi.apache.org
> >>> Subject: Re: Generate flowfiles from flowfile content
> >>>
> >>>
> >>>
> >>> Right. This would be the use case that FetchFile [1] would help solve.
> >>>
> >>> [1] https://issues.apache.org/jira/browse/NIFI-631
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
> >>>
> >>> Hi David,
> >>>
> >>>
> >>>
> >>> When you say "files I need to retrieve", are you referring to files on
> >>> the local filesystem where NiFi is running?
> >>>
> >>>
> >>>
> >>> If so, I am not aware of an existing processor that does that. Currently
> >>> we have GetFile which polls a directory, but that is not what you want here.
> >>>
> >>>
> >>>
> >>> It would be fairly straight forward to implement with a custom processor
> >>> though... You would read the incoming FlowFile content to get the filename,
> >>> then create a new FlowFile with your desired name, and write the content of
> >>> the local file to the new FlowFile.
> >>>
> >>>
> >>>
> >>> -Bryan
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
> >>> wrote:
> >>>
> >>> Hello,
> >>>
> >>>
> >>>
> >>> In a flow I am defining, I receive a flowfile containing json string.
> >>> Using the splitJson processor I can extract some json paths pointing to some
> >>> files I need to retrieve, but the filename is the content of the generated
> >>> flowfile. So I would need to be able to read the content and generate a
> >>> flowfile with that name instead. How could I do that?
> >>>
> >>>
> >>>
> >>> Thanks!
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Sent from Gmail Mobile
> >
> >
> >
> > --
> > Sent from Gmail Mobile
Re: Generate flowfiles from flowfile content
Posted by Joe Witt <jo...@gmail.com>.
Bryan - you may be right that ExtractText will be the right play once
splitjson is done doing its thing. Perhaps either will work. Maybe
we can show either or. If the schema is fairly well known i'm
thinking extract json would be the winner.
thanks
Joe
On Wed, Sep 23, 2015 at 4:04 PM, Bryan Bende <bb...@gmail.com> wrote:
> Sorry I missed Joe's email while sending mine... I can put together a
> template showing this.
>
>
> On Wednesday, September 23, 2015, Bryan Bende <bb...@gmail.com> wrote:
>>
>> David,
>>
>> Take a look at ExtractText, it is for pulling FlowFile content into
>> attributes. I think that will do what you are looking for.
>>
>> -Bryan
>>
>> On Wednesday, September 23, 2015, David Klim <da...@hotmail.com>
>> wrote:
>>>
>>> Hello Bryan,
>>>
>>> I should have been more specific. What I am trying to do is to fetch
>>> files from S3. I am using the GetSQS processor to get new object (files)
>>> events, and each event is a json containing the list of new objects (files)
>>> in the bucket. The output of the GetSQS is processed by SplitJson and I get
>>> flowfiles containing one object key (filename) each. I need to feed this
>>> into FetchS3Object to retrive the actual file, but FetchS3Object expects the
>>> flowfile filename attribute (or any other) to be the filename. So I guess
>>> the problem is moving the filename string from the flowfile content to some
>>> attribute.
>>>
>>> If there is no other alternative, I will implement this processor.
>>>
>>> Thanks!
>>>
>>> ________________________________
>>> From: rbraddy@softnas.com
>>> To: users@nifi.apache.org
>>> Subject: RE: Generate flowfiles from flowfile content
>>> Date: Wed, 23 Sep 2015 19:59:21 +0000
>>>
>>> Good idea, Adam.
>>>
>>>
>>>
>>> I will post a separate review thread on the dev@ list to track comments.
>>>
>>>
>>>
>>> Here’s the repository link: https://github.com/rickbraddy/nifishare
>>>
>>>
>>>
>>>
>>>
>>> Thanks
>>>
>>> Rick
>>>
>>>
>>>
>>> From: Adam Taft [mailto:adam@adamtaft.com]
>>> Sent: Wednesday, September 23, 2015 1:48 PM
>>> To: users@nifi.apache.org
>>> Subject: Re: Generate flowfiles from flowfile content
>>>
>>>
>>>
>>> Not speaking for the entire community, but I am sure that such a
>>> contribution would (at minimum) be appreciated for review, consideration and
>>> potential inclusion. The best thing would be ideally hosting the source
>>> code somewhere that the rest of the community could go to for review. Maybe
>>> you could host the GetFileData and PutFileData processors on a GitHub
>>> repository somewhere?
>>>
>>> I think the idea you proposed is good, but might need to be aligned with
>>> the work (if any) for the referenced ListFile and FetchFile implementation.
>>> And the differences in your PutFileData vs. PutFile would ideally be well
>>> vetted as well.
>>>
>>> Adam
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
>>>
>>> We have already developed modified a modified GetFIle called GetFileData
>>> that takes an incoming FlowFile containing the path to the file/directory
>>> that needs to be transferred. There is a corresponding PutFileData on the
>>> other side that accepts the incoming file/directory that creates the
>>> directory/tree as needed or writes the file, then sets the permissions and
>>> ownership. GetFileData also receives a file.rootdir attribute that gets
>>> passed along to PutFileData, so it can rebase the original file’s location
>>> relative to the configured target directory. Unlike GetFile/PutFile, these
>>> processor work with entire directory trees and are triggered by incoming
>>> FlowFiles to GetFileData.
>>>
>>>
>>>
>>> Eventually, we want to further enhance these two processors so they can
>>> break large files into “chunks” and send as multi-part files that get
>>> reassembled by PutFileData, resolving the limitations associated with huge
>>> files and content repository size; e.g., there are default 100MB chunk
>>> threshold and 10MB chunk size properties that will control the chunking, if
>>> enabled.
>>>
>>>
>>>
>>> If the community is interested would benefit from these processors, we’re
>>> happy to consider further generalizing and contributing these processors,
>>> along with any further refinements based upon community review and feedback.
>>>
>>>
>>>
>>> I believe these processors would address both the Jira and David’s
>>> original inquiry.
>>>
>>>
>>>
>>> Rick
>>>
>>>
>>>
>>> From: Adam Taft [mailto:adam@adamtaft.com]
>>> Sent: Wednesday, September 23, 2015 1:09 PM
>>> To: users@nifi.apache.org
>>> Subject: Re: Generate flowfiles from flowfile content
>>>
>>>
>>>
>>> Right. This would be the use case that FetchFile [1] would help solve.
>>>
>>> [1] https://issues.apache.org/jira/browse/NIFI-631
>>>
>>>
>>>
>>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>>>
>>> Hi David,
>>>
>>>
>>>
>>> When you say "files I need to retrieve", are you referring to files on
>>> the local filesystem where NiFi is running?
>>>
>>>
>>>
>>> If so, I am not aware of an existing processor that does that. Currently
>>> we have GetFile which polls a directory, but that is not what you want here.
>>>
>>>
>>>
>>> It would be fairly straight forward to implement with a custom processor
>>> though... You would read the incoming FlowFile content to get the filename,
>>> then create a new FlowFile with your desired name, and write the content of
>>> the local file to the new FlowFile.
>>>
>>>
>>>
>>> -Bryan
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> In a flow I am defining, I receive a flowfile containing json string.
>>> Using the splitJson processor I can extract some json paths pointing to some
>>> files I need to retrieve, but the filename is the content of the generated
>>> flowfile. So I would need to be able to read the content and generate a
>>> flowfile with that name instead. How could I do that?
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Sent from Gmail Mobile
>
>
>
> --
> Sent from Gmail Mobile
Re: Generate flowfiles from flowfile content
Posted by Bryan Bende <bb...@gmail.com>.
Sorry I missed Joe's email while sending mine... I can put together a
template showing this.
On Wednesday, September 23, 2015, Bryan Bende <bb...@gmail.com> wrote:
> David,
>
> Take a look at ExtractText, it is for pulling FlowFile content into
> attributes. I think that will do what you are looking for.
>
> -Bryan
>
> On Wednesday, September 23, 2015, David Klim <davidklmlg@hotmail.com
> <javascript:_e(%7B%7D,'cvml','davidklmlg@hotmail.com');>> wrote:
>
>> Hello Bryan,
>>
>> I should have been more specific. What I am trying to do is to fetch
>> files from S3. I am using the GetSQS processor to get new object (files)
>> events, and each event is a json containing the list of new objects (files)
>> in the bucket. The output of the GetSQS is processed by SplitJson and I get
>> flowfiles containing one object key (filename) each. I need to feed this
>> into FetchS3Object to retrive the actual file, but FetchS3Object expects
>> the flowfile filename attribute (or any other) to be the filename. So I
>> guess the problem is moving the filename string from the flowfile content
>> to some attribute.
>>
>> If there is no other alternative, I will implement this processor.
>>
>> Thanks!
>>
>> ------------------------------
>> From: rbraddy@softnas.com
>> To: users@nifi.apache.org
>> Subject: RE: Generate flowfiles from flowfile content
>> Date: Wed, 23 Sep 2015 19:59:21 +0000
>>
>> Good idea, Adam.
>>
>>
>>
>> I will post a separate review thread on the dev@ list to track comments.
>>
>>
>>
>> Here’s the repository link: https://github.com/rickbraddy/nifishare
>>
>>
>>
>>
>>
>> Thanks
>>
>> Rick
>>
>>
>>
>> *From:* Adam Taft [mailto:adam@adamtaft.com]
>> *Sent:* Wednesday, September 23, 2015 1:48 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Not speaking for the entire community, but I am sure that such a
>> contribution would (at minimum) be appreciated for review, consideration
>> and potential inclusion. The best thing would be ideally hosting the
>> source code somewhere that the rest of the community could go to for
>> review. Maybe you could host the GetFileData and PutFileData processors on
>> a GitHub repository somewhere?
>>
>> I think the idea you proposed is good, but might need to be aligned with
>> the work (if any) for the referenced ListFile and FetchFile
>> implementation. And the differences in your PutFileData vs. PutFile would
>> ideally be well vetted as well.
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
>>
>> We have already developed modified a modified GetFIle called GetFileData
>> that takes an incoming FlowFile containing the path to the file/directory
>> that needs to be transferred. There is a corresponding PutFileData on the
>> other side that accepts the incoming file/directory that creates the
>> directory/tree as needed or writes the file, then sets the permissions and
>> ownership. GetFileData also receives a file.rootdir attribute that gets
>> passed along to PutFileData, so it can rebase the original file’s location
>> relative to the configured target directory. Unlike GetFile/PutFile, these
>> processor work with entire directory trees and are triggered by incoming
>> FlowFiles to GetFileData.
>>
>>
>>
>> Eventually, we want to further enhance these two processors so they can
>> break large files into “chunks” and send as multi-part files that get
>> reassembled by PutFileData, resolving the limitations associated with huge
>> files and content repository size; e.g., there are default 100MB chunk
>> threshold and 10MB chunk size properties that will control the chunking, if
>> enabled.
>>
>>
>>
>> If the community is interested would benefit from these processors, we’re
>> happy to consider further generalizing and contributing these processors,
>> along with any further refinements based upon community review and feedback.
>>
>>
>>
>> I believe these processors would address both the Jira and David’s
>> original inquiry.
>>
>>
>>
>> Rick
>>
>>
>>
>> *From:* Adam Taft [mailto:adam@adamtaft.com]
>> *Sent:* Wednesday, September 23, 2015 1:09 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Right. This would be the use case that FetchFile [1] would help solve.
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-631
>>
>>
>>
>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>>
>> Hi David,
>>
>>
>>
>> When you say "files I need to retrieve", are you referring to files on
>> the local filesystem where NiFi is running?
>>
>>
>>
>> If so, I am not aware of an existing processor that does that. Currently
>> we have GetFile which polls a directory, but that is not what you want here.
>>
>>
>>
>> It would be fairly straight forward to implement with a custom processor
>> though... You would read the incoming FlowFile content to get the filename,
>> then create a new FlowFile with your desired name, and write the content of
>> the local file to the new FlowFile.
>>
>>
>>
>> -Bryan
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> In a flow I am defining, I receive a flowfile containing json
>> string. Using the splitJson processor I can extract some json paths
>> pointing to some files I need to retrieve, but the filename is the content
>> of the generated flowfile. So I would need to be able to read the content
>> and generate a flowfile with that name instead. How could I do that?
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> Sent from Gmail Mobile
>
--
Sent from Gmail Mobile
Re: Generate flowfiles from flowfile content
Posted by Joe Witt <jo...@gmail.com>.
David,
Could you share a sample of your JSON that you get from pulling in from SQS?
Thanks
Joe
On Wed, Sep 23, 2015 at 4:01 PM, Bryan Bende <bb...@gmail.com> wrote:
> David,
>
> Take a look at ExtractText, it is for pulling FlowFile content into
> attributes. I think that will do what you are looking for.
>
> -Bryan
>
>
> On Wednesday, September 23, 2015, David Klim <da...@hotmail.com> wrote:
>>
>> Hello Bryan,
>>
>> I should have been more specific. What I am trying to do is to fetch files
>> from S3. I am using the GetSQS processor to get new object (files) events,
>> and each event is a json containing the list of new objects (files) in the
>> bucket. The output of the GetSQS is processed by SplitJson and I get
>> flowfiles containing one object key (filename) each. I need to feed this
>> into FetchS3Object to retrive the actual file, but FetchS3Object expects the
>> flowfile filename attribute (or any other) to be the filename. So I guess
>> the problem is moving the filename string from the flowfile content to some
>> attribute.
>>
>> If there is no other alternative, I will implement this processor.
>>
>> Thanks!
>>
>> ________________________________
>> From: rbraddy@softnas.com
>> To: users@nifi.apache.org
>> Subject: RE: Generate flowfiles from flowfile content
>> Date: Wed, 23 Sep 2015 19:59:21 +0000
>>
>> Good idea, Adam.
>>
>>
>>
>> I will post a separate review thread on the dev@ list to track comments.
>>
>>
>>
>> Here’s the repository link: https://github.com/rickbraddy/nifishare
>>
>>
>>
>>
>>
>> Thanks
>>
>> Rick
>>
>>
>>
>> From: Adam Taft [mailto:adam@adamtaft.com]
>> Sent: Wednesday, September 23, 2015 1:48 PM
>> To: users@nifi.apache.org
>> Subject: Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Not speaking for the entire community, but I am sure that such a
>> contribution would (at minimum) be appreciated for review, consideration and
>> potential inclusion. The best thing would be ideally hosting the source
>> code somewhere that the rest of the community could go to for review. Maybe
>> you could host the GetFileData and PutFileData processors on a GitHub
>> repository somewhere?
>>
>> I think the idea you proposed is good, but might need to be aligned with
>> the work (if any) for the referenced ListFile and FetchFile implementation.
>> And the differences in your PutFileData vs. PutFile would ideally be well
>> vetted as well.
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
>>
>> We have already developed modified a modified GetFIle called GetFileData
>> that takes an incoming FlowFile containing the path to the file/directory
>> that needs to be transferred. There is a corresponding PutFileData on the
>> other side that accepts the incoming file/directory that creates the
>> directory/tree as needed or writes the file, then sets the permissions and
>> ownership. GetFileData also receives a file.rootdir attribute that gets
>> passed along to PutFileData, so it can rebase the original file’s location
>> relative to the configured target directory. Unlike GetFile/PutFile, these
>> processor work with entire directory trees and are triggered by incoming
>> FlowFiles to GetFileData.
>>
>>
>>
>> Eventually, we want to further enhance these two processors so they can
>> break large files into “chunks” and send as multi-part files that get
>> reassembled by PutFileData, resolving the limitations associated with huge
>> files and content repository size; e.g., there are default 100MB chunk
>> threshold and 10MB chunk size properties that will control the chunking, if
>> enabled.
>>
>>
>>
>> If the community is interested would benefit from these processors, we’re
>> happy to consider further generalizing and contributing these processors,
>> along with any further refinements based upon community review and feedback.
>>
>>
>>
>> I believe these processors would address both the Jira and David’s
>> original inquiry.
>>
>>
>>
>> Rick
>>
>>
>>
>> From: Adam Taft [mailto:adam@adamtaft.com]
>> Sent: Wednesday, September 23, 2015 1:09 PM
>> To: users@nifi.apache.org
>> Subject: Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Right. This would be the use case that FetchFile [1] would help solve.
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-631
>>
>>
>>
>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>>
>> Hi David,
>>
>>
>>
>> When you say "files I need to retrieve", are you referring to files on the
>> local filesystem where NiFi is running?
>>
>>
>>
>> If so, I am not aware of an existing processor that does that. Currently
>> we have GetFile which polls a directory, but that is not what you want here.
>>
>>
>>
>> It would be fairly straight forward to implement with a custom processor
>> though... You would read the incoming FlowFile content to get the filename,
>> then create a new FlowFile with your desired name, and write the content of
>> the local file to the new FlowFile.
>>
>>
>>
>> -Bryan
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> In a flow I am defining, I receive a flowfile containing json string.
>> Using the splitJson processor I can extract some json paths pointing to some
>> files I need to retrieve, but the filename is the content of the generated
>> flowfile. So I would need to be able to read the content and generate a
>> flowfile with that name instead. How could I do that?
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Sent from Gmail Mobile
Re: Generate flowfiles from flowfile content
Posted by Bryan Bende <bb...@gmail.com>.
David,
Take a look at ExtractText, it is for pulling FlowFile content into
attributes. I think that will do what you are looking for.
-Bryan
On Wednesday, September 23, 2015, David Klim <da...@hotmail.com> wrote:
> Hello Bryan,
>
> I should have been more specific. What I am trying to do is to fetch files
> from S3. I am using the GetSQS processor to get new object (files) events,
> and each event is a json containing the list of new objects (files) in the
> bucket. The output of the GetSQS is processed by SplitJson and I get
> flowfiles containing one object key (filename) each. I need to feed this
> into FetchS3Object to retrive the actual file, but FetchS3Object expects
> the flowfile filename attribute (or any other) to be the filename. So I
> guess the problem is moving the filename string from the flowfile content
> to some attribute.
>
> If there is no other alternative, I will implement this processor.
>
> Thanks!
>
> ------------------------------
> From: rbraddy@softnas.com
> <javascript:_e(%7B%7D,'cvml','rbraddy@softnas.com');>
> To: users@nifi.apache.org
> <javascript:_e(%7B%7D,'cvml','users@nifi.apache.org');>
> Subject: RE: Generate flowfiles from flowfile content
> Date: Wed, 23 Sep 2015 19:59:21 +0000
>
> Good idea, Adam.
>
>
>
> I will post a separate review thread on the dev@ list to track comments.
>
>
>
> Here’s the repository link: https://github.com/rickbraddy/nifishare
>
>
>
>
>
> Thanks
>
> Rick
>
>
>
> *From:* Adam Taft [mailto:adam@adamtaft.com
> <javascript:_e(%7B%7D,'cvml','adam@adamtaft.com');>]
> *Sent:* Wednesday, September 23, 2015 1:48 PM
> *To:* users@nifi.apache.org
> <javascript:_e(%7B%7D,'cvml','users@nifi.apache.org');>
> *Subject:* Re: Generate flowfiles from flowfile content
>
>
>
> Not speaking for the entire community, but I am sure that such a
> contribution would (at minimum) be appreciated for review, consideration
> and potential inclusion. The best thing would be ideally hosting the
> source code somewhere that the rest of the community could go to for
> review. Maybe you could host the GetFileData and PutFileData processors on
> a GitHub repository somewhere?
>
> I think the idea you proposed is good, but might need to be aligned with
> the work (if any) for the referenced ListFile and FetchFile
> implementation. And the differences in your PutFileData vs. PutFile would
> ideally be well vetted as well.
>
> Adam
>
>
>
>
>
>
>
> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rbraddy@softnas.com
> <javascript:_e(%7B%7D,'cvml','rbraddy@softnas.com');>> wrote:
>
> We have already developed modified a modified GetFIle called GetFileData
> that takes an incoming FlowFile containing the path to the file/directory
> that needs to be transferred. There is a corresponding PutFileData on the
> other side that accepts the incoming file/directory that creates the
> directory/tree as needed or writes the file, then sets the permissions and
> ownership. GetFileData also receives a file.rootdir attribute that gets
> passed along to PutFileData, so it can rebase the original file’s location
> relative to the configured target directory. Unlike GetFile/PutFile, these
> processor work with entire directory trees and are triggered by incoming
> FlowFiles to GetFileData.
>
>
>
> Eventually, we want to further enhance these two processors so they can
> break large files into “chunks” and send as multi-part files that get
> reassembled by PutFileData, resolving the limitations associated with huge
> files and content repository size; e.g., there are default 100MB chunk
> threshold and 10MB chunk size properties that will control the chunking, if
> enabled.
>
>
>
> If the community is interested would benefit from these processors, we’re
> happy to consider further generalizing and contributing these processors,
> along with any further refinements based upon community review and feedback.
>
>
>
> I believe these processors would address both the Jira and David’s
> original inquiry.
>
>
>
> Rick
>
>
>
> *From:* Adam Taft [mailto:adam@adamtaft.com
> <javascript:_e(%7B%7D,'cvml','adam@adamtaft.com');>]
> *Sent:* Wednesday, September 23, 2015 1:09 PM
> *To:* users@nifi.apache.org
> <javascript:_e(%7B%7D,'cvml','users@nifi.apache.org');>
> *Subject:* Re: Generate flowfiles from flowfile content
>
>
>
> Right. This would be the use case that FetchFile [1] would help solve.
>
> [1] https://issues.apache.org/jira/browse/NIFI-631
>
>
>
> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bbende@gmail.com
> <javascript:_e(%7B%7D,'cvml','bbende@gmail.com');>> wrote:
>
> Hi David,
>
>
>
> When you say "files I need to retrieve", are you referring to files on the
> local filesystem where NiFi is running?
>
>
>
> If so, I am not aware of an existing processor that does that. Currently
> we have GetFile which polls a directory, but that is not what you want here.
>
>
>
> It would be fairly straight forward to implement with a custom processor
> though... You would read the incoming FlowFile content to get the filename,
> then create a new FlowFile with your desired name, and write the content of
> the local file to the new FlowFile.
>
>
>
> -Bryan
>
>
>
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <davidklmlg@hotmail.com
> <javascript:_e(%7B%7D,'cvml','davidklmlg@hotmail.com');>> wrote:
>
> Hello,
>
>
>
> In a flow I am defining, I receive a flowfile containing json
> string. Using the splitJson processor I can extract some json paths
> pointing to some files I need to retrieve, but the filename is the content
> of the generated flowfile. So I would need to be able to read the content
> and generate a flowfile with that name instead. How could I do that?
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
>
--
Sent from Gmail Mobile
RE: Generate flowfiles from flowfile content
Posted by David Klim <da...@hotmail.com>.
Hello Bryan,
I should have been more specific. What I am trying to do is to fetch files from S3. I am using the GetSQS processor to get new object (files) events, and each event is a json containing the list of new objects (files) in the bucket. The output of the GetSQS is processed by SplitJson and I get flowfiles containing one object key (filename) each. I need to feed this into FetchS3Object to retrive the actual file, but FetchS3Object expects the flowfile filename attribute (or any other) to be the filename. So I guess the problem is moving the filename string from the flowfile content to some attribute.
If there is no other alternative, I will implement this processor.
Thanks!
From: rbraddy@softnas.com
To: users@nifi.apache.org
Subject: RE: Generate flowfiles from flowfile content
Date: Wed, 23 Sep 2015 19:59:21 +0000
Good idea, Adam.
I will post a separate review thread on the dev@ list to track comments.
Here’s the repository link:
https://github.com/rickbraddy/nifishare
Thanks
Rick
From: Adam Taft [mailto:adam@adamtaft.com]
Sent: Wednesday, September 23, 2015 1:48 PM
To: users@nifi.apache.org
Subject: Re: Generate flowfiles from flowfile content
Not speaking for the entire community, but I am sure that such a contribution would (at minimum) be appreciated for review, consideration and potential inclusion. The best thing would be ideally hosting the
source code somewhere that the rest of the community could go to for review. Maybe you could host the GetFileData and PutFileData processors on a GitHub repository somewhere?
I think the idea you proposed is good, but might need to be aligned with the work (if any) for the referenced ListFile and FetchFile implementation. And the differences in your PutFileData vs. PutFile would
ideally be well vetted as well.
Adam
On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
We have already developed modified a modified GetFIle called GetFileData that takes an incoming FlowFile containing the path to the file/directory that needs to be transferred.
There is a corresponding PutFileData on the other side that accepts the incoming file/directory that creates the directory/tree as needed or writes the file, then sets the permissions and ownership. GetFileData also receives a file.rootdir attribute that
gets passed along to PutFileData, so it can rebase the original file’s location relative to the configured target directory. Unlike GetFile/PutFile, these processor work with entire directory trees and are triggered by incoming FlowFiles to GetFileData.
Eventually, we want to further enhance these two processors so they can break large files into “chunks” and send as multi-part files that get reassembled by PutFileData, resolving
the limitations associated with huge files and content repository size; e.g., there are default 100MB chunk threshold and 10MB chunk size properties that will control the chunking, if enabled.
If the community is interested would benefit from these processors, we’re happy to consider further generalizing and contributing these processors, along with any further refinements
based upon community review and feedback.
I believe these processors would address both the Jira and David’s original inquiry.
Rick
From: Adam Taft [mailto:adam@adamtaft.com]
Sent: Wednesday, September 23, 2015 1:09 PM
To: users@nifi.apache.org
Subject: Re: Generate flowfiles from flowfile content
Right. This would be the use case that FetchFile [1] would help solve.
[1] https://issues.apache.org/jira/browse/NIFI-631
On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
Hi David,
When you say "files I need to retrieve", are you referring to files on the local filesystem where NiFi is running?
If so, I am not aware of an existing processor that does that. Currently we have GetFile which polls a directory, but that is not what you want here.
It would be fairly straight forward to implement with a custom processor though... You would read the incoming FlowFile content to get the filename, then create a new FlowFile with
your desired name, and write the content of the local file to the new FlowFile.
-Bryan
On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com> wrote:
Hello,
In a flow I am defining, I receive a flowfile containing json string. Using the splitJson processor I can extract some json paths pointing to some files I need to retrieve, but
the filename is the content of the generated flowfile. So I would need to be able to read the content and generate a flowfile with that name instead. How could I do that?
Thanks!
RE: Generate flowfiles from flowfile content
Posted by Rick Braddy <rb...@softnas.com>.
Good idea, Adam.
I will post a separate review thread on the dev@ list to track comments.
Here’s the repository link: https://github.com/rickbraddy/nifishare
Thanks
Rick
From: Adam Taft [mailto:adam@adamtaft.com]
Sent: Wednesday, September 23, 2015 1:48 PM
To: users@nifi.apache.org
Subject: Re: Generate flowfiles from flowfile content
Not speaking for the entire community, but I am sure that such a contribution would (at minimum) be appreciated for review, consideration and potential inclusion. The best thing would be ideally hosting the source code somewhere that the rest of the community could go to for review. Maybe you could host the GetFileData and PutFileData processors on a GitHub repository somewhere?
I think the idea you proposed is good, but might need to be aligned with the work (if any) for the referenced ListFile and FetchFile implementation. And the differences in your PutFileData vs. PutFile would ideally be well vetted as well.
Adam
On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com>> wrote:
We have already developed modified a modified GetFIle called GetFileData that takes an incoming FlowFile containing the path to the file/directory that needs to be transferred. There is a corresponding PutFileData on the other side that accepts the incoming file/directory that creates the directory/tree as needed or writes the file, then sets the permissions and ownership. GetFileData also receives a file.rootdir attribute that gets passed along to PutFileData, so it can rebase the original file’s location relative to the configured target directory. Unlike GetFile/PutFile, these processor work with entire directory trees and are triggered by incoming FlowFiles to GetFileData.
Eventually, we want to further enhance these two processors so they can break large files into “chunks” and send as multi-part files that get reassembled by PutFileData, resolving the limitations associated with huge files and content repository size; e.g., there are default 100MB chunk threshold and 10MB chunk size properties that will control the chunking, if enabled.
If the community is interested would benefit from these processors, we’re happy to consider further generalizing and contributing these processors, along with any further refinements based upon community review and feedback.
I believe these processors would address both the Jira and David’s original inquiry.
Rick
From: Adam Taft [mailto:adam@adamtaft.com<ma...@adamtaft.com>]
Sent: Wednesday, September 23, 2015 1:09 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Generate flowfiles from flowfile content
Right. This would be the use case that FetchFile [1] would help solve.
[1] https://issues.apache.org/jira/browse/NIFI-631
On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com>> wrote:
Hi David,
When you say "files I need to retrieve", are you referring to files on the local filesystem where NiFi is running?
If so, I am not aware of an existing processor that does that. Currently we have GetFile which polls a directory, but that is not what you want here.
It would be fairly straight forward to implement with a custom processor though... You would read the incoming FlowFile content to get the filename, then create a new FlowFile with your desired name, and write the content of the local file to the new FlowFile.
-Bryan
On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>> wrote:
Hello,
In a flow I am defining, I receive a flowfile containing json string. Using the splitJson processor I can extract some json paths pointing to some files I need to retrieve, but the filename is the content of the generated flowfile. So I would need to be able to read the content and generate a flowfile with that name instead. How could I do that?
Thanks!
Re: Generate flowfiles from flowfile content
Posted by Adam Taft <ad...@adamtaft.com>.
Not speaking for the entire community, but I am sure that such a
contribution would (at minimum) be appreciated for review, consideration
and potential inclusion. The best thing would be ideally hosting the
source code somewhere that the rest of the community could go to for
review. Maybe you could host the GetFileData and PutFileData processors on
a GitHub repository somewhere?
I think the idea you proposed is good, but might need to be aligned with
the work (if any) for the referenced ListFile and FetchFile
implementation. And the differences in your PutFileData vs. PutFile would
ideally be well vetted as well.
Adam
On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rb...@softnas.com> wrote:
> We have already developed modified a modified GetFIle called GetFileData
> that takes an incoming FlowFile containing the path to the file/directory
> that needs to be transferred. There is a corresponding PutFileData on the
> other side that accepts the incoming file/directory that creates the
> directory/tree as needed or writes the file, then sets the permissions and
> ownership. GetFileData also receives a file.rootdir attribute that gets
> passed along to PutFileData, so it can rebase the original file’s location
> relative to the configured target directory. Unlike GetFile/PutFile, these
> processor work with entire directory trees and are triggered by incoming
> FlowFiles to GetFileData.
>
>
>
> Eventually, we want to further enhance these two processors so they can
> break large files into “chunks” and send as multi-part files that get
> reassembled by PutFileData, resolving the limitations associated with huge
> files and content repository size; e.g., there are default 100MB chunk
> threshold and 10MB chunk size properties that will control the chunking, if
> enabled.
>
>
>
> If the community is interested would benefit from these processors, we’re
> happy to consider further generalizing and contributing these processors,
> along with any further refinements based upon community review and feedback.
>
>
>
> I believe these processors would address both the Jira and David’s
> original inquiry.
>
>
>
> Rick
>
>
>
> *From:* Adam Taft [mailto:adam@adamtaft.com]
> *Sent:* Wednesday, September 23, 2015 1:09 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Generate flowfiles from flowfile content
>
>
>
> Right. This would be the use case that FetchFile [1] would help solve.
>
> [1] https://issues.apache.org/jira/browse/NIFI-631
>
>
>
> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Hi David,
>
>
>
> When you say "files I need to retrieve", are you referring to files on the
> local filesystem where NiFi is running?
>
>
>
> If so, I am not aware of an existing processor that does that. Currently
> we have GetFile which polls a directory, but that is not what you want here.
>
>
>
> It would be fairly straight forward to implement with a custom processor
> though... You would read the incoming FlowFile content to get the filename,
> then create a new FlowFile with your desired name, and write the content of
> the local file to the new FlowFile.
>
>
>
> -Bryan
>
>
>
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
> wrote:
>
> Hello,
>
>
>
> In a flow I am defining, I receive a flowfile containing json
> string. Using the splitJson processor I can extract some json paths
> pointing to some files I need to retrieve, but the filename is the content
> of the generated flowfile. So I would need to be able to read the content
> and generate a flowfile with that name instead. How could I do that?
>
>
>
> Thanks!
>
>
>
>
>
>
>
RE: Generate flowfiles from flowfile content
Posted by Rick Braddy <rb...@softnas.com>.
We have already developed modified a modified GetFIle called GetFileData that takes an incoming FlowFile containing the path to the file/directory that needs to be transferred. There is a corresponding PutFileData on the other side that accepts the incoming file/directory that creates the directory/tree as needed or writes the file, then sets the permissions and ownership. GetFileData also receives a file.rootdir attribute that gets passed along to PutFileData, so it can rebase the original file’s location relative to the configured target directory. Unlike GetFile/PutFile, these processor work with entire directory trees and are triggered by incoming FlowFiles to GetFileData.
Eventually, we want to further enhance these two processors so they can break large files into “chunks” and send as multi-part files that get reassembled by PutFileData, resolving the limitations associated with huge files and content repository size; e.g., there are default 100MB chunk threshold and 10MB chunk size properties that will control the chunking, if enabled.
If the community is interested would benefit from these processors, we’re happy to consider further generalizing and contributing these processors, along with any further refinements based upon community review and feedback.
I believe these processors would address both the Jira and David’s original inquiry.
Rick
From: Adam Taft [mailto:adam@adamtaft.com]
Sent: Wednesday, September 23, 2015 1:09 PM
To: users@nifi.apache.org
Subject: Re: Generate flowfiles from flowfile content
Right. This would be the use case that FetchFile [1] would help solve.
[1] https://issues.apache.org/jira/browse/NIFI-631
On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com>> wrote:
Hi David,
When you say "files I need to retrieve", are you referring to files on the local filesystem where NiFi is running?
If so, I am not aware of an existing processor that does that. Currently we have GetFile which polls a directory, but that is not what you want here.
It would be fairly straight forward to implement with a custom processor though... You would read the incoming FlowFile content to get the filename, then create a new FlowFile with your desired name, and write the content of the local file to the new FlowFile.
-Bryan
On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>> wrote:
Hello,
In a flow I am defining, I receive a flowfile containing json string. Using the splitJson processor I can extract some json paths pointing to some files I need to retrieve, but the filename is the content of the generated flowfile. So I would need to be able to read the content and generate a flowfile with that name instead. How could I do that?
Thanks!
Re: Generate flowfiles from flowfile content
Posted by Adam Taft <ad...@adamtaft.com>.
Right. This would be the use case that FetchFile [1] would help solve.
[1] https://issues.apache.org/jira/browse/NIFI-631
On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
> Hi David,
>
> When you say "files I need to retrieve", are you referring to files on the
> local filesystem where NiFi is running?
>
> If so, I am not aware of an existing processor that does that. Currently
> we have GetFile which polls a directory, but that is not what you want here.
>
> It would be fairly straight forward to implement with a custom processor
> though... You would read the incoming FlowFile content to get the filename,
> then create a new FlowFile with your desired name, and write the content of
> the local file to the new FlowFile.
>
> -Bryan
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com>
> wrote:
>
>> Hello,
>>
>> In a flow I am defining, I receive a flowfile containing json
>> string. Using the splitJson processor I can extract some json paths
>> pointing to some files I need to retrieve, but the filename is the content
>> of the generated flowfile. So I would need to be able to read the content
>> and generate a flowfile with that name instead. How could I do that?
>>
>> Thanks!
>>
>>
>
Re: Generate flowfiles from flowfile content
Posted by Mark Payne <ma...@hotmail.com>.
One thing to note, if trying to pull a file from the local file system is that there is a ticket already [1] that would allow
us to pull a file from the local file system using an attribute value. I know this ticket is actively being worked, but I
don't know exactly when we are expecting to have it included in the build.
Thanks
-Mark
[1] https://issues.apache.org/jira/browse/NIFI-631
> On Sep 23, 2015, at 1:11 PM, Bryan Bende <bb...@gmail.com> wrote:
>
> Hi David,
>
> When you say "files I need to retrieve", are you referring to files on the local filesystem where NiFi is running?
>
> If so, I am not aware of an existing processor that does that. Currently we have GetFile which polls a directory, but that is not what you want here.
>
> It would be fairly straight forward to implement with a custom processor though... You would read the incoming FlowFile content to get the filename, then create a new FlowFile with your desired name, and write the content of the local file to the new FlowFile.
>
> -Bryan
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <davidklmlg@hotmail.com <ma...@hotmail.com>> wrote:
> Hello,
>
> In a flow I am defining, I receive a flowfile containing json string. Using the splitJson processor I can extract some json paths pointing to some files I need to retrieve, but the filename is the content of the generated flowfile. So I would need to be able to read the content and generate a flowfile with that name instead. How could I do that?
>
> Thanks!
>
>
Re: Generate flowfiles from flowfile content
Posted by Bryan Bende <bb...@gmail.com>.
Hi David,
When you say "files I need to retrieve", are you referring to files on the
local filesystem where NiFi is running?
If so, I am not aware of an existing processor that does that. Currently we
have GetFile which polls a directory, but that is not what you want here.
It would be fairly straight forward to implement with a custom processor
though... You would read the incoming FlowFile content to get the filename,
then create a new FlowFile with your desired name, and write the content of
the local file to the new FlowFile.
-Bryan
On Wed, Sep 23, 2015 at 11:16 AM, David Klim <da...@hotmail.com> wrote:
> Hello,
>
> In a flow I am defining, I receive a flowfile containing json
> string. Using the splitJson processor I can extract some json paths
> pointing to some files I need to retrieve, but the filename is the content
> of the generated flowfile. So I would need to be able to read the content
> and generate a flowfile with that name instead. How could I do that?
>
> Thanks!
>
>