You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by KhajaAsmath Mohammed <md...@gmail.com> on 2020/10/25 18:00:06 UTC

NIFI - Wait before merging the files

Hi,

I am looking for a use case to wait for processor/flow files for 2 minutes
and later merge them using a merge processor. Is there a processor or
script to achieve this?

Thanks,
Asmath

Re: NIFI - Wait before merging the files

Posted by Michael Loftis <ml...@wgops.com>.
Both MergeRecord and Merge Content have a maximum bin age property.

On Sun, Oct 25, 2020 at 12:00 KhajaAsmath Mohammed <md...@gmail.com>
wrote:

> Hi,
>
> I am looking for a use case to wait for processor/flow files for 2 minutes
> and later merge them using a merge processor. Is there a processor or
> script to achieve this?
>
> Thanks,
> Asmath
>
-- 

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: NIFI - Wait before merging the files

Posted by KhajaAsmath Mohammed <md...@gmail.com>.
Yes Dave, ours is streaming data . We do ETL on the live data before loading into target.  We parse the json data, since the volume is high and don’t know how many events we get, I am looking to load the data by merging instead of doing individual insert. This will avoid more threads on database and does the bulk update. 

Merge content works when I stop the prior professor so I was looking to implement wait strategy. 

Thanks,
Asmath 

Sent from my iPhone

> On Oct 26, 2020, at 10:33 PM, David Early <da...@grokstream.com> wrote:
> 
> 
> Can you expand on your use case?  Do you need to wait for a specific amount of time after receiving the first flow file?  Why would a scheduled run every 2 min not work?  Is the issue that you need all related flowfiles merged together?
> 
> Dave
> 
>> On Mon, Oct 26, 2020, 9:25 PM KhajaAsmath Mohammed <md...@gmail.com> wrote:
>> Thanks David, I have the same issue. Was never able to get it work with merge record directly . Since my flow has stream of data, I can’t implement cron in between the job. I might have to see if groovy script has any option to wait for specified amount of time before merge operation 
>> 
>> Sent from my iPhone
>> 
>>>> On Oct 26, 2020, at 9:22 PM, David Early <da...@grokstream.com> wrote:
>>>> 
>>> 
>>> I have a case where I have a single stream of data items that need merged into a single file.
>>> 
>>> I do this by setting the number of bins in merge to 1 and using the cron scheduler to run the merge every 15 min.  I never got the bin age to work the way I wanted.  
>>> 
>>> I set the number of flowfiles to include in the output to a value much greater than the expected number of ff that will appear in the input queue.
>>> 
>>> This creates a single output flowfile on a timed schedule.
>>> 
>>> In my case, I follow this with a record based query that deduplicates the data.
>>> 
>>> Dave
>>> 
>>>> On Sun, Oct 25, 2020, 12:00 PM KhajaAsmath Mohammed <md...@gmail.com> wrote:
>>>> Hi,
>>>> 
>>>> I am looking for a use case to wait for processor/flow files for 2 minutes and later merge them using a merge processor. Is there a processor or script to achieve this?
>>>> 
>>>> Thanks,
>>>> Asmath

Re: NIFI - Wait before merging the files

Posted by David Early <da...@grokstream.com>.
Can you expand on your use case?  Do you need to wait for a specific amount
of time after receiving the first flow file?  Why would a scheduled run
every 2 min not work?  Is the issue that you need all related flowfiles
merged together?

Dave

On Mon, Oct 26, 2020, 9:25 PM KhajaAsmath Mohammed <md...@gmail.com>
wrote:

> Thanks David, I have the same issue. Was never able to get it work with
> merge record directly . Since my flow has stream of data, I can’t implement
> cron in between the job. I might have to see if groovy script has any
> option to wait for specified amount of time before merge operation
>
> Sent from my iPhone
>
> On Oct 26, 2020, at 9:22 PM, David Early <da...@grokstream.com>
> wrote:
>
> 
> I have a case where I have a single stream of data items that need merged
> into a single file.
>
> I do this by setting the number of bins in merge to 1 and using the cron
> scheduler to run the merge every 15 min.  I never got the bin age to work
> the way I wanted.
>
> I set the number of flowfiles to include in the output to a value much
> greater than the expected number of ff that will appear in the input queue.
>
> This creates a single output flowfile on a timed schedule.
>
> In my case, I follow this with a record based query that deduplicates the
> data.
>
> Dave
>
> On Sun, Oct 25, 2020, 12:00 PM KhajaAsmath Mohammed <
> mdkhajaasmath@gmail.com> wrote:
>
>> Hi,
>>
>> I am looking for a use case to wait for processor/flow files for 2
>> minutes and later merge them using a merge processor. Is there a processor
>> or script to achieve this?
>>
>> Thanks,
>> Asmath
>>
>

Re: NIFI - Wait before merging the files

Posted by KhajaAsmath Mohammed <md...@gmail.com>.
Thanks David, I have the same issue. Was never able to get it work with merge record directly . Since my flow has stream of data, I can’t implement cron in between the job. I might have to see if groovy script has any option to wait for specified amount of time before merge operation 

Sent from my iPhone

> On Oct 26, 2020, at 9:22 PM, David Early <da...@grokstream.com> wrote:
> 
> 
> I have a case where I have a single stream of data items that need merged into a single file.
> 
> I do this by setting the number of bins in merge to 1 and using the cron scheduler to run the merge every 15 min.  I never got the bin age to work the way I wanted.  
> 
> I set the number of flowfiles to include in the output to a value much greater than the expected number of ff that will appear in the input queue.
> 
> This creates a single output flowfile on a timed schedule.
> 
> In my case, I follow this with a record based query that deduplicates the data.
> 
> Dave
> 
>> On Sun, Oct 25, 2020, 12:00 PM KhajaAsmath Mohammed <md...@gmail.com> wrote:
>> Hi,
>> 
>> I am looking for a use case to wait for processor/flow files for 2 minutes and later merge them using a merge processor. Is there a processor or script to achieve this?
>> 
>> Thanks,
>> Asmath

Re: NIFI - Wait before merging the files

Posted by David Early <da...@grokstream.com>.
I have a case where I have a single stream of data items that need merged
into a single file.

I do this by setting the number of bins in merge to 1 and using the cron
scheduler to run the merge every 15 min.  I never got the bin age to work
the way I wanted.

I set the number of flowfiles to include in the output to a value much
greater than the expected number of ff that will appear in the input queue.

This creates a single output flowfile on a timed schedule.

In my case, I follow this with a record based query that deduplicates the
data.

Dave

On Sun, Oct 25, 2020, 12:00 PM KhajaAsmath Mohammed <md...@gmail.com>
wrote:

> Hi,
>
> I am looking for a use case to wait for processor/flow files for 2 minutes
> and later merge them using a merge processor. Is there a processor or
> script to achieve this?
>
> Thanks,
> Asmath
>