You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "Jens M. Kofoed" <jm...@gmail.com> on 2020/03/24 10:05:38 UTC

How to block a flow until first file is finish

Hi

I'm trying to build a flow, where a putfile process is only allow to write
a file, if the previous file has finished the following block in the flow.
But I can't works it out with my wait notify. Since the first file can't go
through the wait block because the notify is coming after the wait block.

kind regards
Jens

Re: How to block a flow until first file is finish

Posted by Juan Pablo Gardella <ga...@gmail.com>.
I think this common use cases are easily implemented with Pentaho
Transformation and Pentaho Job constructors. Below a short summary how each
type of construction works:

1) *Process Group (PG)*: Every processors inside the PG runs forever, that
means always are triggered.
2) *Pentaho Transformation*: Every processors (a.k.a step in Pentaho
parlance) start. The equivalent method for onTrigger (processRow) returns a
boolean. When it returns false, the step stop. So, if you have N steps,
once all return false the transformation completes. All steps start at the
same time.
3) *Pentaho Job*: Used for coordination. It can connects different
transformations. For example, runs transformation T1. If T1 completes,
execute T2.

It is difficult in Nifi implement this type of flows/ETL. We know Nifi is
not an ETL, but sometimes it is used as ETL and a lot of time we need that
type of coordination between processors that now, are not easy to
implement. I will be great to have some type of new component/constructor,
like *Process Group For ETL* that can be used like a transformation. Once
all processors *complete*, continue  with next step.

Data flow and ETL are different type of design. For Data flow Nifi is
great, but for ETL starting to coordinate things make the nifi flow
complex.

Juan



On Tue, 24 Mar 2020 at 09:34, Jens M. Kofoed <jm...@gmail.com> wrote:

> Hi Chris
>
> Thanks for the links, but yes I've read them before and I'm using similar
> flows in other use cases.
> In both examples from the link a process is splitting data up in two
> flows, flow A and flow B. In flow A you use a wait process blocking the
> rest of flow A. In the end of flow B you have a notify process which
> trigger flow A.
> My issue is that I only have 1 flow. Could be something like this where
> the issue is that the outpu of the ExecuteProcess use static filenames
> (stupid I know). So if the Execute process runs multiple times before the
> GetFile is done, it will overwrite old files. Therefore I need some way to
> block the PutFile and Execute Process, until the GetFile is done or that
> the output folder is empty
> GetData - UpdateAttribute - WAIT (or block until output folder from
> ExecuteProcess is empty) - PutFile - ExecuteProcess - GetFile - NOTIFY - do
> something more.
>
> regards
> Jens
>
>
>
>
> Den tir. 24. mar. 2020 kl. 11.37 skrev Chris Sampson <
> chris.sampson@naimuri.com>:
>
>> Have you looked at some Wait-Notify examples (it does sound like what
>> you're wanting to use):
>>
>> https://gist.github.com/ijokarumawak/20125d663d2116c6dae1eecae8d7acbc
>>
>>
>> https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-merge/
>>
>> Your Notify should be on a different "branch" of your Flow than your Wait
>> - send duplicate copies of FlowFiles to the Wait and also to the part of
>> your flow that does the "real" processing.
>>
>>
>> *Chris Sampson*
>> IT Consultant
>> *Tel:* 07867 843 675
>> chris.sampson@naimuri.com
>>
>>
>>
>> On Tue, 24 Mar 2020 at 10:05, Jens M. Kofoed <jm...@gmail.com>
>> wrote:
>>
>>> Hi
>>>
>>> I'm trying to build a flow, where a putfile process is only allow to
>>> write a file, if the previous file has finished the following block in the
>>> flow. But I can't works it out with my wait notify. Since the first file
>>> can't go through the wait block because the notify is coming after the
>>> wait block.
>>>
>>> kind regards
>>> Jens
>>>
>>

Re: How to block a flow until first file is finish

Posted by "Jens M. Kofoed" <jm...@gmail.com>.
Hi Chris

Thanks for the links, but yes I've read them before and I'm using similar
flows in other use cases.
In both examples from the link a process is splitting data up in two flows,
flow A and flow B. In flow A you use a wait process blocking the rest of
flow A. In the end of flow B you have a notify process which trigger flow A.
My issue is that I only have 1 flow. Could be something like this where the
issue is that the outpu of the ExecuteProcess use static filenames (stupid
I know). So if the Execute process runs multiple times before the GetFile
is done, it will overwrite old files. Therefore I need some way to block
the PutFile and Execute Process, until the GetFile is done or that the
output folder is empty
GetData - UpdateAttribute - WAIT (or block until output folder from
ExecuteProcess is empty) - PutFile - ExecuteProcess - GetFile - NOTIFY - do
something more.

regards
Jens




Den tir. 24. mar. 2020 kl. 11.37 skrev Chris Sampson <
chris.sampson@naimuri.com>:

> Have you looked at some Wait-Notify examples (it does sound like what
> you're wanting to use):
>
> https://gist.github.com/ijokarumawak/20125d663d2116c6dae1eecae8d7acbc
>
>
> https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-merge/
>
> Your Notify should be on a different "branch" of your Flow than your Wait
> - send duplicate copies of FlowFiles to the Wait and also to the part of
> your flow that does the "real" processing.
>
>
> *Chris Sampson*
> IT Consultant
> *Tel:* 07867 843 675
> chris.sampson@naimuri.com
>
>
>
> On Tue, 24 Mar 2020 at 10:05, Jens M. Kofoed <jm...@gmail.com>
> wrote:
>
>> Hi
>>
>> I'm trying to build a flow, where a putfile process is only allow to
>> write a file, if the previous file has finished the following block in the
>> flow. But I can't works it out with my wait notify. Since the first file
>> can't go through the wait block because the notify is coming after the
>> wait block.
>>
>> kind regards
>> Jens
>>
>

Re: How to block a flow until first file is finish

Posted by Chris Sampson <ch...@naimuri.com>.
Have you looked at some Wait-Notify examples (it does sound like what
you're wanting to use):

https://gist.github.com/ijokarumawak/20125d663d2116c6dae1eecae8d7acbc

https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-merge/

Your Notify should be on a different "branch" of your Flow than your Wait -
send duplicate copies of FlowFiles to the Wait and also to the part of your
flow that does the "real" processing.


*Chris Sampson*
IT Consultant
*Tel:* 07867 843 675
chris.sampson@naimuri.com



On Tue, 24 Mar 2020 at 10:05, Jens M. Kofoed <jm...@gmail.com> wrote:

> Hi
>
> I'm trying to build a flow, where a putfile process is only allow to write
> a file, if the previous file has finished the following block in the flow.
> But I can't works it out with my wait notify. Since the first file can't go
> through the wait block because the notify is coming after the wait block.
>
> kind regards
> Jens
>