You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by ski n <ra...@gmail.com> on 2019/07/22 14:17:17 UTC

Best practices for working loosely-coupled and error handling

I work on migrating a large ESB process to a NiFi flow. This process
contains around 40 events (40 different flowfiles). On the ESB a loosely
coupled pattern was used with the help of JMS queues. In NiFi I can use
ports, but than I need to connect those ports. The canvas soon becomes
messy.

Is there a way to use something like a ‘topic’ in Nifi? So some kind of
endpoint without connecting items (processors/process groups) or is this
against the dataflow concept and you always need external brokers like
Kafka or ActiveMQ for this?

Another question is what to do with failure messages. Can you configure a
default ‘endpoint’ for all failures within a certain process.? Now I
connect all processors to failure handling step/port, but this gets soon
messy as well. What is the best practice for errors? Do most use
autotermination?


Regards,


Raymond

Re: Best practices for working loosely-coupled and error handling

Posted by ski n <ra...@gmail.com>.
Thanks for the advice. Grouping multiple flows into one process group and
embedding process-groups within process groups are options to get some
oversight. They are limited however.

1) If multiple workflows are grouped into one Process Groups then
    a) You need to create different output ports
or
    b) connect multiple flows to the same output port

It's not allowed to have multiple output ports of the same name in one
Process Group.

2) Embedding has also limitations as the parent Process Group has no access
to the output port of the embedded Process Group.

Using multiple process groups is also not very practical. I have for
example a two phase processing of messages. First phase consists of about
ten process groups and second phase has also 10 process groups. Each
process group of the first phase can connect to all ten process groups of
the seconds phase. This leads to 100 relations crossing each other.

Basically the functionality is as I want, only it hard to find out what's
happening with so many process groups/ports connected. I see 2 options:

1) Use queues or topics on an external broker (JMS/AMQP/KAFKA)
2) Use Wormholes connections (When they get implemented...).
https://cwiki.apache.org/confluence/display/NIFI/Wormhole+Connections

I would prefer the second as then I don't need to use a third-party tool,
but for now will use the first one. In case someone has a better idea I
would like to hear of it of course.

Kind regards,

Raymond






Op wo 24 jul. 2019 om 21:42 schreef Mike Thomsen <mi...@gmail.com>:

> More specifically, I mean wrap each of your 40 workflows in a process
> group. I have a workflow that processes some financial data, and it has 3
> levels of process groups at its most extreme points to group common
> functions and isolate edge cases so none of them are distracting when
> looking at the data flow from a higher level while it's running. It's about
> 100 processors total, but the canvas is quite clean because all of the
> functionality is neatly encapsulated in well-organized process groups that
> allow us to do things like add new sources and then drop them safely when
> they're no longer needed.
>
> On Wed, Jul 24, 2019 at 3:39 PM Mike Thomsen <mi...@gmail.com>
> wrote:
>
> > > In NiFi I can use ports, but than I need to connect those ports.
> >
> > You can wrap each operation in a process group and then connect the
> > process groups via ports so your main canvas is substantially less
> > cluttered. You can also nest process groups inside of each other. that
> > works really well for organizing related functionality.
> >
> > On Mon, Jul 22, 2019 at 10:17 AM ski n <ra...@gmail.com> wrote:
> >
> >> I work on migrating a large ESB process to a NiFi flow. This process
> >> contains around 40 events (40 different flowfiles). On the ESB a loosely
> >> coupled pattern was used with the help of JMS queues. In NiFi I can use
> >> ports, but than I need to connect those ports. The canvas soon becomes
> >> messy.
> >>
> >> Is there a way to use something like a ‘topic’ in Nifi? So some kind of
> >> endpoint without connecting items (processors/process groups) or is this
> >> against the dataflow concept and you always need external brokers like
> >> Kafka or ActiveMQ for this?
> >>
> >> Another question is what to do with failure messages. Can you configure
> a
> >> default ‘endpoint’ for all failures within a certain process.? Now I
> >> connect all processors to failure handling step/port, but this gets soon
> >> messy as well. What is the best practice for errors? Do most use
> >> autotermination?
> >>
> >>
> >> Regards,
> >>
> >>
> >> Raymond
> >>
> >
>

Re: Best practices for working loosely-coupled and error handling

Posted by Mike Thomsen <mi...@gmail.com>.
More specifically, I mean wrap each of your 40 workflows in a process
group. I have a workflow that processes some financial data, and it has 3
levels of process groups at its most extreme points to group common
functions and isolate edge cases so none of them are distracting when
looking at the data flow from a higher level while it's running. It's about
100 processors total, but the canvas is quite clean because all of the
functionality is neatly encapsulated in well-organized process groups that
allow us to do things like add new sources and then drop them safely when
they're no longer needed.

On Wed, Jul 24, 2019 at 3:39 PM Mike Thomsen <mi...@gmail.com> wrote:

> > In NiFi I can use ports, but than I need to connect those ports.
>
> You can wrap each operation in a process group and then connect the
> process groups via ports so your main canvas is substantially less
> cluttered. You can also nest process groups inside of each other. that
> works really well for organizing related functionality.
>
> On Mon, Jul 22, 2019 at 10:17 AM ski n <ra...@gmail.com> wrote:
>
>> I work on migrating a large ESB process to a NiFi flow. This process
>> contains around 40 events (40 different flowfiles). On the ESB a loosely
>> coupled pattern was used with the help of JMS queues. In NiFi I can use
>> ports, but than I need to connect those ports. The canvas soon becomes
>> messy.
>>
>> Is there a way to use something like a ‘topic’ in Nifi? So some kind of
>> endpoint without connecting items (processors/process groups) or is this
>> against the dataflow concept and you always need external brokers like
>> Kafka or ActiveMQ for this?
>>
>> Another question is what to do with failure messages. Can you configure a
>> default ‘endpoint’ for all failures within a certain process.? Now I
>> connect all processors to failure handling step/port, but this gets soon
>> messy as well. What is the best practice for errors? Do most use
>> autotermination?
>>
>>
>> Regards,
>>
>>
>> Raymond
>>
>

Re: Best practices for working loosely-coupled and error handling

Posted by Mike Thomsen <mi...@gmail.com>.
> In NiFi I can use ports, but than I need to connect those ports.

You can wrap each operation in a process group and then connect the process
groups via ports so your main canvas is substantially less cluttered. You
can also nest process groups inside of each other. that works really well
for organizing related functionality.

On Mon, Jul 22, 2019 at 10:17 AM ski n <ra...@gmail.com> wrote:

> I work on migrating a large ESB process to a NiFi flow. This process
> contains around 40 events (40 different flowfiles). On the ESB a loosely
> coupled pattern was used with the help of JMS queues. In NiFi I can use
> ports, but than I need to connect those ports. The canvas soon becomes
> messy.
>
> Is there a way to use something like a ‘topic’ in Nifi? So some kind of
> endpoint without connecting items (processors/process groups) or is this
> against the dataflow concept and you always need external brokers like
> Kafka or ActiveMQ for this?
>
> Another question is what to do with failure messages. Can you configure a
> default ‘endpoint’ for all failures within a certain process.? Now I
> connect all processors to failure handling step/port, but this gets soon
> messy as well. What is the best practice for errors? Do most use
> autotermination?
>
>
> Regards,
>
>
> Raymond
>