You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jean-Sebastien Vachon <js...@brizodata.com> on 2019/04/01 18:29:04 UTC

Load balancing strategy/thoughts

Hi all,

over the last couple of days, I've been playing with the different load balancing options.
They all seem to  do what they are designed for but I have a small issue and I am not sure how to deal with this...

Let's say I have a process A which output is load balanced on all the nodes within my cluster to a processor B. Once everything has been processed on each node, I want to bring back everything to the same node to merge the results together and perform additional processing.

Should I use a MergeContent on the main/primary node? or configure the output queue to use the "Single node" load balancing  strategy? The documentation for the latest says that " the node the flows will be sent to is not defined".

What is the recommended approach for this?

Thanks

Re: Load balancing strategy/thoughts

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi,

The few final steps/processes of my flow is to produce a final file containing the information that was computed in the various processors and save them into a single file to be delivered to a customer. I would rather send them one larger file instead of many small ones.

Thanks for your comments, I will configure the MergeContent processor's queue and see how it goes
________________________________
From: Bryan Bende <bb...@gmail.com>
Sent: Monday, April 1, 2019 2:52 PM
To: users@nifi.apache.org
Subject: Re: Load balancing strategy/thoughts

Hello,

Is there a reason why it has to be brought back to the same original node?

As long as MergeContent is scheduled on all nodes, then you can choose
"Single node" strategy for the queue leading into MergeContent, and
one of the nodes will get all the pieces and can do the
merge/defragment.

-Bryan

On Mon, Apr 1, 2019 at 2:29 PM Jean-Sebastien Vachon
<js...@brizodata.com> wrote:
>
> Hi all,
>
> over the last couple of days, I've been playing with the different load balancing options.
> They all seem to  do what they are designed for but I have a small issue and I am not sure how to deal with this...
>
> Let's say I have a process A which output is load balanced on all the nodes within my cluster to a processor B. Once everything has been processed on each node, I want to bring back everything to the same node to merge the results together and perform additional processing.
>
> Should I use a MergeContent on the main/primary node? or configure the output queue to use the "Single node" load balancing  strategy? The documentation for the latest says that " the node the flows will be sent to is not defined".
>
> What is the recommended approach for this?
>
> Thanks

Re: Load balancing strategy/thoughts

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

Is there a reason why it has to be brought back to the same original node?

As long as MergeContent is scheduled on all nodes, then you can choose
"Single node" strategy for the queue leading into MergeContent, and
one of the nodes will get all the pieces and can do the
merge/defragment.

-Bryan

On Mon, Apr 1, 2019 at 2:29 PM Jean-Sebastien Vachon
<js...@brizodata.com> wrote:
>
> Hi all,
>
> over the last couple of days, I've been playing with the different load balancing options.
> They all seem to  do what they are designed for but I have a small issue and I am not sure how to deal with this...
>
> Let's say I have a process A which output is load balanced on all the nodes within my cluster to a processor B. Once everything has been processed on each node, I want to bring back everything to the same node to merge the results together and perform additional processing.
>
> Should I use a MergeContent on the main/primary node? or configure the output queue to use the "Single node" load balancing  strategy? The documentation for the latest says that " the node the flows will be sent to is not defined".
>
> What is the recommended approach for this?
>
> Thanks