You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@nifi.apache.org by "Denis Jakupovic (Jira)" <ji...@apache.org> on 2022/12/03 18:11:00 UTC

[jira] [Commented] (NIFI-9598) Load Balancing on labeled nodes and/or fixed amount of usable nodes in process groups

    [ https://issues.apache.org/jira/browse/NIFI-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642883#comment-17642883 ] 

Denis Jakupovic commented on NIFI-9598:
---------------------------------------

Any news on this?

> Load Balancing on labeled nodes and/or fixed amount of usable nodes in process groups
> -------------------------------------------------------------------------------------
>
>                 Key: NIFI-9598
>                 URL: https://issues.apache.org/jira/browse/NIFI-9598
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.15.3
>            Reporter: Denis Jakupovic
>            Priority: Trivial
>
> One of NiFi's great features is its linear scalability by adding just more nodes. However by only having the distribute load processor or by round robin, load balance by attribute name or to a single node feature in the connection, we could need a more granular form of distributing flowfiles through the cluster. 
> Let's assume we have a 10 node NiFi Cluster. 
> Round Robin: Each node would get 1/10 of the flowfiles.
> Single Node: Only one node would process all FF. Chance that other process groups distribute to same node is 1/10
> By Attribute: 1-10 nodes could get the data, not evenly partitioned
> Distribute Load Processor: Manual and fixed process, cannot scale with adding more nodes to the cluster and needs 
> By having several dataflows with different use cases with enormous variance in computation, one or a few dataflows can slow down all other data flows. Therefore a solution could be partitioning the data to labeled nodes or by setting the maximum allowed nodes to use for FF partitioning/load balancing on process groups or a connection.
> In the cluster configuration each node could be labeled. Distributing the FF by round robin would only be distributed to the labeled nodes with the proper label. A distribution by attribute name would mean to build the attribute accordingly and cannot be build dynamically. 
> Another great feature would be the maximum amount of nodes a process group can use to distribute nodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)