You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by Ryan Hendrickson <ry...@gmail.com> on 2021/08/09 20:59:30 UTC

NiFi Cluster Relationship Back Pressure & Size Threshold Math

Hi all,
To confirm, when using a NiFi Cluster, are the Relationship Settings the
"Back Pressure Object Threshold" and "Size Threshold" per node node, or
cluster-wide?

For example, if we have a 10 node cluster and set the Back Pressure Object
Threshold to 100.  Would we then expect the Relationship to queue up-to
1000 flowfiles prior to exceeding the threshold?

We have the following setup:
Update Attribute -----Relationship----> JoltTransform

In our case, we set a 70,000 object threshold and have 7 servers in the
cluster.

When hovering on the Relationship's status bar, it says:   "Queue: 100%
full (based on 70,000 object threshold)"

There's two things that don't make sense about that message:
1. The Relationship only has ~350,000 FlowFiles in it, for it to be 100%
full, it would need 490,000.
2. There are 7 nodes in the cluster, so should the "based on xx object
threshold" say "based on 490,000 object threshold"?

We also have a 2GB "Size Threshold" set on the Relationship.  The
Relationship hover text reads: "Queue 36% full (based on 2GB data size
threshold)".

What doesn't make sense about this, is that the math doesn't make sense if
you check it yourself.  We have 7 nodes x 2GB each limit equals 14GB
cluster-wide.

Taking the reported value of 3.45 GB in the queue, divide it by 14, you get
~25%.  That's a far 11% off from the 36% noted in the hover text.

We are running 1.13.2 on these servers.  All servers appear to be
communicating and processing data, per the UI's NiFi Cluster overview
(thread count, queue size, status, etc)

Any thoughts on this would be appreciated.

Thanks,
Ryan

Re: NiFi Cluster Relationship Back Pressure & Size Threshold Math

Posted by Ryan Hendrickson <ry...@gmail.com>.

Hi all,

The 7 node cluster had 1 node underperforming today.  The single node's
object back pressure exceeded the limit and caused the other 6 nodes to
stop processing data, waiting for the single node to complete the work.
Question 1: Is this expected behavior?

To remedy the situation, we added a relationship round-robin load-balance
on that relationship.  We observed FlowFiles moving to one other nodes, but
this unnecessarily saturates the network.
Question 2: When a FlowFile is Load Balanced from one node to another, is
the entire Content Claim load balanced?  Or just the small portion
necessary?
Question 3: When the Load Balancing begins, how many threads can it take
up?  And how many "in-parallel" files can be moved?
Question 4: When the Load Balancing attempts a Load Balance using Round
Robin, will it "skip" a node if a node's Relationship has already exceeded
the object backpressure limit?  Similar to how DistributeLoad uses "Next
Available"?

Thanks,
Ryan

On Mon, Aug 9, 2021 at 4:59 PM Ryan Hendrickson <
ryan.andrew.hendrickson@gmail.com> wrote:

> Hi all,
> To confirm, when using a NiFi Cluster, are the Relationship Settings the
> "Back Pressure Object Threshold" and "Size Threshold" per node node, or
> cluster-wide?
>
> For example, if we have a 10 node cluster and set the Back Pressure Object
> Threshold to 100.  Would we then expect the Relationship to queue up-to
> 1000 flowfiles prior to exceeding the threshold?
>
> We have the following setup:
> Update Attribute -----Relationship----> JoltTransform
>
> In our case, we set a 70,000 object threshold and have 7 servers in the
> cluster.
>
> When hovering on the Relationship's status bar, it says:   "Queue: 100%
> full (based on 70,000 object threshold)"
>
> There's two things that don't make sense about that message:
> 1. The Relationship only has ~350,000 FlowFiles in it, for it to be 100%
> full, it would need 490,000.
> 2. There are 7 nodes in the cluster, so should the "based on xx object
> threshold" say "based on 490,000 object threshold"?
>
> We also have a 2GB "Size Threshold" set on the Relationship.  The
> Relationship hover text reads: "Queue 36% full (based on 2GB data size
> threshold)".
>
> What doesn't make sense about this, is that the math doesn't make sense if
> you check it yourself.  We have 7 nodes x 2GB each limit equals 14GB
> cluster-wide.
>
> Taking the reported value of 3.45 GB in the queue, divide it by 14, you
> get ~25%.  That's a far 11% off from the 36% noted in the hover text.
>
> We are running 1.13.2 on these servers.  All servers appear to be
> communicating and processing data, per the UI's NiFi Cluster overview
> (thread count, queue size, status, etc)
>
> Any thoughts on this would be appreciated.
>
> Thanks,
> Ryan
>