You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Kien Truong <du...@gmail.com> on 2018/12/03 11:41:48 UTC

Load Balancing connection issue

Hi all,

We're testing the new load-balance connection feature of NIFI 1.8.

After some cluster-wide restarts, some connections with load-balancing 
enabled seem to stuck.

The connection is always shown as actively balancing, however, the size 
of the queue show very little changes, also the queue contents cannot be 
viewed.

NIFI always returns that the queue has 0 flow files when trying to view 
queue content, despite showing a non-zero number in the UI and in REST.

When this happen, if we disable the load-balancing, we will be able to 
view the queue content immediately, but the problem return if we enable 
the load balancing feature again.


In addition, we sometime see negative queue size exception in the log 
when this happen, but not always.


Regards,

Kien



Re: Load Balancing connection issue

Posted by dan young <da...@gmail.com>.
We've been seeing issues
, three times now, where it seems like a flowfile is stuck in a load
balanced queue. We're not able to empty the queue or view the flowfile that
appears to be in the queue. The only resolution for use right now is to
detach the node where the Flowfile is in the queue, then restart that node.
After that, the flowfile is gone.

I've enabled the expire flow files after X time, but that doesn't appear to
help...

Regards

Dano

On Mon, Dec 3, 2018, 4:42 AM Kien Truong <duckientruong@gmail.com wrote:

> Hi all,
>
> We're testing the new load-balance connection feature of NIFI 1.8.
>
> After some cluster-wide restarts, some connections with load-balancing
> enabled seem to stuck.
>
> The connection is always shown as actively balancing, however, the size
> of the queue show very little changes, also the queue contents cannot be
> viewed.
>
> NIFI always returns that the queue has 0 flow files when trying to view
> queue content, despite showing a non-zero number in the UI and in REST.
>
> When this happen, if we disable the load-balancing, we will be able to
> view the queue content immediately, but the problem return if we enable
> the load balancing feature again.
>
>
> In addition, we sometime see negative queue size exception in the log
> when this happen, but not always.
>
>
> Regards,
>
> Kien
>
>
>

Re: Load Balancing connection issue

Posted by Mark Payne <ma...@hotmail.com>.
Hi Kien,

Thanks for the details! Can you tell us a bit more about how the Connection is configured?
What is the Load Balance Strategy that you're using? Do you have Back-Pressure enabled? 
Is it configured for the default 10,000 FlowFiles / 1 GB, or have you changed those settings?
Are you reaching that backpressure threshold? I.e., is backpressure being applied?

The fact that you cannot view data in the queue is not surprising. When you attempt to view the listing,
it will only show what is available in the 'local' queue. I.e., any data that is waiting to be load balanced
to another node will not be shown. That's why, when you turn off load-balancing, the data is viewable again,
because it is no longer waiting to go to another node but instead is waiting in the 'local' queue.

One thing that I would recommend, to get more information, is to go to the REST endpoint (in your browser is fine)
/nifi-api/processors/<processor id>/diagnostics

Where <processor id> is the UUID of either the source or the destination of the Connection in question. This gives us
a lot of information about the internals of Connection. The easiest way to get that Processor ID is to just click on the
processor on the canvas and look at the Operate palette on the left-hand side. You can copy & paste from there. If you
then send the diagnostics information to us, we can analyze that to help understand what's happening.

Many thanks!
-Mark


> On Dec 3, 2018, at 6:41 AM, Kien Truong <du...@gmail.com> wrote:
> 
> Hi all,
> 
> We're testing the new load-balance connection feature of NIFI 1.8.
> 
> After some cluster-wide restarts, some connections with load-balancing enabled seem to stuck.
> 
> The connection is always shown as actively balancing, however, the size of the queue show very little changes, also the queue contents cannot be viewed.
> 
> NIFI always returns that the queue has 0 flow files when trying to view queue content, despite showing a non-zero number in the UI and in REST.
> 
> When this happen, if we disable the load-balancing, we will be able to view the queue content immediately, but the problem return if we enable the load balancing feature again.
> 
> 
> In addition, we sometime see negative queue size exception in the log when this happen, but not always.
> 
> 
> Regards,
> 
> Kien
> 
>