You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by Axel Schwarz <Ax...@emailn.de> on 2021/08/03 07:08:25 UTC

Re: Re: Re: Re: No Load Balancing since 1.13.2

Hey guys,

I think I found the "trick" for at least version 1.13.2 and of course I'll share it with you.
I now use the following load balancing properties:

# cluster load balancing properties #
nifi.cluster.load.balance.host=192.168.1.10
nifi.cluster.load.balance.port=6342
nifi.cluster.load.balance.connections.per.node=4
nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

So I use the hosts IP address for balance.host instead of 0.0.0.0 or the fqdn and have no balance.address property at all.
This led to partly load balancing in my case as already mentioned. It looked like I needed to do one more step to reach the goal and this step seems to be deleting all statemanagement files.

Through the state-management.xml config file I changed the state management directory to be outside of the nifi installation, because the config file says "it is important, that the directory be copied over to the new version when upgrading nifi". So everytime when I upgraded or reinstalled Nifi during my load balancing odyssey, the statemanagement remained completely untouched.
As soon as I changed that, by deleting the entire state management directory before reinstalling Nifi with above mentioned properties, load balancing was immediately working throughout the whole cluster.


I think for my flow it is not quite that bad to delete the state management as I only use one statefull processor to increase some counter. And the times I already tried this by now, I could not encounter any wrong behaviour whatsoever. But of course I can't test everything, so when any of you have some important facts about deleting the state management, please let me know :)

Beside that I now feel like this solved my problem. Gotta have an eye on that when updating to version 1.14.0 later on, but I think I can figure this out. So thanks for all your support! :)

--- Ursprüngliche Nachricht ---
Von: "Jens M. Kofoed" <jm...@gmail.com>
Datum: 29.07.2021 11:08:28
An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
Betreff: Re: Re: Re: No Load Balancing since 1.13.2

> Hmm... I can't remember :-( sorry
>
> My configuration for version 1.13.2 is like this:
> # cluster node properties (only configure for cluster nodes) #
> nifi.cluster.is.node=true
> nifi.cluster.node.address=nifi-node01.domaine.com
> nifi.cluster.node.protocol.port=9443
> nifi.cluster.node.protocol.threads=10
> nifi.cluster.node.protocol.max.threads=50
> nifi.cluster.node.event.history.size=25
> nifi.cluster.node.connection.timeout=5 sec
> nifi.cluster.node.read.timeout=5 sec
> nifi.cluster.node.max.concurrent.requests=100
> nifi.cluster.firewall.file=
> nifi.cluster.flow.election.max.wait.time=5 mins
> nifi.cluster.flow.election.max.candidates=3
>
> # cluster load balancing properties #
> nifi.cluster.load.balance.address=192.168.1.11
> nifi.cluster.load.balance.port=6111
> nifi.cluster.load.balance.connections.per.node=4
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
>
> So I defined "nifi.cluster.node.address" with the hostname and
> not an ip
> adress and the "nifi.cluster.load.balance.address" with the ip
> address of
> the server.
> And triple check the configuration at all servers :-)
>
> Kind Regards
> Jens M. Kofoed
>
>
> Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <Ax...@emailn.de>:
>
>
> > Hey Jens,
> >
> > in Issue Nifi-8643 you wrote the last comment with the exactly same
>
> > behaviour as we're experiencing now. 2 of 3 nodes were load balancing.
>
> > How did you get the third node to participate in load balancing? An
> update
> > to 1.14.0 does not change anything for us.
> >
> >
> > https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418
>
> >
> >
> > --- Ursprüngliche Nachricht ---
> > Von: "Jens M. Kofoed" <jm...@gmail.com>
> > Datum: 28.07.2021 12:07:50
> > An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>
> > Betreff: Re: Re: No Load Balancing since 1.13.2
> >
> > > hi
> > >
> > > I can see that you have configured
> > nifi.cluster.load.balance.address=0.0.0.0
> > >
> > > Have your tried to set the correct ip adress?
> > > node1: nifi.cluster.load.balance.address=192.168.1.10
> > > node2: nifi.cluster.load.balance.address=192.168.1.11
> > > node3: nifi.cluster.load.balance.address=192.168.1.12
> > >
> > > regards
> > > Jens M. Kofoed
> > >
> > > Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz <
> > Axelkopter@emailn.de>:
> > >
> > >
> > > > Just tried Java 11. But still does not work. Nothing changed.
> :(
> > > >
> > > > --- Ursprüngliche Nachricht ---
> > > > Von: Jorge Machado <jo...@me.com>
> > > > Datum: 27.07.2021 13:08:55
> > > > An: users@nifi.apache.org,  Axel Schwarz <Ax...@emailn.de>
>
> > >
> > > > Betreff: Re: No Load Balancing since 1.13.2
> > > >
> > > > > Did you tried java 11 ? I have a client running a similar
> setup
> > > to yours
> > > > > but with a lower nigh version and it works fine. Maybe
> it is worth
> > > to try
> > > > > it.
> > > > >
> > > > >
> > > > > > On 27. Jul 2021, at 12:42, Axel Schwarz <Ax...@emailn.de>
>
> > >
> > > > > wrote:
> > > > > >
> > > > > > I did indeed, but I updated from u161 to u291, as
> this was
> > > the newest
> > > > > version at that time, because I thought it could help.
>
> > > > > > So the issue started under u161. But I just saw
> that u301
> > > is out. I
> > > > > will try this as well.
> > > > > > --- Ursprüngliche Nachricht ---
> > > > > > Von: Pierre Villard <pi...@gmail.com>
>
> > > > > > Datum: 27.07.2021 10:18:38
> > > > > > An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>
> > >
> > > > >
> > > > > > Betreff: Re: No Load Balancing since 1.13.2
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I believe the minor u291 is known to have issues
> (for some
> > > of its early
> > > > > builds). Did you upgrade the Java version recently?
> > > > > >
> > > > > > Thanks,
> > > > > > Pierre
> > > > > >
> > > > > > Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <Axelkopter@emailn.de
>
> > >
> > > > > <ma...@emailn.de>> a écrit :
> > > > > > Dear Community,
> > > > > >
> > > > > > we're running a secured 3 node Nifi Cluster on Java
> 8_u291
> > > and Debian
> > > > > 7 and experiencing
> > > > > > problems with load balancing since version 1.13.2.
>
> > > > > >
> > > > > > I'm fully aware of Issue Nifi-8643 and tested alot
> around
> > > this, but
> > > > > gotta say, that this
> > > > > > is not our problem. Mainly because the balance port
> never
> > > binds to
> > > > localhost,
> > > > > but also because I
> > > > > > implemented all workarounds under version 1.13.2
> and even
> > > tried version
> > > > > 1.14.0 by now,
> > > > > > but load blancing still does not work.
> > > > > > What we experience is best described as "the
> primary
> > > node balances
> > > > > with itself"...
> > > > > >
> > > > > > So what it does is, opening the balancing connections
> to its
> > > own IP
> > > > > instead of the IPs
> > > > > > of the other two nodes. And the other two nodes
> don't open
> > > balancing
> > > > > connections at all.
> > > > > >
> > > > > > When executing "ss | grep 6342" on the
> primary node,
> > > this
> > > > > is what it looks like:
> > > > > >
> > > > > > [root@nifiHost1 conf]# ss | grep 6342
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51380
> <
> > > > http://192.168.1.10:51380/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51376
> <
> > > > http://192.168.1.10:51376/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51378
> <
> > > > http://192.168.1.10:51378/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51370
> <
> > > > http://192.168.1.10:51370/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51372
> <
> > > > http://192.168.1.10:51372/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51376 <http://192.168.1.10:51376/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51374
> <
> > > > http://192.168.1.10:51374/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51374 <http://192.168.1.10:51374/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51366
> <
> > > > http://192.168.1.10:51366/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51370 <http://192.168.1.10:51370/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51366 <http://192.168.1.10:51366/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:51368
> <
> > > > http://192.168.1.10:51368/>
> > > > >                192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51372 <http://192.168.1.10:51372/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51378 <http://192.168.1.10:51378/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51368 <http://192.168.1.10:51368/>
>
> > >
> > > > >
> > > > > > tcp    ESTAB      0      0      192.168.1.10:6342
> <
> > > > http://192.168.1.10:6342/>
> > > > >                 192.168.1.10:51380 <http://192.168.1.10:51380/>
>
> > >
> > > > > >
> > > > > > Executing it on the other non primary nodes, just
> returns
> > > absolutely
> > > > > nothing.
> > > > > >
> > > > > > Netstat show the following on each server:
> > > > > >
> > > > > > [root@nifiHost1 conf]# netstat -tulpn
> > > > > > Active Internet connections (only servers)
> > > > > > Proto Recv-Q Send-Q Local Address           Foreign
> Address
> > >
> > > > > State       PID/Program name
> > > > > > tcp        0      0 192.168.1.10:6342 <http://192.168.1.10:6342/>
>
> > >
> > > > >          0.0.0.0:*               LISTEN      10352/java
>
> > > > > >
> > > > > > [root@nifiHost2 conf]# netstat -tulpn
> > > > > > Active Internet connections (only servers)
> > > > > > Proto Recv-Q Send-Q Local Address           Foreign
> Address
> > >
> > > > > State       PID/Program name
> > > > > > tcp        0      0 192.168.1.11:6342 <http://192.168.1.11:6342/>
>
> > >
> > > > >          0.0.0.0:*               LISTEN      31562/java
>
> > > > > >
> > > > > > [root@nifiHost3 conf]# netstat -tulpn
> > > > > > Active Internet connections (only servers)
> > > > > > Proto Recv-Q Send-Q Local Address           Foreign
> Address
> > >
> > > > > State       PID/Program name
> > > > > > tcp        0      0 192.168.1.12:6342 <http://192.168.1.12:6342/>
>
> > >
> > > > >          0.0.0.0:*               LISTEN      31685/java
>
> > > > > >
> > > > > > And here is what our load balancing properties look
> like:
> > >
> > > > > >
> > > > > > # cluster load balancing properties #
> > > > > > nifi.cluster.load.balance.host=nifiHost1.contoso.com
> <
> > >
> > > > http://nifihost1.contoso.com/>
> > > > >
> > > > > > nifi.cluster.load.balance.address=0.0.0.0
> > > > > > nifi.cluster.load.balance.port=6342
> > > > > > nifi.cluster.load.balance.connections.per.node=4
>
> > > > > > nifi.cluster.load.balance.max.thread.count=8
> > > > > > nifi.cluster.load.balance.comms.timeout=30 sec
> > > > > >
> > > > > > When running Nifi in version 1.12.1 on the exact
> same setup
> > > in the
> > > > exact
> > > > > same environment, load balancing is working absolutely
> fine.
> > > > > > There was a time when load balancing even worked
> in version
> > > 1.13.2.
> > > > > But I'm not able to reproduce this and it just stopped
>
> > > > > > working one day after some restart, without changing
> any property
> > > or
> > > > > whatsoever.
> > > > > >
> > > > > > If any more information would be helpful please
> let me know
> > > and I'll
> > > > > try to provide it as fast as possible.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Versendet mit Emailn.de <https://www.emailn.de/>
> - Freemail
> > >
> > > > > >
> > > > > > * Unbegrenzt Speicherplatz
> > > > > > * Eigenes Online-Büro
> > > > > > * 24h besten Mailempfang
> > > > > > * Spamschutz, Adressbuch
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Versendet mit Emailn.de <https://www.emailn.de/>
> - Freemail
> > >
> > > > >
> > > > > >
> > > > > > * Unbegrenzt Speicherplatz
> > > > > > * Eigenes Online-Büro
> > > > > > * 24h besten Mailempfang
> > > > > > * Spamschutz, Adressbuch
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
>

Re: No Load Balancing since 1.13.2

Posted by "Jens M. Kofoed" <jm...@gmail.com>.

Hi Mark

Just back at the office after a small holiday :-)
I have tested my setup with nifi 1.14.0 regarding hostname and FQDN.
If I run a nslookup node01.domain.lan I get the address 192.168.1.11
If I configure nifi.cluster.load.balance.host=node01.domain.lan, netstat -l
show the following:
tcp        0      0 localhost:6342         0.0.0.0:*               LISTEN

if I configure nifi.cluster.load.balance.host=192.168.1.11, netstat -l show
the  following:
tcp        0      0 node01.domain.lan:6342 0.0.0.0:*               LISTEN

I don't know why it will be different than yours since I can get the
correct IP via nslookup

Kind regards
Jens M. Kofoed

Den fre. 6. aug. 2021 kl. 15.48 skrev Mark Payne <ma...@hotmail.com>:

> Jens,
>
> You’re right - my mistake, the change from
> “nifi.cluster.load.balance.address” to “nifi.cluster.load.balance.host” was
> in 1.14.0, not early on. In 1.14.0, only nifi.cluster.load.balance.host is
> used. The documentation and properties file both used .host, but the code
> was making use of .address instead. So the code was fixed in 1.14.0 to
> match what the documentation and nifi.properties file specified.
>
> I just did some testing locally on my macbook regarding the IP address vs.
> hostname.
> What I found is that if I use the IP address, it listens as expected.
> If I use just <hostname> (not fully qualified), interestingly it listens
> on localhost only.
> If I run: "nslookup <hostname>" I get back <hostname>.lan as the fqdn
> If I use "<hostname>.lan” in my properties, it listens as expected.
>
> Thanks
> -Mark
>
> On Aug 6, 2021, at 12:28 AM, Jens M. Kofoed <jm...@gmail.com>
> wrote:
>
> Hi Mark
>
> In version 1.13.2 (at least) the file
> "main/nifi-commons/nifi-properties/src/main/java/org/apache/nifi/util/NiFiProperties.java"
> is looking for a property called "nifi.cluster.load.balance.address" which
> has been reported in https://issues.apache.org/jira/browse/NIFI-8643 and
> fixed in version 1.14.0
>
> In version 1.14.0 the only way I can get it to work, is if I type in the
> IP address. If I don't specified it or type in the fqdn the load balance
> port will bind to localhost. which has been reported in
> https://issues.apache.org/jira/browse/NIFI-9010
> The result from running netstat -l
> tcp 0 0 localhost:6342 0.0.0.0:* LISTEN
>
> Kind regards
> Jens M. Kofoed
>
>
>
> Den tor. 5. aug. 2021 kl. 23.08 skrev Mark Payne <ma...@hotmail.com>:
>
>> Axel,
>>
>> I think that I can help clarify some of these things.
>>
>> First of all: nifi.cluster.load.balance.host vs.
>> nifi.cluster.load.balance.address
>> * The nifi.cluster.load.balance.host property is what matters.
>>
>> * The nifi.cluster.load.balance.address is not a real property. NiFi has
>> never looked at this property. However, in the first release that included
>> load-balancing, there was a typo in which the nifi.properties file had
>> “…address” instead of “…host”. This was later addressed.
>>
>> * So if you have a value for “nifi.cluster.load.balance.address”, it does
>> nothing and is always ignored.
>>
>>
>>
>> Next: nifi.cluster.load.balance.host property
>>
>> * nifi.cluster.load.balance.host can be either an IP address or a
>> hostname. But if set, other nodes in the cluster MUST be able to
>> communicate with the node using whatever value you put here. So using a
>> value of 0.0.0.0 will not work. Also, if set, NiFi will listen for incoming
>> connections ONLY on that hostname. So if you set it to “localhost”, for
>> instance, no other node can connect to it, because no other host can
>> connect to the node using “localhost”. So this needs to be an address that
>> both the NiFi instance knows about/can bind to, and other nodes in the
>> cluster can connect to.
>>
>> * If nifi.cluster.load.balance.host is NOT set: NiFi will listen for
>> incoming requests on all network interfaces / hostnames. It will advertise
>> its hostname to other nodes in the cluster according to whatever is set for
>> the “nifi.cluster.node.address” property. Meaning that other nodes in the
>> cluster must be able to connect to this node using whatever hostname is set
>> for the “nifi.cluster.node.address” property. If
>> the “nifi.cluster.node.address” property is not set, it advertises its
>> hostname as localhost - which means other nodes won’t be able to send to
>> it.
>>
>> So you must specify either the “nifi.cluster.load.balance.host” property
>> or the “nifi.cluster.node.address” property.
>>
>>
>>
>> Finally: having to delete the state directory
>>
>> If you change the “nifi.cluster.load.balance.host” or
>> “nifi.cluster.load.balance.port” property and restart a node, you must
>> restart all nodes in the cluster. Otherwise, the other nodes won’t be able
>> to send to that node.
>> So, for example, when you changed the load.balance.host from fqdn or
>> 0.0.0.0 to the IP address - the other nodes in the cluster would stop
>> sending. I created a JIRA [1] for that. In my testing, when I changed the
>> hostname, the other nodes stopped sending. But restarting them got things
>> back on track. I wasn’t able to replicate the issue after restarting all
>> nodes.
>>
>> Hope this is helpful!
>> -Mark
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-9017
>>
>>
>> On Aug 3, 2021, at 3:08 AM, Axel Schwarz <Ax...@emailn.de> wrote:
>>
>> Hey guys,
>>
>> I think I found the "trick" for at least version 1.13.2 and of course
>> I'll share it with you.
>> I now use the following load balancing properties:
>>
>> # cluster load balancing properties #
>> nifi.cluster.load.balance.host=192.168.1.10
>> nifi.cluster.load.balance.port=6342
>> nifi.cluster.load.balance.connections.per.node=4
>> nifi.cluster.load.balance.max.thread.count=8
>> nifi.cluster.load.balance.comms.timeout=30 sec
>>
>> So I use the hosts IP address for balance.host instead of 0.0.0.0 or the
>> fqdn and have no balance.address property at all.
>> This led to partly load balancing in my case as already mentioned. It
>> looked like I needed to do one more step to reach the goal and this step
>> seems to be deleting all statemanagement files.
>>
>> Through the state-management.xml config file I changed the state
>> management directory to be outside of the nifi installation, because the
>> config file says "it is important, that the directory be copied over to the
>> new version when upgrading nifi". So everytime when I upgraded or
>> reinstalled Nifi during my load balancing odyssey, the statemanagement
>> remained completely untouched.
>> As soon as I changed that, by deleting the entire state management
>> directory before reinstalling Nifi with above mentioned properties, load
>> balancing was immediately working throughout the whole cluster.
>>
>>
>> I think for my flow it is not quite that bad to delete the state
>> management as I only use one statefull processor to increase some counter.
>> And the times I already tried this by now, I could not encounter any wrong
>> behaviour whatsoever. But of course I can't test everything, so when any of
>> you have some important facts about deleting the state management, please
>> let me know :)
>>
>> Beside that I now feel like this solved my problem. Gotta have an eye on
>> that when updating to version 1.14.0 later on, but I think I can figure
>> this out. So thanks for all your support! :)
>>
>> --- Ursprüngliche Nachricht ---
>> Von: "Jens M. Kofoed" <jm...@gmail.com>
>> Datum: 29.07.2021 11:08:28
>> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>> Betreff: Re: Re: Re: No Load Balancing since 1.13.2
>>
>> Hmm... I can't remember :-( sorry
>>
>> My configuration for version 1.13.2 is like this:
>> # cluster node properties (only configure for cluster nodes) #
>> nifi.cluster.is.node=true
>> nifi.cluster.node.address=nifi-node01.domaine.com
>> nifi.cluster.node.protocol.port=9443
>> nifi.cluster.node.protocol.threads=10
>> nifi.cluster.node.protocol.max.threads=50
>> nifi.cluster.node.event.history.size=25
>> nifi.cluster.node.connection.timeout=5 sec
>> nifi.cluster.node.read.timeout=5 sec
>> nifi.cluster.node.max.concurrent.requests=100
>> nifi.cluster.firewall.file=
>> nifi.cluster.flow.election.max.wait.time=5 mins
>> nifi.cluster.flow.election.max.candidates=3
>>
>> # cluster load balancing properties #
>> nifi.cluster.load.balance.address=192.168.1.11
>> nifi.cluster.load.balance.port=6111
>> nifi.cluster.load.balance.connections.per.node=4
>> nifi.cluster.load.balance.max.thread.count=8
>> nifi.cluster.load.balance.comms.timeout=30 sec
>>
>> So I defined "nifi.cluster.node.address" with the hostname and
>> not an ip
>> adress and the "nifi.cluster.load.balance.address" with the ip
>> address of
>> the server.
>> And triple check the configuration at all servers :-)
>>
>> Kind Regards
>> Jens M. Kofoed
>>
>>
>> Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <Axelkopter@emailn.de
>> >:
>>
>>
>> Hey Jens,
>>
>> in Issue Nifi-8643 you wrote the last comment with the exactly same
>>
>>
>> behaviour as we're experiencing now. 2 of 3 nodes were load balancing.
>>
>>
>> How did you get the third node to participate in load balancing? An
>>
>> update
>>
>> to 1.14.0 does not change anything for us.
>>
>>
>>
>> https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418
>>
>>
>>
>>
>> --- Ursprüngliche Nachricht ---
>> Von: "Jens M. Kofoed" <jm...@gmail.com>
>> Datum: 28.07.2021 12:07:50
>> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>>
>>
>> Betreff: Re: Re: No Load Balancing since 1.13.2
>>
>> hi
>>
>> I can see that you have configured
>>
>> nifi.cluster.load.balance.address=0.0.0.0
>>
>>
>> Have your tried to set the correct ip adress?
>> node1: nifi.cluster.load.balance.address=192.168.1.10
>> node2: nifi.cluster.load.balance.address=192.168.1.11
>> node3: nifi.cluster.load.balance.address=192.168.1.12
>>
>> regards
>> Jens M. Kofoed
>>
>> Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz <
>>
>> Axelkopter@emailn.de>:
>>
>>
>>
>> Just tried Java 11. But still does not work. Nothing changed.
>>
>> :(
>>
>>
>> --- Ursprüngliche Nachricht ---
>> Von: Jorge Machado <jo...@me.com>
>> Datum: 27.07.2021 13:08:55
>> An: users@nifi.apache.org,  Axel Schwarz <Ax...@emailn.de>
>>
>>
>>
>> Betreff: Re: No Load Balancing since 1.13.2
>>
>> Did you tried java 11 ? I have a client running a similar
>>
>> setup
>>
>> to yours
>>
>> but with a lower nigh version and it works fine. Maybe
>>
>> it is worth
>>
>> to try
>>
>> it.
>>
>>
>> On 27. Jul 2021, at 12:42, Axel Schwarz <Ax...@emailn.de>
>>
>>
>>
>> wrote:
>>
>>
>> I did indeed, but I updated from u161 to u291, as
>>
>> this was
>>
>> the newest
>>
>> version at that time, because I thought it could help.
>>
>>
>> So the issue started under u161. But I just saw
>>
>> that u301
>>
>> is out. I
>>
>> will try this as well.
>>
>> --- Ursprüngliche Nachricht ---
>> Von: Pierre Villard <pi...@gmail.com>
>>
>>
>> Datum: 27.07.2021 10:18:38
>> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>>
>>
>>
>>
>> Betreff: Re: No Load Balancing since 1.13.2
>>
>> Hi,
>>
>> I believe the minor u291 is known to have issues
>>
>> (for some
>>
>> of its early
>>
>> builds). Did you upgrade the Java version recently?
>>
>>
>> Thanks,
>> Pierre
>>
>> Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <Axelkopter@emailn.de
>>
>>
>>
>> <mailto:Axelkopter@emailn.de <Ax...@emailn.de>>> a écrit :
>>
>> Dear Community,
>>
>> we're running a secured 3 node Nifi Cluster on Java
>>
>> 8_u291
>>
>> and Debian
>>
>> 7 and experiencing
>>
>> problems with load balancing since version 1.13.2.
>>
>>
>>
>> I'm fully aware of Issue Nifi-8643 and tested alot
>>
>> around
>>
>> this, but
>>
>> gotta say, that this
>>
>> is not our problem. Mainly because the balance port
>>
>> never
>>
>> binds to
>>
>> localhost,
>>
>> but also because I
>>
>> implemented all workarounds under version 1.13.2
>>
>> and even
>>
>> tried version
>>
>> 1.14.0 by now,
>>
>> but load blancing still does not work.
>> What we experience is best described as "the
>>
>> primary
>>
>> node balances
>>
>> with itself"...
>>
>>
>> So what it does is, opening the balancing connections
>>
>> to its
>>
>> own IP
>>
>> instead of the IPs
>>
>> of the other two nodes. And the other two nodes
>>
>> don't open
>>
>> balancing
>>
>> connections at all.
>>
>>
>> When executing "ss | grep 6342" on the
>>
>> primary node,
>>
>> this
>>
>> is what it looks like:
>>
>>
>> [root@nifiHost1 conf]# ss | grep 6342
>> tcp    ESTAB      0      0      192.168.1.10:51380
>>
>> <
>>
>> http://192.168.1.10:51380/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51376
>>
>> <
>>
>> http://192.168.1.10:51376/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51378
>>
>> <
>>
>> http://192.168.1.10:51378/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51370
>>
>> <
>>
>> http://192.168.1.10:51370/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51372
>>
>> <
>>
>> http://192.168.1.10:51372/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51376 <http://192.168.1.10:51376/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51374
>>
>> <
>>
>> http://192.168.1.10:51374/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51374 <http://192.168.1.10:51374/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51366
>>
>> <
>>
>> http://192.168.1.10:51366/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51370 <http://192.168.1.10:51370/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51366 <http://192.168.1.10:51366/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:51368
>>
>> <
>>
>> http://192.168.1.10:51368/>
>>
>>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51372 <http://192.168.1.10:51372/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51378 <http://192.168.1.10:51378/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51368 <http://192.168.1.10:51368/>
>>
>>
>>
>>
>> tcp    ESTAB      0      0      192.168.1.10:6342
>>
>> <
>>
>> http://192.168.1.10:6342/>
>>
>>                192.168.1.10:51380 <http://192.168.1.10:51380/>
>>
>>
>>
>>
>> Executing it on the other non primary nodes, just
>>
>> returns
>>
>> absolutely
>>
>> nothing.
>>
>>
>> Netstat show the following on each server:
>>
>> [root@nifiHost1 conf]# netstat -tulpn
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address           Foreign
>>
>> Address
>>
>>
>> State       PID/Program name
>>
>> tcp        0      0 192.168.1.10:6342 <http://192.168.1.10:6342/>
>>
>>
>>
>>         0.0.0.0:*               LISTEN      10352/java
>>
>>
>>
>> [root@nifiHost2 conf]# netstat -tulpn
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address           Foreign
>>
>> Address
>>
>>
>> State       PID/Program name
>>
>> tcp        0      0 192.168.1.11:6342 <http://192.168.1.11:6342/>
>>
>>
>>
>>         0.0.0.0:*               LISTEN      31562/java
>>
>>
>>
>> [root@nifiHost3 conf]# netstat -tulpn
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address           Foreign
>>
>> Address
>>
>>
>> State       PID/Program name
>>
>> tcp        0      0 192.168.1.12:6342 <http://192.168.1.12:6342/>
>>
>>
>>
>>         0.0.0.0:*               LISTEN      31685/java
>>
>>
>>
>> And here is what our load balancing properties look
>>
>> like:
>>
>>
>>
>> # cluster load balancing properties #
>> nifi.cluster.load.balance.host=nifiHost1.contoso.com
>> <http://nifihost1.contoso.com/>
>>
>> <
>>
>>
>> http://nifihost1.contoso.com/>
>>
>>
>> nifi.cluster.load.balance.address=0.0.0.0
>> nifi.cluster.load.balance.port=6342
>> nifi.cluster.load.balance.connections.per.node=4
>>
>>
>> nifi.cluster.load.balance.max.thread.count=8
>> nifi.cluster.load.balance.comms.timeout=30 sec
>>
>> When running Nifi in version 1.12.1 on the exact
>>
>> same setup
>>
>> in the
>>
>> exact
>>
>> same environment, load balancing is working absolutely
>>
>> fine.
>>
>> There was a time when load balancing even worked
>>
>> in version
>>
>> 1.13.2.
>>
>> But I'm not able to reproduce this and it just stopped
>>
>>
>> working one day after some restart, without changing
>>
>> any property
>>
>> or
>>
>> whatsoever.
>>
>>
>> If any more information would be helpful please
>>
>> let me know
>>
>> and I'll
>>
>> try to provide it as fast as possible.
>>
>>
>>
>>
>> Versendet mit Emailn.de <http://emailn.de/> <https://www.emailn.de/>
>>
>> - Freemail
>>
>>
>>
>> * Unbegrenzt Speicherplatz
>> * Eigenes Online-Büro
>> * 24h besten Mailempfang
>> * Spamschutz, Adressbuch
>>
>>
>>
>>
>> Versendet mit Emailn.de <http://emailn.de/> <https://www.emailn.de/>
>>
>> - Freemail
>>
>>
>>
>>
>> * Unbegrenzt Speicherplatz
>> * Eigenes Online-Büro
>> * 24h besten Mailempfang
>> * Spamschutz, Adressbuch
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: No Load Balancing since 1.13.2

Posted by Mark Payne <ma...@hotmail.com>.

Jens,

You’re right - my mistake, the change from “nifi.cluster.load.balance.address” to “nifi.cluster.load.balance.host” was in 1.14.0, not early on. In 1.14.0, only nifi.cluster.load.balance.host is used. The documentation and properties file both used .host, but the code was making use of .address instead. So the code was fixed in 1.14.0 to match what the documentation and nifi.properties file specified.

I just did some testing locally on my macbook regarding the IP address vs. hostname.
What I found is that if I use the IP address, it listens as expected.
If I use just <hostname> (not fully qualified), interestingly it listens on localhost only.
If I run: "nslookup <hostname>" I get back <hostname>.lan as the fqdn
If I use "<hostname>.lan” in my properties, it listens as expected.

Thanks
-Mark

On Aug 6, 2021, at 12:28 AM, Jens M. Kofoed <jm...@gmail.com>> wrote:

Hi Mark

In version 1.13.2 (at least) the file "main/nifi-commons/nifi-properties/src/main/java/org/apache/nifi/util/NiFiProperties.java" is looking for a property called "nifi.cluster.load.balance.address" which has been reported in https://issues.apache.org/jira/browse/NIFI-8643 and fixed in version 1.14.0

In version 1.14.0 the only way I can get it to work, is if I type in the IP address. If I don't specified it or type in the fqdn the load balance port will bind to localhost. which has been reported in https://issues.apache.org/jira/browse/NIFI-9010
The result from running netstat -l
tcp 0 0 localhost:6342 0.0.0.0:* LISTEN

Kind regards
Jens M. Kofoed



Den tor. 5. aug. 2021 kl. 23.08 skrev Mark Payne <ma...@hotmail.com>>:
Axel,

I think that I can help clarify some of these things.

First of all: nifi.cluster.load.balance.host vs. nifi.cluster.load.balance.address
* The nifi.cluster.load.balance.host property is what matters.

* The nifi.cluster.load.balance.address is not a real property. NiFi has never looked at this property. However, in the first release that included load-balancing, there was a typo in which the nifi.properties file had “…address” instead of “…host”. This was later addressed.

* So if you have a value for “nifi.cluster.load.balance.address”, it does nothing and is always ignored.



Next: nifi.cluster.load.balance.host property

* nifi.cluster.load.balance.host can be either an IP address or a hostname. But if set, other nodes in the cluster MUST be able to communicate with the node using whatever value you put here. So using a value of 0.0.0.0 will not work. Also, if set, NiFi will listen for incoming connections ONLY on that hostname. So if you set it to “localhost”, for instance, no other node can connect to it, because no other host can connect to the node using “localhost”. So this needs to be an address that both the NiFi instance knows about/can bind to, and other nodes in the cluster can connect to.

* If nifi.cluster.load.balance.host is NOT set: NiFi will listen for incoming requests on all network interfaces / hostnames. It will advertise its hostname to other nodes in the cluster according to whatever is set for the “nifi.cluster.node.address” property. Meaning that other nodes in the cluster must be able to connect to this node using whatever hostname is set for the “nifi.cluster.node.address” property. If the “nifi.cluster.node.address” property is not set, it advertises its hostname as localhost - which means other nodes won’t be able to send to it.

So you must specify either the “nifi.cluster.load.balance.host” property or the “nifi.cluster.node.address” property.



Finally: having to delete the state directory

If you change the “nifi.cluster.load.balance.host” or “nifi.cluster.load.balance.port” property and restart a node, you must restart all nodes in the cluster. Otherwise, the other nodes won’t be able to send to that node.
So, for example, when you changed the load.balance.host from fqdn or 0.0.0.0 to the IP address - the other nodes in the cluster would stop sending. I created a JIRA [1] for that. In my testing, when I changed the hostname, the other nodes stopped sending. But restarting them got things back on track. I wasn’t able to replicate the issue after restarting all nodes.

Hope this is helpful!
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-9017


On Aug 3, 2021, at 3:08 AM, Axel Schwarz <Ax...@emailn.de>> wrote:

Hey guys,

I think I found the "trick" for at least version 1.13.2 and of course I'll share it with you.
I now use the following load balancing properties:

# cluster load balancing properties #
nifi.cluster.load.balance.host=192.168.1.10
nifi.cluster.load.balance.port=6342
nifi.cluster.load.balance.connections.per.node=4
nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

So I use the hosts IP address for balance.host instead of 0.0.0.0 or the fqdn and have no balance.address property at all.
This led to partly load balancing in my case as already mentioned. It looked like I needed to do one more step to reach the goal and this step seems to be deleting all statemanagement files.

Through the state-management.xml config file I changed the state management directory to be outside of the nifi installation, because the config file says "it is important, that the directory be copied over to the new version when upgrading nifi". So everytime when I upgraded or reinstalled Nifi during my load balancing odyssey, the statemanagement remained completely untouched.
As soon as I changed that, by deleting the entire state management directory before reinstalling Nifi with above mentioned properties, load balancing was immediately working throughout the whole cluster.


I think for my flow it is not quite that bad to delete the state management as I only use one statefull processor to increase some counter. And the times I already tried this by now, I could not encounter any wrong behaviour whatsoever. But of course I can't test everything, so when any of you have some important facts about deleting the state management, please let me know :)

Beside that I now feel like this solved my problem. Gotta have an eye on that when updating to version 1.14.0 later on, but I think I can figure this out. So thanks for all your support! :)

--- Ursprüngliche Nachricht ---
Von: "Jens M. Kofoed" <jm...@gmail.com>>
Datum: 29.07.2021 11:08:28
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>
Betreff: Re: Re: Re: No Load Balancing since 1.13.2

Hmm... I can't remember :-( sorry

My configuration for version 1.13.2 is like this:
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-node01.domaine.com<http://nifi-node01.domaine.com/>
nifi.cluster.node.protocol.port=9443
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=3

# cluster load balancing properties #
nifi.cluster.load.balance.address=192.168.1.11
nifi.cluster.load.balance.port=6111
nifi.cluster.load.balance.connections.per.node=4
nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

So I defined "nifi.cluster.node.address" with the hostname and
not an ip
adress and the "nifi.cluster.load.balance.address" with the ip
address of
the server.
And triple check the configuration at all servers :-)

Kind Regards
Jens M. Kofoed


Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <Ax...@emailn.de>>:


Hey Jens,

in Issue Nifi-8643 you wrote the last comment with the exactly same

behaviour as we're experiencing now. 2 of 3 nodes were load balancing.

How did you get the third node to participate in load balancing? An
update
to 1.14.0 does not change anything for us.


https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418



--- Ursprüngliche Nachricht ---
Von: "Jens M. Kofoed" <jm...@gmail.com>>
Datum: 28.07.2021 12:07:50
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>

Betreff: Re: Re: No Load Balancing since 1.13.2

hi

I can see that you have configured
nifi.cluster.load.balance.address=0.0.0.0

Have your tried to set the correct ip adress?
node1: nifi.cluster.load.balance.address=192.168.1.10
node2: nifi.cluster.load.balance.address=192.168.1.11
node3: nifi.cluster.load.balance.address=192.168.1.12

regards
Jens M. Kofoed

Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz <
Axelkopter@emailn.de<ma...@emailn.de>>:


Just tried Java 11. But still does not work. Nothing changed.
:(

--- Ursprüngliche Nachricht ---
Von: Jorge Machado <jo...@me.com>>
Datum: 27.07.2021 13:08:55
An: users@nifi.apache.org<ma...@nifi.apache.org>,  Axel Schwarz <Ax...@emailn.de>>


Betreff: Re: No Load Balancing since 1.13.2

Did you tried java 11 ? I have a client running a similar
setup
to yours
but with a lower nigh version and it works fine. Maybe
it is worth
to try
it.


On 27. Jul 2021, at 12:42, Axel Schwarz <Ax...@emailn.de>>


wrote:

I did indeed, but I updated from u161 to u291, as
this was
the newest
version at that time, because I thought it could help.

So the issue started under u161. But I just saw
that u301
is out. I
will try this as well.
--- Ursprüngliche Nachricht ---
Von: Pierre Villard <pi...@gmail.com>>

Datum: 27.07.2021 10:18:38
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>



Betreff: Re: No Load Balancing since 1.13.2

Hi,

I believe the minor u291 is known to have issues
(for some
of its early
builds). Did you upgrade the Java version recently?

Thanks,
Pierre

Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <Ax...@emailn.de>


<ma...@emailn.de>> a écrit :
Dear Community,

we're running a secured 3 node Nifi Cluster on Java
8_u291
and Debian
7 and experiencing
problems with load balancing since version 1.13.2.


I'm fully aware of Issue Nifi-8643 and tested alot
around
this, but
gotta say, that this
is not our problem. Mainly because the balance port
never
binds to
localhost,
but also because I
implemented all workarounds under version 1.13.2
and even
tried version
1.14.0 by now,
but load blancing still does not work.
What we experience is best described as "the
primary
node balances
with itself"...

So what it does is, opening the balancing connections
to its
own IP
instead of the IPs
of the other two nodes. And the other two nodes
don't open
balancing
connections at all.

When executing "ss | grep 6342" on the
primary node,
this
is what it looks like:

[root@nifiHost1 conf]# ss | grep 6342
tcp    ESTAB      0      0      192.168.1.10:51380<http://192.168.1.10:51380/>
<
http://192.168.1.10:51380/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51376<http://192.168.1.10:51376/>
<
http://192.168.1.10:51376/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51378<http://192.168.1.10:51378/>
<
http://192.168.1.10:51378/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51370<http://192.168.1.10:51370/>
<
http://192.168.1.10:51370/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51372<http://192.168.1.10:51372/>
<
http://192.168.1.10:51372/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51376<http://192.168.1.10:51376/> <http://192.168.1.10:51376/>



tcp    ESTAB      0      0      192.168.1.10:51374<http://192.168.1.10:51374/>
<
http://192.168.1.10:51374/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51374<http://192.168.1.10:51374/> <http://192.168.1.10:51374/>



tcp    ESTAB      0      0      192.168.1.10:51366<http://192.168.1.10:51366/>
<
http://192.168.1.10:51366/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51370<http://192.168.1.10:51370/> <http://192.168.1.10:51370/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51366<http://192.168.1.10:51366/> <http://192.168.1.10:51366/>



tcp    ESTAB      0      0      192.168.1.10:51368<http://192.168.1.10:51368/>
<
http://192.168.1.10:51368/>
              192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51372<http://192.168.1.10:51372/> <http://192.168.1.10:51372/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51378<http://192.168.1.10:51378/> <http://192.168.1.10:51378/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51368<http://192.168.1.10:51368/> <http://192.168.1.10:51368/>



tcp    ESTAB      0      0      192.168.1.10:6342<http://192.168.1.10:6342/>
<
http://192.168.1.10:6342/>
               192.168.1.10:51380<http://192.168.1.10:51380/> <http://192.168.1.10:51380/>



Executing it on the other non primary nodes, just
returns
absolutely
nothing.

Netstat show the following on each server:

[root@nifiHost1 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.10:6342<http://192.168.1.10:6342/> <http://192.168.1.10:6342/>


        0.0.0.0:*               LISTEN      10352/java


[root@nifiHost2 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.11:6342<http://192.168.1.11:6342/> <http://192.168.1.11:6342/>


        0.0.0.0:*               LISTEN      31562/java


[root@nifiHost3 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.12:6342<http://192.168.1.12:6342/> <http://192.168.1.12:6342/>


        0.0.0.0:*               LISTEN      31685/java


And here is what our load balancing properties look
like:


# cluster load balancing properties #
nifi.cluster.load.balance.host=nifiHost1.contoso.com<http://nifihost1.contoso.com/>
<

http://nifihost1.contoso.com/>

nifi.cluster.load.balance.address=0.0.0.0
nifi.cluster.load.balance.port=6342
nifi.cluster.load.balance.connections.per.node=4

nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

When running Nifi in version 1.12.1 on the exact
same setup
in the
exact
same environment, load balancing is working absolutely
fine.
There was a time when load balancing even worked
in version
1.13.2.
But I'm not able to reproduce this and it just stopped

working one day after some restart, without changing
any property
or
whatsoever.

If any more information would be helpful please
let me know
and I'll
try to provide it as fast as possible.



Versendet mit Emailn.de<http://emailn.de/> <https://www.emailn.de/>
- Freemail


* Unbegrenzt Speicherplatz
* Eigenes Online-Büro
* 24h besten Mailempfang
* Spamschutz, Adressbuch




Versendet mit Emailn.de<http://emailn.de/> <https://www.emailn.de/>
- Freemail



* Unbegrenzt Speicherplatz
* Eigenes Online-Büro
* 24h besten Mailempfang
* Spamschutz, Adressbuch

Re: No Load Balancing since 1.13.2

Posted by "Jens M. Kofoed" <jm...@gmail.com>.

Hi Mark

In version 1.13.2 (at least) the file
"main/nifi-commons/nifi-properties/src/main/java/org/apache/nifi/util/NiFiProperties.java"
is looking for a property called "nifi.cluster.load.balance.address" which
has been reported in https://issues.apache.org/jira/browse/NIFI-8643 and
fixed in version 1.14.0

In version 1.14.0 the only way I can get it to work, is if I type in the IP
address. If I don't specified it or type in the fqdn the load balance port
will bind to localhost. which has been reported in
https://issues.apache.org/jira/browse/NIFI-9010
The result from running netstat -l
tcp 0 0 localhost:6342 0.0.0.0:* LISTEN

Kind regards
Jens M. Kofoed



Den tor. 5. aug. 2021 kl. 23.08 skrev Mark Payne <ma...@hotmail.com>:

> Axel,
>
> I think that I can help clarify some of these things.
>
> First of all: nifi.cluster.load.balance.host vs.
> nifi.cluster.load.balance.address
> * The nifi.cluster.load.balance.host property is what matters.
>
> * The nifi.cluster.load.balance.address is not a real property. NiFi has
> never looked at this property. However, in the first release that included
> load-balancing, there was a typo in which the nifi.properties file had
> “…address” instead of “…host”. This was later addressed.
>
> * So if you have a value for “nifi.cluster.load.balance.address”, it does
> nothing and is always ignored.
>
>
>
> Next: nifi.cluster.load.balance.host property
>
> * nifi.cluster.load.balance.host can be either an IP address or a
> hostname. But if set, other nodes in the cluster MUST be able to
> communicate with the node using whatever value you put here. So using a
> value of 0.0.0.0 will not work. Also, if set, NiFi will listen for incoming
> connections ONLY on that hostname. So if you set it to “localhost”, for
> instance, no other node can connect to it, because no other host can
> connect to the node using “localhost”. So this needs to be an address that
> both the NiFi instance knows about/can bind to, and other nodes in the
> cluster can connect to.
>
> * If nifi.cluster.load.balance.host is NOT set: NiFi will listen for
> incoming requests on all network interfaces / hostnames. It will advertise
> its hostname to other nodes in the cluster according to whatever is set for
> the “nifi.cluster.node.address” property. Meaning that other nodes in the
> cluster must be able to connect to this node using whatever hostname is set
> for the “nifi.cluster.node.address” property. If
> the “nifi.cluster.node.address” property is not set, it advertises its
> hostname as localhost - which means other nodes won’t be able to send to
> it.
>
> So you must specify either the “nifi.cluster.load.balance.host” property
> or the “nifi.cluster.node.address” property.
>
>
>
> Finally: having to delete the state directory
>
> If you change the “nifi.cluster.load.balance.host” or
> “nifi.cluster.load.balance.port” property and restart a node, you must
> restart all nodes in the cluster. Otherwise, the other nodes won’t be able
> to send to that node.
> So, for example, when you changed the load.balance.host from fqdn or
> 0.0.0.0 to the IP address - the other nodes in the cluster would stop
> sending. I created a JIRA [1] for that. In my testing, when I changed the
> hostname, the other nodes stopped sending. But restarting them got things
> back on track. I wasn’t able to replicate the issue after restarting all
> nodes.
>
> Hope this is helpful!
> -Mark
>
> [1] https://issues.apache.org/jira/browse/NIFI-9017
>
>
> On Aug 3, 2021, at 3:08 AM, Axel Schwarz <Ax...@emailn.de> wrote:
>
> Hey guys,
>
> I think I found the "trick" for at least version 1.13.2 and of course I'll
> share it with you.
> I now use the following load balancing properties:
>
> # cluster load balancing properties #
> nifi.cluster.load.balance.host=192.168.1.10
> nifi.cluster.load.balance.port=6342
> nifi.cluster.load.balance.connections.per.node=4
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
>
> So I use the hosts IP address for balance.host instead of 0.0.0.0 or the
> fqdn and have no balance.address property at all.
> This led to partly load balancing in my case as already mentioned. It
> looked like I needed to do one more step to reach the goal and this step
> seems to be deleting all statemanagement files.
>
> Through the state-management.xml config file I changed the state
> management directory to be outside of the nifi installation, because the
> config file says "it is important, that the directory be copied over to the
> new version when upgrading nifi". So everytime when I upgraded or
> reinstalled Nifi during my load balancing odyssey, the statemanagement
> remained completely untouched.
> As soon as I changed that, by deleting the entire state management
> directory before reinstalling Nifi with above mentioned properties, load
> balancing was immediately working throughout the whole cluster.
>
>
> I think for my flow it is not quite that bad to delete the state
> management as I only use one statefull processor to increase some counter.
> And the times I already tried this by now, I could not encounter any wrong
> behaviour whatsoever. But of course I can't test everything, so when any of
> you have some important facts about deleting the state management, please
> let me know :)
>
> Beside that I now feel like this solved my problem. Gotta have an eye on
> that when updating to version 1.14.0 later on, but I think I can figure
> this out. So thanks for all your support! :)
>
> --- Ursprüngliche Nachricht ---
> Von: "Jens M. Kofoed" <jm...@gmail.com>
> Datum: 29.07.2021 11:08:28
> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
> Betreff: Re: Re: Re: No Load Balancing since 1.13.2
>
> Hmm... I can't remember :-( sorry
>
> My configuration for version 1.13.2 is like this:
> # cluster node properties (only configure for cluster nodes) #
> nifi.cluster.is.node=true
> nifi.cluster.node.address=nifi-node01.domaine.com
> nifi.cluster.node.protocol.port=9443
> nifi.cluster.node.protocol.threads=10
> nifi.cluster.node.protocol.max.threads=50
> nifi.cluster.node.event.history.size=25
> nifi.cluster.node.connection.timeout=5 sec
> nifi.cluster.node.read.timeout=5 sec
> nifi.cluster.node.max.concurrent.requests=100
> nifi.cluster.firewall.file=
> nifi.cluster.flow.election.max.wait.time=5 mins
> nifi.cluster.flow.election.max.candidates=3
>
> # cluster load balancing properties #
> nifi.cluster.load.balance.address=192.168.1.11
> nifi.cluster.load.balance.port=6111
> nifi.cluster.load.balance.connections.per.node=4
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
>
> So I defined "nifi.cluster.node.address" with the hostname and
> not an ip
> adress and the "nifi.cluster.load.balance.address" with the ip
> address of
> the server.
> And triple check the configuration at all servers :-)
>
> Kind Regards
> Jens M. Kofoed
>
>
> Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <Axelkopter@emailn.de
> >:
>
>
> Hey Jens,
>
> in Issue Nifi-8643 you wrote the last comment with the exactly same
>
>
> behaviour as we're experiencing now. 2 of 3 nodes were load balancing.
>
>
> How did you get the third node to participate in load balancing? An
>
> update
>
> to 1.14.0 does not change anything for us.
>
>
>
> https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418
>
>
>
>
> --- Ursprüngliche Nachricht ---
> Von: "Jens M. Kofoed" <jm...@gmail.com>
> Datum: 28.07.2021 12:07:50
> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>
>
> Betreff: Re: Re: No Load Balancing since 1.13.2
>
> hi
>
> I can see that you have configured
>
> nifi.cluster.load.balance.address=0.0.0.0
>
>
> Have your tried to set the correct ip adress?
> node1: nifi.cluster.load.balance.address=192.168.1.10
> node2: nifi.cluster.load.balance.address=192.168.1.11
> node3: nifi.cluster.load.balance.address=192.168.1.12
>
> regards
> Jens M. Kofoed
>
> Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz <
>
> Axelkopter@emailn.de>:
>
>
>
> Just tried Java 11. But still does not work. Nothing changed.
>
> :(
>
>
> --- Ursprüngliche Nachricht ---
> Von: Jorge Machado <jo...@me.com>
> Datum: 27.07.2021 13:08:55
> An: users@nifi.apache.org,  Axel Schwarz <Ax...@emailn.de>
>
>
>
> Betreff: Re: No Load Balancing since 1.13.2
>
> Did you tried java 11 ? I have a client running a similar
>
> setup
>
> to yours
>
> but with a lower nigh version and it works fine. Maybe
>
> it is worth
>
> to try
>
> it.
>
>
> On 27. Jul 2021, at 12:42, Axel Schwarz <Ax...@emailn.de>
>
>
>
> wrote:
>
>
> I did indeed, but I updated from u161 to u291, as
>
> this was
>
> the newest
>
> version at that time, because I thought it could help.
>
>
> So the issue started under u161. But I just saw
>
> that u301
>
> is out. I
>
> will try this as well.
>
> --- Ursprüngliche Nachricht ---
> Von: Pierre Villard <pi...@gmail.com>
>
>
> Datum: 27.07.2021 10:18:38
> An: users@nifi.apache.org, Axel Schwarz <Ax...@emailn.de>
>
>
>
>
> Betreff: Re: No Load Balancing since 1.13.2
>
> Hi,
>
> I believe the minor u291 is known to have issues
>
> (for some
>
> of its early
>
> builds). Did you upgrade the Java version recently?
>
>
> Thanks,
> Pierre
>
> Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <Axelkopter@emailn.de
>
>
>
> <mailto:Axelkopter@emailn.de <Ax...@emailn.de>>> a écrit :
>
> Dear Community,
>
> we're running a secured 3 node Nifi Cluster on Java
>
> 8_u291
>
> and Debian
>
> 7 and experiencing
>
> problems with load balancing since version 1.13.2.
>
>
>
> I'm fully aware of Issue Nifi-8643 and tested alot
>
> around
>
> this, but
>
> gotta say, that this
>
> is not our problem. Mainly because the balance port
>
> never
>
> binds to
>
> localhost,
>
> but also because I
>
> implemented all workarounds under version 1.13.2
>
> and even
>
> tried version
>
> 1.14.0 by now,
>
> but load blancing still does not work.
> What we experience is best described as "the
>
> primary
>
> node balances
>
> with itself"...
>
>
> So what it does is, opening the balancing connections
>
> to its
>
> own IP
>
> instead of the IPs
>
> of the other two nodes. And the other two nodes
>
> don't open
>
> balancing
>
> connections at all.
>
>
> When executing "ss | grep 6342" on the
>
> primary node,
>
> this
>
> is what it looks like:
>
>
> [root@nifiHost1 conf]# ss | grep 6342
> tcp    ESTAB      0      0      192.168.1.10:51380
>
> <
>
> http://192.168.1.10:51380/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51376
>
> <
>
> http://192.168.1.10:51376/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51378
>
> <
>
> http://192.168.1.10:51378/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51370
>
> <
>
> http://192.168.1.10:51370/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51372
>
> <
>
> http://192.168.1.10:51372/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51376 <http://192.168.1.10:51376/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51374
>
> <
>
> http://192.168.1.10:51374/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51374 <http://192.168.1.10:51374/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51366
>
> <
>
> http://192.168.1.10:51366/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51370 <http://192.168.1.10:51370/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51366 <http://192.168.1.10:51366/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:51368
>
> <
>
> http://192.168.1.10:51368/>
>
>               192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51372 <http://192.168.1.10:51372/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51378 <http://192.168.1.10:51378/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51368 <http://192.168.1.10:51368/>
>
>
>
>
> tcp    ESTAB      0      0      192.168.1.10:6342
>
> <
>
> http://192.168.1.10:6342/>
>
>                192.168.1.10:51380 <http://192.168.1.10:51380/>
>
>
>
>
> Executing it on the other non primary nodes, just
>
> returns
>
> absolutely
>
> nothing.
>
>
> Netstat show the following on each server:
>
> [root@nifiHost1 conf]# netstat -tulpn
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign
>
> Address
>
>
> State       PID/Program name
>
> tcp        0      0 192.168.1.10:6342 <http://192.168.1.10:6342/>
>
>
>
>         0.0.0.0:*               LISTEN      10352/java
>
>
>
> [root@nifiHost2 conf]# netstat -tulpn
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign
>
> Address
>
>
> State       PID/Program name
>
> tcp        0      0 192.168.1.11:6342 <http://192.168.1.11:6342/>
>
>
>
>         0.0.0.0:*               LISTEN      31562/java
>
>
>
> [root@nifiHost3 conf]# netstat -tulpn
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign
>
> Address
>
>
> State       PID/Program name
>
> tcp        0      0 192.168.1.12:6342 <http://192.168.1.12:6342/>
>
>
>
>         0.0.0.0:*               LISTEN      31685/java
>
>
>
> And here is what our load balancing properties look
>
> like:
>
>
>
> # cluster load balancing properties #
> nifi.cluster.load.balance.host=nifiHost1.contoso.com
>
> <
>
>
> http://nifihost1.contoso.com/>
>
>
> nifi.cluster.load.balance.address=0.0.0.0
> nifi.cluster.load.balance.port=6342
> nifi.cluster.load.balance.connections.per.node=4
>
>
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
>
> When running Nifi in version 1.12.1 on the exact
>
> same setup
>
> in the
>
> exact
>
> same environment, load balancing is working absolutely
>
> fine.
>
> There was a time when load balancing even worked
>
> in version
>
> 1.13.2.
>
> But I'm not able to reproduce this and it just stopped
>
>
> working one day after some restart, without changing
>
> any property
>
> or
>
> whatsoever.
>
>
> If any more information would be helpful please
>
> let me know
>
> and I'll
>
> try to provide it as fast as possible.
>
>
>
>
> Versendet mit Emailn.de <https://www.emailn.de/>
>
> - Freemail
>
>
>
> * Unbegrenzt Speicherplatz
> * Eigenes Online-Büro
> * 24h besten Mailempfang
> * Spamschutz, Adressbuch
>
>
>
>
> Versendet mit Emailn.de <https://www.emailn.de/>
>
> - Freemail
>
>
>
>
> * Unbegrenzt Speicherplatz
> * Eigenes Online-Büro
> * 24h besten Mailempfang
> * Spamschutz, Adressbuch
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: No Load Balancing since 1.13.2

Posted by Mark Payne <ma...@hotmail.com>.

Axel,

I think that I can help clarify some of these things.

First of all: nifi.cluster.load.balance.host vs. nifi.cluster.load.balance.address
* The nifi.cluster.load.balance.host property is what matters.

* The nifi.cluster.load.balance.address is not a real property. NiFi has never looked at this property. However, in the first release that included load-balancing, there was a typo in which the nifi.properties file had “…address” instead of “…host”. This was later addressed.

* So if you have a value for “nifi.cluster.load.balance.address”, it does nothing and is always ignored.



Next: nifi.cluster.load.balance.host property

* nifi.cluster.load.balance.host can be either an IP address or a hostname. But if set, other nodes in the cluster MUST be able to communicate with the node using whatever value you put here. So using a value of 0.0.0.0 will not work. Also, if set, NiFi will listen for incoming connections ONLY on that hostname. So if you set it to “localhost”, for instance, no other node can connect to it, because no other host can connect to the node using “localhost”. So this needs to be an address that both the NiFi instance knows about/can bind to, and other nodes in the cluster can connect to.

* If nifi.cluster.load.balance.host is NOT set: NiFi will listen for incoming requests on all network interfaces / hostnames. It will advertise its hostname to other nodes in the cluster according to whatever is set for the “nifi.cluster.node.address” property. Meaning that other nodes in the cluster must be able to connect to this node using whatever hostname is set for the “nifi.cluster.node.address” property. If the “nifi.cluster.node.address” property is not set, it advertises its hostname as localhost - which means other nodes won’t be able to send to it.

So you must specify either the “nifi.cluster.load.balance.host” property or the “nifi.cluster.node.address” property.



Finally: having to delete the state directory

If you change the “nifi.cluster.load.balance.host” or “nifi.cluster.load.balance.port” property and restart a node, you must restart all nodes in the cluster. Otherwise, the other nodes won’t be able to send to that node.
So, for example, when you changed the load.balance.host from fqdn or 0.0.0.0 to the IP address - the other nodes in the cluster would stop sending. I created a JIRA [1] for that. In my testing, when I changed the hostname, the other nodes stopped sending. But restarting them got things back on track. I wasn’t able to replicate the issue after restarting all nodes.

Hope this is helpful!
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-9017


On Aug 3, 2021, at 3:08 AM, Axel Schwarz <Ax...@emailn.de>> wrote:

Hey guys,

I think I found the "trick" for at least version 1.13.2 and of course I'll share it with you.
I now use the following load balancing properties:

# cluster load balancing properties #
nifi.cluster.load.balance.host=192.168.1.10
nifi.cluster.load.balance.port=6342
nifi.cluster.load.balance.connections.per.node=4
nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

So I use the hosts IP address for balance.host instead of 0.0.0.0 or the fqdn and have no balance.address property at all.
This led to partly load balancing in my case as already mentioned. It looked like I needed to do one more step to reach the goal and this step seems to be deleting all statemanagement files.

Through the state-management.xml config file I changed the state management directory to be outside of the nifi installation, because the config file says "it is important, that the directory be copied over to the new version when upgrading nifi". So everytime when I upgraded or reinstalled Nifi during my load balancing odyssey, the statemanagement remained completely untouched.
As soon as I changed that, by deleting the entire state management directory before reinstalling Nifi with above mentioned properties, load balancing was immediately working throughout the whole cluster.


I think for my flow it is not quite that bad to delete the state management as I only use one statefull processor to increase some counter. And the times I already tried this by now, I could not encounter any wrong behaviour whatsoever. But of course I can't test everything, so when any of you have some important facts about deleting the state management, please let me know :)

Beside that I now feel like this solved my problem. Gotta have an eye on that when updating to version 1.14.0 later on, but I think I can figure this out. So thanks for all your support! :)

--- Ursprüngliche Nachricht ---
Von: "Jens M. Kofoed" <jm...@gmail.com>>
Datum: 29.07.2021 11:08:28
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>
Betreff: Re: Re: Re: No Load Balancing since 1.13.2

Hmm... I can't remember :-( sorry

My configuration for version 1.13.2 is like this:
# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-node01.domaine.com<http://nifi-node01.domaine.com>
nifi.cluster.node.protocol.port=9443
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=3

# cluster load balancing properties #
nifi.cluster.load.balance.address=192.168.1.11
nifi.cluster.load.balance.port=6111
nifi.cluster.load.balance.connections.per.node=4
nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

So I defined "nifi.cluster.node.address" with the hostname and
not an ip
adress and the "nifi.cluster.load.balance.address" with the ip
address of
the server.
And triple check the configuration at all servers :-)

Kind Regards
Jens M. Kofoed


Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <Ax...@emailn.de>>:


Hey Jens,

in Issue Nifi-8643 you wrote the last comment with the exactly same

behaviour as we're experiencing now. 2 of 3 nodes were load balancing.

How did you get the third node to participate in load balancing? An
update
to 1.14.0 does not change anything for us.


https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418



--- Ursprüngliche Nachricht ---
Von: "Jens M. Kofoed" <jm...@gmail.com>>
Datum: 28.07.2021 12:07:50
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>

Betreff: Re: Re: No Load Balancing since 1.13.2

hi

I can see that you have configured
nifi.cluster.load.balance.address=0.0.0.0

Have your tried to set the correct ip adress?
node1: nifi.cluster.load.balance.address=192.168.1.10
node2: nifi.cluster.load.balance.address=192.168.1.11
node3: nifi.cluster.load.balance.address=192.168.1.12

regards
Jens M. Kofoed

Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz <
Axelkopter@emailn.de<ma...@emailn.de>>:


Just tried Java 11. But still does not work. Nothing changed.
:(

--- Ursprüngliche Nachricht ---
Von: Jorge Machado <jo...@me.com>>
Datum: 27.07.2021 13:08:55
An: users@nifi.apache.org<ma...@nifi.apache.org>,  Axel Schwarz <Ax...@emailn.de>>


Betreff: Re: No Load Balancing since 1.13.2

Did you tried java 11 ? I have a client running a similar
setup
to yours
but with a lower nigh version and it works fine. Maybe
it is worth
to try
it.


On 27. Jul 2021, at 12:42, Axel Schwarz <Ax...@emailn.de>>


wrote:

I did indeed, but I updated from u161 to u291, as
this was
the newest
version at that time, because I thought it could help.

So the issue started under u161. But I just saw
that u301
is out. I
will try this as well.
--- Ursprüngliche Nachricht ---
Von: Pierre Villard <pi...@gmail.com>>

Datum: 27.07.2021 10:18:38
An: users@nifi.apache.org<ma...@nifi.apache.org>, Axel Schwarz <Ax...@emailn.de>>



Betreff: Re: No Load Balancing since 1.13.2

Hi,

I believe the minor u291 is known to have issues
(for some
of its early
builds). Did you upgrade the Java version recently?

Thanks,
Pierre

Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <Ax...@emailn.de>


<ma...@emailn.de>> a écrit :
Dear Community,

we're running a secured 3 node Nifi Cluster on Java
8_u291
and Debian
7 and experiencing
problems with load balancing since version 1.13.2.


I'm fully aware of Issue Nifi-8643 and tested alot
around
this, but
gotta say, that this
is not our problem. Mainly because the balance port
never
binds to
localhost,
but also because I
implemented all workarounds under version 1.13.2
and even
tried version
1.14.0 by now,
but load blancing still does not work.
What we experience is best described as "the
primary
node balances
with itself"...

So what it does is, opening the balancing connections
to its
own IP
instead of the IPs
of the other two nodes. And the other two nodes
don't open
balancing
connections at all.

When executing "ss | grep 6342" on the
primary node,
this
is what it looks like:

[root@nifiHost1 conf]# ss | grep 6342
tcp    ESTAB      0      0      192.168.1.10:51380
<
http://192.168.1.10:51380/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51376
<
http://192.168.1.10:51376/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51378
<
http://192.168.1.10:51378/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51370
<
http://192.168.1.10:51370/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:51372
<
http://192.168.1.10:51372/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51376 <http://192.168.1.10:51376/>



tcp    ESTAB      0      0      192.168.1.10:51374
<
http://192.168.1.10:51374/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51374 <http://192.168.1.10:51374/>



tcp    ESTAB      0      0      192.168.1.10:51366
<
http://192.168.1.10:51366/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51370 <http://192.168.1.10:51370/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51366 <http://192.168.1.10:51366/>



tcp    ESTAB      0      0      192.168.1.10:51368
<
http://192.168.1.10:51368/>
              192.168.1.10:6342 <http://192.168.1.10:6342/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51372 <http://192.168.1.10:51372/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51378 <http://192.168.1.10:51378/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51368 <http://192.168.1.10:51368/>



tcp    ESTAB      0      0      192.168.1.10:6342
<
http://192.168.1.10:6342/>
               192.168.1.10:51380 <http://192.168.1.10:51380/>



Executing it on the other non primary nodes, just
returns
absolutely
nothing.

Netstat show the following on each server:

[root@nifiHost1 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.10:6342 <http://192.168.1.10:6342/>


        0.0.0.0:*               LISTEN      10352/java


[root@nifiHost2 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.11:6342 <http://192.168.1.11:6342/>


        0.0.0.0:*               LISTEN      31562/java


[root@nifiHost3 conf]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign
Address

State       PID/Program name
tcp        0      0 192.168.1.12:6342 <http://192.168.1.12:6342/>


        0.0.0.0:*               LISTEN      31685/java


And here is what our load balancing properties look
like:


# cluster load balancing properties #
nifi.cluster.load.balance.host=nifiHost1.contoso.com<http://nifiHost1.contoso.com>
<

http://nifihost1.contoso.com/>

nifi.cluster.load.balance.address=0.0.0.0
nifi.cluster.load.balance.port=6342
nifi.cluster.load.balance.connections.per.node=4

nifi.cluster.load.balance.max.thread.count=8
nifi.cluster.load.balance.comms.timeout=30 sec

When running Nifi in version 1.12.1 on the exact
same setup
in the
exact
same environment, load balancing is working absolutely
fine.
There was a time when load balancing even worked
in version
1.13.2.
But I'm not able to reproduce this and it just stopped

working one day after some restart, without changing
any property
or
whatsoever.

If any more information would be helpful please
let me know
and I'll
try to provide it as fast as possible.



Versendet mit Emailn.de<http://Emailn.de> <https://www.emailn.de/>
- Freemail


* Unbegrenzt Speicherplatz
* Eigenes Online-Büro
* 24h besten Mailempfang
* Spamschutz, Adressbuch




Versendet mit Emailn.de<http://Emailn.de> <https://www.emailn.de/>
- Freemail



* Unbegrenzt Speicherplatz
* Eigenes Online-Büro
* 24h besten Mailempfang
* Spamschutz, Adressbuch