You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by Michal Klempa <mi...@gmail.com> on 2017/01/10 08:50:18 UTC

Re: NiFi Clustering

Hi,
we have been doing some tests with NiFi cluster and similar questions arised.
Our configuration is as follows:
NiFi ClusterA:
172.31.12.232 nifi-cluster-04 (sample configuration
nifi-cluster-04.properties in attachment)
172.31.5.194 nifi-cluster-05
172.31.15.84 nifi-cluster-06
Standalone ZooKeeper, 3 instances, sample configuration
nifi-cluster-04.zoo.cfg in attachment.

NiFi ClusterB:
172.31.9.147 nifi-cluster-01 (sample configuration
nifi-cluster-01.properties in attachment)
172.31.24.77 nifi-cluster-02
172.31.8.152 nifi-cluster-03
Standalone ZooKeeper, 3 instances, sample configuration
nifi-cluster-01.zoo.cfg in attachment.

We have done testing the following:
ClusterA_flow (template in attachment):
GenerateFlowFile -> output_port ("to_clusterB" - the port to be
imported as RemoteProcessGroup from ClusterB)
                                 -> PutFile ("/tmp/clusterA", create
missing dirs: false)

ClusterB_flow (template in attachment):
RemoteProcessGroup (attached to 172.31.12.232:8081/nifi, remote ports:
"to_clusterB") > PutFile ("/tmp/clusterB", create missing dirs: false)

Testing scenario is:
GenerateFlowFile in ClusterA, send the file to output port
"to_clusterB" and also PutFile ("/tmp/clusterA"). Receive the FlowFile
from RemoteProcessGtroup in ClusterB, save the file to "/tmp/clusterB"
on ClusterB machines.

Now following situations were tested:
Situation1: all the nodes are up and running - three FlowFiles are
generated in ClusterA, one on each node, all three files are
transferred to ClusterB, although the distribution is not even on
ClusterB, When we rerun the GenerateFlowFile (e.g. every 10 sec) 4
times, we get 12 FlowFiles generated in ClusterA (4 on each node), but
the have got 6 on node nifi-cluster-01, 2 on node nifi-cluster-02 and
4 flow files on node nifi-cluster-03. Although the distribution is not
even, the flowfiles are properly transferred to clusterB and that is
important.
Conclusion: is everything is green, everything works as expected (and
same as separate nifi instances)

Situation2: We have run GenerateFlowFile 2 times on ClusterA,
FlowFiles were succesfuly transferred to ClusterB. Then we removed the
target directory "/tmp/clusterB" on node nifi-cluster-01 node.  We
have executed GenerateFlowFile two more times. As the PutFile there,
is configured to NOT created target directiories, we expected errors.
But the key point is, how can nifi cluster help in resolution.
Although the Failure relationship from PutFile is directed again as
the input to PutFile, the result is: 12 FlowFiles generated in
ClusterA (4 on each node), But after directory removal on node
nifi-cluster-01, 6 flow files remained stucked on node
nifi-cluster-01, circling around PutFile with Target directory not
exists error.
Conclusion: From this, we can see, although we have cluster setup, the
nodes do balance somewhere inside RemoteProcessGroup but do not
rebalance the FlowFiles stucked on relationaships once they enter the
flow, even after they are penalized by processor. Is this the desired
behavior? Are there any plans, to improve on this?

Situation3: We have run GenerateFlowFile 2 times on ClusterA,
FlowFiles were succesfuly transferred to ClusterB. Then we shielded
node nifi-cluster-01 (ClusterB) using iptables, so that NiFi would be
unreachable, and ZooKeeper would become unreachable on this node.
Iptables commands used:
```
iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
--state ESTABLISHED -j ACCEPT
iptables -A INPUT -j DROP
iptables -A OUTPUT -j DROP
```
This should simulate HW failure from NiFi and ZooKeepers point of view.
We have executed GenerateFlowFile two more times. The result was: 6
FlowFiles generated in ClusterA (4 on each node), after shielding
nifi-cluster-01 node, 6 more flow files were transferred to ClusterB
(distributed unevenly on nodes nifi-cluster-02 and nifi-cluster-03).
Conclusion: From this, we can see, that NiFi cluster setup does help
us in transfer of FlowFiles, if one of the destination nodes becomes
unavailable. For separate nifi instances, we are currently trying to
figure out how to arrange the flows to achieve this behavior.Any
ideas?

Situation4: We have run GenerateFlowFile 2 times on ClusterA,
FlowFiles were succesfuly transferred to ClusterB. Then we shielded
node nifi-cluster-04 (ClusterA) using iptables, so that NiFi would be
unreachable, and ZooKeeper would become unreachable on this node.
Iptables commands used:
```
iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
--state ESTABLISHED -j ACCEPT
iptables -A INPUT -j DROP
iptables -A OUTPUT -j DROP
```
This should simulate HW failure from NiFi and ZooKeepers point of view.

The GenerateFlowFile remained executing in timely manner and we were
unable to stop it. As the UI became unavailable on ClusterA. After
shielding nifi-cluster-04 node, remaining 2 nodes in ClusterA were
generating flow files and these were transferred to ClusterB, so the
flow was running. But it was unmanageable as the UI became
unavailable.
Conclusion: From this, we can see, that NiFi cluster setup does help
us in transfer of FlowFiles, if one of the source nodes becomes
unavailable. Unfortunately we experienced UI issues. For separate nifi
instances, we are currently trying to figure out how to arrange the
flows to achieve this behavior.Any ideas?

* * *

Moreover, we tested upgrade process of the flow.xml.gz. Currently, we
are using separate NiFi instances managed by Ansible(+Jenkins). The
job of flow.xml.gz upgrade consists basically of
1. service nifi stop
2. backup old and place new flow.xml.gz file into conf/ nifi directory
3. service nifi start
As our flows are pre-tested in staging environment, we have never
experienced issues in production, like nifi wouldn't start cause of
damaged flow.xml.gz. Everything works ok.
Even if something would break, we have other separate hot production
NiFi instances with the old flow.xml.gz running, so the overall flow
is running through the other nodes (with performance hit of course).
We can still revert to original flow.xml.gz on a single node we are
upgrading at once.

Now the question is, if are going to use NiFi cluster feature, how can
we achieve rolling upgrades of the flow.xml.gz? Should we run a
separate NiFi Cluster and switch between two clusters?
We experienced this behavior: NiFi instance does not join the NiFi
cluster if the flow.xml.gz differs. We had to turn off all NiFi
instances in a cluster for a while to start a single one with new
flow.xml.gz to populate the flow pool with the new version. Then, we
have been forced to deploy new flow.xml.gz to other 2 nodes, as they
rejected to join cluster :)

* * *

To our use-cases for now, we find using separate nifi instances
superior to using Nifi cluster. Mainly cause of flow.xml.gz upgrade
(unless somebody gives us advice on this! thank you).
Regarding the flow balance and setup of inter-cluster communication,
we do not know how to achieve this without nifi cluster setup. As for
now, our flow is very simple and can basically run in parallel in
multiple single instances, the separate nifi instances work well (even
our source system supports balancing using more IPs so we do not even
have to bother in setting up balanced IP on routers).

Any comments are welcome. Thanks.
Michal Klempa

On Sat, Dec 10, 2016 at 9:03 AM, Caton, Nigel <ni...@cgi.com> wrote:
> Thanks Bryan.
>
> On 2016-12-09 15:32 (-0000), Bryan Bende <b....@gmail.com> wrote:
>> Nigel,>
>>
>> The advantage of using a cluster is that whenever you change something in>
>> the UI, it will be changed on all nodes, and you also get a central view of>
>> the metrics/stats across all nodes.  If you use standalone nodes you would>
>> have to go to each node and make the same changes.>
>>
>> It sounds like you are probably doing automatic deployments of a flow that>
>> you setup else where and aren't planning to ever modify the production>
>> nodes so maybe the above is a non-issue for you.>
>>
>> The rolling deployment scenario depends on whether you are updating the>
>> flow, or just code. For example, if you are just updating code then you>
>> should be able to do a rolling deployment in a cluster, but if you are>
>> updating the flow then I don't think it will work because the a node will>
>> come up with the new flow and attempt to join the cluster, and the cluster>
>> won't accept it because the flow is different.>
>>
>> Hope that helps.>
>>
>> -Bryan>
>>
>>
>> On Fri, Dec 9, 2016 at 9:33 AM, Caton, Nigel <ni...@cgi.com> wrote:>
>>
>> > Are there any views of the pros/cons of running a native NiFi cluster>
>> > versus a cluster of standalone

Re: NiFi Clustering

Posted by Bryan Bende <bb...@gmail.com>.

Hi Michal,

Trying to answer some of your questions...

Situation 2: Conclusion: From this, we can see, although we have cluster
setup, the
nodes do balance somewhere inside RemoteProcessGroup but do not
rebalance the FlowFiles stucked on relationaships once they enter the
flow, even after they are penalized by processor. Is this the desired
behavior? Are there any plans, to improve on this?

There are no plans that I know of to change this. The way NiFi clustering
works is that data is partitioned across the nodes, which depends a lot on
how you bring the data into the cluster, in your case via site-to-site.
Once data is on a node it is never automatically moved to another node
unless the data flow is configured to do this (site-to-site back to itself).

Situation 3: For separate nifi instances, we are currently trying to
figure out how to arrange the flows to achieve this behavior.Any
ideas?

If you have standalone NiFi instances, you probably have two options...

- The standalone instances can use ListenHttp as the entry point and you
can stick a load balancer in front, the source NiFi would use PostHTTP with
the url of the load balancer.

- The source NiFi could use DistributeLoad processor to round-robin flow
files to RemoteProcessGroups, and you would have a RPG for each stand-alone
instance.

Upgrading: Now the question is, if are going to use NiFi cluster feature,
how can
we achieve rolling upgrades of the flow.xml.gz? Should we run a
separate NiFi Cluster and switch between two clusters?

I have seen the second cluster approach work successfully. Again a lot of
it depends how the data is coming into the clusters. You need a way to
easily switch the source of the data to send to a new destination, or trick
it into a new destination using DNS changes. Then you can upgrade the
original cluster and once it is up switch the source back.

Hope that helps.

-Bryan


On Tue, Jan 10, 2017 at 4:24 AM, Michal Klempa <mi...@gmail.com>
wrote:

> All the files are here: https://gist.github.com/michalklempa
>
> On Tue, Jan 10, 2017 at 9:50 AM, Michal Klempa <mi...@gmail.com>
> wrote:
> > Hi,
> > we have been doing some tests with NiFi cluster and similar questions
> arised.
> > Our configuration is as follows:
> > NiFi ClusterA:
> > 172.31.12.232 nifi-cluster-04 (sample configuration
> > nifi-cluster-04.properties in attachment)
> > 172.31.5.194 nifi-cluster-05
> > 172.31.15.84 nifi-cluster-06
> > Standalone ZooKeeper, 3 instances, sample configuration
> > nifi-cluster-04.zoo.cfg in attachment.
> >
> > NiFi ClusterB:
> > 172.31.9.147 nifi-cluster-01 (sample configuration
> > nifi-cluster-01.properties in attachment)
> > 172.31.24.77 nifi-cluster-02
> > 172.31.8.152 nifi-cluster-03
> > Standalone ZooKeeper, 3 instances, sample configuration
> > nifi-cluster-01.zoo.cfg in attachment.
> >
> > We have done testing the following:
> > ClusterA_flow (template in attachment):
> > GenerateFlowFile -> output_port ("to_clusterB" - the port to be
> > imported as RemoteProcessGroup from ClusterB)
> >                                  -> PutFile ("/tmp/clusterA", create
> > missing dirs: false)
> >
> > ClusterB_flow (template in attachment):
> > RemoteProcessGroup (attached to 172.31.12.232:8081/nifi, remote ports:
> > "to_clusterB") > PutFile ("/tmp/clusterB", create missing dirs: false)
> >
> > Testing scenario is:
> > GenerateFlowFile in ClusterA, send the file to output port
> > "to_clusterB" and also PutFile ("/tmp/clusterA"). Receive the FlowFile
> > from RemoteProcessGtroup in ClusterB, save the file to "/tmp/clusterB"
> > on ClusterB machines.
> >
> > Now following situations were tested:
> > Situation1: all the nodes are up and running - three FlowFiles are
> > generated in ClusterA, one on each node, all three files are
> > transferred to ClusterB, although the distribution is not even on
> > ClusterB, When we rerun the GenerateFlowFile (e.g. every 10 sec) 4
> > times, we get 12 FlowFiles generated in ClusterA (4 on each node), but
> > the have got 6 on node nifi-cluster-01, 2 on node nifi-cluster-02 and
> > 4 flow files on node nifi-cluster-03. Although the distribution is not
> > even, the flowfiles are properly transferred to clusterB and that is
> > important.
> > Conclusion: is everything is green, everything works as expected (and
> > same as separate nifi instances)
> >
> > Situation2: We have run GenerateFlowFile 2 times on ClusterA,
> > FlowFiles were succesfuly transferred to ClusterB. Then we removed the
> > target directory "/tmp/clusterB" on node nifi-cluster-01 node.  We
> > have executed GenerateFlowFile two more times. As the PutFile there,
> > is configured to NOT created target directiories, we expected errors.
> > But the key point is, how can nifi cluster help in resolution.
> > Although the Failure relationship from PutFile is directed again as
> > the input to PutFile, the result is: 12 FlowFiles generated in
> > ClusterA (4 on each node), But after directory removal on node
> > nifi-cluster-01, 6 flow files remained stucked on node
> > nifi-cluster-01, circling around PutFile with Target directory not
> > exists error.
> > Conclusion: From this, we can see, although we have cluster setup, the
> > nodes do balance somewhere inside RemoteProcessGroup but do not
> > rebalance the FlowFiles stucked on relationaships once they enter the
> > flow, even after they are penalized by processor. Is this the desired
> > behavior? Are there any plans, to improve on this?
> >
> > Situation3: We have run GenerateFlowFile 2 times on ClusterA,
> > FlowFiles were succesfuly transferred to ClusterB. Then we shielded
> > node nifi-cluster-01 (ClusterB) using iptables, so that NiFi would be
> > unreachable, and ZooKeeper would become unreachable on this node.
> > Iptables commands used:
> > ```
> > iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
> > NEW,ESTABLISHED -j ACCEPT
> > iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
> > --state ESTABLISHED -j ACCEPT
> > iptables -A INPUT -j DROP
> > iptables -A OUTPUT -j DROP
> > ```
> > This should simulate HW failure from NiFi and ZooKeepers point of view.
> > We have executed GenerateFlowFile two more times. The result was: 6
> > FlowFiles generated in ClusterA (4 on each node), after shielding
> > nifi-cluster-01 node, 6 more flow files were transferred to ClusterB
> > (distributed unevenly on nodes nifi-cluster-02 and nifi-cluster-03).
> > Conclusion: From this, we can see, that NiFi cluster setup does help
> > us in transfer of FlowFiles, if one of the destination nodes becomes
> > unavailable. For separate nifi instances, we are currently trying to
> > figure out how to arrange the flows to achieve this behavior.Any
> > ideas?
> >
> > Situation4: We have run GenerateFlowFile 2 times on ClusterA,
> > FlowFiles were succesfuly transferred to ClusterB. Then we shielded
> > node nifi-cluster-04 (ClusterA) using iptables, so that NiFi would be
> > unreachable, and ZooKeeper would become unreachable on this node.
> > Iptables commands used:
> > ```
> > iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
> > NEW,ESTABLISHED -j ACCEPT
> > iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
> > --state ESTABLISHED -j ACCEPT
> > iptables -A INPUT -j DROP
> > iptables -A OUTPUT -j DROP
> > ```
> > This should simulate HW failure from NiFi and ZooKeepers point of view.
> >
> > The GenerateFlowFile remained executing in timely manner and we were
> > unable to stop it. As the UI became unavailable on ClusterA. After
> > shielding nifi-cluster-04 node, remaining 2 nodes in ClusterA were
> > generating flow files and these were transferred to ClusterB, so the
> > flow was running. But it was unmanageable as the UI became
> > unavailable.
> > Conclusion: From this, we can see, that NiFi cluster setup does help
> > us in transfer of FlowFiles, if one of the source nodes becomes
> > unavailable. Unfortunately we experienced UI issues. For separate nifi
> > instances, we are currently trying to figure out how to arrange the
> > flows to achieve this behavior.Any ideas?
> >
> > * * *
> >
> > Moreover, we tested upgrade process of the flow.xml.gz. Currently, we
> > are using separate NiFi instances managed by Ansible(+Jenkins). The
> > job of flow.xml.gz upgrade consists basically of
> > 1. service nifi stop
> > 2. backup old and place new flow.xml.gz file into conf/ nifi directory
> > 3. service nifi start
> > As our flows are pre-tested in staging environment, we have never
> > experienced issues in production, like nifi wouldn't start cause of
> > damaged flow.xml.gz. Everything works ok.
> > Even if something would break, we have other separate hot production
> > NiFi instances with the old flow.xml.gz running, so the overall flow
> > is running through the other nodes (with performance hit of course).
> > We can still revert to original flow.xml.gz on a single node we are
> > upgrading at once.
> >
> > Now the question is, if are going to use NiFi cluster feature, how can
> > we achieve rolling upgrades of the flow.xml.gz? Should we run a
> > separate NiFi Cluster and switch between two clusters?
> > We experienced this behavior: NiFi instance does not join the NiFi
> > cluster if the flow.xml.gz differs. We had to turn off all NiFi
> > instances in a cluster for a while to start a single one with new
> > flow.xml.gz to populate the flow pool with the new version. Then, we
> > have been forced to deploy new flow.xml.gz to other 2 nodes, as they
> > rejected to join cluster :)
> >
> > * * *
> >
> > To our use-cases for now, we find using separate nifi instances
> > superior to using Nifi cluster. Mainly cause of flow.xml.gz upgrade
> > (unless somebody gives us advice on this! thank you).
> > Regarding the flow balance and setup of inter-cluster communication,
> > we do not know how to achieve this without nifi cluster setup. As for
> > now, our flow is very simple and can basically run in parallel in
> > multiple single instances, the separate nifi instances work well (even
> > our source system supports balancing using more IPs so we do not even
> > have to bother in setting up balanced IP on routers).
> >
> > Any comments are welcome. Thanks.
> > Michal Klempa
> >
> > On Sat, Dec 10, 2016 at 9:03 AM, Caton, Nigel <ni...@cgi.com>
> wrote:
> >> Thanks Bryan.
> >>
> >> On 2016-12-09 15:32 (-0000), Bryan Bende <b....@gmail.com> wrote:
> >>> Nigel,>
> >>>
> >>> The advantage of using a cluster is that whenever you change something
> in>
> >>> the UI, it will be changed on all nodes, and you also get a central
> view of>
> >>> the metrics/stats across all nodes.  If you use standalone nodes you
> would>
> >>> have to go to each node and make the same changes.>
> >>>
> >>> It sounds like you are probably doing automatic deployments of a flow
> that>
> >>> you setup else where and aren't planning to ever modify the production>
> >>> nodes so maybe the above is a non-issue for you.>
> >>>
> >>> The rolling deployment scenario depends on whether you are updating
> the>
> >>> flow, or just code. For example, if you are just updating code then
> you>
> >>> should be able to do a rolling deployment in a cluster, but if you are>
> >>> updating the flow then I don't think it will work because the a node
> will>
> >>> come up with the new flow and attempt to join the cluster, and the
> cluster>
> >>> won't accept it because the flow is different.>
> >>>
> >>> Hope that helps.>
> >>>
> >>> -Bryan>
> >>>
> >>>
> >>> On Fri, Dec 9, 2016 at 9:33 AM, Caton, Nigel <ni...@cgi.com> wrote:>
> >>>
> >>> > Are there any views of the pros/cons of running a native NiFi
> cluster>
> >>> > versus a cluster of standalone
>

Re: NiFi Clustering

Posted by Michal Klempa <mi...@gmail.com>.

All the files are here: https://gist.github.com/michalklempa

On Tue, Jan 10, 2017 at 9:50 AM, Michal Klempa <mi...@gmail.com> wrote:
> Hi,
> we have been doing some tests with NiFi cluster and similar questions arised.
> Our configuration is as follows:
> NiFi ClusterA:
> 172.31.12.232 nifi-cluster-04 (sample configuration
> nifi-cluster-04.properties in attachment)
> 172.31.5.194 nifi-cluster-05
> 172.31.15.84 nifi-cluster-06
> Standalone ZooKeeper, 3 instances, sample configuration
> nifi-cluster-04.zoo.cfg in attachment.
>
> NiFi ClusterB:
> 172.31.9.147 nifi-cluster-01 (sample configuration
> nifi-cluster-01.properties in attachment)
> 172.31.24.77 nifi-cluster-02
> 172.31.8.152 nifi-cluster-03
> Standalone ZooKeeper, 3 instances, sample configuration
> nifi-cluster-01.zoo.cfg in attachment.
>
> We have done testing the following:
> ClusterA_flow (template in attachment):
> GenerateFlowFile -> output_port ("to_clusterB" - the port to be
> imported as RemoteProcessGroup from ClusterB)
>                                  -> PutFile ("/tmp/clusterA", create
> missing dirs: false)
>
> ClusterB_flow (template in attachment):
> RemoteProcessGroup (attached to 172.31.12.232:8081/nifi, remote ports:
> "to_clusterB") > PutFile ("/tmp/clusterB", create missing dirs: false)
>
> Testing scenario is:
> GenerateFlowFile in ClusterA, send the file to output port
> "to_clusterB" and also PutFile ("/tmp/clusterA"). Receive the FlowFile
> from RemoteProcessGtroup in ClusterB, save the file to "/tmp/clusterB"
> on ClusterB machines.
>
> Now following situations were tested:
> Situation1: all the nodes are up and running - three FlowFiles are
> generated in ClusterA, one on each node, all three files are
> transferred to ClusterB, although the distribution is not even on
> ClusterB, When we rerun the GenerateFlowFile (e.g. every 10 sec) 4
> times, we get 12 FlowFiles generated in ClusterA (4 on each node), but
> the have got 6 on node nifi-cluster-01, 2 on node nifi-cluster-02 and
> 4 flow files on node nifi-cluster-03. Although the distribution is not
> even, the flowfiles are properly transferred to clusterB and that is
> important.
> Conclusion: is everything is green, everything works as expected (and
> same as separate nifi instances)
>
> Situation2: We have run GenerateFlowFile 2 times on ClusterA,
> FlowFiles were succesfuly transferred to ClusterB. Then we removed the
> target directory "/tmp/clusterB" on node nifi-cluster-01 node.  We
> have executed GenerateFlowFile two more times. As the PutFile there,
> is configured to NOT created target directiories, we expected errors.
> But the key point is, how can nifi cluster help in resolution.
> Although the Failure relationship from PutFile is directed again as
> the input to PutFile, the result is: 12 FlowFiles generated in
> ClusterA (4 on each node), But after directory removal on node
> nifi-cluster-01, 6 flow files remained stucked on node
> nifi-cluster-01, circling around PutFile with Target directory not
> exists error.
> Conclusion: From this, we can see, although we have cluster setup, the
> nodes do balance somewhere inside RemoteProcessGroup but do not
> rebalance the FlowFiles stucked on relationaships once they enter the
> flow, even after they are penalized by processor. Is this the desired
> behavior? Are there any plans, to improve on this?
>
> Situation3: We have run GenerateFlowFile 2 times on ClusterA,
> FlowFiles were succesfuly transferred to ClusterB. Then we shielded
> node nifi-cluster-01 (ClusterB) using iptables, so that NiFi would be
> unreachable, and ZooKeeper would become unreachable on this node.
> Iptables commands used:
> ```
> iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
> NEW,ESTABLISHED -j ACCEPT
> iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
> --state ESTABLISHED -j ACCEPT
> iptables -A INPUT -j DROP
> iptables -A OUTPUT -j DROP
> ```
> This should simulate HW failure from NiFi and ZooKeepers point of view.
> We have executed GenerateFlowFile two more times. The result was: 6
> FlowFiles generated in ClusterA (4 on each node), after shielding
> nifi-cluster-01 node, 6 more flow files were transferred to ClusterB
> (distributed unevenly on nodes nifi-cluster-02 and nifi-cluster-03).
> Conclusion: From this, we can see, that NiFi cluster setup does help
> us in transfer of FlowFiles, if one of the destination nodes becomes
> unavailable. For separate nifi instances, we are currently trying to
> figure out how to arrange the flows to achieve this behavior.Any
> ideas?
>
> Situation4: We have run GenerateFlowFile 2 times on ClusterA,
> FlowFiles were succesfuly transferred to ClusterB. Then we shielded
> node nifi-cluster-04 (ClusterA) using iptables, so that NiFi would be
> unreachable, and ZooKeeper would become unreachable on this node.
> Iptables commands used:
> ```
> iptables -A INPUT -p tcp --sport 513:65535 --dport 22 -m state --state
> NEW,ESTABLISHED -j ACCEPT
> iptables -A OUTPUT -p tcp --sport 22 --dport 513:65535 -m state
> --state ESTABLISHED -j ACCEPT
> iptables -A INPUT -j DROP
> iptables -A OUTPUT -j DROP
> ```
> This should simulate HW failure from NiFi and ZooKeepers point of view.
>
> The GenerateFlowFile remained executing in timely manner and we were
> unable to stop it. As the UI became unavailable on ClusterA. After
> shielding nifi-cluster-04 node, remaining 2 nodes in ClusterA were
> generating flow files and these were transferred to ClusterB, so the
> flow was running. But it was unmanageable as the UI became
> unavailable.
> Conclusion: From this, we can see, that NiFi cluster setup does help
> us in transfer of FlowFiles, if one of the source nodes becomes
> unavailable. Unfortunately we experienced UI issues. For separate nifi
> instances, we are currently trying to figure out how to arrange the
> flows to achieve this behavior.Any ideas?
>
> * * *
>
> Moreover, we tested upgrade process of the flow.xml.gz. Currently, we
> are using separate NiFi instances managed by Ansible(+Jenkins). The
> job of flow.xml.gz upgrade consists basically of
> 1. service nifi stop
> 2. backup old and place new flow.xml.gz file into conf/ nifi directory
> 3. service nifi start
> As our flows are pre-tested in staging environment, we have never
> experienced issues in production, like nifi wouldn't start cause of
> damaged flow.xml.gz. Everything works ok.
> Even if something would break, we have other separate hot production
> NiFi instances with the old flow.xml.gz running, so the overall flow
> is running through the other nodes (with performance hit of course).
> We can still revert to original flow.xml.gz on a single node we are
> upgrading at once.
>
> Now the question is, if are going to use NiFi cluster feature, how can
> we achieve rolling upgrades of the flow.xml.gz? Should we run a
> separate NiFi Cluster and switch between two clusters?
> We experienced this behavior: NiFi instance does not join the NiFi
> cluster if the flow.xml.gz differs. We had to turn off all NiFi
> instances in a cluster for a while to start a single one with new
> flow.xml.gz to populate the flow pool with the new version. Then, we
> have been forced to deploy new flow.xml.gz to other 2 nodes, as they
> rejected to join cluster :)
>
> * * *
>
> To our use-cases for now, we find using separate nifi instances
> superior to using Nifi cluster. Mainly cause of flow.xml.gz upgrade
> (unless somebody gives us advice on this! thank you).
> Regarding the flow balance and setup of inter-cluster communication,
> we do not know how to achieve this without nifi cluster setup. As for
> now, our flow is very simple and can basically run in parallel in
> multiple single instances, the separate nifi instances work well (even
> our source system supports balancing using more IPs so we do not even
> have to bother in setting up balanced IP on routers).
>
> Any comments are welcome. Thanks.
> Michal Klempa
>
> On Sat, Dec 10, 2016 at 9:03 AM, Caton, Nigel <ni...@cgi.com> wrote:
>> Thanks Bryan.
>>
>> On 2016-12-09 15:32 (-0000), Bryan Bende <b....@gmail.com> wrote:
>>> Nigel,>
>>>
>>> The advantage of using a cluster is that whenever you change something in>
>>> the UI, it will be changed on all nodes, and you also get a central view of>
>>> the metrics/stats across all nodes.  If you use standalone nodes you would>
>>> have to go to each node and make the same changes.>
>>>
>>> It sounds like you are probably doing automatic deployments of a flow that>
>>> you setup else where and aren't planning to ever modify the production>
>>> nodes so maybe the above is a non-issue for you.>
>>>
>>> The rolling deployment scenario depends on whether you are updating the>
>>> flow, or just code. For example, if you are just updating code then you>
>>> should be able to do a rolling deployment in a cluster, but if you are>
>>> updating the flow then I don't think it will work because the a node will>
>>> come up with the new flow and attempt to join the cluster, and the cluster>
>>> won't accept it because the flow is different.>
>>>
>>> Hope that helps.>
>>>
>>> -Bryan>
>>>
>>>
>>> On Fri, Dec 9, 2016 at 9:33 AM, Caton, Nigel <ni...@cgi.com> wrote:>
>>>
>>> > Are there any views of the pros/cons of running a native NiFi cluster>
>>> > versus a cluster of standalone