You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> on 2016/03/31 23:06:05 UTC

Question about NiFi HA

Hey, guys and gals.  I’m pretty new to NiFi so its quite possible, if not likely, that I am missing something.   That said, would someone please comment on this observation. conclusion and and questions.

My observation is that FlowFiles get queued in local queues on individual nodes and those flow files and queues are not replicated to other nodes.  So its my conclusion that if a node is lost, the files queued to that node are also lost (or at least stuck there until if and when the node is recovered.)

While I understand that there might be ways to work around this, by using separate NiFi clusters and remote process groups, and duplicate the flow processing, is there anything on the road map that would address this in NiFi itself? Storing the queues and flow flies on a distributed file system like HDFS, comes to mind, so that any “local” queue is distributed to more than one node.

Thanks in advance,

Chris

Re: Question about NiFi HA

Posted by Joe Witt <jo...@gmail.com>.

Chris,

Absolutely: https://cwiki.apache.org/confluence/display/NIFI/High+Availability+Processing

Thanks
Joe

On Thu, Mar 31, 2016 at 5:06 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
> Hey, guys and gals.  I’m pretty new to NiFi so its quite possible, if not likely, that I am missing something.   That said, would someone please comment on this observation. conclusion and and questions.
>
> My observation is that FlowFiles get queued in local queues on individual nodes and those flow files and queues are not replicated to other nodes.  So its my conclusion that if a node is lost, the files queued to that node are also lost (or at least stuck there until if and when the node is recovered.)
>
> While I understand that there might be ways to work around this, by using separate NiFi clusters and remote process groups, and duplicate the flow processing, is there anything on the road map that would address this in NiFi itself? Storing the queues and flow flies on a distributed file system like HDFS, comes to mind, so that any “local” queue is distributed to more than one node.
>
> Thanks in advance,
>
> Chris