You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by "Woschitz, Janosch" <Ja...@thinkbiganalytics.com> on 2017/07/19 14:40:55 UTC

Replication of flow file and content repositories (HA setup)

Hello everyone,

In general NiFi seems to support HA semantics by establishing a multi-master/no-master clustering via Zookeeper. This works well to achieve consensus for the currently deployed flow. Though it is not clear to me how I can prevent against local disk failure apart from relying on a RAID10 setup.

The flow file and the content repositories are stored on local disks. If I have a rather complex flow then a full outage of a node of my cluster could result in data loss of the data which is “flowing” through a node at the time being.

I found the following feature proposals in the wiki which seem to address this problem:
https://cwiki.apache.org/confluence/display/NIFI/Data+Replication
https://cwiki.apache.org/confluence/display/NIFI/High+Availability+Processing

Unfortunately I was not able to find any pointers about the current state of these proposals. Is there any active work happening in this direction?

How could one support the project to achieve the goals mentioned in these feature proposals? I would think the work needs to broken down in smaller work package beforehand to allow a smooth integration into the master/upstream.

Thanks,
Janosch

Re: Replication of flow file and content repositories (HA setup)

Posted by Joe Witt <jo...@gmail.com>.

Janosch,

You're right that distributed durability (across nodes) is not
supported today natively.  You can use traditional raid techniques to
have strong durability (within a node) though yes if the node is down
the data is not available during that time.  In environments with
shared storage mechanisms like ceph/gluster/EBS/etc.. there are some
really powerful options that can be explored.  We need to understand
the tradeoff of having application level volume management backed by
such distributed shared storage approaches versus having application
level data replication like HDFS and Kafka do.  There are pros/cons
there and I agree it needs to be better broken down.

Thanks
Joe

On Wed, Jul 19, 2017 at 10:40 AM, Woschitz, Janosch
<Ja...@thinkbiganalytics.com> wrote:
> Hello everyone,
>
> In general NiFi seems to support HA semantics by establishing a multi-master/no-master clustering via Zookeeper. This works well to achieve consensus for the currently deployed flow. Though it is not clear to me how I can prevent against local disk failure apart from relying on a RAID10 setup.
>
> The flow file and the content repositories are stored on local disks. If I have a rather complex flow then a full outage of a node of my cluster could result in data loss of the data which is “flowing” through a node at the time being.
>
> I found the following feature proposals in the wiki which seem to address this problem:
> https://cwiki.apache.org/confluence/display/NIFI/Data+Replication
> https://cwiki.apache.org/confluence/display/NIFI/High+Availability+Processing
>
> Unfortunately I was not able to find any pointers about the current state of these proposals. Is there any active work happening in this direction?
>
> How could one support the project to achieve the goals mentioned in these feature proposals? I would think the work needs to broken down in smaller work package beforehand to allow a smooth integration into the master/upstream.
>
> Thanks,
> Janosch