You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Kiran <b....@gmail.com> on 2017/02/15 21:58:40 UTC
Re[2]: MergeContent across a NiFi cluster
Thanks for the reply Joe.
I'm glad I wasn't missing something obvious. I'm afraid I'm stuck with
file size limitation but I'll have a word with the guys who configure
the load balancer to see what affinity options they have.
Thanks
Brian
------ Original Message ------
From: "Joe Witt" <jo...@gmail.com>
To: users@nifi.apache.org; "Kiran" <b....@gmail.com>
Sent: 15/02/2017 21:36:41
Subject: Re: MergeContent across a NiFi cluster
>Brian,
>
>Great use case and you're right we don't have an easy way of handling
>this now. If you do indeed have a load balancer in front of the
>receiving nifi cluster and it can support affinity of some kind then it
>is possible you can set a header in HTTP Post I believe which would
>come from a flowfile attribute which would be on each split and would
>be the hash of its full object. If the load balancer ensured all
>splits (based on that header matching) were on the same machine then
>you'd be in business. There are some load balancers that do this (i'm
>thinking of a commercial one). But, I admit that is a lot of moving
>parts to keep in mind. We need to improve our site-to-site feature to
>do things like automatically split content for you and handle the
>partitioning/affinity logic I suggested. You might also consider
>avoiding the splitting for now to keep things super simple though I
>recognize that exposes alternative tradeoffs.
>
>Great case for us to work on/rally around though.
>
>Thanks
>Joe
>
>On Wed, Feb 15, 2017 at 4:29 PM, Kiran <b....@gmail.com>
>wrote:
>>Hello,
>>
>>I need to send data from one organisation to another but there are
>>data
>>size limits between them (this isn't my choice and has been enforced
>>on
>>me). I've got a 4 node NiFi cluster in each organisation.
>>
>>The sending NiFi cluster has the following data flow:
>>Ingest the data by various means
>> -> Compress Data using CompressContent
>> -> If file size > X amount I use SplitContent
>> -> HTTPS POST to load balancer sitting in front of the NiFi
>>cluster in the other organisation
>>
>>On the receiving NiFi cluster I wanted to:
>>-> Receive the data
>> -> MergeContent
>> -> Do what ever else with the data...
>>
>>The problem I can't get round is that if I split the content into 3
>>fragments and send them to the receiving NiFi instance because it's
>>behind a load balancer I can't guarantee that the 3 fragments are
>>received by the same node.
>>
>>Q1) I'm assuming that for MergeContent to work all the fragments of a
>>single piece of data have to arrive on the same NiFi node or is there
>>a
>>option to have it working across a cluster?
>>
>>Q2) How long does the MergeContent processor wait for all the
>>fragments?
>>If one of the fragments gets lost does it timeout after a certain
>>period?
>>
>>I was thinking one way to solve this of to have the HTTPListener on
>>the
>>receiving NiFi only listening on the primary node which would ensure
>>all
>>the fragments arrive on the same node. The downside would be that I
>>end
>>up with idle NiFi nodes.
>>
>>Is there anything obvious that I'm missed that would solve my issue?
>>
>>Thanks in advance,
>>
>>Brian
>>
>>Virus-free. www.avast.com
>
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus